From 3d00adde7c31736e65abd0e9a6cb5829ac005732 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 28 Aug 2014 17:36:55 +0200 Subject: [PATCH 0001/1195] initial draft still a few things to flesh out i thin. --- active/0000-empty-structs-with-braces.md | 158 +++++++++++++++++++++++ 1 file changed, 158 insertions(+) create mode 100644 active/0000-empty-structs-with-braces.md diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md new file mode 100644 index 00000000000..12be0177d17 --- /dev/null +++ b/active/0000-empty-structs-with-braces.md @@ -0,0 +1,158 @@ +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +When a struct type `S` has no fields (a so-called "empty struct"): + + * allow `S` to be defined via either `struct S;` (as today) + or `struct S {}` (new) + * allow instances of `S` to be constructed via either the + expression `S` (as today) or the expression `S {}` (new) + * allow instances of `S` to be pattern matched via either the + pattern `S` (as today) or the pattern `S {}` (new). + +# Motivation + +Today, when writing code, one must treat an empty struct as a +special case, distinct from structs that include fields. +That is, one must write code like this: +```rust +struct S2 { x1: int, x2: int } +struct S0; // kind of different from the above. + +let s2 = S2 { x1: 1, x2: 2 }; +let s0 = S0; // kind of different from the above. + +match (s2, s0) { + (S2 { x1: y1, x2: y2 }, + S0) // you can see my pattern here + => { println!("Hello from S2({}, {}) and S0", y1, y2); } +} +``` + +While this yields code that is relatively free of extraneous +curly-braces, this special case handling of empty structs presents +problems for two cases of interest: code generators (including, but +not limited to, Rust macros) and conditionalized code (i.e. code with +`cfg` attributes). + +The special case handling of empty structs is also a problem for +programmers who actively add and remove fields from structs during +development; such changes cause a struct to switch from being empty +and non-empty, and the associated revisions of changing removing and +adding curly braces is aggravating (both in effort revising the code, +and also in extra noise introduced into commit histories). + +This RFC proposes going back to the state we were in circa February +2013, when both `S0` and `S0 { }` were accepted syntaxes for an empty +struct. The parsing ambiguity that motivated removing support for +`S0 { }` is no longer present (see [#ancient_history]). + + +# Detailed design + +Revise the grammar of struct item definitions so that one can write +either `struct S;` or `struct S { }`. The two forms are synonymous. +The first is preferred with respect to coding style; for example, the +first is emitted by the pretty printer. + +Revise the grammar of expressions and patterns so that, when `S` is an +empty struct, one can write either `S` or `S { }`. The two forms are +synonymous. Again, the first is preferred with respect to coding style, +and is emitted by the pretty printer. + +# Drawbacks + +Some people like "There is only one way to do it." But, there is +precendent in Rust for violating "one way to do it" in favor of +syntactic convenience or regularity; see +[#precedent_for_flexible_syntax_in_rust]. +Also, see Alternative 1 below. + +# Alternatives + +Alternative 1: Require empty curly braces on empty structs. + +Alternative 2: Status quo. Macros and code-generators in general +will need to handle empty structs as a special case. We may +continue hitting bugs like + +# Unresolved questions + +None. + +# Appendices + +## Ancient History + +A parsing ambiguity was the original motivation for disallowing the +syntax `struct S {}` in favor of `struct S;` for an empty struct +declaration. The ambiguity and various options for dealing with it +were well documented on the [associated mailing list thread][RustDev +Thread]. Both syntaxes were simultaneously supported at the time. +Support for `struct S {}` was removed because that was the most +expedient option. In particular, at that time, the option of "Place a +parser restriction on those contexts where `{` terminates the +expression and say that struct literals cannot appear there unless +they are in parentheses." was explicitly not chosen, in favor of +continuing to use the disambiguation rule in use at the time, namely +that the presence of a label (e.g. `S { a_label: ... }`) was *the* way +to distinguish a struct constructor from an identifier followed by a +control block, and thus, "there must be one label." + +In particular, at the time that mailing list thread was created, the +code match `match x {} ...` would be parsed as `match (x {}) ...`, not +as `(match x {}) ...` (see [Rust PR 5137]); likewise, `if x {}` would +be parsed as an if-expression whose test component is the struct +literal `x {}`. Thus, at the time of [Rust PR 5137], if the input to +a `match` or `if` was an identifier expression, one had to put +parentheses around the identifier to force it to be interpreted as +input, and not as a struct constructor. + +Things have changed since then; namely, we have now adopted the +aforementioned parser restriction [Rust RFC 25]. (The text of RFC 25 +does not explicitly address `match`, but we have effectively expanded +it to include a curly-brace delimited block of match-arms in the +definition of "block".) Today, one uses parentheses around struct +literals in some contexts (such as `for e in (S {x: 3}) { ... }` or +`match (S {x: 3}) { ... }` + +## Precedent for flexible syntax in Rust + +There is precendent in Rust for violating "one way to do it" in favor +of syntactic convenience or regularity. + +For example, one can often include an optional trailing comma, for +example in: `let x : &[int] = [3, 2, 1, ];`. + +One can also include redundant curly braces or parentheses, for +example in: +```rust +println!("hi: {}", { if { x.len() > 2 } { ("whoa") } else { ("there") } }); +``` + +One can even mix the two together when delimiting match arms: +```rust + let z: int = match x { + [3, 2] => { 3 } + [3, 2, 1] => 2, + _ => { 1 }, + }; +``` + +We do have lints for some style violations (though none catch the +cases above), but lints are different from fundamental language +restrictions. + + +[RustDev Thread]: https://mail.mozilla.org/pipermail/rust-dev/2013-February/003282.html + +[Rust Issue 5167]: https://github.com/rust-lang/rust/issues/5167 + +[Rust RFC 25]: https://github.com/rust-lang/rfcs/blob/master/complete/0025-struct-grammar.md + +[CFG parse bug]: https://github.com/rust-lang/rust/issues/16819 + +[Rust PR 5137]: https://github.com/rust-lang/rust/pull/5137 From 99301be11e48d0ee64676e90f8a83576dfd189fa Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 11:43:26 +0200 Subject: [PATCH 0002/1195] fleshed out remainder of rfc. --- active/0000-empty-structs-with-braces.md | 186 ++++++++++++++++++++--- 1 file changed, 167 insertions(+), 19 deletions(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 12be0177d17..2aefdc11bf8 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -1,17 +1,13 @@ -- Start Date: (fill me in with today's date, YYYY-MM-DD) +- Start Date: (fill me in with today's date, 2014-08-28) - RFC PR: (leave this empty) - Rust Issue: (leave this empty) # Summary -When a struct type `S` has no fields (a so-called "empty struct"): - - * allow `S` to be defined via either `struct S;` (as today) - or `struct S {}` (new) - * allow instances of `S` to be constructed via either the - expression `S` (as today) or the expression `S {}` (new) - * allow instances of `S` to be pattern matched via either the - pattern `S` (as today) or the pattern `S {}` (new). +When a struct type `S` has no fields (a so-called "empty struct"), +allow it to be defined via either `struct S;` or `struct S {}`, and +allow instances of it to be constructed and pattern-matched via either +`S` or `S {}`. # Motivation @@ -34,9 +30,14 @@ match (s2, s0) { While this yields code that is relatively free of extraneous curly-braces, this special case handling of empty structs presents -problems for two cases of interest: code generators (including, but -not limited to, Rust macros) and conditionalized code (i.e. code with -`cfg` attributes). +problems for two cases of interest: automatic code generators +(including, but not limited to, Rust macros) and conditionalized code +(i.e. code with `cfg` attributes; see appendix [#the_cfg_problem]). +The heart of the code-generator argument is: Why force all +to-be-written code-generators and macros with special-case handling of +the empty struct case (in terms of whether or not to include the +surrounding braces), especially since that special case is likely to +be forgotten (yielding a latent bug in the code generator). The special case handling of empty structs is also a problem for programmers who actively add and remove fields from structs during @@ -49,10 +50,20 @@ This RFC proposes going back to the state we were in circa February 2013, when both `S0` and `S0 { }` were accepted syntaxes for an empty struct. The parsing ambiguity that motivated removing support for `S0 { }` is no longer present (see [#ancient_history]). - +Supporting empty braces in the syntax for empty structs is easy to do +in the language now. # Detailed design + * Allow `S` to be defined via either `struct S;` (as today) + or `struct S {}` (new) + + * Allow instances of `S` to be constructed via either the + expression `S` (as today) or the expression `S {}` (new) + + * Allow instances of `S` to be pattern matched via either the + pattern `S` (as today) or the pattern `S {}` (new). + Revise the grammar of struct item definitions so that one can write either `struct S;` or `struct S { }`. The two forms are synonymous. The first is preferred with respect to coding style; for example, the @@ -60,20 +71,39 @@ first is emitted by the pretty printer. Revise the grammar of expressions and patterns so that, when `S` is an empty struct, one can write either `S` or `S { }`. The two forms are -synonymous. Again, the first is preferred with respect to coding style, -and is emitted by the pretty printer. +synonymous. Again, the first is preferred with respect to coding +style, and is emitted by the pretty printer. + +The format of the definiton has no bearing on the format of the +expressions or pattern forms; either syntax can be used for any +empty-struct, regardless of how it is defined. + +There is no ambiguity introduced by this change, because we have +already introduced a restriction to the Rust grammar to force the use +of parentheses to disambiguate struct literals in such contexts. (See +[Rust RFC 25]). + # Drawbacks Some people like "There is only one way to do it." But, there is precendent in Rust for violating "one way to do it" in favor of syntactic convenience or regularity; see -[#precedent_for_flexible_syntax_in_rust]. -Also, see Alternative 1 below. +the appendix +[Precedent for flexible syntax in Rust][#precedent_for_flexible_syntax_in_rust]. +Also, see Alternative 1: "Always Require Braces" below. # Alternatives -Alternative 1: Require empty curly braces on empty structs. +Alternative 1: "Always Require Braces". Specifically, require empty +curly braces on empty structs. People who like the current syntax of +curly-brace free structs can encode them this way: `enum S0 { S0 }` +This would address all of the same issues outlined above. (Also, the +author (pnkfelix) would be happy to take this tack.) The main reason +not to take this tack is that some people may like writing empty +structs without braces, but do not want to switch to the unary enum +version. See "I wouldn't want to force noisier syntax ..." in +[#recent_history]. Alternative 2: Status quo. Macros and code-generators in general will need to handle empty structs as a special case. We may @@ -81,10 +111,102 @@ continue hitting bugs like # Unresolved questions -None. +## Empty Tuple Structs + +The code-generation argument could be applied to tuple-structs as +well, to claim that we should allow the syntax `S0()`. I am less +inclined to add a special case for that. Note that we should not +attempt to generalize this RFC as proposed to include tuple structs, +i.e. so that given `struct S0 {}`, the expressions `T0`, `T0 {}`, and +`T0()` would be synonymous. The reason is that +given a tuple struct `struct T2(int, int)`, the identifier `T2` is +*already* bound to the constructor function: + +```rust +fn main() { + #[deriving(Show)] + struct T2(int, int); + + fn foo(f: |int, int| -> S) { + println!("Hello from {} and {}", f(2,3), f(4,5)); + } + foo(T2); +} +``` + +So if we were to attempt to generalize the leniency of this RFC to +tuple structs, we would be in the unfortunate situation given `struct +T0();` of trying to treat `T0` simultaneously as an instance of the +struct and as a constructor function. So, the handling of empty +structs proposed by this RFC does not generalize to tuple structs. + +(Note that if we adopt alternative 1, then the issue of how tuple +structs are handled is totally orthogonal -- we could add support for +`struct T0()` as a distinct type from `struct S0 {}`, if we so wished, +or leave it aside.) # Appendices +## The CFG problem + +A program like this works today: + +```rust +fn main() { + #[deriving(Show)] + struct Svaries { + x: int, + y: int, + + #[cfg(zed)] + z: int, + } + + let s = match () { + #[cfg(zed)] _ => Svaries { x: 3, y: 4, z: 5 }, + #[cfg(not(zed))] _ => Svaries { x: 3, y: 4 }, + }; + println!("Hello from {}", s) +} +``` + +Observe what happens when one modifies the above just a bit: +```rust + struct Svaries { + #[cfg(eks)] + x: int, + #[cfg(why)] + y: int, + + #[cfg(zed)] + z: int, + } +``` + +Now, certain `cfg` settings yield an empty struct, even though it +is surrounded by braces. Today this leads to a [CFG parse bug]. + +If we want to support situations like this properly, we will probably +need to further extend the `cfg` attribute so that it can be placed +before individual fields in a struct constructor, like this: + +```rust +// You cannot do this today, +// but maybe in the future (after a different RFC) +let s = Svaries { + #[cfg(eks)] x: 3, + #[cfg(why)] y: 4, + #[cfg(zed)] z: 5, +}; +``` + +Supporting such a syntax consistently in the future should start today +with allowing empty braces as legal code. (Strictly speaking, it is +not *necessary* that we add support for empty braces at the parsing +level to support this feature at the semantic level. But supporting +empty-braces in the syntax still seems like the most consistent path +to me.) + ## Ancient History A parsing ambiguity was the original motivation for disallowing the @@ -119,6 +241,10 @@ definition of "block".) Today, one uses parentheses around struct literals in some contexts (such as `for e in (S {x: 3}) { ... }` or `match (S {x: 3}) { ... }` +Note that there was never an ambiguity for uses of `struct S0 { }` in item +position. The issue was solely about expression position prior to the +adoption of [Rust RFC 25]. + ## Precedent for flexible syntax in Rust There is precendent in Rust for violating "one way to do it" in favor @@ -146,6 +272,26 @@ We do have lints for some style violations (though none catch the cases above), but lints are different from fundamental language restrictions. +## Recent history + +There was a previous [RFC PR][RFC PR 147] that was effectively the +same in spirit to this one. It was closed because it was not +sufficient well fleshed out for further consideration by the core +team. However, to save people the effort of reviewing the comments on +that PR (and hopefully stave off potential bikeshedding on this PR), I +here summarize the various viewpoints put forward on the comment +thread there, and note for each one, whether that viewpoint would be +addressed by this RFC (accept both syntaxes), by Alternative 1 (accept +only `S0 {}`), or by the status quo (accept only `S0`). + + + +* "I find `let s = S0;` jarring, think its an enum initially." ==> Favors: Alternative 1 +* "Frequently start out with an empty struct and add fields as I need them." ==> Favors: This RFC or Alternative 1 +* "Foo{} suggests is constructing something that it's not; all uses of the value `Foo` are indistinguishable from each other" ==> Favors: Status Quo +* "I find it strange anyone would prefer `let x = Foo{};` over `let x = Foo;`" ==> Favors Status Quo; strongly opposes Alternative 1. +* "I agree that 'instantiation-should-follow-declation', that is, structs declared `;, (), {}` should only be instantiated [via] `;, (), { }` respectively" ==> Opposes leniency of this RFC in that it allows expression to use include or omit `{}` on an empty struct, regardless of declaration form, and vice-versa. +* "The code generation argument is reasonable, but I wouldn't want to force noisier syntax on all 'normal' code just to make macros work better." ==> Favors: This RFC [RustDev Thread]: https://mail.mozilla.org/pipermail/rust-dev/2013-February/003282.html @@ -156,3 +302,5 @@ restrictions. [CFG parse bug]: https://github.com/rust-lang/rust/issues/16819 [Rust PR 5137]: https://github.com/rust-lang/rust/pull/5137 + +[RFC PR 147]: https://github.com/rust-lang/rfcs/pull/147 From 30873b058b4c71bba43949f5c72a4e0cbf5f6dd7 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 11:44:48 +0200 Subject: [PATCH 0003/1195] fixed a href link. --- active/0000-empty-structs-with-braces.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 2aefdc11bf8..8188f0d13d1 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -32,7 +32,7 @@ While this yields code that is relatively free of extraneous curly-braces, this special case handling of empty structs presents problems for two cases of interest: automatic code generators (including, but not limited to, Rust macros) and conditionalized code -(i.e. code with `cfg` attributes; see appendix [#the_cfg_problem]). +(i.e. code with `cfg` attributes; see appendix [The CFG problem][#the_cfg_problem]). The heart of the code-generator argument is: Why force all to-be-written code-generators and macros with special-case handling of the empty struct case (in terms of whether or not to include the From 40e608cafa1e04b5a58228abf17bdc2a88043299 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 11:45:43 +0200 Subject: [PATCH 0004/1195] again attempt to fix href format. --- active/0000-empty-structs-with-braces.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 8188f0d13d1..854b4d3949a 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -32,7 +32,7 @@ While this yields code that is relatively free of extraneous curly-braces, this special case handling of empty structs presents problems for two cases of interest: automatic code generators (including, but not limited to, Rust macros) and conditionalized code -(i.e. code with `cfg` attributes; see appendix [The CFG problem][#the_cfg_problem]). +(i.e. code with `cfg` attributes; see appendix [The CFG problem]. The heart of the code-generator argument is: Why force all to-be-written code-generators and macros with special-case handling of the empty struct case (in terms of whether or not to include the @@ -293,6 +293,8 @@ only `S0 {}`), or by the status quo (accept only `S0`). * "I agree that 'instantiation-should-follow-declation', that is, structs declared `;, (), {}` should only be instantiated [via] `;, (), { }` respectively" ==> Opposes leniency of this RFC in that it allows expression to use include or omit `{}` on an empty struct, regardless of declaration form, and vice-versa. * "The code generation argument is reasonable, but I wouldn't want to force noisier syntax on all 'normal' code just to make macros work better." ==> Favors: This RFC +[The CFG problem]: #the_cfg_problem + [RustDev Thread]: https://mail.mozilla.org/pipermail/rust-dev/2013-February/003282.html [Rust Issue 5167]: https://github.com/rust-lang/rust/issues/5167 From 35a247ff49528ec2072b66939075880753f7db25 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 11:55:16 +0200 Subject: [PATCH 0005/1195] more cleanup, attempted to fix many urls. --- active/0000-empty-structs-with-braces.md | 74 ++++++++++++++---------- 1 file changed, 45 insertions(+), 29 deletions(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 854b4d3949a..33fd8947523 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -32,7 +32,7 @@ While this yields code that is relatively free of extraneous curly-braces, this special case handling of empty structs presents problems for two cases of interest: automatic code generators (including, but not limited to, Rust macros) and conditionalized code -(i.e. code with `cfg` attributes; see appendix [The CFG problem]. +(i.e. code with `cfg` attributes; see the [CFG problem] appendix). The heart of the code-generator argument is: Why force all to-be-written code-generators and macros with special-case handling of the empty struct case (in terms of whether or not to include the @@ -49,7 +49,7 @@ and also in extra noise introduced into commit histories). This RFC proposes going back to the state we were in circa February 2013, when both `S0` and `S0 { }` were accepted syntaxes for an empty struct. The parsing ambiguity that motivated removing support for -`S0 { }` is no longer present (see [#ancient_history]). +`S0 { }` is no longer present (see the [Ancient History] appendix). Supporting empty braces in the syntax for empty structs is easy to do in the language now. @@ -89,12 +89,13 @@ of parentheses to disambiguate struct literals in such contexts. (See Some people like "There is only one way to do it." But, there is precendent in Rust for violating "one way to do it" in favor of syntactic convenience or regularity; see -the appendix -[Precedent for flexible syntax in Rust][#precedent_for_flexible_syntax_in_rust]. -Also, see Alternative 1: "Always Require Braces" below. +the [Precedent for flexible syntax in Rust] appendix. +Also, see [Always Require Braces] alternative below. # Alternatives +## Always Require Braces + Alternative 1: "Always Require Braces". Specifically, require empty curly braces on empty structs. People who like the current syntax of curly-brace free structs can encode them this way: `enum S0 { S0 }` @@ -102,25 +103,32 @@ This would address all of the same issues outlined above. (Also, the author (pnkfelix) would be happy to take this tack.) The main reason not to take this tack is that some people may like writing empty structs without braces, but do not want to switch to the unary enum -version. See "I wouldn't want to force noisier syntax ..." in -[#recent_history]. +version. See "I wouldn't want to force noisier syntax ..." in the +[Recent History] appendix. -Alternative 2: Status quo. Macros and code-generators in general -will need to handle empty structs as a special case. We may -continue hitting bugs like +## Status quo -# Unresolved questions +Alternative 2: Status quo. Macros and code-generators in general will +need to handle empty structs as a special case. We may continue +hitting bugs like [CFG parse bug]. Some users will be annoyed but +most will probably cope. ## Empty Tuple Structs +One might say "why are you including support for curly braces, but not +parentheses?" Or in other words, "what about empty tuple structs?" + The code-generation argument could be applied to tuple-structs as well, to claim that we should allow the syntax `S0()`. I am less -inclined to add a special case for that. Note that we should not -attempt to generalize this RFC as proposed to include tuple structs, -i.e. so that given `struct S0 {}`, the expressions `T0`, `T0 {}`, and -`T0()` would be synonymous. The reason is that -given a tuple struct `struct T2(int, int)`, the identifier `T2` is -*already* bound to the constructor function: +inclined to add a special case for that; I think tuple-structs are +less frequently used (especially with many fields); they are largely +for ad-hoc data such as newtype wrappers, not for code generators. + +Note that we should not attempt to generalize this RFC as proposed to +include tuple structs, i.e. so that given `struct S0 {}`, the +expressions `T0`, `T0 {}`, and `T0()` would be synonymous. The reason +is that given a tuple struct `struct T2(int, int)`, the identifier +`T2` is *already* bound to a constructor function: ```rust fn main() { @@ -140,10 +148,14 @@ T0();` of trying to treat `T0` simultaneously as an instance of the struct and as a constructor function. So, the handling of empty structs proposed by this RFC does not generalize to tuple structs. -(Note that if we adopt alternative 1, then the issue of how tuple -structs are handled is totally orthogonal -- we could add support for -`struct T0()` as a distinct type from `struct S0 {}`, if we so wished, -or leave it aside.) +(Note that if we adopt alternative 1, [Always Require Braces], then +the issue of how tuple structs are handled is totally orthogonal -- we +could add support for `struct T0()` as a distinct type from `struct S0 +{}`, if we so wished, or leave it aside.) + +# Unresolved questions + +None # Appendices @@ -281,19 +293,23 @@ team. However, to save people the effort of reviewing the comments on that PR (and hopefully stave off potential bikeshedding on this PR), I here summarize the various viewpoints put forward on the comment thread there, and note for each one, whether that viewpoint would be -addressed by this RFC (accept both syntaxes), by Alternative 1 (accept -only `S0 {}`), or by the status quo (accept only `S0`). - - +addressed by this RFC (accept both syntaxes), by [Always Require Braces], +or by [Status Quo]. -* "I find `let s = S0;` jarring, think its an enum initially." ==> Favors: Alternative 1 -* "Frequently start out with an empty struct and add fields as I need them." ==> Favors: This RFC or Alternative 1 +* "I find `let s = S0;` jarring, think its an enum initially." ==> Favors: Always Require Braces +* "Frequently start out with an empty struct and add fields as I need them." ==> Favors: This RFC or Always Require Braces * "Foo{} suggests is constructing something that it's not; all uses of the value `Foo` are indistinguishable from each other" ==> Favors: Status Quo -* "I find it strange anyone would prefer `let x = Foo{};` over `let x = Foo;`" ==> Favors Status Quo; strongly opposes Alternative 1. +* "I find it strange anyone would prefer `let x = Foo{};` over `let x = Foo;`" ==> Favors Status Quo; strongly opposes Always Require Braces. * "I agree that 'instantiation-should-follow-declation', that is, structs declared `;, (), {}` should only be instantiated [via] `;, (), { }` respectively" ==> Opposes leniency of this RFC in that it allows expression to use include or omit `{}` on an empty struct, regardless of declaration form, and vice-versa. * "The code generation argument is reasonable, but I wouldn't want to force noisier syntax on all 'normal' code just to make macros work better." ==> Favors: This RFC -[The CFG problem]: #the_cfg_problem +[Always Require Braces]: #always-require-braces +[Status Quo]: #status-quo +[Ancient History]: #ancient-history +[Recent History]: #recent-history +[CFG problem]: #the-cfg-problem +[Empty Tuple Structs]: #empty-tuple-structs +[Precedent for flexible syntax in Rust]: #precedent-for-flexible-syntax-in-rust [RustDev Thread]: https://mail.mozilla.org/pipermail/rust-dev/2013-February/003282.html From ff628b341ba9bb81e44f77c39a5af186e29f0f21 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 11:58:00 +0200 Subject: [PATCH 0006/1195] clarify where the CFG parse bug actually arises so people are not scratching their heads. --- active/0000-empty-structs-with-braces.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 33fd8947523..77e471c03b2 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -196,7 +196,8 @@ Observe what happens when one modifies the above just a bit: ``` Now, certain `cfg` settings yield an empty struct, even though it -is surrounded by braces. Today this leads to a [CFG parse bug]. +is surrounded by braces. Today this leads to a [CFG parse bug] +when one attempts to actually construct such a struct. If we want to support situations like this properly, we will probably need to further extend the `cfg` attribute so that it can be placed From f7294bc88c321dfbbe1a496e0d812c6ca527ca10 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 12:04:41 +0200 Subject: [PATCH 0007/1195] Corrected some misstatements in the ancient history. --- active/0000-empty-structs-with-braces.md | 47 ++++++++++++++---------- 1 file changed, 27 insertions(+), 20 deletions(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 77e471c03b2..4ca67b2cba2 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -223,19 +223,10 @@ to me.) ## Ancient History A parsing ambiguity was the original motivation for disallowing the -syntax `struct S {}` in favor of `struct S;` for an empty struct -declaration. The ambiguity and various options for dealing with it +syntax `S {}` in favor of `S` for constructing an instance of +an empty struct. The ambiguity and various options for dealing with it were well documented on the [associated mailing list thread][RustDev Thread]. Both syntaxes were simultaneously supported at the time. -Support for `struct S {}` was removed because that was the most -expedient option. In particular, at that time, the option of "Place a -parser restriction on those contexts where `{` terminates the -expression and say that struct literals cannot appear there unless -they are in parentheses." was explicitly not chosen, in favor of -continuing to use the disambiguation rule in use at the time, namely -that the presence of a label (e.g. `S { a_label: ... }`) was *the* way -to distinguish a struct constructor from an identifier followed by a -control block, and thus, "there must be one label." In particular, at the time that mailing list thread was created, the code match `match x {} ...` would be parsed as `match (x {}) ...`, not @@ -244,15 +235,31 @@ be parsed as an if-expression whose test component is the struct literal `x {}`. Thus, at the time of [Rust PR 5137], if the input to a `match` or `if` was an identifier expression, one had to put parentheses around the identifier to force it to be interpreted as -input, and not as a struct constructor. - -Things have changed since then; namely, we have now adopted the -aforementioned parser restriction [Rust RFC 25]. (The text of RFC 25 -does not explicitly address `match`, but we have effectively expanded -it to include a curly-brace delimited block of match-arms in the -definition of "block".) Today, one uses parentheses around struct -literals in some contexts (such as `for e in (S {x: 3}) { ... }` or -`match (S {x: 3}) { ... }` +input to the `match`/`if`, and not as a struct constructor. + +Of the options for resolving this discussed on the mailing list +thread, the one selected (removing `S {}` construction expressions) +was chosen as the most expedient option. + +At that time, the option of "Place a parser restriction on those +contexts where `{` terminates the expression and say that struct +literals cannot appear there unless they are in parentheses." was +explicitly not chosen, in favor of continuing to use the +disambiguation rule in use at the time, namely that the presence of a +label (e.g. `S { a_label: ... }`) was *the* way to distinguish a +struct constructor from an identifier followed by a control block, and +thus, "there must be one label." + +Naturally, if the construction syntax were to be disallowed, it made +sense to also remove the `struct S {}` declaration syntax. + +Things have changed since the time of that mailing list thread; +namely, we have now adopted the aforementioned parser restriction +[Rust RFC 25]. (The text of RFC 25 does not explicitly address +`match`, but we have effectively expanded it to include a curly-brace +delimited block of match-arms in the definition of "block".) Today, +one uses parentheses around struct literals in some contexts (such as +`for e in (S {x: 3}) { ... }` or `match (S {x: 3}) { ... }` Note that there was never an ambiguity for uses of `struct S0 { }` in item position. The issue was solely about expression position prior to the From 2470b6dd1608346395eeed08fe562a50f82039c3 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 12:06:10 +0200 Subject: [PATCH 0008/1195] fixed presentation of mailing list thread hyperlink. --- active/0000-empty-structs-with-braces.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 4ca67b2cba2..114da96636b 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -225,8 +225,8 @@ to me.) A parsing ambiguity was the original motivation for disallowing the syntax `S {}` in favor of `S` for constructing an instance of an empty struct. The ambiguity and various options for dealing with it -were well documented on the [associated mailing list thread][RustDev -Thread]. Both syntaxes were simultaneously supported at the time. +were well documented on the [rust-dev thread]. +Both syntaxes were simultaneously supported at the time. In particular, at the time that mailing list thread was created, the code match `match x {} ...` would be parsed as `match (x {}) ...`, not @@ -319,7 +319,7 @@ or by [Status Quo]. [Empty Tuple Structs]: #empty-tuple-structs [Precedent for flexible syntax in Rust]: #precedent-for-flexible-syntax-in-rust -[RustDev Thread]: https://mail.mozilla.org/pipermail/rust-dev/2013-February/003282.html +[rust-dev thread]: https://mail.mozilla.org/pipermail/rust-dev/2013-February/003282.html [Rust Issue 5167]: https://github.com/rust-lang/rust/issues/5167 From d6a06629153ea579c0e80e6fee78efac237fd23f Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 12:13:51 +0200 Subject: [PATCH 0009/1195] Okay I think this is about ready to post now. --- active/0000-empty-structs-with-braces.md | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 114da96636b..d9e2496fbc2 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -90,7 +90,12 @@ Some people like "There is only one way to do it." But, there is precendent in Rust for violating "one way to do it" in favor of syntactic convenience or regularity; see the [Precedent for flexible syntax in Rust] appendix. -Also, see [Always Require Braces] alternative below. +Also, see the [Always Require Braces] alternative below. + +I have attempted to summarize the previous discussion from [RFC PR +147] in the [Recent History] appendix; some of the points there +include drawbacks to this approach and to the [Always Require Braces] +alternative. # Alternatives @@ -100,11 +105,12 @@ Alternative 1: "Always Require Braces". Specifically, require empty curly braces on empty structs. People who like the current syntax of curly-brace free structs can encode them this way: `enum S0 { S0 }` This would address all of the same issues outlined above. (Also, the -author (pnkfelix) would be happy to take this tack.) The main reason -not to take this tack is that some people may like writing empty -structs without braces, but do not want to switch to the unary enum -version. See "I wouldn't want to force noisier syntax ..." in the -[Recent History] appendix. +author (pnkfelix) would be happy to take this tack.) + +The main reason not to take this tack is that some people may like +writing empty structs without braces, but do not want to switch to the +unary enum version. See "I wouldn't want to force noisier syntax ..." +in the [Recent History] appendix. ## Status quo From e5eecb0d7e5c78dcd9cc6f3a6b232376edcfc25c Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 12:16:02 +0200 Subject: [PATCH 0010/1195] Could not resist adding air-quotes to "new". --- active/0000-empty-structs-with-braces.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index d9e2496fbc2..9f5cf7d3c90 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -56,13 +56,13 @@ in the language now. # Detailed design * Allow `S` to be defined via either `struct S;` (as today) - or `struct S {}` (new) + or `struct S {}` ("new") * Allow instances of `S` to be constructed via either the - expression `S` (as today) or the expression `S {}` (new) + expression `S` (as today) or the expression `S {}` ("new") * Allow instances of `S` to be pattern matched via either the - pattern `S` (as today) or the pattern `S {}` (new). + pattern `S` (as today) or the pattern `S {}` ("new"). Revise the grammar of struct item definitions so that one can write either `struct S;` or `struct S { }`. The two forms are synonymous. From a62d42abe031fa00399dfd351520de4bce79aab3 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 12:23:32 +0200 Subject: [PATCH 0011/1195] Add a disclaimer about the Recent History summary. --- active/0000-empty-structs-with-braces.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 9f5cf7d3c90..4606e937eb1 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -310,6 +310,11 @@ thread there, and note for each one, whether that viewpoint would be addressed by this RFC (accept both syntaxes), by [Always Require Braces], or by [Status Quo]. +Note that this list of comments is *just* meant to summarize the list +of views; it does not attempt to reflect the number of commenters who +agreed or disagreed with a particular point. (But since the RFC process +is not a democracy, the number of commenters should not matter anyway.) + * "I find `let s = S0;` jarring, think its an enum initially." ==> Favors: Always Require Braces * "Frequently start out with an empty struct and add fields as I need them." ==> Favors: This RFC or Always Require Braces * "Foo{} suggests is constructing something that it's not; all uses of the value `Foo` are indistinguishable from each other" ==> Favors: Status Quo From 936bb323484b8c2a4ea61cf0dd890bbcf1c61d81 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 29 Aug 2014 12:26:36 +0200 Subject: [PATCH 0012/1195] Add a "+1" to the Recent History summary. Figure I should at least let those commenters get a bucket in the summary, even though I personally am ambivalent at best about "+1" comments. --- active/0000-empty-structs-with-braces.md | 1 + 1 file changed, 1 insertion(+) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 4606e937eb1..512b410e6a5 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -315,6 +315,7 @@ of views; it does not attempt to reflect the number of commenters who agreed or disagreed with a particular point. (But since the RFC process is not a democracy, the number of commenters should not matter anyway.) +* "+1" ==> Favors: This RFC (or potentially [Always Require Braces]; I think the content of [RFC PR 147] shifted over time, so it is hard to interpret the "+1" comments now). * "I find `let s = S0;` jarring, think its an enum initially." ==> Favors: Always Require Braces * "Frequently start out with an empty struct and add fields as I need them." ==> Favors: This RFC or Always Require Braces * "Foo{} suggests is constructing something that it's not; all uses of the value `Foo` are indistinguishable from each other" ==> Favors: Status Quo From c62612508db617110d7cead92852f6e5e9e2272b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Tue, 16 Sep 2014 21:11:59 +0200 Subject: [PATCH 0013/1195] Trait-based exception handling --- active/0000-trait-based-exception-handling.md | 702 ++++++++++++++++++ 1 file changed, 702 insertions(+) create mode 100644 active/0000-trait-based-exception-handling.md diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md new file mode 100644 index 00000000000..836e0021afe --- /dev/null +++ b/active/0000-trait-based-exception-handling.md @@ -0,0 +1,702 @@ +- Start Date: 2014-09-16 +- RFC PR #: (leave this empty) +- Rust Issue #: (leave this empty) + + +# Summary + +Add sugar for working with existing algebraic datatypes such as `Result` and +`Option`. Put another way, use types such as `Result` and `Option` to model +common exception handling constructs. + +Add a trait which precisely spells out the abstract interface and requirements +for such types. + +The new constructs are: + + * An `?` operator for explicitly propagating exceptions. + + * A `try`..`catch` construct for conveniently catching and handling exceptions. + + * (Potentially) a `throw` operator, and `throws` sugar for function signatures. + +The idea for the `?` operator originates from [RFC PR 204][204] by @aturon. + +[204]: https://github.com/rust-lang/rfcs/pull/204 + + +# Motivation and overview + +Rust currently uses algebraic `enum` types `Option` and `Result` for error +handling. This solution is simple, well-behaved, and easy to understand, but +often gnarly and inconvenient to work with. We would like to solve the latter +problem while retaining the other nice properties and avoiding duplication of +functionality. + +We can accomplish this by adding constructs which mimic the exception-handling +constructs of other languages in both appearance and behavior, while improving +upon them in typically Rustic fashion. These constructs are well-behaved in a +very precise sense and their meaning can be specified by a straightforward +source-to-source translation into existing language constructs (plus a very +simple and obvious new one). (They may also, but need not necessarily, be +implemented in this way.) + +These constructs are strict additions to the existing language, and apart from +the issue of keywords, the legality and behavior of all currently existing Rust +programs is entirely unaffected. + +The most important additions are a postfix `?` operator for propagating +"exceptions" and a `try`..`catch` block for catching and handling them. By an +"exception", we more or less just mean the `None` variant of an `Option` or the +`Err` variant of a `Result`. (See the "Detailed design" section for more +precision.) + +## `?` operator + +The postfix `?` operator can be applied to expressions of types like `Option` +and `Result` which contain either a "success" or an "exception" value, and can +be thought of as a generalization of the current `try! { }` macro. It either +returns the "success" value directly, or performs an early exit and propagates +the "exception" value further out. (So given `my_result: Result`, we +have `my_result?: Foo`.) This allows it to be used for e.g. conveniently +chaining method calls which may each "throw an exception": + + foo()?.bar()?.baz() + +(Naturally, in this case the types of the "exceptions thrown by" `foo()` and +`bar()` must unify.) + +When used outside of a `try` block, the `?` operator propagates the exception to +the caller of the current function, just like the current `try!` macro does. (If +the return type of the function isn't one, like `Result`, that's capable of +carrying the exception, then this is a type error.) When used inside a `try` +block, it propagates the exception up to the innermost `try` block, as one would +expect. + +Requiring an explicit `?` operator to propagate exceptions strikes a very +pleasing balance between completely automatic exception propagation, which most +languages have, and completely manual propagation, which we currently have +(apart from the `try!` macro to lessen the pain). It means that function calls +remain simply function calls which return a result to their caller, with no +magic going on behind the scenes; and this also *increases* flexibility, because +one gets to choose between propagation with `?` or consuming the returned +`Result` directly. + +The `?` operator itself is suggestive, syntactically lightweight enough to not +be bothersome, and lets the reader determine at a glance where an exception may +or may not be thrown. It also means that if the signature of a function changes +with respect to exceptions, it will lead to type errors rather than silent +behavior changes, which is always a good thing. Finally, because exceptions are +tracked in the type system, there is no silent propagation of exceptions, and +all points where an exception may be thrown are readily apparent visually, this +also means that we do not have to worry very much about "exception safety". + +## `try`..`catch` + +Like most other things in Rust, and unlike other languages that I know of, +`try`..`catch` is an *expression*. If no exception is thrown in the `try` block, +the `try`..`catch` evaluates to the value of `try` block; if an exception is +thrown, it is passed to the `catch` block, and the `try`..`catch` evaluates to +the value of the `catch` block. As with `if`..`else` expressions, the types of +the `try` and `catch` blocks must therefore unify. Unlike other languages, only +a single type of exception may be thrown in the `try` block (a `Result` only has +a single `Err` type); and there may only be a single `catch` block, which +catches all exceptions. This dramatically simplifies matters and allows for nice +properties. + +There are two variations on the `try`..`catch` theme, each of which is more +convenient in different circumstances. + + 1. `try { EXPR } catch IRR-PAT { EXPR }` + + For example: + + try { + foo()?.bar()? + } catch e { + let x = baz(e); + quux(x, e); + } + + Here the caught exception is bound to an irrefutable pattern immediately + following the `catch`. + This form is convenient when one does not wish to do case analysis on the + caught exception. + + 2. `try { EXPR } catch { PAT => EXPR, PAT => EXPR, ... }` + + For example: + + try { + foo()?.bar()? + } catch { + Red(rex) => baz(rex), + Blue(bex) => quux(bex) + } + + Here the `catch` is not immediately followed by a pattern; instead, its body + performs a `match` on the caught exception directly, using any number of + refutable patterns. + This form is convenient when one *does* wish to do case analysis on the + caught exception. + +While it may appear to be extravagant to provide both forms, there is reason to +do so: either form on its own leads to unavoidable rightwards drift under some +circumstances. + +The first form leads to rightwards drift if one wishes to `match` on the caught +exception: + + try { + foo()?.bar()? + } catch e { + match e { + Red(rex) => baz(rex), + Blue(bex) => quux(bex) + } + } + +This `match e` is quite redundant and unfortunate. + +The second form leads to rightwards drift if one wishes to do more complex +multi-statement work with the caught exception: + + try { + foo()?.bar()? + } catch { + e => { + let x = baz(e); + quux(x, e); + } + } + +This single case arm is quite redundant and unfortunate. + +Therefore, neither form can be considered strictly superior to the other, and it +is preferable to simply provide both. + +Finally, it is also possible to write a `try` block *without* a `catch` block: + + 3. `try { EXPR }` + + In this case the `try` block evaluates directly to a `Result`-like type + containing either the value of `EXPR`, or the exception which was thrown. + For instance, `try { foo()? }` is essentially equivalent to `foo()`. + This can be useful if you want to coalesce *multiple* potential exceptions - + `try { foo()?.bar()?.baz()? }` - into a single `Result`, which you wish to + then e.g. pass on as-is to another function, rather than analyze yourself. + +## (Optional) `throw` and `throws` + +It is possible to carry the exception handling analogy further and also add +`throw` and `throws` constructs. + +`throw` is very simple: `throw EXPR` is essentially the same thing as +`Err(EXPR)?`; in other words it throws the exception `EXPR` to the innermost +`try` block, or to the function's caller if there is none. + +A `throws` clause on a function: + + fn foo(arg; Foo) -> Bar throws Baz { ... } + +would do two things: + + * Less importantly, it would make the function polymorphic over the + `Result`-like type used to "carry" exceptions. + + * More importantly, it means that instead of writing `return Ok(foo)` and + `return Err(bar)` in the body of the function, one would write `return foo` + and `throw bar`, and these are implicitly embedded as the "success" or + "exception" value in the carrier type. This removes syntactic overhead from + both "normal" and "throwing" code paths and (apart from `?` to propagate + exceptions) matches what code might look like in a language with native + exceptions. + +(This could potentially be extended to allow writing `throws` clauses on `fn` +and closure *types*, desugaring to a type parameter with a `Carrier` bound on +the parent item (e.g. a HOF), but this would be considerably more involved, and +it's not clear whether there is value in doing so.) + + +# Detailed design + +The meaning of the constructs will be specified by a source-to-source +translation. We make use of an "early exit from any block" feature which doesn't +currently exist in the language, generalizes the current `break` and `return` +constructs, and is independently useful. + +## Early exit from any block + +The capability can be exposed either by generalizing `break` to take an optional +value argument and break out of any block (not just loops), or by generalizing +`return` to take an optional lifetime argument and return from any block, not +just the outermost block of the function. This feature is independently useful +and I believe it should be added, but as it is only used here in this RFC as an +explanatory device, and implementing the RFC does not require exposing it, I am +going to arbitrarily choose the `return` syntax for the following and won't +discuss the question further. + +So we are extending `return` with an optional lifetime argument: `return 'a +EXPR`. This is an expression of type `!` which causes an early return from the +enclosing block specified by `'a`, which then evaluates to the value `EXPR` (of +course, the type of `EXPR` must unify with the type of the last expression in +that block). + +A completely artificial example: + + 'a: { + let my_thing = if have_thing { + get_thing() + } else { + return 'a None + }; + println!("found thing: {}", my_thing); + Some(my_thing) + } + +Here if we don't have a thing, we escape from the block early with `None`. + +If no lifetime is specified, it defaults to returning from the whole function: +in other words, the current behavior. We can pretend there is a magical lifetime +`'fn` which refers to the outermost block of the current function, which is the +default. + +## The trait + +Here we specify the trait for types which can be used to "carry" either a normal +result or an exception. There are several different, completely equivalent ways +to formulate it, which differ only in the set of methods: for other +possibilities, see the appendix. + + #[lang(carrier)] + trait Carrier { + type Normal; + type Exception; + fn embed_normal(from: Normal) -> Self; + fn embed_exception(from: Exception) -> Self; + fn translate>(from: Self) -> Other; + } + +This trait basically just states that `Self` is isomorphic to +`Result` for some types `Normal` and `Exception`. For greater +clarity on how these methods work, see the section on `impl`s below. (For a +simpler formulation of the trait using `Result` directly, see the appendix.) + +The `translate` method says that it should be possible to translate to any +*other* `Carrier` type which has the same `Normal` and `Exception` types. This +can be used to inspect the value by translating to a concrete type such as +`Result` and then, for example, pattern matching on it. + +Laws: + + 1. For all `x`, `translate(embed_normal(x): A): B ` = `embed_normal(x): B`. + 2. For all `x`, `translate(embed_exception(x): A): B ` = `embed_exception(x): B`. + 3. For all `carrier`, `translate(translate(carrier: A): B): A` = `carrier: A`. + +Here I've used explicit type ascription syntax to make it clear that e.g. the +types of `embed_` on the left and right hand sides are different. + +The first two laws say that embedding a result `x` into one carrier type and +then translating it to a second carrier type should be the same as embedding it +into the second type directly. + +The third law says that translating to a different carrier type and then +translating back should be the identity function. + + +## `impl`s of the trait + + impl Carrier for Result { + type Normal = T; + type Exception = E; + fn embed_normal(a: T) -> Result { Ok(a) } + fn embed_exception(e: E) -> Result { Err(e) } + fn translate>(result: Result) -> Other { + match result { + Ok(a) => Other::embed_normal(a), + Err(e) => Other::embed_exception(e) + } + } + } + +As we can see, `translate` can be implemented by deconstructing ourself and then +re-embedding the contained value into the other carrier type. + + impl Carrier for Option { + type Normal = T; + type Exception = (); + fn embed_normal(a: T) -> Option { Some(a) } + fn embed_exception(e: ()) -> Option { None } + fn translate>(option: Option) -> Other { + match option { + Some(a) => Other::embed_normal(a), + None => Other::embed_exception(()) + } + } + } + +Potentially also: + + impl Carrier for bool { + type Normal = (); + type Exception = (); + fn embed_normal(a: ()) -> bool { true } + fn embed_exception(e: ()) -> bool { false } + fn translate>(b: bool) -> Other { + match b { + true => Other::embed_normal(()), + false => Other::embed_exception(()) + } + } + } + +The laws should be sufficient to rule out any "icky" impls. For example, an impl +for `Vec` where an exception is represented as the empty vector, and a normal +result as a single-element vector: here the third law fails, because if the +`Vec` has more than element *to begin with*, then it's not possible to translate +to a different carrier type and then back without losing information. + +The `bool` impl may be surprising, or not useful, but it *is* well-behaved: +`bool` is, after all, isomorphic to `Result<(), ()>`. This `impl` may be +included or not; I don't have a strong opinion about it. + +## Definition of constructs + +Finally we have the definition of the new constructs in terms of a +source-to-source translation. + +In each case except the first, I will provide two definitions: a single-step +"shallow" desugaring which is defined in terms of the previously defined new +constructs, and a "deep" one which is "fully expanded". + +Of course, these could be defined in many equivalent ways: the below definitions +are merely one way. + + * Construct: + + throw EXPR + + Shallow: + + return 'here Carrier::embed_exception(EXPR) + + Where `'here` refers to the innermost enclosing `try` block, or to `'fn` if + there is none. As with `return`, `EXPR` may be omitted and defaults to `()`. + + * Construct: + + EXPR? + + Shallow: + + match translate(EXPR) { + Ok(a) => a, + Err(e) => throw e + } + + Deep: + + match translate(EXPR) { + Ok(a) => a, + Err(e) => return 'here Carrier::embed_exception(e) + } + + * Construct: + + try { + foo()?.bar() + } + + Shallow: + + 'here: { + Carrier::embed_normal(foo()?.bar()) + } + + Deep: + + 'here: { + Carrier::embed_normal(match translate(foo()) { + Ok(a) => a, + Err(e) => return 'here Carrier::embed_exception(e) + }.bar()) + } + + * Construct: + + try { + foo()?.bar() + } catch e { + baz(e) + } + + Shallow: + + match try { + foo()?.bar() + } { + Ok(a) => a, + Err(e) => baz(e) + } + + Deep: + + match 'here: { + Carrier::embed_normal(match translate(foo()) { + Ok(a) => a, + Err(e) => return 'here Carrier::embed_exception(e) + }.bar()) + } { + Ok(a) => a, + Err(e) => baz(e) + } + + * Construct: + + try { + foo()?.bar() + } catch { + A(a) => baz(a), + B(b) => quux(b) + } + + Shallow: + + try { + foo()?.bar() + } catch e { + match e { + A(a) => baz(a), + B(b) => quux(b) + } + } + + Deep: + + match 'here: { + Carrier::embed_normal(match translate(foo()) { + Ok(a) => a, + Err(e) => return 'here Carrier::embed_exception(e) + }.bar()) + } { + Ok(a) => a, + Err(e) => match e { + A(a) => baz(a), + B(b) => quux(b) + } + } + + * Construct: + + fn foo(A) -> B throws C { + CODE + } + + Shallow: + + fn foo>(A) -> Car { + try { + 'fn: { + CODE + } + } + } + + Deep: + + fn foo>(A) -> Car { + 'here: { + Carrier::embed_normal('fn: { + CODE + }) + } + } + + (Here our desugaring runs into a stumbling block, and we resort to a pun: the + *whole function* should be conceptually wrapped in a `try` block, and a + `return` inside `CODE` should be embedded as a successful result into the + carrier, rather than escaping from the `try` block itself. We suggest this by + putting the "magical lifetime" `'fn` *inside* the `try` block.) + +The fully expanded translations get quite gnarly, but that is why it's good that +you don't have to write them! + +In general, the types of the defined constructs should be the same as the types +of their definitions. + +(As noted earlier, while the behavior of the constructs can be *specified* using +a source-to-source translation in this manner, they need not necessarily be +*implemented* this way.) + +## Laws + +Without any attempt at completeness, and modulo `translate()` between different +carrier types, here are some things which should be true: + + * `try { foo() } ` = `Ok(foo())` + * `try { throw e } ` = `Err(e)` + * `try { foo()? } ` = `foo()` + * `try { foo() } catch e { e }` = `foo()` + * `try { throw e } catch e { e }` = `e` + * `try { Ok(foo()?) } catch e { Err(e) }` = `foo()` + +## Misc + + * Our current lint for unused results could be replaced by one which warns for + any unused result of a type which implements `Carrier`. + + * If there is ever ambiguity due to the carrier type being underdetermined + (experience should reveal whether this is a problem in practice), we could + resolve it by defaulting to `Result`. (This would presumably involve making + `Result` a lang item.) + + * Translating between different carrier types with the same `Normal` and + `Exception` types *should*, but may not necessarily *currently* be, a no-op + most of the time. + + We should make it so that: + + * repr(`Option`) = repr(`Result`) + * repr(`bool`) = repr(`Option<()>`) = repr(`Result<(), ()>`) + + If these hold, then `translate` between these types could in theory be + compiled down to just a `transmute`. (Whether LLVM is smart enough to do + this, I don't know.) + + * The `translate()` function smells to me like a natural transformation between + functors, but I'm not category theorist enough for it to be obvious. + + +# Drawbacks + + * Adds new constructs to the language. + + * Some people have a philosophical objection to "there's more than one way to + do it". + + * Relative to first-class checked exceptions, our implementation options are + constrained: while actual checked exceptions could be implemented in a + similar way to this proposal, they could also be implemented using unwinding, + should we choose to do so, and we do not realistically have that option here. + + +# Alternatives + + * Do nothing. + + * Only add the `?` operator, but not any of the other constructs. + + * Instead of a general `Carrier` trait, define everything directly in terms of + `Result`. This has precedent in that, for example, the `if`..`else` construct + is also defined directly in terms of `bool`. (However, this would likely also + lead to removing `Option` from the standard library in favor of + `Result<_, ()>`.) + + * Add [first-class checked exceptions][notes], which are propagated + automatically (without an `?` operator). + + This has the drawbacks of being a more invasive change and duplicating + functionality: each function must choose whether to use checked exceptions + via `throws`, or to return a `Result`. While the two are isomorphic and + converting between them is easy, with this proposal, the issue does not even + arise, as exception handling is defined *in terms of* `Result`. Furthermore, + automatic exception propagation raises the specter of "exception safety": how + serious an issue this would actually be in practice, I don't know - there's + reason to believe that it would be much less of one than in C++. + +[notes]: https://github.com/glaebhoerl/rust-notes/blob/268266e8fbbbfd91098d3bea784098e918b42322/my_rfcs/Exceptions.txt + + +# Unresolved questions + + * What should the precedence of the `?` operator be? + + * Should we add `throw` and/or `throws`? + + * Should we have `impl Carrier for bool`? + + * Should we also add the "early return from any block" feature along with this + proposal, or should that be considered separately? (If we add it: should we + do it by generalizing `break` or `return`?) + + +# Appendices + +## Alternative formulations of the `Carrier` trait + +All of these have the form: + + trait Carrier { + type Normal; + type Exception; + ...methods... + } + +and differ only in the methods, which will be given. + +### Explicit isomorphism with `Result` + + fn from_result(Result) -> Self; + fn to_result(Self) -> Result; + +This is, of course, the simplest possible formulation. + +The drawbacks are that it, in some sense, privileges `Result` over other +potentially equivalent types, and that it may be less efficient for those types: +for any non-`Result` type, every operation requires two method calls (one into +`Result`, and one out), whereas with the `Carrier` trait in the main text, they +only require one. + +Laws: + + * For all `x`, `from_result(to_result(x))` = `x`. + * For all `x`, `to_result(from_result(x))` = `x`. + +Laws for the remaining formulations below are left as an exercise for the +reader. + +### Avoid privileging `Result`, most naive version + + fn embed_normal(Normal) -> Self; + fn embed_exception(Exception) -> Self; + fn is_normal(&Self) -> bool; + fn is_exception(&Self) -> bool; + fn assert_normal(Self) -> Normal; + fn assert_exception(Self) -> Exception; + +Of course this is horrible. + +### Destructuring with HOFs (a.k.a. Church/Scott-encoding) + + fn embed_normal(Normal) -> Self; + fn embed_exception(Exception) -> Self; + fn match_carrier(Self, FnOnce(Normal) -> T, FnOnce(Exception) -> T) -> T; + +This is probably the right approach for Haskell, but not for Rust. + +With this formulation, because they each take ownership of them, the two +closures may not even close over the same variables! + +### Destructuring with HOFs, round 2 + + trait BiOnceFn { + type ArgA; + type ArgB; + type Ret; + fn callA(Self, ArgA) -> Ret; + fn callB(Self, ArgB) -> Ret; + } + + trait Carrier { + type Normal; + type Exception; + fn normal(Normal) -> Self; + fn exception(Exception) -> Self; + fn match_carrier(Self, BiOnceFn) -> T; + } + +Here we solve the environment-sharing problem from above: instead of two objects +with a single method each, we use a single object with two methods! I believe +this is the most flexible and general formulation (which is however a strange +thing to believe when they are all equivalent to each other). Of course, it's +even more awkward syntactically. From 63a3c4d088374a9f3929acd77cdaa407b6d80e8f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Tue, 16 Sep 2014 21:40:09 +0200 Subject: [PATCH 0014/1195] I was going to mention "just do it with a macro" in the Alternatives, but it somehow got lost in the shuffle --- active/0000-trait-based-exception-handling.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index 836e0021afe..2ef64c818ac 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -353,8 +353,8 @@ Potentially also: The laws should be sufficient to rule out any "icky" impls. For example, an impl for `Vec` where an exception is represented as the empty vector, and a normal result as a single-element vector: here the third law fails, because if the -`Vec` has more than element *to begin with*, then it's not possible to translate -to a different carrier type and then back without losing information. +`Vec` has more than one element *to begin with*, then it's not possible to +translate to a different carrier type and then back without losing information. The `bool` impl may be surprising, or not useful, but it *is* well-behaved: `bool` is, after all, isomorphic to `Result<(), ()>`. This `impl` may be @@ -586,6 +586,15 @@ carrier types, here are some things which should be true: * Only add the `?` operator, but not any of the other constructs. + * Instead of a built-in `try`..`catch` construct, attempt to define one using + macros. However, this is likely to be awkward because, at least, macros may + only have their contents as a single block, rather than two. Furthermore, + macros are excellent as a "safety net" for features which we forget to add + to the language itself, or which only have specialized use cases; but after + seeing this proposal, we need not forget `try`..`catch`, and its prevalence + in nearly every existing language suggests that it is, in fact, generally + useful. + * Instead of a general `Carrier` trait, define everything directly in terms of `Result`. This has precedent in that, for example, the `if`..`else` construct is also defined directly in terms of `bool`. (However, this would likely also From d569e5af72bc4703bfe06f34561abe84238c8340 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 30 Sep 2014 20:53:03 +0200 Subject: [PATCH 0015/1195] explicitly refer back to paragraph that defines the enum workaround. --- active/0000-empty-structs-with-braces.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 512b410e6a5..5ba2b9615b7 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -109,7 +109,8 @@ author (pnkfelix) would be happy to take this tack.) The main reason not to take this tack is that some people may like writing empty structs without braces, but do not want to switch to the -unary enum version. See "I wouldn't want to force noisier syntax ..." +unary enum version described in the previous paragraph. +See "I wouldn't want to force noisier syntax ..." in the [Recent History] appendix. ## Status quo From f44ac712b86e6c4395df2c48a8649fa259ab4fff Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 7 Oct 2014 13:48:02 +0200 Subject: [PATCH 0016/1195] Update to deal with aforementioned type- vs value- namespace issue. --- active/0000-empty-structs-with-braces.md | 79 +++++++++++++++++------- 1 file changed, 58 insertions(+), 21 deletions(-) diff --git a/active/0000-empty-structs-with-braces.md b/active/0000-empty-structs-with-braces.md index 5ba2b9615b7..4e2f55d8d04 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/active/0000-empty-structs-with-braces.md @@ -5,9 +5,11 @@ # Summary When a struct type `S` has no fields (a so-called "empty struct"), -allow it to be defined via either `struct S;` or `struct S {}`, and -allow instances of it to be constructed and pattern-matched via either -`S` or `S {}`. +allow it to be defined via either `struct S;` or `struct S {}`. +When defined via `struct S;`, allow instances of it to be constructed +and pattern-matched via either `S` or `S {}`. +When defined via `struct S {}`, require instances to be constructed +and pattern-matched solely via `S {}`. # Motivation @@ -46,7 +48,7 @@ and non-empty, and the associated revisions of changing removing and adding curly braces is aggravating (both in effort revising the code, and also in extra noise introduced into commit histories). -This RFC proposes going back to the state we were in circa February +This RFC proposes an approach similar to the one we used circa February 2013, when both `S0` and `S0 { }` were accepted syntaxes for an empty struct. The parsing ambiguity that motivated removing support for `S0 { }` is no longer present (see the [Ancient History] appendix). @@ -55,34 +57,43 @@ in the language now. # Detailed design - * Allow `S` to be defined via either `struct S;` (as today) - or `struct S {}` ("new") +There are two kinds of empty structs: Braced empty structs and +flexible empty structs. Flexible empty structs are a slight +generalization of the structs that we have today. - * Allow instances of `S` to be constructed via either the - expression `S` (as today) or the expression `S {}` ("new") +Flexible empty structs are defined via the syntax `struct S;` (as today). - * Allow instances of `S` to be pattern matched via either the - pattern `S` (as today) or the pattern `S {}` ("new"). +Braced empty structs are defined via the syntax `struct S { }` ("new"). -Revise the grammar of struct item definitions so that one can write -either `struct S;` or `struct S { }`. The two forms are synonymous. -The first is preferred with respect to coding style; for example, the -first is emitted by the pretty printer. +Both braced and flexible empty structs can be constructed via the +expression syntax `S { }` ("new"). Flexible empty structs, as today, +can also be constructed via the expression syntax `S`. -Revise the grammar of expressions and patterns so that, when `S` is an -empty struct, one can write either `S` or `S { }`. The two forms are -synonymous. Again, the first is preferred with respect to coding -style, and is emitted by the pretty printer. +Both braced and flexible empty structs can be pattern-matched via the +pattern syntax `S { }` ("new"). Flexible empty structs, as today, +can also be pattern-matched via the pattern syntax `S`. -The format of the definiton has no bearing on the format of the -expressions or pattern forms; either syntax can be used for any -empty-struct, regardless of how it is defined. +Braced empty struct definitions solely affect the type namespace, +just like normal non-empty structs. +Flexible empty structs affect both the type and value namespaces. + +As a matter of style, using braceless syntax is preferred for +constructing and pattern-matching flexible empty structs. For +example, pretty-printer tools are encouraged to emit braceless forms +if they know that the corresponding struct is a flexible empty struct. +(Note that pretty printers that handle incomplete fragments may not +have such information available.) There is no ambiguity introduced by this change, because we have already introduced a restriction to the Rust grammar to force the use of parentheses to disambiguate struct literals in such contexts. (See [Rust RFC 25]). +The expectation is that when migrating code from a flexible empty +struct to a non-empty struct, it can start by first migrating to a +braced empty struct (and then have a tool indicate all of the +locations where braces need to be added); after that step has been +completed, one can then take the next step of adding the actual field. # Drawbacks @@ -120,6 +131,32 @@ need to handle empty structs as a special case. We may continue hitting bugs like [CFG parse bug]. Some users will be annoyed but most will probably cope. +## Synonymous in all contexts + +Alternative 3: An earlier version of this RFC proposed having `struct +S;` be entirely synonymous with `struct S { }`, and the expression +`S { }` be synonymous with `S`. + +This was deemed problematic, since it would mean that `S { }` would +put an entry into both the type and value namespaces, while +`S { x: int }` would only put an entry into the type namespace. +Thus the current draft of the RFC proposes the "flexible" versus +"braced" distinction for empty structs. + +## Never synonymous + +Alternative 4: Treat `struct S;` as requiring `S` at the expression +and pattern sites, and `struct S { }` as requiring `S { }` at the +expression and pattern sites. + +This in some ways follows a principle of least surprise, but it also +is really hard to justify having both syntaxes available for empty +structs with no flexibility about how they are used. (Note again that +one would have the option of choosing between +`enum S { S }`, `struct S;`, or `struct S { }`, each with their own +idiosyncrasies about whether you have to write `S` or `S { }`.) +I would rather adopt "Always Require Braces" than "Never Synonymous" + ## Empty Tuple Structs One might say "why are you including support for curly braces, but not From 221414ddc8491f660efdd2526a4bbe3a69ebc74c Mon Sep 17 00:00:00 2001 From: P1start Date: Wed, 3 Dec 2014 15:59:18 +1300 Subject: [PATCH 0017/1195] RFC: Array pattern adjustments --- text/0000-array-pattern-changes.md | 152 +++++++++++++++++++++++++++++ 1 file changed, 152 insertions(+) create mode 100644 text/0000-array-pattern-changes.md diff --git a/text/0000-array-pattern-changes.md b/text/0000-array-pattern-changes.md new file mode 100644 index 00000000000..06d7e19f7ce --- /dev/null +++ b/text/0000-array-pattern-changes.md @@ -0,0 +1,152 @@ +- Start Date: 2014-12-03 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +Summary +======= + +Change array/slice patterns in the following ways: + +- Make them only match on arrays (`[T, ..n]` and `[T]`), not slices; +- Make subslice matching yield a value of type `[T, ..n]` or `[T]`, not `&[T]` + or `&mut [T]`; +- Allow multiple mutable references to be made to different parts of the same + array or slice in array patterns (resolving rust-lang/rust [issue + #8636](https://github.com/rust-lang/rust/issues/8636)). + +Motivation +========== + +Before DST (and after the removal of `~[T]`), there were only two types based on +`[T]`: `&[T]` and `&mut [T]`. With DST, we can have many more types based on +`[T]`, `Box<[T]>` in particular, but theoretically any pointer type around a +`[T]` could be used. However, array patterns still match on `&[T]`, `&mut [T]`, +and `[T, ..n]` only, meaning that to match on a `Box<[T]>`, one must first +convert it to a slice, which disallows moves. This may prove to significantly +limit the amount of useful code that can be written using array patterns. + +Another problem with today’s array patterns is in subslice matching, which +specifies that the rest of a slice not matched on already in the pattern should +be put into a variable: + +```rust +let foo = [1i, 2, 3]; +match foo { + [head, tail..] => { + assert_eq!(head, 1); + assert_eq!(tail, &[2, 3]); + }, + _ => {}, +} +``` + +This makes sense, but still has a few problems. In particular, `tail` is a +`&[int]`, even though the compiler can always assert that it will have a length +of `2`, so there is no way to treat it like a fixed-length array. Also, all +other bindings in array patterns are by-value, whereas bindings using subslice +matching are by-reference (even though they don’t use `ref`). This can create +confusing errors because of the fact that the `..` syntax is the only way of +taking a reference to something within a pattern without using the `ref` +keyword. + +Finally, the compiler currently complains when one tries to take multiple +mutable references to different values within the same array in a slice pattern: + +```rust +let foo: &mut [int] = &mut [1, 2, 3]; +match foo { + [ref mut a, ref mut b] => ..., + ... +} +``` + +This fails to compile, because the compiler thinks that this would allow +multiple mutable borrows to the same value (which is not the case). + +Detailed design +=============== + +- Make array patterns match only on arrays (`[T, ..n]` and `[T]`). For example, + the following code: + + ```rust + let foo: &[u8] = &[1, 2, 3]; + match foo { + [a, b, c] => ..., + ... + } + ``` + + Would have to be changed to this: + + ```rust + let foo: &[u8] = &[1, 2, 3]; + match foo { + &[a, b, c] => ..., + ... + } + ``` + + This change makes slice patterns mirror slice expressions much more closely. + +- Make subslice matching in array patterns yield a value of type `[T, ..n]` (if + the array is of fixed size) or `[T]` (if not). This means changing most code + that looks like this: + + ```rust + let foo: &[u8] = &[1, 2, 3]; + match foo { + [a, b, c..] => ..., + ... + } + ``` + + To this: + + ```rust + let foo: &[u8] = &[1, 2, 3]; + match foo { + &[a, b, ref c..] => ..., + ... + } + ``` + + It should be noted that if a fixed-size array is matched on using subslice + matching, and `ref` is used, the type of the binding will be `&[T, ..n]`, + *not* `&[T]`. + +- Improve the compiler’s analysis of multiple mutable references to the same + value within array patterns. This would be done by allowing multiple mutable + references to different elements of the same array (including bindings from + subslice matching): + + ```rust + let foo: &mut [u8] = &mut [1, 2, 3, 4]; + match foo { + &[ref mut a, ref mut b, ref c, ref mut d..] => ..., + ... + } + ``` + +Drawbacks +========= + +- This will break a non-negligible amount of code, requiring people to add `&`s + and `ref`s to their code. + +- The modifications to subslice matching will require `ref` or `ref mut` to be + used in almost all cases. This could be seen as unnecessary. + +Alternatives +============ + +- Do a subset of this proposal; for example, the modifications to subslice + matching in patterns could be removed. + +Unresolved questions +==================== + +- What are the precise implications to the borrow checker of the change to + multiple mutable borrows in the same array pattern? Since it is a + backwards-compatible change, it can be implemented after 1.0 if it turns out + to be difficult to implement. From 40d369ac6a5ace63d00acbeafddf61f18e046cac Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Mon, 8 Dec 2014 13:13:09 -0500 Subject: [PATCH 0018/1195] API comment conventions --- text/0000-api-comment-conventions.md | 125 +++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 text/0000-api-comment-conventions.md diff --git a/text/0000-api-comment-conventions.md b/text/0000-api-comment-conventions.md new file mode 100644 index 00000000000..edd7cdf878d --- /dev/null +++ b/text/0000-api-comment-conventions.md @@ -0,0 +1,125 @@ +- Start Date: 2014-12-08 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This is a conventions RFC, providing guidance on providing API documentation +for Rust projects, including the Rust language itself. + +# Motivation + +Documentation is an extremely important part of any project. It's important +that we have consistency in our documentation. + +For the most part, the RFC proposes guidelines that are already followed today, +but it tries to motivate and clarify them. + +# Detailed design + +There are a number of indivudal guidelines: + +## Use line comments + +Avoid block comments. Use line comments instead: + +```rust +// Wait for the main task to return, and set the process error code +// appropriately. +``` + +Instead of: + +```rust +/* + * Wait for the main task to return, and set the process error code + * appropriately. + */ +``` + +Only use inner doc comments //! to write crate and module-level documentation, nothing else. + +## Formatting + +The first line in any doc comment should be a single-line short sentence +providing a summary of the code. This line is used as a summary description +throughout Rustdoc's output, so it's a good idea to keep it short. + +All doc comments, including the summary line, should begin with a capital +letter and end with a period, question mark, or exclamation point. Prefer full +sentences to fragments. + +The summary line should be written in third person singular present indicative +form. Basically, this means write "Returns" instead of "Return". + +## Using Markdown + +Within doc comments, use Markdown to format your documentation. + +Use top level headings # to indicate sections within your comment. Common headings: + +* Examples +* Panics +* Failure + +Even if you only include one example, use the plural form: "Examples" rather +than "Example". Future tooling is easier this way. + +Use graves (`) to denote a code fragment within a sentence. + +Use triple graves (```) to write longer examples, like this: + + This code does something cool. + + ```rust + let x = foo(); + x.bar(); + ``` + +When appropriate, make use of Rustdoc's modifiers. Annotate triple grave blocks with +the appropriate formatting directive. While they default to Rust in Rustdoc, prefer +being explicit, so that it highlights syntax in places that do not, like GitHub. + + ```rust + println!("Hello, world!"); + ``` + + ```ruby + puts "Hello" + ``` + +Rustdoc is able to test all Rust examples embedded inside of documentation, so +it's important to mark what is not Rust so your tests don't fail. + +References and citation should be linked inline. Prefer + +``` +[some paper](http://www.foo.edu/something.pdf) +``` + +to + +``` +some paper[1] + +1: http://www.foo.edu/something.pdf +``` + +## English + +All documentation is standardized on American English, with regards to +spelling, grammar, and punctuation conventions. Language changes over time, +so this doesn't mean that there is always a correct answer to every grammar +question, but there is often some kind of formal consensus. + +# Drawbacks + +None. + +# Alternatives + +Not having documentation guidelines. + +# Unresolved questions + +None. From 59adf760e178eae67e502aa7aac1e9fbd3fefb36 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Sun, 14 Dec 2014 17:56:12 +0100 Subject: [PATCH 0019/1195] Added string pattern RFC --- text/0000-string-patterns.md | 374 +++++++++++++++++++++++++++++++++++ 1 file changed, 374 insertions(+) create mode 100644 text/0000-string-patterns.md diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md new file mode 100644 index 00000000000..447450f235a --- /dev/null +++ b/text/0000-string-patterns.md @@ -0,0 +1,374 @@ +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +> One para explanation of the feature. + +Stabilize all string functions working with search patterns around a new +generic API that provides a unfied way to define and use those patterns. + +# Motivation + +> Why are we doing this? What use cases does it support? What is the expected outcome? + +Right now, string slices define a couple of methods for string +manipulation that work with user provided values that act as +search patterns. For example, `split()` takes an type implementing `CharEq` +to split the slice at all codepoints that match that predicate. + +Among these methods, the notion of what exactly is being used as a search +pattern varies inconsistently: Many work with the generic `CharEq`, +which only looks at a single codepoint at a time; and some +work with `char` or `&str` directly, sometimes duplicating a method to +provide operations for both. + +This presents a couple of issues: + +- The API is inconsistent. +- The API duplicates similar operations on different types. (`contains` vs `contains_char`) +- The API does not provide all operations for all types. (No `rsplit` for `&str` patterns) +- The API is not extensible, eg to allow splitting at regex matches. +- The API offers no way to statically decide between different basic search algorithms + for the same pattern, for example to use Boojer Moore string searching + +> TODO: Spelling above + +At the moment, the full set of relevant string methods roughly looks like this: + +```rust +pub trait StrExt for ?Sized { + fn contains(&self, needle: &str) -> bool; + fn contains_char(&self, needle: char) -> bool; + + fn split(&self, sep: Sep) -> CharSplits; + fn splitn(&self, sep: Sep, count: uint) -> CharSplitsN; + fn rsplitn(&self, sep: Sep, count: uint) -> CharSplitsN; + fn split_terminator(&self, sep: Sep) -> CharSplits; + fn split_str<'a>(&'a self, &'a str) -> StrSplits<'a>; + + fn match_indices<'a>(&'a self, sep: &'a str) -> MatchIndices<'a>; + + fn starts_with(&self, needle: &str) -> bool; + fn ends_with(&self, needle: &str) -> bool; + + fn trim_chars(&self, to_trim: C) -> &'a str; + fn trim_left_chars(&self, to_trim: C) -> &'a str; + fn trim_right_chars(&self, to_trim: C) -> &'a str; + + fn find(&self, search: C) -> Option; + fn rfind(&self, search: C) -> Option; + fn find_str(&self, &str) -> Option; + + // ... +} +``` + +This RFC proposes to fix those issues by providing a unified `Pattern` trait +that all "string pattern" types would implement, and that would be used by the string API +exclusively. + +As an additional design goal, the new abstractions should also not pose a problem +for optimization - like for iterators, a concrete instance should produce similar +machine code to a hardcoded optimized loop written in C. + +> Idea: Parallel trait hierachy not using unsafe, that will use checks + +# Detailed design + +> This is the bulk of the RFC. Explain the design in enough detail for somebody familiar +with the language to understand, and for somebody familiar with the compiler to implement. +This should get into specifics and corner-cases, and include examples of how the feature is used. + +> Goal: A working draft with lifetimes + +## New traits + +First, new traits will be added to the `str` module in the std library: + +```rust +trait Pattern<'a> { + type MatcherImpl: Matcher<'a>; + + fn into_matcher(self, haystack: &'a str) -> Self::MatcherImpl; + + // Can be implemented to optimize the "find only" case. + fn is_contained_in(self, haystack: &'a str) -> bool { + self.into_matcher(s).next_match().is_some() + } +} +``` + +A `Pattern` represents a builder for an associated type implementing a +family of `Matcher` traits (see below), and will be implemented by all types that +represent string patterns, which includes: + +- `char` and `&str` +- Everything implementing `CharEq` +- Additional types like `&Regex` or `Ascii` + +```rust +impl<'a> Pattern<'a> for char { /* ... */ } +impl<'a, 'b> Pattern<'a> for &'b str { /* ... */ } + +impl<'a, 'b> Pattern<'a> for &'b [char] { /* ... */ } +impl<'a, F> Pattern<'a> for F where F: FnOnce(char) -> bool { /* ... */ } + +impl<'a, 'b> Pattern<'a> for &'b Regex { /* ... */ } +``` + +The lifetime paramter on `Pattern` exists in order to allow threading the lifetime +of the haystack (the string to be searched through) through the API, and is a workaround +for not having associated higher kinded types yet. + +Consumers of this API can then call `into_matcher()` on the pattern to convert it into +a type implementing a family of `Matcher` traits: + +```rust +unsafe trait Matcher<'a> { + fn haystack(&self) -> &'a str + fn next_match(&mut self) -> Option<(uint, uint)>; +} + +unsafe trait ReverseMatcher<'a>: Matcher<'a> { + fn next_match_back(&mut self) -> Option<(uint, uint)>; +} + +trait DoubleEndedMatcher<'a>: ReverseMatcher<'a> {} +``` + +> TODO: Better name for the last trait + +The basic idea of a `Matcher` is to expose a `Iterator`-like interface for +iterating through all matches of a pattern in the given haystack. + +Similar to iterators, depending on the concrete implementation a matcher can have +additional capabilities that build on each other, which is why they will be +defined in terms of a three-tier hierachy: + +- `Matcher<'a>` is the basic trait that all matchers need to implement. + It contains a `next_match()` method that returns the `start` and `end` indices of + the next non-overlapping match in the haystack, with the search beginning at the front + (left) of the string. It also contains a `haystack()` getter for returning the + actual haystack, which is the source of the `'a` lifetime on the hierarchy. + The reason for this getter being made part of the trait is twofold: + - Every matcher needs to store some reference to the haystack anyway. + - Users of this trait will need access to the haystack in order + for the individual match results to be useful. +- `ReverseMatcher<'a>` adds an `next_match_back` method, for also allowing to efficiently + search for matches in reverse (starting from the right). + However, the results are not required to be equal to the results of + `next_match` in reverse, (as would be the case for the `DoubleEndedIterator` trait) + as that can not be efficiently guaranteed for all matchers. (For an example, see further below) +- Instead `DoubleEndedMatcher<'a>` is provided as an marker trait for expressing + that guarantee - If a matcher implements this trait, all results found from the + left need to be equal to all results found from the right in reverse order. + +As an important last detail, both +`Matcher` and `ReverseMatcher` are marked as `unsafe` traits, even though the actual methods +aren't. This is because every implementation of these traits need to ensure that all +indices returned by `next_match` and `next_match_back` lie on valid utf8 boundaries +in the used haystack. + +Without that guarantee, every single match returned by a matcher would need to be +double-checked for validity, which would be unnecessary and most likely +unoptimizable work. + +This is in contrast to the current hardcoded implementations, which can +make use of such guarantees because the concrete types are known +and all unsafe code needed for such optimizations is contained inside a single safe impl. + +Given that most implementations of these traits will likely +live in the std library anyway, and are thoroughly tested, marking these traits `unsafe` +doesn't seem like a huge burden to bear for good, optimizable performance. + +### Example for the issue with double-ended searching + +Let the haystack be the string `"fooaaaaabar"`, and let the pattern be the string `"aa"`. + +Then a efficient, lazy implementation of the matcher searching from the left +would find these matches: + +`"foo[aa][aa]abar"` + +However, the same algorithm searching from the right would find these matches: + +`"fooa[aa][aa]bar"` + +This discrepancy can not be avoided without additional overhead or even +allocations for caching in the reverse matcher, and thus "matching from the front" needs to +be considered a different operation than "matching from the back". + +## New methods on `StrExt` + +With the `Pattern` and `Matcher` traits defined and implemented, the actual `str` +methods will be changed to make use of them: + +```rust +pub trait StrExt { + fn contains<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>; + + fn split<'a, P>(&'a self, pat: P) -> Splits

where P: Pattern<'a>; + fn rsplit<'a, P>(&'a self, pat: P) -> RSplits

where P: Pattern<'a>; + fn split_terminator<'a, P>(&'a self, pat: P) -> TermSplits

where P: Pattern<'a>; + fn rsplit_terminator<'a, P>(&'a self, pat: P) -> RTermSplits

where P: Pattern<'a>; + fn splitn<'a, P>(&'a self, pat: P, n: uint) -> NSplits

where P: Pattern<'a>; + fn rsplitn<'a, P>(&'a self, pat: P, n: uint) -> RNSplits

where P: Pattern<'a>; + + fn matches<'a, P>(&'a self, pat: P) -> Matches

where P: Pattern<'a>; + fn rmatches<'a, P>(&'a self, pat: P) -> RMatches

where P: Pattern<'a>; + fn match_indices<'a, P>(&'a self, pat: P) -> MatchIndices

where P: Pattern<'a>; + fn rmatch_indices<'a, P>(&'a self, pat: P) -> RMatchIndices

where P: Pattern<'a>; + + fn starts_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>; + fn ends_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>, + P::MatcherImpl: ReverseMatcher<'a>; + + fn trim_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>, + P::MatcherImpl: ReverseMatcher<'a>; + fn trim_left_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>; + fn trim_right_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>, + P::MatcherImpl: ReverseMatcher<'a>; + + fn find<'a, P>(&'a self, pat: P) -> Option where P: Pattern<'a>; + fn rfind<'a, P>(&'a self, pat: P) -> Option where P: Pattern<'a>, + P::MatcherImpl: ReverseMatcher<'a>; + + // ... +} +``` + +These are mainly the same pattern-using methods as currently existing, only +changed to uniformly use the new pattern API. The main differences are: +- Duplicates like `contains(char)` and `contains_str(&str)` got merged into single generic methods. +- `CharEq`-centric naming got changed to `Pattern`-centric naming by changing `chars` + to `matches` in a few method names. +- A `Matches` iterator has been added, that just returns the pattern matches as `&str` slices. + Its uninteresting for patterns that look for a single string fragment, like the `char` and `&str` + matcher, but useful for advanced patterns like predicates over codepoints, or regular expressions. +- All operations that can work from both the front and the back consistently exist in two versions, + the regular front version, and a `r` prefixed reverse versions. As explained above, + this is because both represent different operations, and thus need to be handled as such. + To be more precise, the two can __not__ be abstracted over by providing a `DoubleEndedIterator` + implementations, as the different results would break the requirement for double ended iterators + to behave like a double ended queues where you just pop elements from both sides. + +_However_, all iterators will still implement `DoubleEndedIterator` if the underling +matcher implements `DoubleEndedMatcher`, to keep the ability to do things like `foo.split('a').rev()`. + +## Transition and deprecation plans + +Most changes in this RFC can be made in such a way that code using the old hardcoded or `CharEq`-using +methods will still compile, or give deprecation warning. + +It would even be possible to generically implement `Pattern` for all `CharEq` types, +making the transition more painless. + +Long-term, post 1.0, it would be possible to define new sets of `Pattern` and `Matcher` +without a lifetime parameter by making use of higher kinded types in order to simplify the +string APIs. Eg, instead of `fn starts_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>;` +you'd have `fn starts_with

(&self, pat: P) -> bool where P: Pattern;`. + +In order to not break backwards-compability, these can use the same generic-impl trick to +forward to the old traits, which would roughly look like this: + +```rust +unsafe trait NewPattern { + type MatcherImpl<'a> where MatcherImpl: NewMatcher; + + fn into_matcher<'a>(self, s: &'a str) -> Self::MatcherImpl<'a>; +} + +unsafe impl<'a, P> Pattern<'a> for P where P: NewPattern { + type MatcherImpl = ::MatcherImpl<'a>; + + fn into_matcher(self, haystack: &'a str) -> Self::MatcherImpl { + ::into_matcher(self, haystack) + } +} + +unsafe trait NewMatcher for Self<'_> { + fn haystack<'a>(self: &Self<'a>) -> &'a str; + fn next_match<'a>(self: &mut Self<'a>) -> Option<(uint, uint)>; +} + +unsafe impl<'a, M> Matcher<'a> for M<'a> where M: NewMatcher { + fn haystack(&self) -> &'a str { + ::haystack(self) + } + fn next_match(&mut self) -> Option<(uint, uint)> { + ::next_match(self) + } +} +``` + +Based on coherency experiments and assumptions about how future HKT will work, +the author is assuming that the above implementation will work, but can not experimentally prove it. + +In order for these new traits to fully replace the old ones without getting in their way, +the old ones need to not be defined in a way that makes them "final". +That is, they should be defined in their own submodule, like `str::pattern` that can grow +a sister module like `str::newpattern`, and not be exported in a global place like `str` or even +the `prelude` (which would be unneeded anyway). + +# Drawbacks + +- It complicates the whole machinery and API behind the implementation of matching on string patterns. +- The no-HKT-lifetime-workaround wart might be to confusing for something as commonplace as the string API. +- This add a few layers of generics, so compilation times and micro optimizations might suffer. + +# Alternatives + +## Alternatives in general + +- Keep status quo, with all issues listed at the beginning. +- Stabilize on hardcoded variants, eg providing both `contains` and `contains_str`. + Similar to status quo, but no `CharEq` and thus no generics. + +## Primary alternatives in details of this proposal + +The author identified two alternatives that might still give the same desired API long-term. +The biggest wart is the lifetime parameter on the two trait families, so both try to avoid it: + +- Stabilize on a variant around `CharEq` - This would mean hardcoded `_str` methods, + generic `CharEq` methods, and no extensibility to types like `Regex`, but has a + upgrade path for later upgrading `CharEq` to a full-fledged, HKT-using `Pattern` API, by providing + back-comp generic impls. +- Remove the lifetimes on `Matcher` and `Pattern` by requiring users of the API to store the haystack slice + themselves, duplicating it in the in-memory representation. + +## Other alternatives in details of this proposal + +- Remove the lifetime parameter on `Pattern` and `Matcher` by making them fully unsafe API's, + and require implementations to unsafely transmuting away and back the lifetime of the haystack slice. +- Remove `unsafe` from the API by not marking the `Matcher` traits as `unsafe`, requiring users of the API + to explicitly check every match on validity in regard to utf8 boundaries. +- Allow to opt-in the `unsafe` traits by providing parallel safe and unsafe `Matcher` traits or methods, + with the one per default implemented in terms of the other. +- Turn `Pattern` into `Pattern` and `ReversePattern`, starting the forward-reverse split at the level of + patterns directly. The two would still be in a inherits-from relationship like + `Matcher` and `ReverseMatcher`, and be interchangeable if the later also implement `DoubleEndedMatcher`, + but on the `str` API `where` clauses like `where P: Pattern<'a>, P::MatcherImpl: ReverseMatcher<'a>` + would turn into `where P: ReversePattern<'a>`. + +# Unresolved questions + +- Concrete performance is untested compared to the current situation. +- Should the API split in regard to forward-reverse matching be as symmetrical as possible, + or as minimal as possible? + In the first case, iterators like `Matches` and `RMatches` could both implement `DoubleEndedIterator` if a + `DoubleEndedMatcher` exists, in the latter only `Matches` would, with `RMatches` only providing the + minimum to support reverse operation. + +# Additional extensions + +A similar abstraction system could be implemented for `String` APIs, so that for example `string.push("foo")`, +`string.push('f')`, `string.push('f'.to_ascii())` all work by using something like a `StringSource` trait. + +This would allow operations like `s.replace(®ex!(...), "foo")`, +which would be a method generic over both the pattern matched and the string fragment it gets replaced with: + +```rust +fn replace(&mut self, pat: P, with: S) where P: Pattern, S: StringSource { /* ... */ } +``` From dc30bd0caf31e209e85e24a6046fd5d658e6f841 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Sun, 14 Dec 2014 18:12:14 +0100 Subject: [PATCH 0020/1195] Fix spelling, notes, and actual errors --- text/0000-string-patterns.md | 38 ++++++++++++------------------------ 1 file changed, 12 insertions(+), 26 deletions(-) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index 447450f235a..7926697c94d 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -4,15 +4,11 @@ # Summary -> One para explanation of the feature. - Stabilize all string functions working with search patterns around a new -generic API that provides a unfied way to define and use those patterns. +generic API that provides a unified way to define and use those patterns. # Motivation -> Why are we doing this? What use cases does it support? What is the expected outcome? - Right now, string slices define a couple of methods for string manipulation that work with user provided values that act as search patterns. For example, `split()` takes an type implementing `CharEq` @@ -31,9 +27,7 @@ This presents a couple of issues: - The API does not provide all operations for all types. (No `rsplit` for `&str` patterns) - The API is not extensible, eg to allow splitting at regex matches. - The API offers no way to statically decide between different basic search algorithms - for the same pattern, for example to use Boojer Moore string searching - -> TODO: Spelling above + for the same pattern, for example to use Boyer-Moore string searching. At the moment, the full set of relevant string methods roughly looks like this: @@ -73,16 +67,8 @@ As an additional design goal, the new abstractions should also not pose a proble for optimization - like for iterators, a concrete instance should produce similar machine code to a hardcoded optimized loop written in C. -> Idea: Parallel trait hierachy not using unsafe, that will use checks - # Detailed design -> This is the bulk of the RFC. Explain the design in enough detail for somebody familiar -with the language to understand, and for somebody familiar with the compiler to implement. -This should get into specifics and corner-cases, and include examples of how the feature is used. - -> Goal: A working draft with lifetimes - ## New traits First, new traits will be added to the `str` module in the std library: @@ -118,7 +104,7 @@ impl<'a, F> Pattern<'a> for F where F: FnOnce(char) -> bool { /* ... */ } impl<'a, 'b> Pattern<'a> for &'b Regex { /* ... */ } ``` -The lifetime paramter on `Pattern` exists in order to allow threading the lifetime +The lifetime parameter on `Pattern` exists in order to allow threading the lifetime of the haystack (the string to be searched through) through the API, and is a workaround for not having associated higher kinded types yet. @@ -138,14 +124,12 @@ unsafe trait ReverseMatcher<'a>: Matcher<'a> { trait DoubleEndedMatcher<'a>: ReverseMatcher<'a> {} ``` -> TODO: Better name for the last trait - The basic idea of a `Matcher` is to expose a `Iterator`-like interface for iterating through all matches of a pattern in the given haystack. Similar to iterators, depending on the concrete implementation a matcher can have additional capabilities that build on each other, which is why they will be -defined in terms of a three-tier hierachy: +defined in terms of a three-tier hierarchy: - `Matcher<'a>` is the basic trait that all matchers need to implement. It contains a `next_match()` method that returns the `start` and `end` indices of @@ -168,7 +152,7 @@ defined in terms of a three-tier hierachy: As an important last detail, both `Matcher` and `ReverseMatcher` are marked as `unsafe` traits, even though the actual methods aren't. This is because every implementation of these traits need to ensure that all -indices returned by `next_match` and `next_match_back` lie on valid utf8 boundaries +indices returned by `next_match` and `next_match_back` lay on valid utf8 boundaries in the used haystack. Without that guarantee, every single match returned by a matcher would need to be @@ -206,7 +190,7 @@ With the `Pattern` and `Matcher` traits defined and implemented, the actual `str methods will be changed to make use of them: ```rust -pub trait StrExt { +pub trait StrExt for ?Sized { fn contains<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>; fn split<'a, P>(&'a self, pat: P) -> Splits

where P: Pattern<'a>; @@ -241,6 +225,7 @@ pub trait StrExt { These are mainly the same pattern-using methods as currently existing, only changed to uniformly use the new pattern API. The main differences are: + - Duplicates like `contains(char)` and `contains_str(&str)` got merged into single generic methods. - `CharEq`-centric naming got changed to `Pattern`-centric naming by changing `chars` to `matches` in a few method names. @@ -328,18 +313,19 @@ the `prelude` (which would be unneeded anyway). ## Primary alternatives in details of this proposal -The author identified two alternatives that might still give the same desired API long-term. -The biggest wart is the lifetime parameter on the two trait families, so both try to avoid it: +The author identified a primary alternative that might still give the same desired API long-term. +The biggest wart is the lifetime parameter on the two trait families, so it tries to avoid it: - Stabilize on a variant around `CharEq` - This would mean hardcoded `_str` methods, generic `CharEq` methods, and no extensibility to types like `Regex`, but has a upgrade path for later upgrading `CharEq` to a full-fledged, HKT-using `Pattern` API, by providing back-comp generic impls. -- Remove the lifetimes on `Matcher` and `Pattern` by requiring users of the API to store the haystack slice - themselves, duplicating it in the in-memory representation. ## Other alternatives in details of this proposal +- Remove the lifetimes on `Matcher` and `Pattern` by requiring users of the API to store the haystack slice + themselves, duplicating it in the in-memory representation. + However, this still runs into HKT issues with the impl of `Pattern`. - Remove the lifetime parameter on `Pattern` and `Matcher` by making them fully unsafe API's, and require implementations to unsafely transmuting away and back the lifetime of the haystack slice. - Remove `unsafe` from the API by not marking the `Matcher` traits as `unsafe`, requiring users of the API From 7a448b59df926a06bca4289a72cb4cc81a5823be Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Sun, 14 Dec 2014 18:17:59 +0100 Subject: [PATCH 0021/1195] Added missing fluff about how to solve the issues in the summary --- text/0000-string-patterns.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index 7926697c94d..c6de5cb0c4c 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -63,6 +63,10 @@ This RFC proposes to fix those issues by providing a unified `Pattern` trait that all "string pattern" types would implement, and that would be used by the string API exclusively. +This fixes the duplication, consistency, and extensibility problems, and also allows to define +newtype wrappers for the same pattern types that use different or specific +search implementations. + As an additional design goal, the new abstractions should also not pose a problem for optimization - like for iterators, a concrete instance should produce similar machine code to a hardcoded optimized loop written in C. From 4d9b73459cae9508b84bfbbdf5e20f7343b5eb78 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Sun, 14 Dec 2014 18:32:07 +0100 Subject: [PATCH 0022/1195] Add note --- text/0000-string-patterns.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index c6de5cb0c4c..fcb492bdfd5 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -295,6 +295,9 @@ unsafe impl<'a, M> Matcher<'a> for M<'a> where M: NewMatcher { Based on coherency experiments and assumptions about how future HKT will work, the author is assuming that the above implementation will work, but can not experimentally prove it. +> Note: There might be still an issue with this upgrade path on the concrete iterator types. + That is, `Split

` might turn into `Split<'a, P>`... Maybe require the `'a` from the beginning? + In order for these new traits to fully replace the old ones without getting in their way, the old ones need to not be defined in a way that makes them "final". That is, they should be defined in their own submodule, like `str::pattern` that can grow From d7249b9a2bd970d65f673a6e87eb7d95a4aaa97f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Sun, 14 Dec 2014 19:06:32 +0100 Subject: [PATCH 0023/1195] Added another unsafe alternative --- text/0000-string-patterns.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index fcb492bdfd5..29f842bf4ae 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -330,6 +330,13 @@ The biggest wart is the lifetime parameter on the two trait families, so it trie ## Other alternatives in details of this proposal +- Remove `unsafe` from the API by returning a special `SubSlice<'a>` type instead of `(uint, uint)` in each + match, that wraps the haystack and the + current match as a `(*start, *match_start, *match_end, *end)` pointer quad. It is unclear whether + those two additional words per match end up being an issue after monomorphization, but two of them + will be constant for the duration of the iteration, so changes are good. + The `haystack()` could also be removed that way, as each match already returns the haystack. + However, this still prevents removal of the lifetime parameters without HKT. - Remove the lifetimes on `Matcher` and `Pattern` by requiring users of the API to store the haystack slice themselves, duplicating it in the in-memory representation. However, this still runs into HKT issues with the impl of `Pattern`. From 8f7c68761d51e3d2a3f05de1cab4ddc1a933ae0f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Sun, 14 Dec 2014 19:07:45 +0100 Subject: [PATCH 0024/1195] Fix --- text/0000-string-patterns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index 29f842bf4ae..f82a407922b 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -334,7 +334,7 @@ The biggest wart is the lifetime parameter on the two trait families, so it trie match, that wraps the haystack and the current match as a `(*start, *match_start, *match_end, *end)` pointer quad. It is unclear whether those two additional words per match end up being an issue after monomorphization, but two of them - will be constant for the duration of the iteration, so changes are good. + will be constant for the duration of the iteration, so changes are they won't matter. The `haystack()` could also be removed that way, as each match already returns the haystack. However, this still prevents removal of the lifetime parameters without HKT. - Remove the lifetimes on `Matcher` and `Pattern` by requiring users of the API to store the haystack slice From 8048588882d15aa35e00d9f2f3aa5f973747e67c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 16 Dec 2014 02:13:41 +0100 Subject: [PATCH 0025/1195] Updated RFC with details about the return type of the `next_match` functions --- text/0000-string-patterns.md | 69 +++++++++++++++++++++++++++++------- 1 file changed, 56 insertions(+), 13 deletions(-) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index f82a407922b..2a31e0ec969 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -157,7 +157,7 @@ As an important last detail, both `Matcher` and `ReverseMatcher` are marked as `unsafe` traits, even though the actual methods aren't. This is because every implementation of these traits need to ensure that all indices returned by `next_match` and `next_match_back` lay on valid utf8 boundaries -in the used haystack. +in the haystack. Without that guarantee, every single match returned by a matcher would need to be double-checked for validity, which would be unnecessary and most likely @@ -188,6 +188,40 @@ This discrepancy can not be avoided without additional overhead or even allocations for caching in the reverse matcher, and thus "matching from the front" needs to be considered a different operation than "matching from the back". +### Why `(uint, uint)` instead of `&str` + +It would be possible to define `next_match` and `next_match_back` to return an `&str` +to the match instead of `(uint, uint)`. + +A concrete matcher impl could then make use of unsafe code to construct such an slice cheaply, +and by its very nature it is guaranteed to lie on utf8 boundaries, +which would also allow not marking the traits as unsafe. + +However, this approach has a couple of issues. For one, not every consumer of +this API cares about only the matched slice itself: + +- The `split()` family of operations cares about the slices _between_ matches. +- Operations like `match_indices()` and `find()` need to actually return the offset + to the start of the string as part of their definition. +- The `trim()` and `Xs_with()` family of operations need to compare individual match + offsets with each other and the start and end of the string. + +In order for these use cases to work with a `&str` match, the concrete adapters +would need to unsafely calculate the offset of a match `&str` to the start of the haystack `&str`. + +But that in turn would require matcher implementors to only return actual sub slices into +the haystack, and not random `static` string slices, as the API defined with `&str` would allow. + +In order to resolve that issue, you'd have to do one of: + +- Add the uncheckable API constraint of only requiring true subslices, which would make the traits + unsafe again, negating much of the benefit. +- Return a more complex custom slice type that still contains the haystack offset. + (This is listed as an alternative at the end of this RFC.) + +In both cases, the API does not really improve significantly, so `uint` indices have been chosen +as the "simple" default design. + ## New methods on `StrExt` With the `Pattern` and `Matcher` traits defined and implemented, the actual `str` @@ -312,23 +346,36 @@ the `prelude` (which would be unneeded anyway). # Alternatives -## Alternatives in general +In general: - Keep status quo, with all issues listed at the beginning. - Stabilize on hardcoded variants, eg providing both `contains` and `contains_str`. Similar to status quo, but no `CharEq` and thus no generics. -## Primary alternatives in details of this proposal - -The author identified a primary alternative that might still give the same desired API long-term. -The biggest wart is the lifetime parameter on the two trait families, so it tries to avoid it: +Under the assumption that the lifetime parameter on the traits in this proposal +is too big a wart to have in the release string API, there is an primary alternative +that would avoid it: - Stabilize on a variant around `CharEq` - This would mean hardcoded `_str` methods, generic `CharEq` methods, and no extensibility to types like `Regex`, but has a upgrade path for later upgrading `CharEq` to a full-fledged, HKT-using `Pattern` API, by providing back-comp generic impls. -## Other alternatives in details of this proposal +Next, there are alternatives that might make a positive difference in the authors opinion, but still have +some negative trade-of: + +- With the `Matcher` traits having the unsafe constraint of returning results unique to the + current haystack already, they could just directly return a `(*const u8, *const u8)` pointing into it. + This would allow a few more micro-optimizations, as now the `matcher -> match -> final slice` + pipeline would no longer need to keep adding and subtracting the start address of the haystack + for immediate results. +- Extend `Pattern` into `Pattern` and `ReversePattern`, starting the forward-reverse split at the level of + patterns directly. The two would still be in a inherits-from relationship like + `Matcher` and `ReverseMatcher`, and be interchangeable if the later also implement `DoubleEndedMatcher`, + but on the `str` API where clauses like `where P: Pattern<'a>, P::MatcherImpl: ReverseMatcher<'a>` + would turn into `where P: ReversePattern<'a>`. + +Lastly, there are alternatives that don't seem very favorable, but are listed for completeness sake: - Remove `unsafe` from the API by returning a special `SubSlice<'a>` type instead of `(uint, uint)` in each match, that wraps the haystack and the @@ -341,16 +388,11 @@ The biggest wart is the lifetime parameter on the two trait families, so it trie themselves, duplicating it in the in-memory representation. However, this still runs into HKT issues with the impl of `Pattern`. - Remove the lifetime parameter on `Pattern` and `Matcher` by making them fully unsafe API's, - and require implementations to unsafely transmuting away and back the lifetime of the haystack slice. + and require implementations to unsafely transmuting back the lifetime of the haystack slice. - Remove `unsafe` from the API by not marking the `Matcher` traits as `unsafe`, requiring users of the API to explicitly check every match on validity in regard to utf8 boundaries. - Allow to opt-in the `unsafe` traits by providing parallel safe and unsafe `Matcher` traits or methods, with the one per default implemented in terms of the other. -- Turn `Pattern` into `Pattern` and `ReversePattern`, starting the forward-reverse split at the level of - patterns directly. The two would still be in a inherits-from relationship like - `Matcher` and `ReverseMatcher`, and be interchangeable if the later also implement `DoubleEndedMatcher`, - but on the `str` API `where` clauses like `where P: Pattern<'a>, P::MatcherImpl: ReverseMatcher<'a>` - would turn into `where P: ReversePattern<'a>`. # Unresolved questions @@ -360,6 +402,7 @@ The biggest wart is the lifetime parameter on the two trait families, so it trie In the first case, iterators like `Matches` and `RMatches` could both implement `DoubleEndedIterator` if a `DoubleEndedMatcher` exists, in the latter only `Matches` would, with `RMatches` only providing the minimum to support reverse operation. + A ruling in favor of symmetry would also speak for the `ReversePattern` alternative. # Additional extensions From d269a9c62540762c9961a7e3a1f8b309f5b2ab58 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Sun, 23 Nov 2014 22:01:26 -0800 Subject: [PATCH 0026/1195] RFC: Generic conversion traits --- text/0000-conversion-traits.md | 489 +++++++++++++++++++++++++++++++++ 1 file changed, 489 insertions(+) create mode 100644 text/0000-conversion-traits.md diff --git a/text/0000-conversion-traits.md b/text/0000-conversion-traits.md new file mode 100644 index 00000000000..98e3a8e8608 --- /dev/null +++ b/text/0000-conversion-traits.md @@ -0,0 +1,489 @@ +- Start Date: 2014-11-21 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC proposes several new *generic conversion* traits. The +motivation is to remove the need for ad hoc conversion traits (like +`FromStr`, `AsSlice`, `ToSocketAddr`, `FromError`) whose *sole role* +is for generics bounds. Aside from cutting down on trait +proliferation, centralizing these traits also helps the ecosystem +avoid incompatible ad hoc conversion traits defined downstream from +the types they convert to or from. It also future-proofs against +eventual language features for ergonomic conversion-based overloading. + +# Motivation + +The idea of generic conversion traits has come up from +[time](https://github.com/rust-lang/rust/issues/7080) +[to](http://discuss.rust-lang.org/t/pre-rfc-add-a-coerce-trait-to-get-rid-of-the-as-slice-calls/415) +[time](http://discuss.rust-lang.org/t/pre-rfc-remove-fromerror-trait-add-from-trait/783/3), +and now that multidispatch is available they can be made to work +reasonably well. They are worth considering due to the problems they +solve (given below), and considering *now* because they would obsolete +several ad hoc conversion traits (and several more that are in the +pipeline) for `std`. + +## Problem 1: overloading over conversions + +Rust does not currently support arbitrary, implicit conversions -- and +for some good reasons. However, it is sometimes important +ergonomically to allow a single function to be *explicitly* overloaded +based on conversions. + +For example, the +[recently proposed path APIs](https://github.com/rust-lang/rfcs/pull/474) +introduce an `AsPath` trait to make various path operations ergonomic: + +```rust +pub trait AsPath for Sized? { + fn as_path(&self) -> &Path; +} + +impl Path { + ... + + pub fn join(&self, path: &P) -> PathBuf { ... } +} +``` + +The idea in particular is that, given a path, you can join using a +string literal directly. That is: + +```rust +// write this: +let new_path = my_path.join("fixed_subdir_name"); + +// not this: +let new_path = my_path.join(Path::new("fixed_subdir_name")); +``` + +It's a shame to have to introduce new ad hoc traits every time such an +overloading is desired. And because the traits are ad hoc, it's also +not possible to program generically over conversions themselves. + +## Problem 2: duplicate, incompatible conversion traits + +There's a somewhat more subtle problem compounding the above: if the +author of the path API neglects to include traits like `AsPath` for +its core types, but downstream crates want to overload on those +conversions, those downstream crates may each introduce their own +conversion traits, which will not be compatible with one another. + +Having standard, generic conversion traits cuts down on the total +number of traits, and also ensures that all Rust libraries have an +agreed-upon way to talk about conversions. + +## Non-goals + +When considering the design of generic conversion traits, it's +tempting to try to do away will *all* ad hoc conversion methods. That +is, to replace methods like `to_string` and `to_vec` with a single +method `to::` and `to::>`. + +Unfortunately, this approach carries several ergonomic downsides: + +* The required `::< _ >` syntax is pretty unfriendly. Something like + `to` would be much better, but is unlikely to happen given + the current grammar. + +* Designing the traits to allow this usage is surprisingly subtle -- + it effectively requires *two traits* per type of generic conversion, + with blanket `impl`s mapping one to the other. Having such + complexity for *all conversions* in Rust seems like a non-starter. + +* Discoverability suffers somewhat. Looking through a method list and + seeing `to_string` is easier to comprehend (for newcomers + especially) than having to crawl through the `impl`s for a trait on + the side -- especially given the trait complexity mentioned above. + +Nevertheless, this is a serious alternative that will be laid out in +more detail below, and merits community discussion. + +# Detailed design + +## Basic design + +The design is fairly simple, although perhaps not as simple as one +might expect: we introduce a total of *four* traits: + +```rust +trait As for Sized? { + fn cvt_as(&self) -> &T; +} + +trait AsMut for Sized? { + fn cvt_as_mut(&mut self) -> &mut T; +} + +trait To for Sized? { + fn cvt_to(&self) -> T; +} + +trait Into { + fn cvt_into(self) -> T; +} +``` + +These traits mirror our `as`/`to`/`into` conventions, but add a bit +more structure to them: `as`-style conversions are from references to +references, `to`-style conversions are from references to arbitrary +types, and `into`-style conversions are between arbitrary types +(consuming their argument). + +**Why the reference restrictions?** + +If all of the conversion traits were between arbitrary types, you +would have to use generalized where clauses and explicit lifetimes even for simple cases: + +```rust +// Possible alternative: +trait As { + fn cvt_as(self) -> T; +} + +// But then you get this: +fn take_as<'a, T>(t: &'a T) where &'a T: As<&'a MyType>; + +// Instead of this: +fn take_as(t: &T) where T: As; +``` + +What's worse, if you need a conversion that works over any lifetime, +*there's no way to specify it*: you can't write something like + +```rust +... where for<'a> &'a T: As<&'a MyType> +``` + +This case is particularly important when you cannot name a lifetime in +advance, because it will be created on the stack within the +function. While such a `where` clause can likely be added in the +future, it's a bit of a gamble to pin conversion traits on it today. + +The proposed trait definition essentially *bakes in* the needed +lifetime connection, capturing the most common mode of use for +`as`/`to`/`into` conversions. In the future, an HKT-based version of +these traits could likely generalize further. + +**Why have multiple traits at all**? + +The biggest reason to have multiple traits is to take advantage of the +lifetime linking explained above. In addition, however, it is a basic +principle of Rust's libraries that conversions are distinguished by +cost and consumption, and having multiple traits makes it possible to +(by convention) restrict attention to e.g. "free" `as`-style conversions +by bounding only by `As`. + +## Blanket `impl`s + +Given the above trait design, there are a few straightforward blanket +`impl`s as one would expect: + +```rust +// As implies To +impl<'a, Sized? T, Sized? U> To<&'a U> for &'a T where T: As { + fn cvt_to(&self) -> &'a U { + self.cvt_as() + } +} + +// To implies Into +impl<'a, T, U> Into for &'a T where T: To { + fn cvt_into(self) -> U { + self.cvt_to() + } +} + +// AsMut implies Into +impl<'a, T, U> Into<&'a mut U> for &'a mut T where T: AsMut { + fn cvt_into(self) -> &'a mut U { + self.cvt_as_mut() + } +} +``` + +## An example + +Using all of the above, here are some example `impl`s and their use: + +```rust +impl As for String { + fn cvt_as(&self) -> &str { + self.as_slice() + } +} +impl As<[u8]> for String { + fn cvt_as(&self) -> &[u8] { + self.as_bytes() + } +} + +impl Into> for String { + fn cvt_into(self) -> Vec { + self.into_bytes() + } +} + +fn main() { + let a = format!("hello"); + let b: &[u8] = a.cvt_as(); + let c: &str = a.cvt_as(); + let d: Vec = a.cvt_into(); +} +``` + +This use of generic conversions within a function body is expected to +be rare, however; usually the traits are used for generic functions: + +``` +impl Path { + fn join_path_inner(&self, p: &Path) -> PathBuf { ... } + + pub fn join_path>(&self, p: &P) -> PathBuf { + self.join_path_inner(p.cvt_as()) + } +} +``` + +In this very typical pattern, you introduce an "inner" function that +takes the converted value, and the public API is a thin wrapper around +that. The main reason to do so is to avoid code bloat: given that the +generic bound is used only for a conversion that can be done up front, +there is no reason to monomorphize the entire function body for each +input type. + +### An aside: codifying the generics pattern in the language + +This pattern is so common that we probably want to consider sugar for +it, e.g. something like: + +```rust +impl Path { + pub fn join_path(&self, p: ~Path) -> PathBuf { + ... + } +} +``` + +that would desugar into exactly the above (assuming that the `~` sigil +was restricted to `As` conversions). Such a feature is out of scope +for this RFC, but it's a natural and highly ergonomic extension of the +traits being proposed here. + +## Preliminary conventions + +Would *all* conversion traits be replaced by the proposed ones? +Probably not, due to the combination of two factors: + +* You still want blanket `impl`s like `ToString` for `Show`, but: +* This RFC proposes that specific conversion *methods* like + `to_string` stay in common use. + +On the other hand, you'd expect a blanket `impl` of `To` for +any `T: ToString`, and one should prefer bounding over `To` +rather than `ToString` for consistency. Basically, the role of +`ToString` is just to provide the ad hoc method name `to_string` in a +blanket fashion. + +So a rough, preliminary convention would be the following: + +* An *ad hoc conversion method* is one following the normal convention + of `as_foo`, `to_foo`, `into_foo` or `from_foo`. A "generic" + conversion method is one going through the generic traits proposed + in this RFC. An *ad hoc conversion trait* is a trait providing an ad + hoc conversion method. + +* Use ad hoc conversion methods for "natural" conversions that should + have easy names and good discoverability. A conversion is "natural" + if you'd call it directly on the type in normal code; "unnatural" + conversions usually come from generic programming. + + For example, `to_string` is a natural conversion for `str`, while + `into_string` is not; but the latter is sometimes useful in a + generic context -- and that's what the generic conversion traits can + help with. + +* Introduce ad hoc conversion *traits* if you need to provide a + blanket `impl` of an ad hoc conversion method, or need special + functionality. For example, `to_string` needs a trait so that every + `Show` type automatically provides it. + +* For any ad hoc conversion method, *also* provide an `impl` of the + corresponding generic version; for traits, this should be done via a + blanket `impl`. + +* When using generics bounded over a conversion, always prefer to use + the generic conversion traits. For example, bound `S: To` + not `S: ToString`. This encourages consistency, and also allows + clients to take advantage of the various blanket generic conversion + `impl`s. + +* Use the "inner function" pattern mentioned above to avoid code + bloat. + +# Drawbacks + +There are a few drawbacks to the design as proposed: + +* Since it does not replace all conversion traits, there's the + unfortunate case of having both a `ToString` trait and a + `To` trait bound. The proposed conventions go some distance + toward at least keeping APIs consistent, but the redundancy is + unfortunate. See Alternatives for a more radical proposal. + +* It may encourage more overloading over coercions, and also more + generics code bloat (assuming that the "inner function" pattern + isn't followed). Coercion overloading is not necessarily a bad + thing, however, since it is still explicit in the signature rather + than wholly implicit. If we do go in this direction, we can consider + language extensions that make it ergonomic *and* avoid code bloat. + +# Alternatives + +The main alternative is one that attempts to provide methods that +*completely replace* ad hoc conversion methods. To make this work, a +form of double dispatch is used, so that the methods are added to +*every type* but bounded by a separate set of conversion traits. + +In this strawman proposal, the name "view shift" is used for `as` +conversions, "conversion" for `to` conversions, and "transformation" +for `into` conversions. These names are not too important, but needed +to distinguish the various generic methods. + +The punchline is that, in the end, we can write + +```rust +let s = format!("hello"); +let b = s.shift_view::<[u8]>(); +``` + +or, put differently, replace `as_bytes` with `shift_view::<[u8]>` -- +for better or worse. + +In addition to the rather large jump in complexity, this alternative +design also suffers from poor error messages. For example, if you +accidentally typed `shift_view::` instead, you receive: + +``` +error: the trait `ShiftViewFrom` is not implemented for the type `u8` +``` + +which takes a bit of thought and familiarity with the traits to fully +digest. Taken together, the complexity, error messages, and poor +ergonomics of things like `convert::` rather than `as_bytes` led +the author to discard this alternative design. + +```rust +// VIEW SHIFTS + +// "Views" here are always lightweight, non-lossy, always +// successful view shifts between reference types + +// Immutable views + +trait ShiftViewFrom for Sized? { + fn shift_view_from(&T) -> &Self; +} + +trait ShiftView for Sized? { + fn shift_view(&self) -> &T where T: ShiftViewFrom; +} + +impl ShiftView for T { + fn shift_view>(&self) -> &U { + ShiftViewFrom::shift_view_from(self) + } +} + +// Mutable coercions + +trait ShiftViewFromMut for Sized? { + fn shift_view_from_mut(&mut T) -> &mut Self; +} + +trait ShiftViewMut for Sized? { + fn shift_view_mut(&mut self) -> &mut T where T: ShiftViewFromMut; +} + +impl ShiftViewMut for T { + fn shift_view_mut>(&mut self) -> &mut U { + ShiftViewFromMut::shift_view_from_mut(self) + } +} + +// CONVERSIONS + +trait ConvertFrom for Sized? { + fn convert_from(&T) -> Self; +} + +trait Convert for Sized? { + fn convert(&self) -> T where T: ConvertFrom; +} + +impl Convert for T { + fn convert(&self) -> U where U: ConvertFrom { + ConvertFrom::convert_from(self) + } +} + +impl ConvertFrom for Vec { + fn convert_from(s: &str) -> Vec { + s.to_string().into_bytes() + } +} + +// TRANSFORMATION + +trait TransformFrom { + fn transform_from(T) -> Self; +} + +trait Transform { + fn transform(self) -> T where T: TransformFrom; +} + +impl Transform for T { + fn transform(self) -> U where U: TransformFrom { + TransformFrom::transform_from(self) + } +} + +impl TransformFrom for Vec { + fn transform_from(s: String) -> Vec { + s.into_bytes() + } +} + +impl<'a, T, U> TransformFrom<&'a T> for U where U: ConvertFrom { + fn transform_from(x: &'a T) -> U { + x.convert() + } +} + +impl<'a, T, U> TransformFrom<&'a mut T> for &'a mut U where U: ShiftViewFromMut { + fn transform_from(x: &'a mut T) -> &'a mut U { + ShiftViewFromMut::shift_view_from_mut(x) + } +} + +// Example + +impl ShiftViewFrom for str { + fn shift_view_from(s: &String) -> &str { + s.as_slice() + } +} +impl ShiftViewFrom for [u8] { + fn shift_view_from(s: &String) -> &[u8] { + s.as_bytes() + } +} + +fn main() { + let s = format!("hello"); + let b = s.shift_view::<[u8]>(); +} +``` From 4e0e8ed99b0ca7e53528f85bb061fdc2c2f4f577 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 16 Dec 2014 13:18:24 +0100 Subject: [PATCH 0027/1195] Fix char predicate impl using FnOnce --- text/0000-string-patterns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index 2a31e0ec969..6d5ea88750e 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -103,7 +103,7 @@ impl<'a> Pattern<'a> for char { /* ... */ } impl<'a, 'b> Pattern<'a> for &'b str { /* ... */ } impl<'a, 'b> Pattern<'a> for &'b [char] { /* ... */ } -impl<'a, F> Pattern<'a> for F where F: FnOnce(char) -> bool { /* ... */ } +impl<'a, F> Pattern<'a> for F where F: FnMut(char) -> bool { /* ... */ } impl<'a, 'b> Pattern<'a> for &'b Regex { /* ... */ } ``` From 33b9490f2cf6a8331d5c8a6eedb830cbf9273734 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 16 Dec 2014 13:23:06 +0100 Subject: [PATCH 0028/1195] spelling --- text/0000-string-patterns.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index 6d5ea88750e..c9e5535a209 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -277,7 +277,7 @@ changed to uniformly use the new pattern API. The main differences are: implementations, as the different results would break the requirement for double ended iterators to behave like a double ended queues where you just pop elements from both sides. -_However_, all iterators will still implement `DoubleEndedIterator` if the underling +_However_, all iterators will still implement `DoubleEndedIterator` if the underlying matcher implements `DoubleEndedMatcher`, to keep the ability to do things like `foo.split('a').rev()`. ## Transition and deprecation plans @@ -362,7 +362,7 @@ that would avoid it: back-comp generic impls. Next, there are alternatives that might make a positive difference in the authors opinion, but still have -some negative trade-of: +some negative trade-offs: - With the `Matcher` traits having the unsafe constraint of returning results unique to the current haystack already, they could just directly return a `(*const u8, *const u8)` pointing into it. From 343802e20e3a8f84ab002688cea8654b61c76d59 Mon Sep 17 00:00:00 2001 From: James Miller Date: Fri, 9 Jan 2015 22:47:02 +1300 Subject: [PATCH 0029/1195] Remove ndebug variable --- text/0000-remove-ndebug.md | 45 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 text/0000-remove-ndebug.md diff --git a/text/0000-remove-ndebug.md b/text/0000-remove-ndebug.md new file mode 100644 index 00000000000..10a733b006e --- /dev/null +++ b/text/0000-remove-ndebug.md @@ -0,0 +1,45 @@ +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Remove official support for the `ndebug` config variable, replace the current usage of it with a +more appropriate 'debug_assertions` compiler-provided config variable. + +# Motivation + +The usage of 'ndebug' to indicate a release build is a strange holdover from C/C++. It is not used +much and is easy to forget about. Since it used like any other value passed to the `cfg` flag, it +does not interact with other flags such as `-g` or `-O`. + +The only current users of `ndebug` are the implementations of the `debug_assert!` macro. At the +time of this writing integer overflow checking is will also be controlled by this variable. Since +the optimisation setting does not influence `ndebug`, this means that code that the user expects to +be optimised will still contain the overflow checking logic. Similarly, `debug_assert!` invocations +are not removed, contrary to what intuition should expect. Enabling optimisations should been seen +as a request to make the user's code faster, removing `debug_assert!` and other checks seems like +a natural consequence. + +# Detailed design + +The `debug_assertions` variable, the replacement for the `ndebug` variable, will be compiler +provided based on the value of the `opt-level` codegen flag, including the implied value from `-O`. +Any value higher than 0 will disable the variable. + +Another codegen flag `debug-assertions` will override this, forcing it on or off based on the value +passed to it. + +# Drawbacks + +Technically backwards incompatible change. However the only usage of the `ndebug` variable in the +rust tree is in the implementation of `debug_assert!`, so it's unlikely that any external code is +using it. + +# Alternatives + +No real alternatives beyond different names and defaults. + +# Unresolved questions + +None. \ No newline at end of file From fe40a7fa3b15e4f207295b15e071f1dbe4a601d2 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Mon, 12 Jan 2015 22:54:27 +0800 Subject: [PATCH 0030/1195] Amend RFC 544: Propose `isz/usz` as the suffixes. --- text/0544-rename-int-uint.md | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/text/0544-rename-int-uint.md b/text/0544-rename-int-uint.md index 3b7e76753d0..9eac9723d1a 100644 --- a/text/0544-rename-int-uint.md +++ b/text/0544-rename-int-uint.md @@ -46,10 +46,34 @@ However, given the discussions about the previous revisions of this RFC, and the # Detailed Design -- Rename `int/uint` to `isize/usize`, with `is/us` being their literal suffixes, respectively. +- Rename `int/uint` to `isize/usize`, with `isz/usz` being their literal suffixes, respectively. - Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. -Some would prefer using `isize/usize` directly as literal suffixes here, as `is/us` are actual words and maybe a bit *too* pleasant to use. But on the other hand, `42isize` can be too long for others. +There are different opinions about which literal suffixes to use. The following section would discuss the alternatives. + +## Choosing literal suffixes: + +### `isize/usize`: + +* Pros: They are the same as the type names, very consistent with the rest of the integer primitives. +* Cons: They are too long for some, and may stand out too much as suffixes. + +### `is/us`: + +* Pros: They are succinct as suffixes. +* Cons: They make an extra pair of reserved words which are actual English words, with `is` being a keyword in many programming languages and `us` being an abbreviation of "microsecond", which makes them confusing as suffixes, though technically there should be no ambiguities between "`is` the suffix" and "`is` the keyword with other use cases (in the future)". Also, `is/us` may be *too* short (shorter than `i64/u64`) and may be *too* pleasant to use, which can be a problem. + +### `isz/usz`: + +* Pros: They are the middle grounds between `isize/usize` and `is/us`, neither too long nor too short, and they are not actual English words. +* Cons: An extra pair of reserved words. + +### `iz/uz`: +* Pros and cons: Similar to those of `is/us`, except that `iz/uz` are not actual words, which is an additional advantage. However it may not be immediately clear that `iz/uz` are abbreviations of `isize/usize`. + +This author believes that `isz/usz` are the best choices here. + +(Note: Even if `is/us` don't get used as literal suffixes, it can be beneficial to reserve `is`, but this is outside the scope of this RFC.) `usize` in action: @@ -57,7 +81,7 @@ Some would prefer using `isize/usize` directly as literal suffixes here, as `is/ fn slice_or_fail<'b>(&'b self, from: &usize, to: &usize) -> &'b [T] ``` -See **Alternatives B to L** for the other alternatives that are rejected. +See **Alternatives B to L** for the alternatives to `isize/usize` that have been rejected. ## Advantages of `isize/usize`: From f74cc0d100b10667ab88df5437e1ca472a0a4481 Mon Sep 17 00:00:00 2001 From: Julian Orth Date: Mon, 12 Jan 2015 16:49:11 +0100 Subject: [PATCH 0031/1195] drain_range --- text/0000-drain-range.md | 66 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) create mode 100644 text/0000-drain-range.md diff --git a/text/0000-drain-range.md b/text/0000-drain-range.md new file mode 100644 index 00000000000..13cfa1eadde --- /dev/null +++ b/text/0000-drain-range.md @@ -0,0 +1,66 @@ +- Start Date: 2015-01-12 +- RFC PR #: (leave this empty) +- Rust Issue #: (leave this empty) + +# Summary + +Replace `Vec::drain` by a method that accepts a range parameter. + +# Motivation + +Allowing a range parameter is strictly more powerful than the current version. +E.g., see the following implementations of some `Vec` methods via the hypothetical +`drain_range` method: + +```rust +fn truncate(x: &mut Vec, len: usize) { + if len <= x.len() { + x.drain_range(len..); + } +} + +fn remove(x: &mut Vec, index: usize) -> u8 { + x.drain_range(index).next().unwrap() +} + +fn pop(x: &mut Vec) -> Option { + match x.len() { + 0 => None, + n => x.drain_range(n-1).next() + } +} + +fn drain(x: &mut Vec) -> DrainRange { + x.drain_range(0..) +} + +fn clear(x: &mut Vec) { + x.drain_range(0..); +} +``` + +With optimization enabled, those methods will produce code that runs as fast +as the current versions. (They should not be implemented this way.) + +In particular, this method allows the user to remove a slice from a vector in +`O(Vec::len)` instead of `O(Slice::len * Vec::len)`. + +# Detailed design + +Remove `Vec::drain` and add the following method: + +```rust +/// Creates a draining iterator that clears the specified range in the Vec and +/// iterates over the removed items from start to end. +/// +/// # Panics +/// +/// Panics if the range is decreasing or if the upper bound is larger than the +/// length of the vector. +pub fn drain(&mut self, range: T) -> RangeIter { + range.drain(self) +} +``` + +Where `Drainer` should be implemented for `Range`, `RangeTo`, +`RangeFrom`, `FullRange`, and `usize`. From c3e4beb4758f8808ba3d84e3059829781687a8b2 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 13 Jan 2015 19:21:06 +0800 Subject: [PATCH 0032/1195] Various adjustments and discussions about `i/u`. --- text/0544-rename-int-uint.md | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/text/0544-rename-int-uint.md b/text/0544-rename-int-uint.md index 9eac9723d1a..77f883960f4 100644 --- a/text/0544-rename-int-uint.md +++ b/text/0544-rename-int-uint.md @@ -49,6 +49,12 @@ However, given the discussions about the previous revisions of this RFC, and the - Rename `int/uint` to `isize/usize`, with `isz/usz` being their literal suffixes, respectively. - Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. +`usize` in action: + +```rust +fn slice_or_fail<'b>(&'b self, from: &usize, to: &usize) -> &'b [T] +``` + There are different opinions about which literal suffixes to use. The following section would discuss the alternatives. ## Choosing literal suffixes: @@ -61,27 +67,23 @@ There are different opinions about which literal suffixes to use. The following ### `is/us`: * Pros: They are succinct as suffixes. -* Cons: They make an extra pair of reserved words which are actual English words, with `is` being a keyword in many programming languages and `us` being an abbreviation of "microsecond", which makes them confusing as suffixes, though technically there should be no ambiguities between "`is` the suffix" and "`is` the keyword with other use cases (in the future)". Also, `is/us` may be *too* short (shorter than `i64/u64`) and may be *too* pleasant to use, which can be a problem. - -### `isz/usz`: +* Cons: They are actual English words, with `is` being a keyword in many programming languages and `us` being an abbreviation of "unsigned" (losing information) or "microsecond" (misleading). Also, `is/us` may be *too* short (shorter than `i64/u64`) and *too* pleasant to use, which can be a problem. -* Pros: They are the middle grounds between `isize/usize` and `is/us`, neither too long nor too short, and they are not actual English words. -* Cons: An extra pair of reserved words. +Note: No matter which suffixes get chosen, it can be beneficial to reserve `is` as a keyword, but this is outside the scope of this RFC. ### `iz/uz`: * Pros and cons: Similar to those of `is/us`, except that `iz/uz` are not actual words, which is an additional advantage. However it may not be immediately clear that `iz/uz` are abbreviations of `isize/usize`. -This author believes that `isz/usz` are the best choices here. - -(Note: Even if `is/us` don't get used as literal suffixes, it can be beneficial to reserve `is`, but this is outside the scope of this RFC.) +### `i/u`: +* Pros: They are very succinct. +* Cons: They are *too* succinct and carry the "default integer types" connotation, which is undesirable. -`usize` in action: +### `isz/usz`: -```rust -fn slice_or_fail<'b>(&'b self, from: &usize, to: &usize) -> &'b [T] -``` +* Pros: They are the middle grounds between `isize/usize` and `is/us`, neither too long nor too short. They are not actual English words and it's clear that they are short for `isize/usize`. +* Cons: Not everyone likes the appearances of `isz/usz`, but this can be said about all the candidates. -See **Alternatives B to L** for the alternatives to `isize/usize` that have been rejected. +Thus, this author believes that `isz/usz` are the best choices here. ## Advantages of `isize/usize`: @@ -90,6 +92,8 @@ See **Alternatives B to L** for the alternatives to `isize/usize` that have been - The names are newcomer friendly and have familiarity advantage over almost all other alternatives. - The names are easy on the eyes. +See **Alternatives B to L** for the alternatives to `isize/usize` that have been rejected. + # Drawbacks ## Drawbacks of the renaming in general: @@ -106,7 +110,7 @@ Familiarity is a double edged sword here. `isize/usize` are chosen not because t # Alternatives -## A. Keep the status quo. +## A. Keep the status quo: Which may hurt in the long run, especially when there is at least one (would-be?) high-profile language (which is Rust-inspired) taking the opposite stance of Rust. From 4a4540f65c5a68b62c37e7b308c726a3e31d490f Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 13 Jan 2015 21:54:57 +0800 Subject: [PATCH 0033/1195] RFC: Rename `BinaryHeap` to `BinHeap`. --- text/0000-rename-binary-heap.md | 74 +++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 text/0000-rename-binary-heap.md diff --git a/text/0000-rename-binary-heap.md b/text/0000-rename-binary-heap.md new file mode 100644 index 00000000000..7c33cf058dc --- /dev/null +++ b/text/0000-rename-binary-heap.md @@ -0,0 +1,74 @@ +- Start Date: 2015-01-13 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Rename the standard correction `BinaryHeap` to `BinHeap`, in order to follow the existing naming convention. + +# Motivation + +In [this comment](http://www.reddit.com/r/programming/comments/2rvoha/announcing_rust_100_alpha/cnk31hf) in the Rust 1.0.0-alpha announcement thread in /r/programming, it was pointed out that Rust's std collections had inconsistent names. Particularly, the abbreviation rules of the names seemed unclear. + +The current collection names (and their longer versions) are: + +* `Vec` -> `Vector` +* `BTreeMap` +* `BTreeSet` +* `BinaryHeap` +* `Bitv` -> `BitVec` -> `BitVector` +* `BitvSet` -> `BitVecSet` -> `BitVectorSet` +* `DList` -> `DoublyLinkedList` +* `HashMap` +* `HashSet` +* `RingBuf` -> `RingBuffer` +* `VecMap` -> `VectorMap` + +The abbreviation rules do seem unclear. Sometimes the first word is abbreviated, sometimes the last. However there are also cases where the names are not abbreviated. Such inconsistency is undesirable, as Rust should not give an impression as "the promising language that has strangely inconsistent naming conventions for its standard collections". + +# Detailed design + +An observation: + +Given the current names, Rust actually *do* have consistent name abbreviation rules that are generally followed by its standard collections: + +- Each word in the names must be shorter than five letters, or they should be abbreviated. +- When choosing abbreviations for each overly-long word, prefer commonly used ones. + +There are four names worth mentioning: `Bitv`, `BitvSet`, `DList` and `BinaryHeap`. + +- `Bitv`: This can be seen as short for `Bitvector`, not `BitVector`, so it actually confirms to the rules. +- `BitvSet`: Ditto. +- `DList`: This should be either `DList`, or `DoublyLinkedList`, as all the "middle grounds" feel unnatural. (DLList? DblList? DoublyList? DLinkList? ...) We don't want the full one because it is too long (and more importantly, we already use other abbreviated names), so `DList` is the best choice here. +- `BinaryHeap`: It seems that `BinHeap` can be a better name here, no reason to violate the rules. + +Thus, this RFC proposes the following change: + +**Rename `BinaryHeap` to `BinHeap`.** + +# Drawbacks + +- This is A breaking change to a standard collection that is already marked `stable`. +- There is no guarantee that all future additions to the standard collections will have names that look pretty under these abbreviation rules. + +However `BinaryHeap` is only one collection, and a deprecation period can be provided if necessary, so the first drawback may not be a serious problem. Regarding the second one, this change at least isn't worse than the status quo. + +# Alternatives + +## A. Keep the status quo: + +And Rust will have no consistent abbreviation rules for its standard collections' names. + +## B. Rename all collections to their full names: + +This will ensure maximum consistency, both now and in the future. However, a breaking change at this scale is undesirable at this stage, and `Vec` is so frequently used that it deserves an abbreviation. Then, if one collection has an abbreviated name, it is only natural for others to also have such names. + +## C. Also rename `Bitv` to `BitVec`, and `BitvSet` to `BitVecSet`: + +Some may argue that `BitVector` is the more common spelling, not `Bitvector`, so `Bitv` is not as good as `BitVec`. Therefore, `Bitv` and `BitvSet` should also be renamed. + +The drawback: this alternative means more breaking changes than only renaming `BinaryHeap`. + +# Unresolved questions + +None. From abe5cb74a3a22cb882a582c167119444d72bea47 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Tue, 13 Jan 2015 22:05:16 +0800 Subject: [PATCH 0034/1195] Typo: "correction" -> "collection". --- text/0000-rename-binary-heap.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-rename-binary-heap.md b/text/0000-rename-binary-heap.md index 7c33cf058dc..d032beb2f54 100644 --- a/text/0000-rename-binary-heap.md +++ b/text/0000-rename-binary-heap.md @@ -4,7 +4,7 @@ # Summary -Rename the standard correction `BinaryHeap` to `BinHeap`, in order to follow the existing naming convention. +Rename the standard collection `BinaryHeap` to `BinHeap`, in order to follow the existing naming convention. # Motivation From d373162d53cf6b6e59fd68ae77719d9f641fcb04 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Wed, 14 Jan 2015 21:59:31 +0800 Subject: [PATCH 0035/1195] More discussions and general refinement. --- text/0000-rename-binary-heap.md | 35 ++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/text/0000-rename-binary-heap.md b/text/0000-rename-binary-heap.md index d032beb2f54..0345cc7e5e8 100644 --- a/text/0000-rename-binary-heap.md +++ b/text/0000-rename-binary-heap.md @@ -4,7 +4,7 @@ # Summary -Rename the standard collection `BinaryHeap` to `BinHeap`, in order to follow the existing naming convention. +Rename (maybe one of) the standard collections, so as to make the names more consistent. Currently, among all the alternatives, renaming `BinaryHeap` to `BinHeap` is the slightly preferred solution. # Motivation @@ -42,32 +42,45 @@ There are four names worth mentioning: `Bitv`, `BitvSet`, `DList` and `BinaryHea - `DList`: This should be either `DList`, or `DoublyLinkedList`, as all the "middle grounds" feel unnatural. (DLList? DblList? DoublyList? DLinkList? ...) We don't want the full one because it is too long (and more importantly, we already use other abbreviated names), so `DList` is the best choice here. - `BinaryHeap`: It seems that `BinHeap` can be a better name here, no reason to violate the rules. -Thus, this RFC proposes the following change: +Thus, this RFC proposes the following changes: -**Rename `BinaryHeap` to `BinHeap`.** +- Rename `std::collections::binary_heap::BinaryHeap` to `std::collections::bin_heap::BinHeap`. Change affected codes accordingly. +- If necessary, redefine `BinaryHeap` as an alias of `BinHeap` and mark it as deprecated. After a transition period, remove `BinaryHeap` completely. # Drawbacks - This is A breaking change to a standard collection that is already marked `stable`. -- There is no guarantee that all future additions to the standard collections will have names that look pretty under these abbreviation rules. +- `DList` is left unchanged, but far from ideal. It just doesn't say much about what the type actually is. +- Future additions to the standard collections may be like `DList`/`DoublyLinkedList` in that no ideal abbreviations can be found. Such additions *will* make the standard collections' names less consistent. -However `BinaryHeap` is only one collection, and a deprecation period can be provided if necessary, so the first drawback may not be a serious problem. Regarding the second one, this change at least isn't worse than the status quo. +This solution can bring *some* consistency to the collections' names, but doing so may be sweeping the real problem under the rug. Still it is better than the status quo and requires the least amount of breaking changes. # Alternatives ## A. Keep the status quo: -And Rust will have no consistent abbreviation rules for its standard collections' names. +And Rust's standard collections will have no consistent name abbreviation rules. `DList` can be excused for being an exception (if only because all the alternatives are worse), but `BinaryHeap` cannot. -## B. Rename all collections to their full names: +## B. Rename all collections with abbreviated names to their full names: -This will ensure maximum consistency, both now and in the future. However, a breaking change at this scale is undesirable at this stage, and `Vec` is so frequently used that it deserves an abbreviation. Then, if one collection has an abbreviated name, it is only natural for others to also have such names. +This will ensure maximum consistency, both now and in the future. As the referenced reddit comment (and discussions about this RFC) indicates, *Many* believe this to be the optimal solution. -## C. Also rename `Bitv` to `BitVec`, and `BitvSet` to `BitVecSet`: +However: -Some may argue that `BitVector` is the more common spelling, not `Bitvector`, so `Bitv` is not as good as `BitVec`. Therefore, `Bitv` and `BitvSet` should also be renamed. +- A breaking change at this scale is undesirable at this stage. +- `Vec` is so frequently used that it deserves an abbreviation. +- If one collection has an abbreviated name, it is only natural for others to also have such names. +- Most abbreviated names are clear, `DList` is the exception, not the rule. -The drawback: this alternative means more breaking changes than only renaming `BinaryHeap`. +Still, using full and consistent names may be the right choice in the long run, especially considering that people tend to follow the naming conventions of the standard library, and it's very likely that there will be future additions to the standard collections, which may or may not have "abbreviation-friendly" names. + +Also, if abbreviated names are truly needed, one can always write `type`. `Option` is not called `Opt` after all. Some may also argue that modern editors/IDEs make longer names less of an issue. + +## C. Rename `BinaryHeap`, and also `Bitv` to `BitVec`, `BitvSet` to `BitVecSet`: + +Some may argue that `BitVector` is the more common spelling, not `Bitvector`, so `Bitv` is not as good as `BitVec`. Therefore, `Bitv` and `BitvSet` should also be renamed alongside `BinaryHeap`. + +The pros and cons of this alternative is similar to only renaming `BinaryHeap`, but with more conventional names and more breaking changes. # Unresolved questions From 5e5f2803abd7a0b1e84ae855c5a8ac70b89a96d5 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Thu, 15 Jan 2015 22:43:09 +0800 Subject: [PATCH 0036/1195] Major revision. Based on the discussions, the preferred solution is changed, and this RFC is hopefully more coherent now. --- text/0000-rename-binary-heap.md | 77 +++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 32 deletions(-) diff --git a/text/0000-rename-binary-heap.md b/text/0000-rename-binary-heap.md index 0345cc7e5e8..c042cd4504b 100644 --- a/text/0000-rename-binary-heap.md +++ b/text/0000-rename-binary-heap.md @@ -24,63 +24,76 @@ The current collection names (and their longer versions) are: * `RingBuf` -> `RingBuffer` * `VecMap` -> `VectorMap` -The abbreviation rules do seem unclear. Sometimes the first word is abbreviated, sometimes the last. However there are also cases where the names are not abbreviated. Such inconsistency is undesirable, as Rust should not give an impression as "the promising language that has strangely inconsistent naming conventions for its standard collections". +The abbreviation rules do seem unclear. Sometimes the first word is abbreviated, sometimes the last. However there are also cases where the names are not abbreviated. `Bitv`, `BitvSet` and `DList` seem strange on first glance. Such inconsistencies are undesirable, as Rust should not give an impression as "the promising language that has strangely inconsistent naming conventions for its standard collections". # Detailed design -An observation: +First some general naming rules should be established. -Given the current names, Rust actually *do* have consistent name abbreviation rules that are generally followed by its standard collections: +1. Prefer commonly used names. +2. Prefer full names when full names and abbreviated names are almost equally elegant. -- Each word in the names must be shorter than five letters, or they should be abbreviated. -- When choosing abbreviations for each overly-long word, prefer commonly used ones. +And the new names: -There are four names worth mentioning: `Bitv`, `BitvSet`, `DList` and `BinaryHeap`. +* `Vec` +* `BTreeMap` +* `BTreeSet` +* `BinaryHeap` +* `Bitv` -> `BitVec` +* `BitvSet` -> `BitSet` +* `DList` -> `LinkedList` +* `HashMap` +* `HashSet` +* `RingBuf` -> `RingBuffer` +* `VecMap` -- `Bitv`: This can be seen as short for `Bitvector`, not `BitVector`, so it actually confirms to the rules. -- `BitvSet`: Ditto. -- `DList`: This should be either `DList`, or `DoublyLinkedList`, as all the "middle grounds" feel unnatural. (DLList? DblList? DoublyList? DLinkList? ...) We don't want the full one because it is too long (and more importantly, we already use other abbreviated names), so `DList` is the best choice here. -- `BinaryHeap`: It seems that `BinHeap` can be a better name here, no reason to violate the rules. +The following changes should be made: -Thus, this RFC proposes the following changes: +- Rename `Bitv`, `BitvSet`, `DList` and `RingBuf`. Change affected codes accordingly. +- If necessary, redefine the original names as aliases of the new names, and mark them as deprecated. After a transition period, remove the original names completely. -- Rename `std::collections::binary_heap::BinaryHeap` to `std::collections::bin_heap::BinHeap`. Change affected codes accordingly. -- If necessary, redefine `BinaryHeap` as an alias of `BinHeap` and mark it as deprecated. After a transition period, remove `BinaryHeap` completely. +## Why prefer full names when full names and abbreviated ones are almost equally elegant? -# Drawbacks +The naming rules should apply not only to standard collections, but also to other codes. It is (comparatively) easier to maintain a higher level of naming consistency by preferring full names to abbreviated ones *when in doubt*. Because given a full name, there are possibly many abbreviated forms to choose from. Which should be chosen and why? It is hard to write down guideline for that. -- This is A breaking change to a standard collection that is already marked `stable`. -- `DList` is left unchanged, but far from ideal. It just doesn't say much about what the type actually is. -- Future additions to the standard collections may be like `DList`/`DoublyLinkedList` in that no ideal abbreviations can be found. Such additions *will* make the standard collections' names less consistent. +For example, a name `BinaryBuffer` have at least three convincing abbreviated forms: `BinBuffer`/`BinaryBuf`/`BinBuf`. Which one would be the most preferred? Hard to say. But it is clear that the full name `BinaryBuffer` is not a bad name. -This solution can bring *some* consistency to the collections' names, but doing so may be sweeping the real problem under the rug. Still it is better than the status quo and requires the least amount of breaking changes. +However, if there *is* a convincing reason, one should not hesitate using abbreviated names. A series of names like `BinBuffer/OctBuffer/HexBuffer` is very natural. Also, few would think the full name of `Arc` is a good type name. + +## Advantages of the new names: + +- `Vec`: The name of the most frequently used Rust collection is left unchanged (and by extension `VecMap`), so the scope of the changes are greatly reduced. `Vec` is an exception to the rule because it is *the* collection in Rust. +- `BitVec`: `Bitv` is a very unusual abbreviation of `BitVector`, but `BitVec` is a good one given `Vector` is shortened to `Vec`. +- `BitSet`: Technically, `BitSet` is a synonym of `BitVec(tor)`, but it has `Set` in its name and can be interpreted as a set-like "view" into the underlying bit array/vector, so `BitSet` is a good name. No need to have an additional `v`. +- `LinkedList`: `DList` doesn't say much about what it actually is. `LinkedList` is not too long (like `DoublyLinkedList`) and it being a doubly-linked list follows Java/C#'s traditions. +- `RingBuffer`: `RingBuf` is a good name, but `RingBuffer` is good too. No reason to violate the rule here. + +# Drawbacks + +- Preferring full names may result in people naming things with overly-long names that are hard to write and more importantly, read. +- There will be breaking changes to standard collections that are already marked `stable`. # Alternatives ## A. Keep the status quo: -And Rust's standard collections will have no consistent name abbreviation rules. `DList` can be excused for being an exception (if only because all the alternatives are worse), but `BinaryHeap` cannot. - -## B. Rename all collections with abbreviated names to their full names: +And Rust's standard collections will have some strange names and no consistent naming rules. -This will ensure maximum consistency, both now and in the future. As the referenced reddit comment (and discussions about this RFC) indicates, *Many* believe this to be the optimal solution. +## B. Also rename `Vec` to `Vector`: -However: +And by extension, `Bitv` to `BitVector` and `VecMap` to `VectorMap`. -- A breaking change at this scale is undesirable at this stage. -- `Vec` is so frequently used that it deserves an abbreviation. -- If one collection has an abbreviated name, it is only natural for others to also have such names. -- Most abbreviated names are clear, `DList` is the exception, not the rule. +This means breaking changes at a much larger scale. Undesirable at this stage. -Still, using full and consistent names may be the right choice in the long run, especially considering that people tend to follow the naming conventions of the standard library, and it's very likely that there will be future additions to the standard collections, which may or may not have "abbreviation-friendly" names. +## C. Rename `DList` to `DLinkedList`, not `LinkedList`: -Also, if abbreviated names are truly needed, one can always write `type`. `Option` is not called `Opt` after all. Some may also argue that modern editors/IDEs make longer names less of an issue. +It is clearer, but also inconsistent with the other names by having a single-lettered abbreviation of `Doubly`. As Java/C# also have doubly-linked `LinkedList`, it is not necessary to use the additional `D`. -## C. Rename `BinaryHeap`, and also `Bitv` to `BitVec`, `BitvSet` to `BitVecSet`: +## D. Instead of renaming `RingBuf` to `RingBuffer`, rename `BinaryHeap` to `BinHeap`. -Some may argue that `BitVector` is the more common spelling, not `Bitvector`, so `Bitv` is not as good as `BitVec`. Therefore, `Bitv` and `BitvSet` should also be renamed alongside `BinaryHeap`. +Or, reversing the second rule: prefer abbreviated names to full ones when in doubt. -The pros and cons of this alternative is similar to only renaming `BinaryHeap`, but with more conventional names and more breaking changes. +This has the advantage of encouraging succinct names, but everyone has his/her own preferences of how to abbreviate things. Naming consistency will suffer. Whether this is a problem is also a quite subjective matter. # Unresolved questions From 788dee011eaedbc5b2c917174b3a027d29596ce1 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Thu, 15 Jan 2015 22:46:22 +0800 Subject: [PATCH 0037/1195] Rename the RFC to better reflect the contents. --- text/{0000-rename-binary-heap.md => 0000-rename-collections.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-rename-binary-heap.md => 0000-rename-collections.md} (100%) diff --git a/text/0000-rename-binary-heap.md b/text/0000-rename-collections.md similarity index 100% rename from text/0000-rename-binary-heap.md rename to text/0000-rename-collections.md From 4caece6ddf9d3eba916a320a088eface2c56747e Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sun, 18 Jan 2015 10:28:44 +0200 Subject: [PATCH 0038/1195] Added RFC: dereferenced complement to CString --- text/0000-c-str-deref.md | 130 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 text/0000-c-str-deref.md diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md new file mode 100644 index 00000000000..daad5b816b0 --- /dev/null +++ b/text/0000-c-str-deref.md @@ -0,0 +1,130 @@ +- Start Date: 2015-01-17 +- RFC PR: +- Rust Issue: + +# Summary + +Make `CString` dereference to a token type `CStr`, which designates +null-terminated string data. + +```rust +// Type-checked to only accept C strings +fn safe_puts(s: &CStr) { + unsafe { libc::puts(s.as_ptr()) }; +} + +fn main() { + safe_puts(c_str!("Look ma, a `&'static CStr` from a literal!")); +} +``` + +# Motivation + +The type `std::ffi::CString` is used to prepare string data for passing +as null-terminated strings to FFI functions. This type dereferences to a +DST, `[libc::c_char]`. The DST, however, is a poor choice for representing +borrowed C string data, since: + +1. The slice does not enforce the C string invariant at compile time. + Safe interfaces wrapping FFI functions cannot take slice references as is + without dynamic checks (when null-terminated slices are expected) or + building a temporary `CString` internally (in this case plain Rust slices + must be passed with no interior NULs). `CString`, for its part, is an + owning container and is not convenient for passing by reference. A string + literal, for example, would require a `CString` constructed from it at + runtime to pass into a function expecting `&CString`. +2. The primary consumers of the borrowed pointers, FFI functions, do not care + about the 'sized' aspect of the DST. The borrowed reference is + therefore needlessly 'fat' for its primary purpose. + +As a pattern of owned/borrowed type pairs has been established +thoughout other modules (see e.g. +[path reform](https://github.com/rust-lang/rfcs/pull/474)), +it makes sense that `CString` gets its own borrowed counterpart. + +# Detailed design + +## CStr, an Irrelevantly Sized Type + +This proposal introduces `CStr`, a token type to designate a null-terminated +string. This type does not implement `Copy` or `Clone` and is only used in +borrowed references. `CStr` is sized, but its size and layout are of no +consequence to its users. It's only safely obtained by dereferencing +`CString` and a few other helper methods, described below. + +```rust +#[repr(C)] +pub struct CStr { + head: libc::c_char, + marker: std::marker::NoCopy +} + +impl CStr { + pub fn as_ptr(&self) -> *const libc::c_char { + &self.head as *const libc::c_char + } +} + +impl Deref for CString { + type Target = CStr; + fn deref(&self) -> &CStr { ... } +} +``` + +## Static C strings + +A way to create static references asserted as null-terminated strings is +provided by a couple of functions: + +```rust +fn static_c_str_from_bytes(bytes: &'static [u8]) -> &'static CStr +``` +```rust +fn static_c_str_from_str(s: &'static str) -> &'static CStr +``` + +## c_str! + +For added convenience in passing literal string data to FFI functions, +a macro is provided that appends a literal with `"\0"` and returns it +as `&'static CStr`: +```rust +#[macro_export] +macro_rules! c_str { + ($lit:expr) => { + $crate::ffi::static_c_str_from_str(concat!($lit, "\0")) + } +} +``` +Going forward, it would be good to make `c_str!` also accept byte strings +on input, through a [byte string concatenation +macro](https://github.com/rust-lang/rfcs/pull/566). Ultimately, it could be +made workable in static expressions through a compiler plugin. + +## Proof of concept + +The described additions are implemented in crate +[c_string](https://github.com/mzabaluev/rust-c-str/tree/v0.3.0). + +# Drawbacks + +The change of the deref type is another breaking change to `CString`. +In practice the main purpose of borrowing from `CString` is to obtain a +raw pointer with `.as_ptr()`; for code which only does this and does not +expose the slice in type annotations, parameter signatures and so on, +the change should not be breaking since `CStr` also provides +this method. + +# Alternatives + +The users of Rust can turn to third-party libraries for better convenience +and safety when working with C strings. This can result in proliferation of +incompatible helper types in public APIs until a dominant de-facto solution +is established. + +# Unresolved questions + +There is room for a helper type wrapping an allocated C string with a supplied +deallocation function to invoke when dropped. That type should also dereference +to `CStr`. My library crate [c_string](https://crates.io/crates/c_string) +provides an example in `OwnedCString`. From ce1668e376083bd64bdd1e5ec5a05efdf56fbd1d Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sun, 18 Jan 2015 10:49:44 +0200 Subject: [PATCH 0039/1195] Added a question regarding `Cow`s --- text/0000-c-str-deref.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index daad5b816b0..9c2419319a1 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -128,3 +128,5 @@ There is room for a helper type wrapping an allocated C string with a supplied deallocation function to invoke when dropped. That type should also dereference to `CStr`. My library crate [c_string](https://crates.io/crates/c_string) provides an example in `OwnedCString`. + +Need a `Cow`? From b04c835824cc44f1b836304847289bc862556b8e Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sun, 18 Jan 2015 11:05:46 +0200 Subject: [PATCH 0040/1195] A small proofreading fix --- text/0000-c-str-deref.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 9c2419319a1..3f184458779 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -103,7 +103,7 @@ made workable in static expressions through a compiler plugin. ## Proof of concept -The described additions are implemented in crate +The described changes are implemented in crate [c_string](https://github.com/mzabaluev/rust-c-str/tree/v0.3.0). # Drawbacks From d8cfda0c25188195723d7fa07c125e3557ac412a Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Mon, 19 Jan 2015 10:20:53 +0200 Subject: [PATCH 0041/1195] Explained the assertion policy on CStr-from-static-data --- text/0000-c-str-deref.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 3f184458779..86ec39955c3 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -73,8 +73,8 @@ impl Deref for CString { ## Static C strings -A way to create static references asserted as null-terminated strings is -provided by a couple of functions: +A way to create `CStr` references from static Rust expressions asserted as +null-terminated string or byte slices is provided by a couple of functions: ```rust fn static_c_str_from_bytes(bytes: &'static [u8]) -> &'static CStr @@ -83,6 +83,13 @@ fn static_c_str_from_bytes(bytes: &'static [u8]) -> &'static CStr fn static_c_str_from_str(s: &'static str) -> &'static CStr ``` +As these functions mostly work with literals, they only assert that the +slice is terminated by a zero byte. It's the responsibility of the programmer +to ensure that the static data does not contain any unintended interior NULs +(the program will not crash, but the string will be interpreted up to the +first `'\0'` encountered). For non-literal data, `CStrBuf::from_bytes` or +`CStrBuf::from_vec` should be preferred. + ## c_str! For added convenience in passing literal string data to FFI functions, From e6a9bbdfb014fa6ff8c7af1caceb47a45f09c84c Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Mon, 19 Jan 2015 10:23:56 +0200 Subject: [PATCH 0042/1195] Put the static data helpers into the CStr impl NB: std::ffi is annoyingly vague, perhaps std::ffi::c_str would not need disambiguation on function names. --- text/0000-c-str-deref.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 86ec39955c3..c765e3d03d0 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -77,10 +77,10 @@ A way to create `CStr` references from static Rust expressions asserted as null-terminated string or byte slices is provided by a couple of functions: ```rust -fn static_c_str_from_bytes(bytes: &'static [u8]) -> &'static CStr -``` -```rust -fn static_c_str_from_str(s: &'static str) -> &'static CStr +impl CStr { + pub fn from_static_bytes(bytes: &'static [u8]) -> &'static CStr { ... } + pub fn from_static_str(s: &'static str) -> &'static CStr { ... } +} ``` As these functions mostly work with literals, they only assert that the From c9cf40cda7f6fc5017f5e2661b7599d0fa17205a Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Mon, 19 Jan 2015 10:47:34 +0200 Subject: [PATCH 0043/1195] Added a section detailing API for returning C string references --- text/0000-c-str-deref.md | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index c765e3d03d0..e4276b5e4de 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -108,6 +108,34 @@ on input, through a [byte string concatenation macro](https://github.com/rust-lang/rfcs/pull/566). Ultimately, it could be made workable in static expressions through a compiler plugin. +## Returning C strings + +In cases when an FFI function returns a pointer to a non-owned C string, +it might be preferable to wrap the returned string safely as a 'thin' +`&CStr` rather than scan it into a slice up front. To facilitate this, +conversion from a raw pointer should be added (using the +[lifetime anchor](https://github.com/rust-lang/rfcs/pull/556) convention): +```rust +impl CStr { + pub unsafe fn from_raw_ptr<'a, T: ?Sized>(ptr: *const libc::c_char, + life_anchor: &'a T) + -> &'a CStr + { ... } +} +``` + +For getting a slice out of a `CStr` reference, method `parse_as_bytes` is +provided. The name is chosen to reflect the linear cost of calculating the +length. +```rust +impl CStr { + pub fn parse_as_bytes(&self) -> &[u8] { ... } +} +``` + +An odd consequence is that it is valid, if wasteful, to call +`parse_as_bytes` on `CString` via auto-dereferencing. + ## Proof of concept The described changes are implemented in crate From 7305d7a5c906f4b078267e37ed403cc882cbd476 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Mon, 19 Jan 2015 11:02:54 +0200 Subject: [PATCH 0044/1195] Described the DST alternative for `CStr` --- text/0000-c-str-deref.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index e4276b5e4de..e4bdf6a024e 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -152,6 +152,11 @@ this method. # Alternatives +`CStr` could be made a newtype on DST `[libc::c_char]`, allowing no-cost +slices. It's not clear if this is useful, and the need to calculate length +up front might prevent some optimized uses possible with the 'thin' +reference. + The users of Rust can turn to third-party libraries for better convenience and safety when working with C strings. This can result in proliferation of incompatible helper types in public APIs until a dominant de-facto solution From 0f26194fd6350691c74a3160194eeaf4b252d302 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Mon, 19 Jan 2015 11:03:22 +0200 Subject: [PATCH 0045/1195] Editorial on alternatives --- text/0000-c-str-deref.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index e4bdf6a024e..b89007b19e5 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -157,8 +157,9 @@ slices. It's not clear if this is useful, and the need to calculate length up front might prevent some optimized uses possible with the 'thin' reference. -The users of Rust can turn to third-party libraries for better convenience -and safety when working with C strings. This can result in proliferation of +If the proposed enhancements or other equivalent facilities are not adopted, +users of Rust can turn to third-party libraries for better convenience +and safety when working with C strings. This may result in proliferation of incompatible helper types in public APIs until a dominant de-facto solution is established. From d64f14ffb87c12900aa746e3c7ba081245b2b526 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Mon, 19 Jan 2015 11:05:05 +0200 Subject: [PATCH 0046/1195] Updated the definition of c_str! --- text/0000-c-str-deref.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index b89007b19e5..50dc698c877 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -99,7 +99,7 @@ as `&'static CStr`: #[macro_export] macro_rules! c_str { ($lit:expr) => { - $crate::ffi::static_c_str_from_str(concat!($lit, "\0")) + $crate::ffi::CStr::from_static_str(concat!($lit, "\0")) } } ``` From ea48cbe2d79ea25fce91fcaee9c8d9b5c30707c8 Mon Sep 17 00:00:00 2001 From: James Miller Date: Wed, 21 Jan 2015 19:34:37 +1300 Subject: [PATCH 0047/1195] Add discriminant_value intrinsic RFC --- text/0000-discriminant-intrinsic.md | 339 ++++++++++++++++++++++++++++ 1 file changed, 339 insertions(+) create mode 100644 text/0000-discriminant-intrinsic.md diff --git a/text/0000-discriminant-intrinsic.md b/text/0000-discriminant-intrinsic.md new file mode 100644 index 00000000000..bd7aa43a289 --- /dev/null +++ b/text/0000-discriminant-intrinsic.md @@ -0,0 +1,339 @@ +- Start Date: 2015-01-21 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add a new intrinsic, `discriminant_value` that extracts the value of the discriminant for enum +types. + +# Motivation + +Many operations that work with discriminant values can be significantly improved with the ability to +extract the value of the discriminant that is used to distinguish between variants in an enum. While +trivial cases often optimise well, more complex ones would benefit from direct access to this value. + +A good example is the `SqlState` enum from the `postgres` crate (Listed at the end of this RFC). It +contains 233 variants, of which all but one contain no fields. The most obvious implementation of +(for example) the `PartialEq` trait looks like this: + +```rust +match (self, other) { + (&Unknown(ref s1), &Unknown(ref s2)) => s1 == s2, + (&SuccessfulCompletion, &SuccessfulCompletion) => true, + (&Warning, &Warning) => true, + (&DynamicResultSetsReturned, &DynamicResultSetsReturned) => true, + (&ImplicitZeroBitPadding, &ImplicitZeroBitPadding) => true, + . + . + . + (_, _) => false +} +``` + +Even with optimisations enabled, this code is very suboptimal, producing +[this code](https://gist.github.com/Aatch/c23a45634b10aaecad05). A way to extract the discriminant +would allow this code: + +```rust +match (self, other) { + (&Unknown(ref s1), &Unknown(ref s2)) => s1 == s2, + (l, r) => unsafe { + discriminant_value(l) == discriminant(r) + } +} +``` + +Which is compiled into [this IR](https://gist.github.com/Aatch/beb736b93a908aa67e84). + +# Detailed design + +## What is a discriminant? + +A discriminant is a value stored in an enum type that indicates which variant the value is. The most +common case is that the discriminant is stored directly as an extra field in the variant. However, +the discriminant may be stored in any place, and in any format. However, we can always extract the +discriminant from the value somehow. + +## Implementation + +For any given type, `discriminant_value` will return a `u64` value. The values returned are as +specified: + +* **Non-Enum Type**: Always 0 +* **Enum Type**: A value that uniquely identifies that variant within its type. I.E. for a given + enum typem `E`, and two values of type `E`, `a` and `b`, the expression `discriminant_value(a) == + discriminant_value(b)` is true iff `a` and `b` are the same variant. Two values of different types + may return the same discriminant value. + +The reason for this specification is to allow flexibilty in usage of the intrinsic without +compromising our ability to change the representation at will. + +# Drawbacks + +* Potentially exposes implementation details. However, relying the specific values returned from +`discriminant_value` should be considered bad practice, as the intrinsic provides no such guarantee. + +* Does not allow for the value to be used as part of ordering. + +* Allows non-enum types to be provided. This may be unexpected by some users. + +# Alternatives + +* More strongly specify the values returned. This would allow for a broader range of uses, but + requires specifying behaviour that we may not want to. + +* Disallow non-enum types. Non-enum types do not have a discriminant, so trying to extract might be + considered an error. However, there is no compelling reason to disallow these types as we can + simply treat them as single-variant enums and synthesise a zero constant. Note that this is what + would be done for single-variant enums anyway. + +* Do nothing. Improvements to codegen and/or optimisation could make this uneccessary. The + "Sufficiently Smart Compiler" trap is a strong case against this reasoning though. There will + likely always be cases where the user can write more efficient code than the compiler can produce. + +# Unresolved questions + +* Should `#[derive]` use this intrinsic to improve derived implementations of traits? While + intrinsics are inherently unstable, `#[derive]`d code is compiler generated and therefore can be + updated if the intrinsic is changed or removed. + +# Appendix + +```rust +pub enum SqlState { + SuccessfulCompletion, + Warning, + DynamicResultSetsReturned, + ImplicitZeroBitPadding, + NullValueEliminatedInSetFunction, + PrivilegeNotGranted, + PrivilegeNotRevoked, + StringDataRightTruncationWarning, + DeprecatedFeature, + NoData, + NoAdditionalDynamicResultSetsReturned, + SqlStatementNotYetComplete, + ConnectionException, + ConnectionDoesNotExist, + ConnectionFailure, + SqlclientUnableToEstablishSqlconnection, + SqlserverRejectedEstablishmentOfSqlconnection, + TransactionResolutionUnknown, + ProtocolViolation, + TriggeredActionException, + FeatureNotSupported, + InvalidTransactionInitiation, + LocatorException, + InvalidLocatorException, + InvalidGrantor, + InvalidGrantOperation, + InvalidRoleSpecification, + DiagnosticsException, + StackedDiagnosticsAccessedWithoutActiveHandler, + CaseNotFound, + CardinalityViolation, + DataException, + ArraySubscriptError, + CharacterNotInRepertoire, + DatetimeFieldOverflow, + DivisionByZero, + ErrorInAssignment, + EscapeCharacterConflict, + IndicatorOverflow, + IntervalFieldOverflow, + InvalidArgumentForLogarithm, + InvalidArgumentForNtileFunction, + InvalidArgumentForNthValueFunction, + InvalidArgumentForPowerFunction, + InvalidArgumentForWidthBucketFunction, + InvalidCharacterValueForCast, + InvalidDatetimeFormat, + InvalidEscapeCharacter, + InvalidEscapeOctet, + InvalidEscapeSequence, + NonstandardUseOfEscapeCharacter, + InvalidIndicatorParameterValue, + InvalidParameterValue, + InvalidRegularExpression, + InvalidRowCountInLimitClause, + InvalidRowCountInResultOffsetClause, + InvalidTimeZoneDisplacementValue, + InvalidUseOfEscapeCharacter, + MostSpecificTypeMismatch, + NullValueNotAllowedData, + NullValueNoIndicatorParameter, + NumericValueOutOfRange, + StringDataLengthMismatch, + StringDataRightTruncationException, + SubstringError, + TrimError, + UnterminatedCString, + ZeroLengthCharacterString, + FloatingPointException, + InvalidTextRepresentation, + InvalidBinaryRepresentation, + BadCopyFileFormat, + UntranslatableCharacter, + NotAnXmlDocument, + InvalidXmlDocument, + InvalidXmlContent, + InvalidXmlComment, + InvalidXmlProcessingInstruction, + IntegrityConstraintViolation, + RestrictViolation, + NotNullViolation, + ForeignKeyViolation, + UniqueViolation, + CheckViolation, + ExclusionViolation, + InvalidCursorState, + InvalidTransactionState, + ActiveSqlTransaction, + BranchTransactionAlreadyActive, + HeldCursorRequiresSameIsolationLevel, + InappropriateAccessModeForBranchTransaction, + InappropriateIsolationLevelForBranchTransaction, + NoActiveSqlTransactionForBranchTransaction, + ReadOnlySqlTransaction, + SchemaAndDataStatementMixingNotSupported, + NoActiveSqlTransaction, + InFailedSqlTransaction, + InvalidSqlStatementName, + TriggeredDataChangeViolation, + InvalidAuthorizationSpecification, + InvalidPassword, + DependentPrivilegeDescriptorsStillExist, + DependentObjectsStillExist, + InvalidTransactionTermination, + SqlRoutineException, + FunctionExecutedNoReturnStatement, + ModifyingSqlDataNotPermittedSqlRoutine, + ProhibitedSqlStatementAttemptedSqlRoutine, + ReadingSqlDataNotPermittedSqlRoutine, + InvalidCursorName, + ExternalRoutineException, + ContainingSqlNotPermitted, + ModifyingSqlDataNotPermittedExternalRoutine, + ProhibitedSqlStatementAttemptedExternalRoutine, + ReadingSqlDataNotPermittedExternalRoutine, + ExternalRoutineInvocationException, + InvalidSqlstateReturned, + NullValueNotAllowedExternalRoutine, + TriggerProtocolViolated, + SrfProtocolViolated, + SavepointException, + InvalidSavepointException, + InvalidCatalogName, + InvalidSchemaName, + TransactionRollback, + TransactionIntegrityConstraintViolation, + SerializationFailure, + StatementCompletionUnknown, + DeadlockDetected, + SyntaxErrorOrAccessRuleViolation, + SyntaxError, + InsufficientPrivilege, + CannotCoerce, + GroupingError, + WindowingError, + InvalidRecursion, + InvalidForeignKey, + InvalidName, + NameTooLong, + ReservedName, + DatatypeMismatch, + IndeterminateDatatype, + CollationMismatch, + IndeterminateCollation, + WrongObjectType, + UndefinedColumn, + UndefinedFunction, + UndefinedTable, + UndefinedParameter, + UndefinedObject, + DuplicateColumn, + DuplicateCursor, + DuplicateDatabase, + DuplicateFunction, + DuplicatePreparedStatement, + DuplicateSchema, + DuplicateTable, + DuplicateAliaas, + DuplicateObject, + AmbiguousColumn, + AmbiguousFunction, + AmbiguousParameter, + AmbiguousAlias, + InvalidColumnReference, + InvalidColumnDefinition, + InvalidCursorDefinition, + InvalidDatabaseDefinition, + InvalidFunctionDefinition, + InvalidPreparedStatementDefinition, + InvalidSchemaDefinition, + InvalidTableDefinition, + InvalidObjectDefinition, + WithCheckOptionViolation, + InsufficientResources, + DiskFull, + OutOfMemory, + TooManyConnections, + ConfigurationLimitExceeded, + ProgramLimitExceeded, + StatementTooComplex, + TooManyColumns, + TooManyArguments, + ObjectNotInPrerequisiteState, + ObjectInUse, + CantChangeRuntimeParam, + LockNotAvailable, + OperatorIntervention, + QueryCanceled, + AdminShutdown, + CrashShutdown, + CannotConnectNow, + DatabaseDropped, + SystemError, + IoError, + UndefinedFile, + DuplicateFile, + ConfigFileError, + LockFileExists, + FdwError, + FdwColumnNameNotFound, + FdwDynamicParameterValueNeeded, + FdwFunctionSequenceError, + FdwInconsistentDescriptorInformation, + FdwInvalidAttributeValue, + FdwInvalidColumnName, + FdwInvalidColumnNumber, + FdwInvalidDataType, + FdwInvalidDataTypeDescriptors, + FdwInvalidDescriptorFieldIdentifier, + FdwInvalidHandle, + FdwInvalidOptionIndex, + FdwInvalidOptionName, + FdwInvalidStringLengthOrBufferLength, + FdwInvalidStringFormat, + FdwInvalidUseOfNullPointer, + FdwTooManyHandles, + FdwOutOfMemory, + FdwNoSchemas, + FdwOptionNameNotFound, + FdwReplyHandle, + FdwSchemaNotFound, + FdwTableNotFound, + FdwUnableToCreateExcecution, + FdwUnableToCreateReply, + FdwUnableToEstablishConnection, + PlpgsqlError, + RaiseException, + NoDataFound, + TooManyRows, + InternalError, + DataCorrupted, + IndexCorrupted, + Unknown(String), +} +``` \ No newline at end of file From 9a12a510ab7938974141564c45a93f87cbe2b7a5 Mon Sep 17 00:00:00 2001 From: James Miller Date: Wed, 21 Jan 2015 20:03:17 +1300 Subject: [PATCH 0048/1195] Add '_value' --- text/0000-discriminant-intrinsic.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-discriminant-intrinsic.md b/text/0000-discriminant-intrinsic.md index bd7aa43a289..bcbf079ac1b 100644 --- a/text/0000-discriminant-intrinsic.md +++ b/text/0000-discriminant-intrinsic.md @@ -39,7 +39,7 @@ would allow this code: match (self, other) { (&Unknown(ref s1), &Unknown(ref s2)) => s1 == s2, (l, r) => unsafe { - discriminant_value(l) == discriminant(r) + discriminant_value(l) == discriminant_value(r) } } ``` From b6a51f048395d234442d5163604f277a19bff41f Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Wed, 21 Jan 2015 00:09:11 -0800 Subject: [PATCH 0049/1195] Debug improvements RFC --- text/0000-debug-improvements.md | 185 ++++++++++++++++++++++++++++++++ 1 file changed, 185 insertions(+) create mode 100644 text/0000-debug-improvements.md diff --git a/text/0000-debug-improvements.md b/text/0000-debug-improvements.md new file mode 100644 index 00000000000..aef77392b9f --- /dev/null +++ b/text/0000-debug-improvements.md @@ -0,0 +1,185 @@ +- Start Date: 2015-01-20 +- RFC PR: +- Rust Issue: + +# Summary + +The `Debug` trait is intended to be implemented by every trait and display +useful runtime information to help with debugging. This RFC proposes two +additions to the fmt API, one of which aids implementors of `Debug`, and one +which aids consumers of the output of `Debug`. Specifically, the `#` format +specifier modifier will cause `Debug` output to be "pretty printed", and some +utility builder types will be added to the `std::fmt` module to make it easier +to implement `Debug` manually. + +# Motivation + +## Pretty printing + +The conventions for `Debug` format state that output should resemble Rust +struct syntax, without added line breaks. This can make output difficult to +read in the presense of complex and deeply nested structures: +```rust +HashMap { "foo": ComplexType { thing: Some(BufferedReader { reader: FileStream { path: "/home/sfackler/rust/README.md", mode: R }, buffer: 1013/65536 }), other_thing: 100 }, "bar": ComplexType { thing: Some(BufferedReader { reader: FileStream { path: "/tmp/foobar", mode: R }, buffer: 0/65536 }), other_thing: 0 } } +``` +This can be made more readable by adding appropriate indentation: +```rust +HashMap { + "foo": ComplexType { + thing: Some( + BufferedReader { + reader: FileStream { + path: "/home/sfackler/rust/README.md", + mode: R + }, + buffer: 1013/65536 + } + ), + other_thing: 100 + }, + "bar": ComplexType { + thing: Some( + BufferedReader { + reader: FileStream { + path: "/tmp/foobar", + mode: R + }, + buffer: 0/65536 + } + ), + other_thing: 0 + } +} +``` +However, we wouldn't want this "pretty printed" version to be used by default, +since it's significantly more verbose. + +## Helper types + +For many Rust types, a Debug implementation can be automatically generated by +`#[derive(Debug)]`. However, many encapsulated types cannot use the +derived implementation. For example, the types in std::io::buffered all have +manual `Debug` impls. They all maintain a byte buffer that is both extremely +large (64k by default) and full of uninitialized memory. Printing it in the +`Debug` impl would be a terrible idea. Instead, the implementation prints the +size of the buffer as well as how much data is in it at the moment: +https://github.com/rust-lang/rust/blob/0aec4db1c09574da2f30e3844de6d252d79d4939/src/libstd/io/buffered.rs#L48-L60 + +```rust +pub struct BufferedStream { + inner: BufferedReader> +} + +impl fmt::Debug for BufferedStream where S: fmt::Debug { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + let reader = &self.inner; + let writer = &self.inner.inner.0; + write!(fmt, "BufferedStream {{ stream: {:?}, write_buffer: {}/{}, read_buffer: {}/{} }}", + writer.inner, + writer.pos, writer.buf.len(), + reader.cap - reader.pos, reader.buf.len()) + } +} +``` + +A purely manual implementation is tedious to write and error prone. These +difficulties become even more pronounced with the introduction of the "pretty +printed" format described above. If `Debug` is too painful to manually +implement, developers of libraries will create poor implementations or omit +them entirely. Some simple structures to help automatically create the correct +output format can significantly help ease these implementations: +```rust +impl fmt::Debug for BufferedStream where S: fmt::Debug { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + let reader = &self.inner; + let writer = &self.inner.inner.0; + ShowStruct::new(fmt, "BufferedStream") + .field("stream", writer.inner) + .field("write_buffer", &format_args!("{}/{}", writer.pos, writer.buf.len())) + .field("read_buffer", &format_args!("{}/{}", reader.cap - reader.pos, reader.buf.len())) + .finish() + } +} +``` + +# Detailed design + +## Pretty printing + +The `#` modifier (e.g. `{:#?}`) will be interpreted by `Debug` implementations +as a request for "pretty printed" output: + +* Non-compound output is unchanged from normal `Debug` output: e.g. `10`, + `"hi"`, `None`. +* Array, set and map output is printed with one element per line, indented four + spaces, and entries printed with the `#` modifier as well: e.g. +```rust +[ + "a", + "b", + "c" +] +``` +```rust +HashSet { + "a", + "b", + "c" +} +``` +```rust +HashMap { + "a": 1, + "b": 2, + "c": 3 +} +``` +* Struct and tuple struct output is printed with one field per line, indented + four spaces, and fields printed with the `#` modifier as well: e.g. +```rust +Foo { + field1: "hi", + field2: 10, + field3: false +} +``` +```rust +Foo( + "hi", + 10, + false +) +``` + +In all cases, pretty printed and non-pretty printed output should differ *only* +in the addition of newlines and whitespace. + +## Helper types + +Types will be added to `std::fmt` corresponding to each of the common `Debug` +output formats. They will provide a builder-like API to create correctly +formatted output, respecting the `#` flag as needed. A full implementation can +be found at https://gist.github.com/sfackler/6d6610c5d9e271146d11. (Note that +there's a lot of almost-but-not-quite duplicated code in the various impls. +It can probably be cleaned up a bit). An example of use of the `ShowStruct` +type is shown in the Motivation section. + +# Drawbacks + +The use of the `#` modifier adds complexity to `Debug` implementations. + +The builder types are adding extra `#[stable]` surface area to the standard +library that will have to be maintained. + +# Alternatives + +We could take the helper structs alone without the pretty printing format. +They're still useful even if a library author doesn't have to worry about the +second format. + +# Unresolved questions + +The indentation level is currently hardcoded to 4 spaces. We could allow that +to be configured as well by using the width or precision specifiers, for +example, `{:2#?}` would pretty print with a 2-space indent. It's not totally +clear to me that this provides enough value to justify the extra complexity. From 234e2ee4067f621f3f91c069262e09910b5a8200 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Thu, 22 Jan 2015 00:34:25 +0200 Subject: [PATCH 0050/1195] Added a paragraph concerning the bogus size to drawbacks --- text/0000-c-str-deref.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 50dc698c877..58eb51958b7 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -150,6 +150,11 @@ expose the slice in type annotations, parameter signatures and so on, the change should not be breaking since `CStr` also provides this method. +While it's not possible outside of unsafe code to unintentionally copy out +or modify the nominal value of `CStr` under an immutable reference, some +unforeseen trouble or confusion can arise due to the structure having a +bogus size. + # Alternatives `CStr` could be made a newtype on DST `[libc::c_char]`, allowing no-cost From f0c19edee561eed9ac91620963c39442eb14950d Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Wed, 21 Jan 2015 19:31:53 -0800 Subject: [PATCH 0051/1195] Add convinience methods to Formatter --- text/0000-debug-improvements.md | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/text/0000-debug-improvements.md b/text/0000-debug-improvements.md index aef77392b9f..fff02cc1946 100644 --- a/text/0000-debug-improvements.md +++ b/text/0000-debug-improvements.md @@ -93,7 +93,7 @@ impl fmt::Debug for BufferedStream where S: fmt::Debug { fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { let reader = &self.inner; let writer = &self.inner.inner.0; - ShowStruct::new(fmt, "BufferedStream") + fmt.debug_struct("BufferedStream") .field("stream", writer.inner) .field("write_buffer", &format_args!("{}/{}", writer.pos, writer.buf.len())) .field("read_buffer", &format_args!("{}/{}", reader.cap - reader.pos, reader.buf.len())) @@ -161,8 +161,21 @@ output formats. They will provide a builder-like API to create correctly formatted output, respecting the `#` flag as needed. A full implementation can be found at https://gist.github.com/sfackler/6d6610c5d9e271146d11. (Note that there's a lot of almost-but-not-quite duplicated code in the various impls. -It can probably be cleaned up a bit). An example of use of the `ShowStruct` -type is shown in the Motivation section. +It can probably be cleaned up a bit). For convenience, methods will be added +to `Formatter` which create them. An example of use of the `debug_struct` +method is shown in the Motivation section. In addition, the `padded` method +returns a type implementing `fmt::Writer` that pads input passed to it. This +is used inside of the other builders, but may be useful for others. +```rust +impl Formatter { + pub fn debug_struct<'a>(&'a mut self, name: &str) -> DebugStruct<'a> { ... } + pub fn debug_tuple<'a>(&'a mut self, name: &str) -> DebugTuple<'a> { ... } + pub fn debug_set<'a>(&'a mut self, name: &str) -> DebugSet<'a> { ... } + pub fn debug_map<'a>(&'a mut self, name: &str) -> DebugMap<'a> { ... } + + pub fn padded<'a>(&'a mut self) -> PaddedWriter<'a> { ... } +} +``` # Drawbacks From 1917a1f2254dd7f087a64beccd9d50a9c07c76a9 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Wed, 21 Jan 2015 19:39:36 -0800 Subject: [PATCH 0052/1195] Typo --- text/0000-debug-improvements.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-debug-improvements.md b/text/0000-debug-improvements.md index fff02cc1946..69c43ed31e1 100644 --- a/text/0000-debug-improvements.md +++ b/text/0000-debug-improvements.md @@ -86,7 +86,7 @@ A purely manual implementation is tedious to write and error prone. These difficulties become even more pronounced with the introduction of the "pretty printed" format described above. If `Debug` is too painful to manually implement, developers of libraries will create poor implementations or omit -them entirely. Some simple structures to help automatically create the correct +them entirely. Some simple structures to automatically create the correct output format can significantly help ease these implementations: ```rust impl fmt::Debug for BufferedStream where S: fmt::Debug { From be42152a610ee0c37697c74441340fd7013215f7 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Wed, 21 Jan 2015 19:41:32 -0800 Subject: [PATCH 0053/1195] Clarification --- text/0000-debug-improvements.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-debug-improvements.md b/text/0000-debug-improvements.md index 69c43ed31e1..7c55ca5409b 100644 --- a/text/0000-debug-improvements.md +++ b/text/0000-debug-improvements.md @@ -165,7 +165,8 @@ It can probably be cleaned up a bit). For convenience, methods will be added to `Formatter` which create them. An example of use of the `debug_struct` method is shown in the Motivation section. In addition, the `padded` method returns a type implementing `fmt::Writer` that pads input passed to it. This -is used inside of the other builders, but may be useful for others. +is used inside of the other builders, but is provided here for use by `Debug` +implementations that require formats not provided with the other helpers. ```rust impl Formatter { pub fn debug_struct<'a>(&'a mut self, name: &str) -> DebugStruct<'a> { ... } From 9565b477abe4e047230fdfdf254df4681d008937 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Fri, 23 Jan 2015 00:36:19 +0200 Subject: [PATCH 0054/1195] Link to the proposal for truly unsized types. --- text/0000-c-str-deref.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 58eb51958b7..6f9ae17b0ed 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -153,7 +153,8 @@ this method. While it's not possible outside of unsafe code to unintentionally copy out or modify the nominal value of `CStr` under an immutable reference, some unforeseen trouble or confusion can arise due to the structure having a -bogus size. +bogus size. A separate [RFC PR](https://github.com/rust-lang/rfcs/issues/709), +if accepted, will solve this by opting out of `Sized`. # Alternatives @@ -170,6 +171,10 @@ is established. # Unresolved questions +`CStr` can be made a +[truly unsized type](https://github.com/rust-lang/rfcs/issues/709), +pending on that proposal's approval. + There is room for a helper type wrapping an allocated C string with a supplied deallocation function to invoke when dropped. That type should also dereference to `CStr`. My library crate [c_string](https://crates.io/crates/c_string) From 3f8194fd407dab8bca00c259ce434de718864480 Mon Sep 17 00:00:00 2001 From: James Miller Date: Fri, 30 Jan 2015 12:22:26 +1300 Subject: [PATCH 0055/1195] Specify values to allow for ordered comparison --- text/0000-discriminant-intrinsic.md | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/text/0000-discriminant-intrinsic.md b/text/0000-discriminant-intrinsic.md index bcbf079ac1b..2f397b9d9f6 100644 --- a/text/0000-discriminant-intrinsic.md +++ b/text/0000-discriminant-intrinsic.md @@ -61,21 +61,26 @@ For any given type, `discriminant_value` will return a `u64` value. The values r specified: * **Non-Enum Type**: Always 0 -* **Enum Type**: A value that uniquely identifies that variant within its type. I.E. for a given - enum typem `E`, and two values of type `E`, `a` and `b`, the expression `discriminant_value(a) == - discriminant_value(b)` is true iff `a` and `b` are the same variant. Two values of different types - may return the same discriminant value. +* **C-Like Enum Type**: If no variants have fields, then the enum is considered "C-Like". The user + is able to specify discriminant values in this case, and the return value would be equivalent to + the result of casting the variant to a `u64`. +* **ADT Enum Type**: If any variant has a field, then the enum is conidered to be an "ADT" enum. The + user is not able to specify the discriminant value in this case. The precise values are + unspecified, but have the following characteristics: -The reason for this specification is to allow flexibilty in usage of the intrinsic without -compromising our ability to change the representation at will. + * The value returned for the same variant of the same enum type will compare as + equal. I.E. `discriminant_value(v) == discriminant_value(v)`. + * Two values returned for different variants will compare as unequal relative to their respective + listed positions. That means that if variant `A` is listed before variant `B`, then + `discriminant_value(A) < discriminant_value(B)`. + +Note the returned values for two differently-typed variants may compare in any way. # Drawbacks * Potentially exposes implementation details. However, relying the specific values returned from `discriminant_value` should be considered bad practice, as the intrinsic provides no such guarantee. -* Does not allow for the value to be used as part of ordering. - * Allows non-enum types to be provided. This may be unexpected by some users. # Alternatives From 34a8dbb7dbd098505541b77c84550c12f7835798 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Fri, 30 Jan 2015 10:50:46 +0200 Subject: [PATCH 0056/1195] Removed the lifetime anchor on CStr::from_raw_ptr RFC PR #556 is likely to get discarded. --- text/0000-c-str-deref.md | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 6f9ae17b0ed..61a68f1a4e9 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -113,13 +113,10 @@ made workable in static expressions through a compiler plugin. In cases when an FFI function returns a pointer to a non-owned C string, it might be preferable to wrap the returned string safely as a 'thin' `&CStr` rather than scan it into a slice up front. To facilitate this, -conversion from a raw pointer should be added (using the -[lifetime anchor](https://github.com/rust-lang/rfcs/pull/556) convention): +conversion from a raw pointer should be added: ```rust impl CStr { - pub unsafe fn from_raw_ptr<'a, T: ?Sized>(ptr: *const libc::c_char, - life_anchor: &'a T) - -> &'a CStr + pub unsafe fn from_raw_ptr<'a>(ptr: *const libc::c_char) -> &'a CStr { ... } } ``` From a08cc3a4d525a36a9e3805bbef53e6f044b44d46 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Fri, 30 Jan 2015 10:56:22 +0200 Subject: [PATCH 0057/1195] Dropped the paragraph on OwnedCString This would need the allocator API to get resolved, as pointed out by John Ericson: https://github.com/rust-lang/rfcs/pull/592#issuecomment-71399334 --- text/0000-c-str-deref.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 61a68f1a4e9..4ddadb340c2 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -172,9 +172,4 @@ is established. [truly unsized type](https://github.com/rust-lang/rfcs/issues/709), pending on that proposal's approval. -There is room for a helper type wrapping an allocated C string with a supplied -deallocation function to invoke when dropped. That type should also dereference -to `CStr`. My library crate [c_string](https://crates.io/crates/c_string) -provides an example in `OwnedCString`. - Need a `Cow`? From ef75b6ac7b732ea249af79af2a815750f58b0aec Mon Sep 17 00:00:00 2001 From: Nathaniel Theis Date: Fri, 30 Jan 2015 14:44:55 -0800 Subject: [PATCH 0058/1195] std::iter::once --- text/0000-std-iter-once.md | 47 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 text/0000-std-iter-once.md diff --git a/text/0000-std-iter-once.md b/text/0000-std-iter-once.md new file mode 100644 index 00000000000..958181420d4 --- /dev/null +++ b/text/0000-std-iter-once.md @@ -0,0 +1,47 @@ +- Start Date: (fill me in with today's date, YYYY-MM-DD) 2015-1-30 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add a `once` function to `std::iter` to construct an iterator yielding a given value one time. + +# Motivation + +This is a common task when working with iterators. Currently, this can be done in many ways, most of which are unergonomic, do not work for all types (e.g. requiring Copy/Clone), or both. `once` is simple to implement, simple to use, and simple to understand. + +# Detailed design + +`once` will return a new struct, `std::iter::Once`, implementing Iterator. Internally, `Once` is simply a newtype wrapper around `std::option::IntoIter`. The actual body of `once` is thus trivial: + +```rust +pub struct Once(std::option::IntoIter); + +pub fn once(x: T) -> Once { + Once( + Some(x).into_iter() + ) +} +``` + +The `Once` wrapper struct exists to allow future backwards-compatible changes, and hide the implementation. + +# Drawbacks + +Although a tiny amount of code, it still does come with a testing, maintainance, etc. cost. + +It's already possible to do this via `Some(x).into_iter()`, `std::iter::repeat(x).take(1)` (for `x: Clone`), `vec![x].into_iter()`, various contraptions involving `iterate`... + +The existence of the `Once` struct is not technically necessary. + +# Alternatives + +There are already many, many alternatives to this- `Option::into_iter()`, `iterate`... + +The `Once` struct could be not used, with `std::option::IntoIter` used instead. + +# Unresolved questions + +Naturally, `once` is fairly bikesheddable. `one_time`? `repeat_once`? + +Are versions of `once` that return `&T`/`&mut T` desirable? From b30ad7141d43c7df0bac5ab76b1b5b0d1a735d9f Mon Sep 17 00:00:00 2001 From: Nathaniel Theis Date: Fri, 30 Jan 2015 14:48:46 -0800 Subject: [PATCH 0059/1195] formatting --- text/0000-std-iter-once.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-std-iter-once.md b/text/0000-std-iter-once.md index 958181420d4..5ef066e1086 100644 --- a/text/0000-std-iter-once.md +++ b/text/0000-std-iter-once.md @@ -1,4 +1,4 @@ -- Start Date: (fill me in with today's date, YYYY-MM-DD) 2015-1-30 +- Start Date: 2015-1-30 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) From 6ac0f88e1a7a13eb4485aef421f925be485375e2 Mon Sep 17 00:00:00 2001 From: James Miller Date: Tue, 3 Feb 2015 12:31:10 +1300 Subject: [PATCH 0060/1195] Alter some wording, fix markdown --- text/0000-remove-ndebug.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-remove-ndebug.md b/text/0000-remove-ndebug.md index 10a733b006e..5cf9f4a4fcb 100644 --- a/text/0000-remove-ndebug.md +++ b/text/0000-remove-ndebug.md @@ -5,7 +5,7 @@ # Summary Remove official support for the `ndebug` config variable, replace the current usage of it with a -more appropriate 'debug_assertions` compiler-provided config variable. +more appropriate `debug_assertions` compiler-provided config variable. # Motivation @@ -23,9 +23,9 @@ a natural consequence. # Detailed design -The `debug_assertions` variable, the replacement for the `ndebug` variable, will be compiler -provided based on the value of the `opt-level` codegen flag, including the implied value from `-O`. -Any value higher than 0 will disable the variable. +The `debug_assertions` configuration variable, the replacement for the `ndebug` variable, will be +compiler provided based on the value of the `opt-level` codegen flag, including the implied value +from `-O`. Any value higher than 0 will disable the variable. Another codegen flag `debug-assertions` will override this, forcing it on or off based on the value passed to it. From 57690bf4fb49298cb7c7b0ffaa0f74eb491a4522 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 3 Feb 2015 18:29:24 +1300 Subject: [PATCH 0061/1195] Type ascription Closes #354 --- text/0000-type-ascription.md | 174 +++++++++++++++++++++++++++++++++++ 1 file changed, 174 insertions(+) create mode 100644 text/0000-type-ascription.md diff --git a/text/0000-type-ascription.md b/text/0000-type-ascription.md new file mode 100644 index 00000000000..733c045ae14 --- /dev/null +++ b/text/0000-type-ascription.md @@ -0,0 +1,174 @@ +- Start Date: 2015-2-3 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add type ascription to expressions and patterns. + +Type ascription on expression has already been implemented. Type ascription on +patterns can probably wait until post-1.0. + +See also discussion on #354 and [issue 10502](https://github.com/rust-lang/rust/issues/10502). + + +# Motivation + +Type inference is imperfect. It is often useful to help type inference by +annotating a sub-expression or sub-pattern with a type. Currently, this is only +possible by extracting the sub-expression into a variable using a `let` +statement and/or giving a type for a whole expression or pattern. This is un- +ergonomic, and sometimes impossible due to lifetime issues. Specifically, a +variable has lifetime of its enclosing scope, but a sub-expression's lifetime is +typically limited to the nearest semi-colon. + +Typical use cases are where a function's return type is generic (e.g., collect) +and where we want to force a coercion. + +Type ascription can also be used for documentation and debugging - where it is +unclear from the code which type will be inferred, type ascription can be used +to precisely communicate expectations to the compiler or other programmers. + +By allowing type ascription in more places, we remove the inconsistency that +type ascription is currently only allowed on top-level patterns. + +## Examples: + +Generic return type: + +``` +// Current. +let z = if ... { + let x: Vec<_> = foo.enumerate().collect(); + x +} else { + ... +}; + +// With type ascription. +let z = if ... { + foo.enumerate().collect(): Vec<_> +} else { + ... +}; +``` + +Coercion: + +``` +fn foo(a: T, b: T) { ... } + +// Current. +let x = [1u32, 2, 4]; +let y = [3u32]; +... +let x: &[_] = &x; +let y: &[_] = &y; +foo(x, y); + +// With type ascription. +let x = [1u32, 2, 4]; +let y = [3u32]; +... +foo(x: &[_], y: &[_]); +``` + +In patterns: + +``` +struct Foo { a: T, b: String } + +// Current +fn foo(Foo { a, .. }: Foo) { ... } + +// With type ascription. +fn foo(Foo { a: i32, .. }) { ... } +``` + + +# Detailed design + +The syntax of expressions is extended with type ascription: + +``` +e ::= ... | e: T +``` + +where `e` is an expression and `T` is a type. Type ascription has the same +precedence as explicit coercions using `as`. + +When type checking `e: T`, `e` must have type `T`. The `must have type` test +includes implicit coercions and subtyping, but not explicit coercions. `T` may +be any well-formed type. + +At runtime, type ascription is a no-op, unless an implicit coercion was used in +type checking, in which case the dynamic semantics of a type ascription +expression are exactly those of the implicit coercion. + +The syntax of sub-patterns is extended to include an optional type ascription. +Old syntax: + +``` +P ::= SP: T | SP +SP ::= var | 'box' SP | ... +``` + +where `P` is a pattern, `SP` is a sub-pattern, `T` is a type, and `var` is a +variable name. + +New syntax: + +``` +P ::= SP: T | SP +SP ::= var | 'box' P | ... +``` + +Type ascription in patterns has the narrowest precedence, e.g., `box x: T` means +`box (x: T)`. + +In type checking, if an expression is matched against a pattern, when matching +a sub-pattern the matching sub-expression must have the ascribed type (again, +this check includes subtyping and implicit coercion). Types in patterns play no +role at runtime. + +@eddyb has implemented the expressions part of this RFC, +[PR](https://github.com/rust-lang/rust/pull/21836). + + +# Drawbacks + +More syntax, another feature in the language. + +Interacts poorly with struct initialisers (changing the syntax for struct +literals has been [discussed and rejected](https://github.com/rust-lang/rfcs/pull/65) +and again in [discuss](http://internals.rust-lang.org/t/replace-point-x-3-y-5-with-point-x-3-y-5/198)). + +If we introduce named arguments in the future, then it would make it more +difficult to support the same syntax as field initialisers. + + +# Alternatives + +We could do nothing and force programmers to use temporary variables to specify +a type. However, this is less ergonomic and has problems with scopes/lifetimes. +Patterns can be given a type as a whole rather than annotating a part of the +pattern. + +We could allow type ascription in expressions but not patterns. This is a +smaller change and addresses most of the motivation. + +Rely on explicit coercions - the current plan [RFC 401](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md) +is to allow explicit coercion to any valid type and to use a customisable lint +for trivial casts (that is, those given by subtyping, including the identity +case). If we allow trivial casts, then we could always use explicit coercions +instead of type ascription. However, we would then lose the distinction between +implicit coercions which are safe and explicit coercions, such as narrowing, +which require more programmer attention. This also does not help with patterns. + + +# Unresolved questions + +Is the suggested precedence correct? Especially for patterns. + +Does type ascription on patterns have backwards compatibility issues? + From fa582dc25a04888d5025953049c7f2153e55dc0a Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 3 Feb 2015 18:33:20 +1300 Subject: [PATCH 0062/1195] Update links --- text/0000-type-ascription.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-type-ascription.md b/text/0000-type-ascription.md index 733c045ae14..3aaa6e6e16f 100644 --- a/text/0000-type-ascription.md +++ b/text/0000-type-ascription.md @@ -9,7 +9,8 @@ Add type ascription to expressions and patterns. Type ascription on expression has already been implemented. Type ascription on patterns can probably wait until post-1.0. -See also discussion on #354 and [issue 10502](https://github.com/rust-lang/rust/issues/10502). +See also discussion on [#354](https://github.com/rust-lang/rfcs/issues/354) and +[rust issue 10502](https://github.com/rust-lang/rust/issues/10502). # Motivation From e54f3da9c0ecc18641bd6677795821688e966412 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 3 Feb 2015 20:42:26 +0200 Subject: [PATCH 0063/1195] Dropped macro c_str! The macro is not essential to the proposal. --- text/0000-c-str-deref.md | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 4ddadb340c2..d5d96e2c190 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -14,7 +14,8 @@ fn safe_puts(s: &CStr) { } fn main() { - safe_puts(c_str!("Look ma, a `&'static CStr` from a literal!")); + let s = CString::from_slice("A Rust string"); + safe_puts(s); } ``` @@ -90,24 +91,6 @@ to ensure that the static data does not contain any unintended interior NULs first `'\0'` encountered). For non-literal data, `CStrBuf::from_bytes` or `CStrBuf::from_vec` should be preferred. -## c_str! - -For added convenience in passing literal string data to FFI functions, -a macro is provided that appends a literal with `"\0"` and returns it -as `&'static CStr`: -```rust -#[macro_export] -macro_rules! c_str { - ($lit:expr) => { - $crate::ffi::CStr::from_static_str(concat!($lit, "\0")) - } -} -``` -Going forward, it would be good to make `c_str!` also accept byte strings -on input, through a [byte string concatenation -macro](https://github.com/rust-lang/rfcs/pull/566). Ultimately, it could be -made workable in static expressions through a compiler plugin. - ## Returning C strings In cases when an FFI function returns a pointer to a non-owned C string, From 3c904f310afacdd3fb03857d375762652866649e Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 3 Feb 2015 21:00:03 +0200 Subject: [PATCH 0064/1195] Renamed the new functions As suggested by Alex Crichton: https://github.com/rust-lang/rfcs/pull/592#issuecomment-72556786 --- text/0000-c-str-deref.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index d5d96e2c190..13d2a63dc4f 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -99,22 +99,24 @@ it might be preferable to wrap the returned string safely as a 'thin' conversion from a raw pointer should be added: ```rust impl CStr { - pub unsafe fn from_raw_ptr<'a>(ptr: *const libc::c_char) -> &'a CStr - { ... } + pub unsafe fn from_raw<'a>(ptr: *const libc::c_char) -> &'a CStr { + ... + } } ``` -For getting a slice out of a `CStr` reference, method `parse_as_bytes` is -provided. The name is chosen to reflect the linear cost of calculating the -length. +For getting a slice out of a `CStr` reference, method `to_bytes` is +provided. The name is preferred over `as_bytes` to reflect the linear cost +of calculating the length. ```rust impl CStr { - pub fn parse_as_bytes(&self) -> &[u8] { ... } + pub fn to_bytes(&self) -> &[u8] { ... } + pub fn to_bytes_with_nul(&self) -> &[u8] { ... } } ``` -An odd consequence is that it is valid, if wasteful, to call -`parse_as_bytes` on `CString` via auto-dereferencing. +An odd consequence is that it is valid, if wasteful, to call `to_bytes` on +`CString` via auto-dereferencing. ## Proof of concept From c31768120cf27ae957bc092f3911869493e19993 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 3 Feb 2015 21:03:21 +0200 Subject: [PATCH 0065/1195] Reference to the lifetime RFC The signature of CStr::from_raw may need some explaining. --- text/0000-c-str-deref.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 13d2a63dc4f..d1b8e85008c 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -96,7 +96,8 @@ first `'\0'` encountered). For non-literal data, `CStrBuf::from_bytes` or In cases when an FFI function returns a pointer to a non-owned C string, it might be preferable to wrap the returned string safely as a 'thin' `&CStr` rather than scan it into a slice up front. To facilitate this, -conversion from a raw pointer should be added: +conversion from a raw pointer should be added (with an inferred lifetime +as per another proposed [RFC](https://github.com/rust-lang/rfcs/pull/556)): ```rust impl CStr { pub unsafe fn from_raw<'a>(ptr: *const libc::c_char) -> &'a CStr { From 4e47d1950f763429878ddcea2a2c71c117f4eb17 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 3 Feb 2015 21:05:15 +0200 Subject: [PATCH 0066/1195] Added a question about deprecating c_str_to_bytes --- text/0000-c-str-deref.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index d1b8e85008c..b6354e6d407 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -154,6 +154,9 @@ is established. # Unresolved questions +The present function `c_str_to_bytes(&ptr)` may be deprecated in favor of +the more composable `CStr::from_raw(ptr).to_bytes()`. + `CStr` can be made a [truly unsized type](https://github.com/rust-lang/rfcs/issues/709), pending on that proposal's approval. From 0d9943e43a5f9333d043cf2f82cb63c50045433e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 3 Feb 2015 11:05:26 -0800 Subject: [PATCH 0067/1195] Add in std::net text from original RFC --- text/0517-io-os-reform.md | 124 +++++++++++++++++++++++++++++++++++++- 1 file changed, 122 insertions(+), 2 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 9528922999c..b89aae0155a 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -64,7 +64,10 @@ follow-up PRs against this RFC. * [stdin, stdout, stderr] * [std::env] * [std::fs] (stub) - * [std::net] (stub) + * [std::net] + * [TCP] + * [UDP] + * [Addresses] * [std::process] (stub) * [std::os] * [Odds and ends] @@ -1234,7 +1237,124 @@ This brings the constants into line with our naming conventions elsewhere. ### `std::net` [std::net]: #stdnet -> To be added in a follow-up PR. +The contents of `std::io::net` submodules `tcp`, `udp`, `ip` and +`addrinfo` will be retained but moved into a single `std::net` module; +the other modules are being moved or removed and are described +elsewhere. + +#### TCP +[TCP]: #tcp + +For `TcpStream`, the changes are most easily expressed by giving the signatures directly: + +```rust +// TcpStream, which contains both a reader and a writer + +impl TcpStream { + fn connect(addr: A) -> IoResult; + fn connect_deadline(addr: A, deadline: D) -> IoResult where + A: ToSocketAddr, D: IntoDeadline; + + fn reader(&mut self) -> &mut TcpReader; + fn writer(&mut self) -> &mut TcpWriter; + fn split(self) -> (TcpReader, TcpWriter); + + fn peer_addr(&mut self) -> IoResult; + fn socket_addr(&mut self) -> IoResult; +} + +impl Reader for TcpStream { ... } +impl Writer for TcpStream { ... } + +impl Reader for Deadlined { ... } +impl Writer for Deadlined { ... } + +// TcpReader + +impl Reader for TcpReader { ... } +impl Reader for Deadlined { ... } + +impl TcpReader { + fn peer_addr(&mut self) -> IoResult; + fn socket_addr(&mut self) -> IoResult; + + fn shutdown_token(&mut self) -> ShutdownToken; +} + +// TcpWriter + +impl Writer for TcpWriter { ... } +impl Writer for Deadlined { ... } + +impl TcpWriter { + fn peer_addr(&mut self) -> IoResult; + fn socket_addr(&mut self) -> IoResult; + + fn shutdown_token(&mut self) -> ShutdownToken; +} + +// ShutdownToken + +impl ShutdownToken { + fn shutdown(self); +} + +impl Clone for ShutdownToken { ... } +``` + +The idea is that a `TcpStream` provides both a reader and a writer, +and can be used directly as such, just as it can today. However, the +two sides can also be broken apart via the `split` method, which +allows them to be shipped off to separate threads. Moreover, each side +can yield a `ShutdownToken`, a `Clone` and `Send` value that can be +used to shut down that side of the socket, cancelling any in-progress +blocking operations, much like e.g. `close_read` does today. + +The implementation of the `ShutdownToken` infrastructure should ensure +that there is essentially no cost imposed when the feature is not used +-- in particular, if a `ShutdownToken` has not been requested, a +single `read` or `write` should correspond to a single syscall. + +For `TcpListener`, the only change is to rename `socket_name` to +`socket_addr`. + +For `TcpAcceptor` we will: + +* Add a `socket_addr` method. +* Possibly provide a convenience constructor for `bind`. +* Replace `close_accept` with `cancel_token()`. +* Remove `Clone`. +* Rename `IncomingConnecitons` to `Incoming`. + +#### UDP +[UDP]: #udp + +The UDP infrastructure should change to use the new deadline +infrastructure, but should not provide `Clone`, `ShutdownToken`s, or a +reader/writer split. In addition: + +* `recv_from` should become `recv`. +* `send_to` should become `send`. +* `socket_name` should become `socket_addr`. + +Methods like `multicast` and `ttl` are left as `#[experimental]` for +now (they are derived from libuv's design). + +#### Addresses +[Addresses]: #addresses + +For the current `addrinfo` module: + +* The `get_host_addresses` should be renamed to `lookup_host`. +* All other contents should be removed. + +For the current `ip` module: + +* The `ToSocketAddr` trait should become `ToSocketAddrs` +* The default `to_socket_addr_all` method should be removed. + +The actual address structures could use some scrutiny, but any +revisions there are left as an unresolved question. ### `std::process` [std::process]: #stdprocess From 9ba9437e5d37f722f4a70547f8661bc8c6504829 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 3 Feb 2015 12:11:56 -0800 Subject: [PATCH 0068/1195] Revise text with knowledge learned since proposed --- text/0517-io-os-reform.md | 179 ++++++++++++++++++++++---------------- 1 file changed, 105 insertions(+), 74 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index b89aae0155a..28ceb05506f 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1245,100 +1245,131 @@ elsewhere. #### TCP [TCP]: #tcp -For `TcpStream`, the changes are most easily expressed by giving the signatures directly: +The current `TcpStream` struct will be pared back from where it is today to the +following interface: ```rust // TcpStream, which contains both a reader and a writer impl TcpStream { - fn connect(addr: A) -> IoResult; - fn connect_deadline(addr: A, deadline: D) -> IoResult where - A: ToSocketAddr, D: IntoDeadline; - - fn reader(&mut self) -> &mut TcpReader; - fn writer(&mut self) -> &mut TcpWriter; - fn split(self) -> (TcpReader, TcpWriter); - - fn peer_addr(&mut self) -> IoResult; - fn socket_addr(&mut self) -> IoResult; -} - -impl Reader for TcpStream { ... } -impl Writer for TcpStream { ... } - -impl Reader for Deadlined { ... } -impl Writer for Deadlined { ... } - -// TcpReader - -impl Reader for TcpReader { ... } -impl Reader for Deadlined { ... } - -impl TcpReader { - fn peer_addr(&mut self) -> IoResult; - fn socket_addr(&mut self) -> IoResult; - - fn shutdown_token(&mut self) -> ShutdownToken; + fn connect(addr: &A) -> io::Result; + fn peer_addr(&mut self) -> io::Result; + fn socket_addr(&mut self) -> io::Result; + fn shutdown(&mut self, how: Shutdown) -> io::Result<()>; + fn duplicate(&self) -> io::Result; } -// TcpWriter - -impl Writer for TcpWriter { ... } -impl Writer for Deadlined { ... } +impl Read for TcpStream { ... } +impl Write for TcpStream { ... } +impl<'a> Read for &'a TcpStream { ... } +impl<'a> Write for &'a TcpStream { ... } +#[cfg(unix)] impl AsRawFd for TcpStream { ... } +#[cfg(windows)] impl AsRawSocket for TcpStream { ... } +``` -impl TcpWriter { - fn peer_addr(&mut self) -> IoResult; - fn socket_addr(&mut self) -> IoResult; +* `clone` has been replaced with a `duplicate` function. The implementation of + `duplicate` will map to using `dup` on Unix platforms and + `WSADuplicateSocket` on Windows platforms. The `TcpStream` itself will no + longer be reference counted itself under the hood. +* `close_{read,write}` are both removed in favor of binding the `shutdown` + function directly on sockets. This will map to the `shutdown` function on both + Unix and Windows. +* `set_timeout` has been removed for now (as well as other timeout-related + functions). It is likely that this may come back soon as a binding to + `setsockopt` to the `SO_RCVTIMEO` and `SO_SNDTIMEO` options. This RFC does not + currently proposed adding them just yet, however. +* Implementations of `Read` and `Write` are provided for `&TcpStream`. These + implementations are not necessarily ergonomic to call (requires taking an + explicit reference), but they express the ability to concurrently read and + write from a `TcpStream` + +Various other options such as `nodelay` and `keepalive` will be left +`#[unstable]` for now. + +The `TcpAcceptor` struct will be removed and all functionality will be folded +into the `TcpListener` structure. Specifically, this will be the resulting API: - fn shutdown_token(&mut self) -> ShutdownToken; +```rust +impl TcpListener { + fn bind(addr: &A) -> io::Result; + fn socket_addr(&mut self) -> io::Result; + fn duplicate(&self) -> io::Result; + fn accept(&self) -> io::Result<(TcpStream, SocketAddr)>; + fn incoming(&self) -> Incoming; } -// ShutdownToken - -impl ShutdownToken { - fn shutdown(self); +impl<'a> Iterator for Incoming<'a> { + type Item = io::Result; + ... } - -impl Clone for ShutdownToken { ... } +#[cfg(unix)] impl AsRawFd for TcpListener { ... } +#[cfg(windows)] impl AsRawSocket for TcpListener { ... } ``` -The idea is that a `TcpStream` provides both a reader and a writer, -and can be used directly as such, just as it can today. However, the -two sides can also be broken apart via the `split` method, which -allows them to be shipped off to separate threads. Moreover, each side -can yield a `ShutdownToken`, a `Clone` and `Send` value that can be -used to shut down that side of the socket, cancelling any in-progress -blocking operations, much like e.g. `close_read` does today. - -The implementation of the `ShutdownToken` infrastructure should ensure -that there is essentially no cost imposed when the feature is not used --- in particular, if a `ShutdownToken` has not been requested, a -single `read` or `write` should correspond to a single syscall. - -For `TcpListener`, the only change is to rename `socket_name` to -`socket_addr`. - -For `TcpAcceptor` we will: - -* Add a `socket_addr` method. -* Possibly provide a convenience constructor for `bind`. -* Replace `close_accept` with `cancel_token()`. -* Remove `Clone`. -* Rename `IncomingConnecitons` to `Incoming`. +Some major changes from today's API include: + +* The static distinction between `TcpAcceptor` and `TcpListener` has been + removed (more on this in the [socket][Sockets] section). +* The `clone` functionality has been removed in favor of `duplicate` (same + caveats as `TcpStream`). +* The `close_accept` functionality is removed entirely. This is not currently + implemented via `shutdown` (not supported well across platforms) and is + instead implemented via `select`. This functionality can return at a later + date with a more robust interface. +* The `set_timeout` functionality has also been removed in favor of returning at + a later date in a more robust fashion with `select`. +* The `accept` function no longer takes `&mut self` and returns `SocketAddr`. + The change in mutability is done to express that multiple `accept` calls can + happen concurrently. +* For convenience the iterator does not yield the `SocketAddr` from `accept`. #### UDP [UDP]: #udp -The UDP infrastructure should change to use the new deadline -infrastructure, but should not provide `Clone`, `ShutdownToken`s, or a -reader/writer split. In addition: +The UDP infrastructre will receive a similar face-lift as the TCP infrastructure +will: -* `recv_from` should become `recv`. -* `send_to` should become `send`. -* `socket_name` should become `socket_addr`. +```rust +impl UdpSocket { + fn bind(addr: &A) -> io::Result; + fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, SocketAddr)>; + fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; + fn socket_addr(&self) -> io::Result; + fn duplicate(&self) -> io::Result; +} + +#[cfg(unix)] impl AsRawFd for UdpSocket { ... } +#[cfg(windows)] impl AsRawSocket for UdpSocket { ... } +``` -Methods like `multicast` and `ttl` are left as `#[experimental]` for -now (they are derived from libuv's design). +Some important points of note are: + +* The `send` and `recv` function take `&self` instead of `&mut self` to indicate + that they may be called safely in concurrent contexts. +* All configuration options such as `multicast` and `ttl` are left as + `#[unstable]` for now. +* All timeout support is removed. This may come back in the form of `setsockopt` + (as with TCP streams) or with a more general implementation of `select`. +* `clone` functionality has been replaced with `duplicate`. + +#### Sockets +[Sockets]: #sockets + +The current constructors for `TcpStream`, `TcpListener`, and `UdpSocket` are +largely "convenience constructors" as they do not expose the underlying details +that a socket can be configured before it is bound, connected, or listened on. +One of the more frequent configuration options is `SO_REUSEADDR` which is set by +default for `TcpListener` currently. + +This RFC leaves it as an open question how best to implement this +pre-configuration. The constructors today will likely remain no matter what as +convenience constructors and a new structure would implement consuming methods +to transform itself to each of the various `TcpStream`, `TcpListener`, and +`UdpSocket`. + +This RFC does, however, recommend not adding multiple constructors to the +various types to set various configuration options. This pattern is best +expressed via a flexible socket type to be added at a future date. #### Addresses [Addresses]: #addresses From 7204d12f6214830793902349fbf54da2289964b7 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 3 Feb 2015 23:06:04 +0200 Subject: [PATCH 0069/1195] Reworded the motivation to better outline the need for a special type With an explanation as to why slices and DSTs in general don't cut it. --- text/0000-c-str-deref.md | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index b6354e6d407..4bdd9912dc7 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -23,20 +23,24 @@ fn main() { The type `std::ffi::CString` is used to prepare string data for passing as null-terminated strings to FFI functions. This type dereferences to a -DST, `[libc::c_char]`. The DST, however, is a poor choice for representing -borrowed C string data, since: +DST, `[libc::c_char]`. The slice type, however, is a poor choice for +representing borrowed C string data, since: -1. The slice does not enforce the C string invariant at compile time. +1. A slice does not express the C string invariant at compile time. Safe interfaces wrapping FFI functions cannot take slice references as is without dynamic checks (when null-terminated slices are expected) or building a temporary `CString` internally (in this case plain Rust slices - must be passed with no interior NULs). `CString`, for its part, is an - owning container and is not convenient for passing by reference. A string - literal, for example, would require a `CString` constructed from it at - runtime to pass into a function expecting `&CString`. -2. The primary consumers of the borrowed pointers, FFI functions, do not care - about the 'sized' aspect of the DST. The borrowed reference is - therefore needlessly 'fat' for its primary purpose. + must be passed with no interior NULs). +2. An allocated `CString` buffer is not the only desired source for + borrowed C string data. Specifically, it should be possible to interpret + a raw pointer, unsafely and at zero overhead, as a reference to a + null-terminated string, so that the reference can then be used safely. + However, in order to construct a slice (or a dynamically sized newtype + wrapping a slice), its length has to be determined, which is unnecessary + for the consuming FFI function that will only receive a thin pointer. + Another likely data source are string and byte string literals: provided + that a static string is null-terminated, there should be a way to pass it + to FFI functions without an intermediate allocation in `CString`. As a pattern of owned/borrowed type pairs has been established thoughout other modules (see e.g. From 9cf5ca46d3d0373269ce83ecf2d8a14487671983 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 3 Feb 2015 23:28:07 +0200 Subject: [PATCH 0070/1195] Addressed the drawback of losing the length on deref from CString --- text/0000-c-str-deref.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 4bdd9912dc7..6b27c7abc6d 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -130,13 +130,19 @@ The described changes are implemented in crate # Drawbacks -The change of the deref type is another breaking change to `CString`. +The change of the deref target is another breaking change to `CString`. In practice the main purpose of borrowing from `CString` is to obtain a raw pointer with `.as_ptr()`; for code which only does this and does not expose the slice in type annotations, parameter signatures and so on, the change should not be breaking since `CStr` also provides this method. +Making the deref target practically unsized throws away the length information +intrinsic to `CString` and makes it less useful as a container for bytes. +This is countered by the fact that there are general purpose byte containers +in the core libraries, whereas `CString` addresses the specific need to +convey string data from Rust to C-style APIs. + While it's not possible outside of unsafe code to unintentionally copy out or modify the nominal value of `CStr` under an immutable reference, some unforeseen trouble or confusion can arise due to the structure having a From 7e5cb6f8eb2071db6aab9a1147b77ef890a63112 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Wed, 4 Feb 2015 00:14:41 +0200 Subject: [PATCH 0071/1195] Drop CStr::from_static_str Also better describe the case for CStr::from_static_bytes. --- text/0000-c-str-deref.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 6b27c7abc6d..2f46cfeefb6 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -78,17 +78,18 @@ impl Deref for CString { ## Static C strings -A way to create `CStr` references from static Rust expressions asserted as -null-terminated string or byte slices is provided by a couple of functions: +An important special case is producing `CStr` references from static Rust +data, primarily from literals. To avoid copying the data, it is required that +the source slice is null-terminated. The conversion function is otherwise +safe: ```rust impl CStr { pub fn from_static_bytes(bytes: &'static [u8]) -> &'static CStr { ... } - pub fn from_static_str(s: &'static str) -> &'static CStr { ... } } ``` -As these functions mostly work with literals, they only assert that the +As this function mostly works with literal data, it only asserts that the slice is terminated by a zero byte. It's the responsibility of the programmer to ensure that the static data does not contain any unintended interior NULs (the program will not crash, but the string will be interpreted up to the @@ -126,7 +127,7 @@ An odd consequence is that it is valid, if wasteful, to call `to_bytes` on ## Proof of concept The described changes are implemented in crate -[c_string](https://github.com/mzabaluev/rust-c-str/tree/v0.3.0). +[c_string](https://github.com/mzabaluev/rust-c-str). # Drawbacks From cb10c25968502d27b4f7c62de1d8439ab0505014 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 3 Feb 2015 15:00:48 -0800 Subject: [PATCH 0072/1195] Clarify that types are Send/Sync Also remove some `&mut self` methods as only `&self` is necessary. --- text/0517-io-os-reform.md | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 28ceb05506f..ba23db65d47 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1253,9 +1253,9 @@ following interface: impl TcpStream { fn connect(addr: &A) -> io::Result; - fn peer_addr(&mut self) -> io::Result; - fn socket_addr(&mut self) -> io::Result; - fn shutdown(&mut self, how: Shutdown) -> io::Result<()>; + fn peer_addr(&self) -> io::Result; + fn socket_addr(&self) -> io::Result; + fn shutdown(&self, how: Shutdown) -> io::Result<()>; fn duplicate(&self) -> io::Result; } @@ -1284,7 +1284,8 @@ impl<'a> Write for &'a TcpStream { ... } write from a `TcpStream` Various other options such as `nodelay` and `keepalive` will be left -`#[unstable]` for now. +`#[unstable]` for now. The `TcpStream` structure will also adhere to both `Send` +and `Sync`. The `TcpAcceptor` struct will be removed and all functionality will be folded into the `TcpListener` structure. Specifically, this will be the resulting API: @@ -1292,7 +1293,7 @@ into the `TcpListener` structure. Specifically, this will be the resulting API: ```rust impl TcpListener { fn bind(addr: &A) -> io::Result; - fn socket_addr(&mut self) -> io::Result; + fn socket_addr(&self) -> io::Result; fn duplicate(&self) -> io::Result; fn accept(&self) -> io::Result<(TcpStream, SocketAddr)>; fn incoming(&self) -> Incoming; @@ -1323,11 +1324,13 @@ Some major changes from today's API include: happen concurrently. * For convenience the iterator does not yield the `SocketAddr` from `accept`. +The `TcpListener` type will also adhere to `Send` and `Sync`. + #### UDP [UDP]: #udp -The UDP infrastructre will receive a similar face-lift as the TCP infrastructure -will: +The UDP infrastructure will receive a similar face-lift as the TCP +infrastructure will: ```rust impl UdpSocket { @@ -1352,6 +1355,8 @@ Some important points of note are: (as with TCP streams) or with a more general implementation of `select`. * `clone` functionality has been replaced with `duplicate`. +The `UdpSocket` type will adhere to both `Send` and `Sync`. + #### Sockets [Sockets]: #sockets From e65afae4297130261730d1b40853d3a7330171bd Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Wed, 4 Feb 2015 08:39:58 +0200 Subject: [PATCH 0073/1195] Renamed from_raw to from_ptr --- text/0000-c-str-deref.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 2f46cfeefb6..a9dee76b09a 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -105,7 +105,7 @@ conversion from a raw pointer should be added (with an inferred lifetime as per another proposed [RFC](https://github.com/rust-lang/rfcs/pull/556)): ```rust impl CStr { - pub unsafe fn from_raw<'a>(ptr: *const libc::c_char) -> &'a CStr { + pub unsafe fn from_ptr<'a>(ptr: *const libc::c_char) -> &'a CStr { ... } } @@ -166,7 +166,7 @@ is established. # Unresolved questions The present function `c_str_to_bytes(&ptr)` may be deprecated in favor of -the more composable `CStr::from_raw(ptr).to_bytes()`. +the more composable `CStr::from_ptr(ptr).to_bytes()`. `CStr` can be made a [truly unsized type](https://github.com/rust-lang/rfcs/issues/709), From dbbbb11f68c27c47381ebc08761892b6b0a8e80d Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Wed, 4 Feb 2015 08:45:21 +0200 Subject: [PATCH 0074/1195] Dropped CStr::from_static_bytes The semantics are controversial, as detailed in https://github.com/rust-lang/rfcs/pull/592#issuecomment-72751416 --- text/0000-c-str-deref.md | 20 -------------------- 1 file changed, 20 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index a9dee76b09a..7d59f06fd4e 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -76,26 +76,6 @@ impl Deref for CString { } ``` -## Static C strings - -An important special case is producing `CStr` references from static Rust -data, primarily from literals. To avoid copying the data, it is required that -the source slice is null-terminated. The conversion function is otherwise -safe: - -```rust -impl CStr { - pub fn from_static_bytes(bytes: &'static [u8]) -> &'static CStr { ... } -} -``` - -As this function mostly works with literal data, it only asserts that the -slice is terminated by a zero byte. It's the responsibility of the programmer -to ensure that the static data does not contain any unintended interior NULs -(the program will not crash, but the string will be interpreted up to the -first `'\0'` encountered). For non-literal data, `CStrBuf::from_bytes` or -`CStrBuf::from_vec` should be preferred. - ## Returning C strings In cases when an FFI function returns a pointer to a non-owned C string, From 2bc75821640dcd4a3128e087ce94e154d14cafad Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Wed, 4 Feb 2015 08:52:19 +0200 Subject: [PATCH 0075/1195] Promoted deprecation of `c_str_to_bytes` into the proposed changes --- text/0000-c-str-deref.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 7d59f06fd4e..dc30e158e58 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -104,6 +104,13 @@ impl CStr { An odd consequence is that it is valid, if wasteful, to call `to_bytes` on `CString` via auto-dereferencing. +## Remove c_str_to_bytes + +The functions `c_str_to_bytes` and `c_str_to_bytes_with_nul`, with their +problematic lifetime semantics, are deprecated and eventually removed +in favor of composition of the functions described above: +`c_str_to_bytes(&ptr)` becomes `CStr::from_ptr(ptr).to_bytes()`. + ## Proof of concept The described changes are implemented in crate @@ -145,9 +152,6 @@ is established. # Unresolved questions -The present function `c_str_to_bytes(&ptr)` may be deprecated in favor of -the more composable `CStr::from_ptr(ptr).to_bytes()`. - `CStr` can be made a [truly unsized type](https://github.com/rust-lang/rfcs/issues/709), pending on that proposal's approval. From 116cb2a4d8ad4d51b4db30655923e063d83a8fdc Mon Sep 17 00:00:00 2001 From: P1start Date: Fri, 6 Feb 2015 12:36:00 +1300 Subject: [PATCH 0076/1195] [T, ..n] => [T; n] --- text/0000-array-pattern-changes.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/text/0000-array-pattern-changes.md b/text/0000-array-pattern-changes.md index 06d7e19f7ce..c568e7ddb05 100644 --- a/text/0000-array-pattern-changes.md +++ b/text/0000-array-pattern-changes.md @@ -7,9 +7,9 @@ Summary Change array/slice patterns in the following ways: -- Make them only match on arrays (`[T, ..n]` and `[T]`), not slices; -- Make subslice matching yield a value of type `[T, ..n]` or `[T]`, not `&[T]` - or `&mut [T]`; +- Make them only match on arrays (`[T; n]` and `[T]`), not slices; +- Make subslice matching yield a value of type `[T; n]` or `[T]`, not `&[T]` or + `&mut [T]`; - Allow multiple mutable references to be made to different parts of the same array or slice in array patterns (resolving rust-lang/rust [issue #8636](https://github.com/rust-lang/rust/issues/8636)). @@ -21,9 +21,9 @@ Before DST (and after the removal of `~[T]`), there were only two types based on `[T]`: `&[T]` and `&mut [T]`. With DST, we can have many more types based on `[T]`, `Box<[T]>` in particular, but theoretically any pointer type around a `[T]` could be used. However, array patterns still match on `&[T]`, `&mut [T]`, -and `[T, ..n]` only, meaning that to match on a `Box<[T]>`, one must first -convert it to a slice, which disallows moves. This may prove to significantly -limit the amount of useful code that can be written using array patterns. +and `[T; n]` only, meaning that to match on a `Box<[T]>`, one must first convert +it to a slice, which disallows moves. This may prove to significantly limit the +amount of useful code that can be written using array patterns. Another problem with today’s array patterns is in subslice matching, which specifies that the rest of a slice not matched on already in the pattern should @@ -66,7 +66,7 @@ multiple mutable borrows to the same value (which is not the case). Detailed design =============== -- Make array patterns match only on arrays (`[T, ..n]` and `[T]`). For example, +- Make array patterns match only on arrays (`[T; n]` and `[T]`). For example, the following code: ```rust @@ -89,7 +89,7 @@ Detailed design This change makes slice patterns mirror slice expressions much more closely. -- Make subslice matching in array patterns yield a value of type `[T, ..n]` (if +- Make subslice matching in array patterns yield a value of type `[T; n]` (if the array is of fixed size) or `[T]` (if not). This means changing most code that looks like this: @@ -112,8 +112,8 @@ Detailed design ``` It should be noted that if a fixed-size array is matched on using subslice - matching, and `ref` is used, the type of the binding will be `&[T, ..n]`, - *not* `&[T]`. + matching, and `ref` is used, the type of the binding will be `&[T; n]`, *not* + `&[T]`. - Improve the compiler’s analysis of multiple mutable references to the same value within array patterns. This would be done by allowing multiple mutable From 1f4cf0e02bb2fe3a0340cacecde9e22b8d35391d Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Sun, 8 Feb 2015 18:14:53 -0500 Subject: [PATCH 0077/1195] Update per feedback --- text/0000-api-comment-conventions.md | 41 +++++++++++++++++++++------- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/text/0000-api-comment-conventions.md b/text/0000-api-comment-conventions.md index edd7cdf878d..353d585b0a1 100644 --- a/text/0000-api-comment-conventions.md +++ b/text/0000-api-comment-conventions.md @@ -17,7 +17,9 @@ but it tries to motivate and clarify them. # Detailed design -There are a number of indivudal guidelines: +There are a number of individual guidelines. Most of these guidelines are for +any Rust project, but some are specific to documenting `rustc` itself and the +standard library. These are called out specifically in the text itself. ## Use line comments @@ -37,7 +39,25 @@ Instead of: */ ``` -Only use inner doc comments //! to write crate and module-level documentation, nothing else. +Only use inner doc comments //! to write crate and module-level documentation, +nothing else. When using `mod` blocks, prefer `///` outside of the block: + +```rust +/// This module contains tests +mod test { + // ... +} +``` + +over + +```rust +mod test { + //! This module contains tests + + // ... +} +``` ## Formatting @@ -45,9 +65,8 @@ The first line in any doc comment should be a single-line short sentence providing a summary of the code. This line is used as a summary description throughout Rustdoc's output, so it's a good idea to keep it short. -All doc comments, including the summary line, should begin with a capital -letter and end with a period, question mark, or exclamation point. Prefer full -sentences to fragments. +All doc comments, including the summary line, should be property punctuated. +Prefer full sentences to fragments. The summary line should be written in third person singular present indicative form. Basically, this means write "Returns" instead of "Return". @@ -91,22 +110,24 @@ being explicit, so that it highlights syntax in places that do not, like GitHub. Rustdoc is able to test all Rust examples embedded inside of documentation, so it's important to mark what is not Rust so your tests don't fail. -References and citation should be linked inline. Prefer +References and citation should be linked 'reference style.' Prefer ``` -[some paper](http://www.foo.edu/something.pdf) +[some paper][something] + +[something]: http://www.foo.edu/something.pdf) ``` to ``` -some paper[1] - -1: http://www.foo.edu/something.pdf +[some paper][http://www.foo.edu/something.pdf] ``` ## English +This section applies to `rustc` and the standard library. + All documentation is standardized on American English, with regards to spelling, grammar, and punctuation conventions. Language changes over time, so this doesn't mean that there is always a correct answer to every grammar From 9b73205c07b0aaf721f701b2035741bdcf028533 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 9 Feb 2015 11:53:44 -0500 Subject: [PATCH 0078/1195] Amend object safety RFC to include the complete set of rules, along with an exemption that permits individual methods to require that `Self:Sized`. --- text/0255-object-safety.md | 111 +++++++++++++++++++++++++++---------- 1 file changed, 83 insertions(+), 28 deletions(-) diff --git a/text/0255-object-safety.md b/text/0255-object-safety.md index a93e289d972..58afbd20a21 100644 --- a/text/0255-object-safety.md +++ b/text/0255-object-safety.md @@ -18,7 +18,7 @@ is stronger due to part of the DST changes. Part of the planned, in progress DST work is to allow trait objects where a trait is expected. Example: -``` +```rust fn foo(y: &T) { ... } fn bar(x: &SomeTrait) { @@ -40,7 +40,7 @@ safe, then we say `T` is object-safe. If we ignore this restriction we could allow code such as the following: -``` +```rust trait SomeTrait { fn foo(&self, other: &Self) { ... } // assume self and other have the same concrete type } @@ -70,63 +70,108 @@ where a trait object is used with a generic call and would be something like "type error: SomeTrait does not implement SomeTrait" - no indication that the non-object-safe method were to blame, only a failure in trait matching. +Another advantage of this proposal is that it implies that all +method-calls can always be rewritten into an equivalent [UFCS] +call. This simplifies the "core language" and makes method dispatch +notation -- which involves some non-trivial inference -- into a kind +of "sugar" for the more explicit UFCS notation. # Detailed design -To be precise about object-safety, an object-safe method: -* must not have any type parameters, -* must not take `self` by value, -* must not use `Self` (in the future, where we allow arbitrary types for the - receiver, `Self` may only be used for the type of the receiver and only where - we allow `Sized?` types). +To be precise about object-safety, an object-safe method must meet one +of the following conditions: + +* require `Self : Sized`; or, +* meet all of the following conditions: + * must not have any type parameters; and, + * must not take `self` by value; and, + * must not use `Self` (in the future, where we allow arbitrary types + for the receiver, `Self` may only be used for the type of the + receiver and only where we allow `Sized?` types). -A trait is object-safe if all of its methods are object-safe. +A trait is object-safe if all of the following conditions hold: + +* all of its methods are object-safe; and, +* the trait does not require that `Self : Sized` (see also [RFC 546]). When an expression with pointer-to-concrete type is coerced to a trait object, the compiler will check that the trait is object-safe (in addition to the usual check that the concrete type implements the trait). It is an error for the trait to be non-object-safe. - # Drawbacks -This is a breaking change and forbids some safe code which is legal today. This -can be addressed by splitting a trait into object-safe and non-object-safe -parts. We hope that this will lead to better design. We are not sure how much -code this will affect, it would be good to have data about this. +This is a breaking change and forbids some safe code which is legal +today. This can be addressed in two ways: splitting traits, or adding +`where Self:Sized` clauses to methods that cannot not be used with +objects. -Example, today: +### Example problem -``` +Here is an example trait that is not object safe: + +```rust trait SomeTrait { fn foo(&self) -> int { ... } - fn bar(&self, u: Box) { ... } + + // Object-safe methods may not return `Self`: + fn new() -> Self; } +``` + +### Splitting a trait -fn baz(x: &SomeTrait) { - x.foo(); - //x.bar(box 42i); // type error +One option is to split a trait into object-safe and non-object-safe +parts. We hope that this will lead to better design. We are not sure +how much code this will affect, it would be good to have data about +this. + +```rust +trait SomeTrait { + fn foo(&self) -> int { ... } } +trait SomeTraitCtor : SomeTrait { + fn new() -> Self; +} ``` -with this proposal: +### Adding a where-clause -``` +Sometimes adding a second trait feels like overkill. In that case, it +is often an option to simply add a `where Self:Sized` clause to the +methods of the trait that would otherwise violate the object safety +rule. + +```rust trait SomeTrait { fn foo(&self) -> int { ... } + + fn new() -> Self + where Self : Sized; // this condition is new } +``` -trait SomeMoreTrait: SomeTrait { - fn bar(&self, u: Box) { ... } -} +The reason that this makes sense is that if one were writing a generic +function with a type parameter `T` that may range over the trait +object, that type parameter would have to be declared `?Sized`, and +hence would not have access to the `bar` method: -fn baz(x: &SomeTrait) { - x.foo(); - //x.bar(box 42i); // type error +```rust +fn baz(t: &T) { + let v: T = SomeTrait::new(); // illegal because `T : Sized` is not known to hold } ``` +However, if one writes a function with sized type parameter, which +could never be a trait object, then the `bar()` functions becomes +available. + +```rust +fn baz(t: &T) { + let v: T = SomeTrait::new(); // OK +} +``` # Alternatives @@ -148,3 +193,13 @@ approach, this is not necessary. # Unresolved questions N/A + +# Edits + +* 2014-02-09. Edited by Nicholas Matsakis to (1) include the + requirement that object-safe traits do not require `Self:Sized` and + (2) specify that methods may include `where Self:Sized` to overcome + object safety restrictions. + +[UFCS]: 0132-ufcs.md +[RFC 546]: 0546-Self-not-sized-by-default.md From fbeffe6c4922d4dd30108588926dc50747bba5b2 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 9 Feb 2015 15:51:53 -0500 Subject: [PATCH 0079/1195] Clarify the requirements on the type of `self` --- text/0255-object-safety.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/0255-object-safety.md b/text/0255-object-safety.md index 58afbd20a21..664b64b92ab 100644 --- a/text/0255-object-safety.md +++ b/text/0255-object-safety.md @@ -84,7 +84,10 @@ of the following conditions: * require `Self : Sized`; or, * meet all of the following conditions: * must not have any type parameters; and, - * must not take `self` by value; and, + * must have a receiver that dereferences to the `Self` type; + - for now, this means `&self`, `&mut self`, or `self: Box`, + but eventually this should be extended to custom types like + `self: Rc` and so forth. * must not use `Self` (in the future, where we allow arbitrary types for the receiver, `Self` may only be used for the type of the receiver and only where we allow `Sized?` types). From 6afab56505723ff71ab0baf732faa9bdd742b74b Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 9 Feb 2015 15:53:41 -0500 Subject: [PATCH 0080/1195] s/bar/new/ --- text/0255-object-safety.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0255-object-safety.md b/text/0255-object-safety.md index 664b64b92ab..85edf6aa2ef 100644 --- a/text/0255-object-safety.md +++ b/text/0255-object-safety.md @@ -158,7 +158,7 @@ trait SomeTrait { The reason that this makes sense is that if one were writing a generic function with a type parameter `T` that may range over the trait object, that type parameter would have to be declared `?Sized`, and -hence would not have access to the `bar` method: +hence would not have access to the `new` method: ```rust fn baz(t: &T) { @@ -167,7 +167,7 @@ fn baz(t: &T) { ``` However, if one writes a function with sized type parameter, which -could never be a trait object, then the `bar()` functions becomes +could never be a trait object, then the `new` function becomes available. ```rust From 1deb4f8704cdcf18a9577a426ec009f4e15eb376 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 10 Feb 2015 00:05:08 +0200 Subject: [PATCH 0081/1195] Updated the links to unsized types RFC There is now a tracking issue for the postponed RFC. --- text/0000-c-str-deref.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index dc30e158e58..f1c6d9c7e1b 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -134,7 +134,7 @@ convey string data from Rust to C-style APIs. While it's not possible outside of unsafe code to unintentionally copy out or modify the nominal value of `CStr` under an immutable reference, some unforeseen trouble or confusion can arise due to the structure having a -bogus size. A separate [RFC PR](https://github.com/rust-lang/rfcs/issues/709), +bogus size. A separate [RFC](https://github.com/rust-lang/rfcs/issues/813), if accepted, will solve this by opting out of `Sized`. # Alternatives @@ -153,7 +153,7 @@ is established. # Unresolved questions `CStr` can be made a -[truly unsized type](https://github.com/rust-lang/rfcs/issues/709), +[truly unsized type](https://github.com/rust-lang/rfcs/issues/813), pending on that proposal's approval. Need a `Cow`? From 7e360165746f465348607a869d046780796d8656 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 10 Feb 2015 00:09:45 +0200 Subject: [PATCH 0082/1195] Minor editorial --- text/0000-c-str-deref.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index f1c6d9c7e1b..4a1efdb2379 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -118,7 +118,7 @@ The described changes are implemented in crate # Drawbacks -The change of the deref target is another breaking change to `CString`. +The change of the deref target type is another breaking change to `CString`. In practice the main purpose of borrowing from `CString` is to obtain a raw pointer with `.as_ptr()`; for code which only does this and does not expose the slice in type annotations, parameter signatures and so on, From 96e7abfaff83ac7ba42eda424638eb886823a9bf Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Tue, 10 Feb 2015 00:11:36 +0200 Subject: [PATCH 0083/1195] RFC 556 is now approved --- text/0000-c-str-deref.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 4a1efdb2379..218553b115a 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -82,7 +82,7 @@ In cases when an FFI function returns a pointer to a non-owned C string, it might be preferable to wrap the returned string safely as a 'thin' `&CStr` rather than scan it into a slice up front. To facilitate this, conversion from a raw pointer should be added (with an inferred lifetime -as per another proposed [RFC](https://github.com/rust-lang/rfcs/pull/556)): +as per [the established convention](https://github.com/rust-lang/rfcs/pull/556)): ```rust impl CStr { pub unsafe fn from_ptr<'a>(ptr: *const libc::c_char) -> &'a CStr { From 9911675cb62496c4e1805915d8919e3fbbe16f90 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 10 Feb 2015 15:34:37 -0800 Subject: [PATCH 0084/1195] RFC: Simplify `std::hash` Two alternatives are proposed for paring back the API of `std::hash` to be more ergnomic for both consumers and implementors. --- text/0000-hash-simplification.md | 405 +++++++++++++++++++++++++++++++ 1 file changed, 405 insertions(+) create mode 100644 text/0000-hash-simplification.md diff --git a/text/0000-hash-simplification.md b/text/0000-hash-simplification.md new file mode 100644 index 00000000000..56add5c41fd --- /dev/null +++ b/text/0000-hash-simplification.md @@ -0,0 +1,405 @@ +- Feature Name: hash +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Pare back the `std::hash` module's API to match more closely what other +languages such as Java and C++ have. Consequently, this alteration in default +hashing strategy alters the story for DoS protection with the standard library's +`HashMap` implementation. + +# Motivation + +There are a number of motivations for this RFC, and each will be explained in +term. + +## API ergonomics + +Today the API of the `std::hash` module is sometimes considered overly +complicated and it may not be pulling its weight. As a recap, the API looks +like: + +```rust +trait Hash { + fn hash(&self, state: &mut H); +} +trait Hasher { + type Output; + fn reset(&mut self); + fn finish(&self) -> Self::Output; +} +trait Writer { + fn write(&mut self, data: &[u8]); +} +``` + +The `Hash` trait is implemented by various types where the `H` type parameter +signifies the hashing algorithm that the `impl` block corresponds to. Each +`Hasher` is opaque when taken generically and is frequently paired with a bound +of `Writer` to allow feeding in arbitrary bytes. + +The purpose of not having a `Writer` supertrait on `Hasher` or on the `H` type +parameter is to allow hashing algorithms that are *not* byte-stream oriented +(e.g. Java-like algorithms). Unfortunately all primitive types in Rust are only +defined for `Hash where H: Writer + Hasher`, essentially forcing a +byte-stream oriented hashing algorithm for all hashing. + +Some examples of using this API are: + +```rust +use std::hash::{Hash, Hasher, Writer, SipHasher}; + +impl Hash for MyType { + fn hash(&self, s: &mut S) { + self.field1.hash(s); + // don't want to hash field2 + self.field3.hash(s); + } +} + +fn sip_hash>(t: &T) -> u64 { + let mut s = SipHasher::new_with_keys(0, 0); + t.hash(&mut s); + s.finish() +} +``` + +Forcing many `impl` blocks to require `Hasher + Writer` becomes onerous over +times and also requires at least 3 imports for a custom implementation of +`hash`. Taking a generically hashable `T` is also somewhat cumbersome, +especially if the hashing algorithm isn't known in advance. + +Overall the `std::hash` API is generic enough that its usage is somewhat verbose +and becomes tiresome over time to work with. This RFC strives to make this API +easier to work with. + +## Forcing byte-stream oriented hashing + +Much of the `std::hash` API today is oriented around hashing a stream of bytes +(blocks of `&[u8]`). This is not a hard requirement by the API (discussed +above), but in practice this is essentially what happens everywhere. This form +of hashing is not always the most efficient, although it is often one of the +more flexible forms of hashing. + +Other languages such as Java and C++ have a hashing API that looks more like: + +```rust +trait Hash { + fn hash(&self) -> usize; +} +``` + +This expression of hashing is not byte-oriented but is also much less generic +(an algorithm for hashing is predetermined by the type itself). This API is +encodable with today's traits as: + +```rust +struct Slot(u64); + +impl Hash for MyType { + fn hash(&self, slot: &mut Slot) { + *slot = Slot(self.precomputed_hash); + } +} + +impl Hasher for Slot { + type Output = u64; + fn reset(&mut self) { *self = Slot(0); } + fn finish(&self) -> u64 { self.0 } +} +``` + +This form of hashing (which is useful for performance sometimes) is difficult to +work with primarily because of the frequent bounds on `Writer` for hashing. + +## Non-applicability for well-known hashing algorithms + +One of the current aspirations for the `std::hash` module was to be appropriate +for hashing algorithms such as MD5, SHA\*, etc. The current API has proven +inadequate, however, for the primary reason of hashing being so generic. For +example it should in theory be possible to calculate the SHA1 hash of a byte +slice via: + +```rust +let data: &[u8] = ...; +let hash = std::hash::hash::<&[u8], Sha1>(data); +``` + +There are a number of pitfalls to this approach: + +* Due to slices being able to be hashed generically, each byte will be written + individually to the `Sha1` state, which is likely to not be very efficient. +* Due to slices being able to be hashed generically, the length of the slice is + first written to the `Sha1` state, which is likely not desired. + +The key observation is that the hash values produced in a Rust program are +**not** reproducible outside of Rust. For this reason, APIs for reproducible +hashes to be verified elsewhere will explicitly not be considered in the design +for `std::hash`. It is expected that an external crate may wish to provide a +trait for these hashing algorithms and it would not be bounded by +`std::hash::Hash`, but instead perhaps a "byte container" of some form. + +# Detailed design + +This RFC considers two possible designs as a replacement of today's `std::hash` +API. One is a "minor refactoring" of the current API while the +other is a much more radical change towards being conservative. This section +will propose the more radical change and the other may be found in the +[Alternatives](#alternatives) section. + +## API + +The new API of `std::hash` will be: + +```rust +trait Hash { + fn hash(&self) -> usize; +} + +fn combine(a: usize, b: usize) -> usize; +``` + +The `Writer`, `Hasher`, and `SipHasher` structures/traits will all be removed +from `std::hash`. This definition is more or less the Rust equivalent of the +Java/C++ hashing infrastructure. This API is a vast simplification of what +exists today and allows implementations of `Hash` as well as consumers of `Hash` +to quite ergonomically work with hash values as well as hashable objects. + +> **Note**: The choice of `usize` instead of `u64` reflects [C++'s +> choice][cpp-hash] here as well, but it is quite easy to use one instead of +> the other. + +## Hashing algorithm + +With this definition of `Hash`, each type must pre-ordain a particular hash +algorithm that it implements. Using an alternate algorithm would require a +separate newtype wrapper. + +Most implementations will still use `#[derive(Hash)]` which will leverage +`hash::combine` to combine the hash values of aggregate fields. Manual +implementations which only want to hash a select number of fields would look +like: + +```rust +impl Hash for MyType { + fn hash(&self) -> usize { + // ignore field2 + (&self.field1, &self.field3).hash() + } +} +``` + +A possible implementation of combine can be found [in the boost source +code][boost-combine]. + +[boost-combine]: https://github.com/boostorg/functional/blob/master/include/boost/functional/hash/hash.hpp#L209-L213 + +## `HashMap` and DoS protection + +Currently one of the features of the standard library's `HashMap` implementation +is that it by default provides DoS protection through two measures: + +1. A strong hashing algorithm, SipHash 2-4, is used which is fairly difficult to + find collisions with. +2. The SipHash algorithm is randomly seeded for each instance of `HashMap`. The + algorithm is seeded with a 128-bit key. + +These two measures ensure that each `HashMap` is randomly ordered, even if the +same keys are inserted in the same order. As a result, it is quite difficult to +mount a DoS attack against a `HashMap` as it is difficult to predict what +collisions will happen. + +The `Hash` trait proposed above, however, does not allow SipHash to be +implemented generally any more. For example `#[derive(Hash)]` will no longer +leverage SipHash. Additionally, there is no input of state into the `hash` +function, so there is no random state per-`HashMap` to generate different hashes +with. + +Denial of service attacks against hash maps are no new phenomenon, they are +[well](http://www.ocert.org/advisories/ocert-2011-003.html) +[known](http://lwn.net/Articles/474912/) +and have been reported in +[Python](http://bugs.python.org/issue13703), +[Ruby](https://www.ruby-lang.org/en/news/2011/12/28/denial-of-service-attack-was-found-for-rubys-hash-algorithm-cve-2011-4815/) +([other ruby](https://www.ruby-lang.org/en/news/2012/11/09/ruby19-hashdos-cve-2012-5371/)), +[Perl](http://blog.booking.com/hardening-perls-hash-function.html), +and many other languages/frameworks. Rust has taken a fairly proactive step from +the start by using a strong and randomly seeded algorithm since `HashMap`'s +inception. + +In general the standard library does not provide many security-related +guarantees beyond memory safety. For example the new `Read::read_to_end` +function passes a safe buffer of uninitialized data to implementations of +`read` using various techniques to prevent memory safety issues. A DoS attack +against a hash map is such a common and well known exploit, however, that this +RFC considers it critical to consider the design of `Hash` and its relationship +with `HashMap`. + +## Mitigation of DoS attacks + +Other languages have mitigated DoS attacks via a few measures: + +* [C++ specifies][cpp-hash] that the return value of `hash` is not guaranteed to + be stable across program executions, allowing for a global salt to be mixed + into hashes calculated. +* [Ruby has a global seed][ruby-seed] which is randomly initialized on startup + and is used when hashing blocks of memory (e.g. strings). +* PHP and Tomcat have added limits to the maximum amount of keys allowed from a + POST HTTP request (to limit the size of auto-generated maps). This strategy is + not necessarily applicable to the standard library. + +[cpp-hash]: http://en.cppreference.com/w/cpp/utility/hash +[ruby-seed]: https://github.com/ruby/ruby/blob/193ad64359b8ebcd77a2cba50a62d64311e26b22/random.c#L1248-L1251 + +It [has been claimed](http://bugs.python.org/issue13703#msg150558), however, +that a global seed may only mitigate some of the simplest attacks. The primary +downside is that a long-running process may leak the "global seed" through some +other form which could compromise maps in that specific process. + +One possible route to mitigating these attacks with the `Hash` trait above could +be: + +1. All primitives (integers, etc) are `combine`d with a global random seed which + is initialized on first use. +2. Strings will continue to use SipHash as the default algorithm and the + initialization keys will be randomly initialized on first use. + +Given the information available about other DoS mitigations in hash maps for +other languages, however, it is not clear that this will provide the same level +of DoS protection that is available today. + +## `HashMap` and `HashState` + +For both this recommendation as well as the alternative below, this RFC proposes +removing the `HashState` trait and `Hasher` structure (as well as the +`hash_state` module) in favor of the following API: + +```rust +struct HashMap; + +impl HashMap { + fn new() -> HashMap { + HashMap::with_hasher(DefaultHasher::new()) + } +} + +impl u64> HashMap { + fn with_hasher(hasher: H) -> HashMap; +} + +impl Fn(&K) -> u64 for DefaultHasher { + fn call(&self, arg: &K) -> u64 { + arg.hash() as u64 + } +} +``` + +The precise details will be affected based on which design in this RFC is +chosen, but the general idea is to move from a custom trait to the standard `Fn` +trait for calculating hashes. + +# Drawbacks + +* One of the primary drawbacks to the proposed `Hash` trait is that it is now + not possible to select an algorithm that a type should be hashed with. Instead + each type's definition of hashing can only be altered through the use of a + newtype wrapper. + +* Today most Rust types can be hashed using a byte-oriented algorithm, so any + number of these algorithms (e.g. SipHash, Fnv hashing) can be used. With this + new `Hash` definition they are not easily accessible. + +* Due to the lack of input state to hashing, the `HashMap` type can no longer + randomly seed each individual instance but may at best have one global seed. + This consequently may elevate the risk of a DoS attack on a `HashMap` + instance. + +# Alternatives + +As alluded to in the "Detailed design" section the primary alternative to this +RFC, which still improves ergonomics, is to refine the API to require fewer +traits and less generics on consumption. + +## API + +The new API of `std::hash` would be: + +```rust +trait Hash { + fn hash(&self, h: &mut H); +} + +trait Hasher { + type Output; + fn write(&mut self, data: &[u8]); + fn finish(&self) -> Self::Output; + + fn write_u8(&mut self, i: u8) { ... } + fn write_i8(&mut self, i: i8) { ... } + fn write_u16(&mut self, i: u16) { ... } + fn write_i16(&mut self, i: i16) { ... } + fn write_u32(&mut self, i: u32) { ... } + fn write_i32(&mut self, i: i32) { ... } + fn write_u64(&mut self, i: u64) { ... } + fn write_i64(&mut self, i: i64) { ... } + fn write_usize(&mut self, i: usize) { ... } + fn write_isize(&mut self, i: isize) { ... } +} +``` + +This API is quite similar to today's API, but has a few tweaks: + +* The `Writer` trait has been removed by folding it directly into the `Hasher` + trait. As part of this movement the `Hasher` trait grew a number of + specialized `write_foo` methods which the primitives will call. This should + help regain some performance losses where forcing a byte-oriented stream is + a performance loss. + +* The `Hasher` trait no longer has a `reset` method. + +* The `Hash` trait's type parameter is on the *method*, not on the trait. This + implies that the trait is no longer object-safe, but it is much more ergonomic + to operate over generically. + +> **Note**: A possible tweak would be to remove the `Output` associated type in +> favor of just always returning `usize` (or `u64`). + +The purpose of this API is to continue to allow APIs to be generic over the +hashing algorithm used. This would allow `HashMap` continue to use a randomly +keyed SipHash as its default algorithm (e.g. continuing to provide DoS +protection). An example encoding of the previous API would look like: + +```rust +impl Hasher for usize { + type Output = usize; + fn write(&mut self, data: &[u8]) { + for b in data.iter() { self.write_u8(*b); } + } + fn finish(&self) -> usize { *self } + + fn write_u8(&mut self, i: u8) { *self = combine(*self, i); } + // and so on... +} +``` + +Some downsides of this API, however, are: + +* This design is a departure from the precedent set by many other languages. +* Implementations of `Hash` cannot be specialized and are forced to operate + generically over the hashing algorithm provided. This may cause a loss of + performance in some cases. Note that this could be remedied by moving the type + parameter to the trait instead of the method, but this would lead to a loss in + ergonomics for generic consumers of `T: Hash`. +* Manual implementations of `Hash` are somewhat cumbersome still by requiring a + separate `Hasher` parameter which is not necessarily always desired. +* The API of `Hasher` is approaching the realm of serialization/reflection and + it's unclear whether its API should grow over time to support more basic Rust + types. + +# Unresolved questions + +* To what degree should `HashMap` attempt to prevent DoS attacks? Is it the + responsibility of the standard library to do so or should this be provided as + an external crate on crates.io? From 4ea7f463034d30574f02ea00cbec711c7032fd31 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 11 Feb 2015 13:32:28 -0500 Subject: [PATCH 0085/1195] Amend text to also consider by-value self methods as object safe, using the same logic as before. --- text/0255-object-safety.md | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/text/0255-object-safety.md b/text/0255-object-safety.md index 85edf6aa2ef..327e77550f9 100644 --- a/text/0255-object-safety.md +++ b/text/0255-object-safety.md @@ -84,8 +84,8 @@ of the following conditions: * require `Self : Sized`; or, * meet all of the following conditions: * must not have any type parameters; and, - * must have a receiver that dereferences to the `Self` type; - - for now, this means `&self`, `&mut self`, or `self: Box`, + * must have a receiver that has type `Self` or which dereferences to the `Self` type; + - for now, this means `self`, `&self`, `&mut self`, or `self: Box`, but eventually this should be extended to custom types like `self: Rc` and so forth. * must not use `Self` (in the future, where we allow arbitrary types @@ -102,6 +102,15 @@ the compiler will check that the trait is object-safe (in addition to the usual check that the concrete type implements the trait). It is an error for the trait to be non-object-safe. +Note that a trait can be object-safe even if some of its methods use +features that are not supported with an object receiver. This is true +when code that attempted to use those features would only work if the +`Self` type is `Sized`. This is why all methods that require +`Self:Sized` are exempt from the typical rules. This is also why +by-value self methods are permitted, since currently one cannot invoke +pass an unsized type by-value (though we consider that a useful future +extension). + # Drawbacks This is a breaking change and forbids some safe code which is legal @@ -178,21 +187,14 @@ fn baz(t: &T) { # Alternatives -We could continue to check methods rather than traits are object-safe. When -checking the bounds of a type parameter for a function call where the function -is called with a trait object, we would check that all methods are object-safe -as part of the check that the actual type parameter satisfies the formal bounds. -We could probably give a different error message if the bounds are met, but the +We could continue to check methods rather than traits are +object-safe. When checking the bounds of a type parameter for a +function call where the function is called with a trait object, we +would check that all methods are object-safe as part of the check that +the actual type parameter satisfies the formal bounds. We could +probably give a different error message if the bounds are met, but the trait is not object-safe. -Rather than the restriction on taking `self` by value, we could require a trait -is `for Sized?` in order to be object safe. The purpose of forbidding self by -value is to enforce that we always have statically known size and that we have a -vtable for dynamic dispatch. If the programmer were going to manually provide -`impl`s for each trait, we would require the `Sized?` bound on the trait to -ensure that `self` was not dereferenced. However, with the compiler-driven -approach, this is not necessary. - # Unresolved questions N/A From 74b05c600f48f573752a5e0c7bb5802d4b764de1 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 11 Feb 2015 13:34:08 -0500 Subject: [PATCH 0086/1195] Add a note about possible future extensions. --- text/0255-object-safety.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0255-object-safety.md b/text/0255-object-safety.md index 327e77550f9..bb5284fa331 100644 --- a/text/0255-object-safety.md +++ b/text/0255-object-safety.md @@ -195,6 +195,12 @@ the actual type parameter satisfies the formal bounds. We could probably give a different error message if the bounds are met, but the trait is not object-safe. +We might in the future use finer-grained reasoning to permit more +non-object-safe methods from appearing in the trait. For example, we +might permit `fn foo() -> Self` because it (implicitly) requires that +`Self` be sized. Similarly, we might permit other tests beyond just +sized-ness. Any such extension would be backwards compatible. + # Unresolved questions N/A From d2915159ae4392ed7ad864de8d349199b7c2e90c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 12 Feb 2015 11:59:19 -0800 Subject: [PATCH 0087/1195] Switch recommendations --- text/0000-hash-simplification.md | 250 ++++++++++++++++--------------- 1 file changed, 130 insertions(+), 120 deletions(-) diff --git a/text/0000-hash-simplification.md b/text/0000-hash-simplification.md index 56add5c41fd..5883ea83436 100644 --- a/text/0000-hash-simplification.md +++ b/text/0000-hash-simplification.md @@ -146,12 +146,133 @@ trait for these hashing algorithms and it would not be bounded by This RFC considers two possible designs as a replacement of today's `std::hash` API. One is a "minor refactoring" of the current API while the other is a much more radical change towards being conservative. This section -will propose the more radical change and the other may be found in the +will propose the minor refactoring change and the other may be found in the [Alternatives](#alternatives) section. ## API -The new API of `std::hash` will be: +The new API of `std::hash` would be: + +```rust +trait Hash { + fn hash(&self, h: &mut H); +} + +trait Hasher { + type Output; + fn write(&mut self, data: &[u8]); + fn finish(&self) -> Self::Output; + + fn write_u8(&mut self, i: u8) { ... } + fn write_i8(&mut self, i: i8) { ... } + fn write_u16(&mut self, i: u16) { ... } + fn write_i16(&mut self, i: i16) { ... } + fn write_u32(&mut self, i: u32) { ... } + fn write_i32(&mut self, i: i32) { ... } + fn write_u64(&mut self, i: u64) { ... } + fn write_i64(&mut self, i: i64) { ... } + fn write_usize(&mut self, i: usize) { ... } + fn write_isize(&mut self, i: isize) { ... } +} +``` + +This API is quite similar to today's API, but has a few tweaks: + +* The `Writer` trait has been removed by folding it directly into the `Hasher` + trait. As part of this movement the `Hasher` trait grew a number of + specialized `write_foo` methods which the primitives will call. This should + help regain some performance losses where forcing a byte-oriented stream is + a performance loss. + +* The `Hasher` trait no longer has a `reset` method. + +* The `Hash` trait's type parameter is on the *method*, not on the trait. This + implies that the trait is no longer object-safe, but it is much more ergonomic + to operate over generically. + +> **Note**: A possible tweak would be to remove the `Output` associated type in +> favor of just always returning `usize` (or `u64`). + +The purpose of this API is to continue to allow APIs to be generic over the +hashing algorithm used. This would allow `HashMap` continue to use a randomly +keyed SipHash as its default algorithm (e.g. continuing to provide DoS +protection, more information on this below). An example encoding of the +alternative API (proposed below) would look like: + +```rust +impl Hasher for usize { + type Output = usize; + fn write(&mut self, data: &[u8]) { + for b in data.iter() { self.write_u8(*b); } + } + fn finish(&self) -> usize { *self } + + fn write_u8(&mut self, i: u8) { *self = combine(*self, i); } + // and so on... +} +``` + +## `HashMap` and `HashState` + +For both this recommendation as well as the alternative below, this RFC proposes +removing the `HashState` trait and `Hasher` structure (as well as the +`hash_state` module) in favor of the following API: + +```rust +struct HashMap; + +impl HashMap { + fn new() -> HashMap { + HashMap::with_hasher(DefaultHasher::new()) + } +} + +impl u64> HashMap { + fn with_hasher(hasher: H) -> HashMap; +} + +impl Fn(&K) -> u64 for DefaultHasher { + fn call(&self, arg: &K) -> u64 { + let mut s = SipHasher::new_with_keys(self.k1, self.k2); + arg.hash(&mut s); + s.finish() + } +} +``` + +The precise details will be affected based on which design in this RFC is +chosen, but the general idea is to move from a custom trait to the standard `Fn` +trait for calculating hashes. + +# Drawbacks + +* This design is a departure from the precedent set by many other languages. In + doing so, however, it is arguably easier to implement `Hash` as it's more + obvious how to feed in incremental state. We also do not lock ourselves into a + particular hashing algorithm in case we need to alternate in the future. + +* Implementations of `Hash` cannot be specialized and are forced to operate + generically over the hashing algorithm provided. This may cause a loss of + performance in some cases. Note that this could be remedied by moving the type + parameter to the trait instead of the method, but this would lead to a loss in + ergonomics for generic consumers of `T: Hash`. + +* Manual implementations of `Hash` are somewhat cumbersome still by requiring a + separate `Hasher` parameter which is not necessarily always desired. + +* The API of `Hasher` is approaching the realm of serialization/reflection and + it's unclear whether its API should grow over time to support more basic Rust + types. + +# Alternatives + +As alluded to in the "Detailed design" section the primary alternative to this +RFC, which still improves ergonomics, is to remove the generic-ness over the +hashing algorithm. + +## API + +The new API of `std::hash` would be: ```rust trait Hash { @@ -161,7 +282,7 @@ trait Hash { fn combine(a: usize, b: usize) -> usize; ``` -The `Writer`, `Hasher`, and `SipHasher` structures/traits will all be removed +The `Writer`, `Hasher`, and `SipHasher` structures/traits would all be removed from `std::hash`. This definition is more or less the Rust equivalent of the Java/C++ hashing infrastructure. This API is a vast simplification of what exists today and allows implementations of `Hash` as well as consumers of `Hash` @@ -177,7 +298,7 @@ With this definition of `Hash`, each type must pre-ordain a particular hash algorithm that it implements. Using an alternate algorithm would require a separate newtype wrapper. -Most implementations will still use `#[derive(Hash)]` which will leverage +Most implementations would still use `#[derive(Hash)]` which will leverage `hash::combine` to combine the hash values of aggregate fields. Manual implementations which only want to hash a select number of fields would look like: @@ -268,39 +389,11 @@ be: Given the information available about other DoS mitigations in hash maps for other languages, however, it is not clear that this will provide the same level -of DoS protection that is available today. - -## `HashMap` and `HashState` - -For both this recommendation as well as the alternative below, this RFC proposes -removing the `HashState` trait and `Hasher` structure (as well as the -`hash_state` module) in favor of the following API: - -```rust -struct HashMap; - -impl HashMap { - fn new() -> HashMap { - HashMap::with_hasher(DefaultHasher::new()) - } -} - -impl u64> HashMap { - fn with_hasher(hasher: H) -> HashMap; -} +of DoS protection that is available today. For example [@DaGenix explains +well](https://github.com/rust-lang/rfcs/pull/823#issuecomment-74013800) that we +may not be able to provide any form of DoS protection guarantee at all. -impl Fn(&K) -> u64 for DefaultHasher { - fn call(&self, arg: &K) -> u64 { - arg.hash() as u64 - } -} -``` - -The precise details will be affected based on which design in this RFC is -chosen, but the general idea is to move from a custom trait to the standard `Fn` -trait for calculating hashes. - -# Drawbacks +## Alternative Drawbacks * One of the primary drawbacks to the proposed `Hash` trait is that it is now not possible to select an algorithm that a type should be hashed with. Instead @@ -313,90 +406,7 @@ trait for calculating hashes. * Due to the lack of input state to hashing, the `HashMap` type can no longer randomly seed each individual instance but may at best have one global seed. - This consequently may elevate the risk of a DoS attack on a `HashMap` - instance. - -# Alternatives - -As alluded to in the "Detailed design" section the primary alternative to this -RFC, which still improves ergonomics, is to refine the API to require fewer -traits and less generics on consumption. - -## API - -The new API of `std::hash` would be: - -```rust -trait Hash { - fn hash(&self, h: &mut H); -} - -trait Hasher { - type Output; - fn write(&mut self, data: &[u8]); - fn finish(&self) -> Self::Output; - - fn write_u8(&mut self, i: u8) { ... } - fn write_i8(&mut self, i: i8) { ... } - fn write_u16(&mut self, i: u16) { ... } - fn write_i16(&mut self, i: i16) { ... } - fn write_u32(&mut self, i: u32) { ... } - fn write_i32(&mut self, i: i32) { ... } - fn write_u64(&mut self, i: u64) { ... } - fn write_i64(&mut self, i: i64) { ... } - fn write_usize(&mut self, i: usize) { ... } - fn write_isize(&mut self, i: isize) { ... } -} -``` - -This API is quite similar to today's API, but has a few tweaks: - -* The `Writer` trait has been removed by folding it directly into the `Hasher` - trait. As part of this movement the `Hasher` trait grew a number of - specialized `write_foo` methods which the primitives will call. This should - help regain some performance losses where forcing a byte-oriented stream is - a performance loss. - -* The `Hasher` trait no longer has a `reset` method. - -* The `Hash` trait's type parameter is on the *method*, not on the trait. This - implies that the trait is no longer object-safe, but it is much more ergonomic - to operate over generically. - -> **Note**: A possible tweak would be to remove the `Output` associated type in -> favor of just always returning `usize` (or `u64`). - -The purpose of this API is to continue to allow APIs to be generic over the -hashing algorithm used. This would allow `HashMap` continue to use a randomly -keyed SipHash as its default algorithm (e.g. continuing to provide DoS -protection). An example encoding of the previous API would look like: - -```rust -impl Hasher for usize { - type Output = usize; - fn write(&mut self, data: &[u8]) { - for b in data.iter() { self.write_u8(*b); } - } - fn finish(&self) -> usize { *self } - - fn write_u8(&mut self, i: u8) { *self = combine(*self, i); } - // and so on... -} -``` - -Some downsides of this API, however, are: - -* This design is a departure from the precedent set by many other languages. -* Implementations of `Hash` cannot be specialized and are forced to operate - generically over the hashing algorithm provided. This may cause a loss of - performance in some cases. Note that this could be remedied by moving the type - parameter to the trait instead of the method, but this would lead to a loss in - ergonomics for generic consumers of `T: Hash`. -* Manual implementations of `Hash` are somewhat cumbersome still by requiring a - separate `Hasher` parameter which is not necessarily always desired. -* The API of `Hasher` is approaching the realm of serialization/reflection and - it's unclear whether its API should grow over time to support more basic Rust - types. + This consequently elevates the risk of a DoS attack on a `HashMap` instance. # Unresolved questions From dae0b784434ac30427bf3679d814d8aa2f77c955 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 12 Feb 2015 12:03:54 -0800 Subject: [PATCH 0088/1195] Clarify that siphash keys will be globally unique --- text/0000-hash-simplification.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/0000-hash-simplification.md b/text/0000-hash-simplification.md index 5883ea83436..c3c301c71c8 100644 --- a/text/0000-hash-simplification.md +++ b/text/0000-hash-simplification.md @@ -231,9 +231,14 @@ impl u64> HashMap { fn with_hasher(hasher: H) -> HashMap; } +fn global_siphash_keys() -> (u64, u64) { + // ... +} + impl Fn(&K) -> u64 for DefaultHasher { fn call(&self, arg: &K) -> u64 { - let mut s = SipHasher::new_with_keys(self.k1, self.k2); + let (k1, k2) = global_siphash_keys(); + let mut s = SipHasher::new_with_keys(k1, k2); arg.hash(&mut s); s.finish() } From 32eadf4b5a705b79720b0c98468690a5481bb3b9 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 12 Feb 2015 15:08:53 +1300 Subject: [PATCH 0089/1195] Some more open questions, another example, some clarifications. --- text/0000-type-ascription.md | 96 +++++++++++++++++++++++++++++++++--- 1 file changed, 88 insertions(+), 8 deletions(-) diff --git a/text/0000-type-ascription.md b/text/0000-type-ascription.md index 3aaa6e6e16f..7c7e8bc80d5 100644 --- a/text/0000-type-ascription.md +++ b/text/0000-type-ascription.md @@ -35,6 +35,9 @@ type ascription is currently only allowed on top-level patterns. ## Examples: +(Somewhat simplified examples, in these cases there are sometimes better +solutions with the current syntax). + Generic return type: ``` @@ -74,6 +77,19 @@ let y = [3u32]; foo(x: &[_], y: &[_]); ``` +Generic return type and coercion: + +``` +// Current. +let x: T = { + let temp: U<_> = foo(); + temp +}; + +// With type ascription. +let x: T foo(): U<_>; +``` + In patterns: ``` @@ -106,26 +122,63 @@ At runtime, type ascription is a no-op, unless an implicit coercion was used in type checking, in which case the dynamic semantics of a type ascription expression are exactly those of the implicit coercion. -The syntax of sub-patterns is extended to include an optional type ascription. +The syntax of patterns is extended to include an optional type ascription. Old syntax: ``` -P ::= SP: T | SP -SP ::= var | 'box' SP | ... +PT ::= P: T +P ::= var | 'box' P | ... +e ::= 'let' (PT | P) = ... | ... ``` -where `P` is a pattern, `SP` is a sub-pattern, `T` is a type, and `var` is a -variable name. +where `PT` is a pattern with optional type, `P` is a sub-pattern, `T` is a type, +and `var` is a variable name. (Formal arguments are `PT`, patterns in match arms +are `P`). New syntax: ``` -P ::= SP: T | SP -SP ::= var | 'box' P | ... +PT ::= P: T | P +P ::= var | 'box' PT | ... +e ::= 'let' PT = ... | ... ``` Type ascription in patterns has the narrowest precedence, e.g., `box x: T` means -`box (x: T)`. +`box (x: T)`. In particular, in a struct initialiser or patter, `x : y : z` is +parsed as `x : (y: z)`, i.e., a field named `x` is initialised with a value `y` +and that value must have type `z`. If only `x: y` is given, that is considered +to be the field name and the field's contents, with no type ascription. + +The chagnes to pattern syntax mean that in some contexts where a pattern +previously required a type annotation, it is no longer required if all variables +can be assigned types via the ascription. Examples, + +``` +struct Foo { + a: Bar, + b: Baz, +} +fn foo(x: Foo); // Ok, type of x given by type of whole pattern +fn foo(Foo { a: x, b: y}: Foo) // Ok, types of x and y found by destructuring +fn foo(Foo { a: x: Bar, b: y: Baz}) // Ok, no type annotation, but types given as ascriptions +fn foo(Foo { a: x: Bar, _ }) // Ok, we can still deduce the type of x and the whole argument +fn foo(Foo { a: x, b: y}) // Ok, type of x and y given by Foo + +struct Qux { + a: Bar, + b: X, +} +fn foo(x: Qux); // Ok, type of x given by type of whole pattern +fn foo(Qux { a: x, b: y}: Qux) // Ok, types of x and y found by destructuring +fn foo(Qux { a: x: Bar, b: y: Baz}) // Ok, no type annotation, but types given as ascriptions +fn foo(Qux { a: x: Bar, _ }) // Error, can't find the type of the whole argument +fn foo(Qux { a: x, b: y}) // Error can't find type of y or the whole argument +``` + +Note the above changes mean moving some errors from parsing to later in type +checking. For example, all uses of patterns have optional types, and it is a +type error if there must be a type (e.g., in function arguments) but it is not +fully specified (currently it would be a parsing error). In type checking, if an expression is matched against a pattern, when matching a sub-pattern the matching sub-expression must have the ascribed type (again, @@ -166,6 +219,7 @@ instead of type ascription. However, we would then lose the distinction between implicit coercions which are safe and explicit coercions, such as narrowing, which require more programmer attention. This also does not help with patterns. +We could use a different symbol or keyword instead of `:`, e.g., `is`. # Unresolved questions @@ -173,3 +227,29 @@ Is the suggested precedence correct? Especially for patterns. Does type ascription on patterns have backwards compatibility issues? +Given the potential confusion with struct literal syntax, it is perhaps worth +re-opening that discussion. But given the timing, probably not. + +Should remove integer suffixes in favour of type ascription? + +### `as` vs `:` + +A downside of type ascription is the overlap with explicit coercions (aka casts, +the `as` operator). Type ascription makes implicit coercions explicit. In RFC +401, it is proposed that all valid implicit coercions are valid explicit +coercions. However, that may be too confusing for users, since there is no +reason to use type ascription rather than `as` (if there is some coercion). It +might be a good idea to revisit that decision (it has not yet been implemented). +Then it is clear that the user uses `as` for explicit casts and `:` for non- +coercing ascription and implicit casts. Although there is no hard guideline for +which operations are implicit or explicit, the intuition is that if the +programmer ought to be aware of the change (i.e., the invariants of using the +type change to become less safe in any way) then coercion should be explicit, +otherwise it can be implicit. + +Alternatively we could remove `as` and require `:` for explicit coercions, but +not for implicit ones (they would keep the same rules as they currently have). +The only loss would be that `:` doesn't stand out as much as `as` and there +would be no lint for trivial coercions. Another (backwards compatible) +alternative would be to keep `as` and `:` as synonyms and recommend against +using `as`. From 2b4d09c35c18fad927c714b599dbe69a0156bb72 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Fri, 13 Feb 2015 11:25:52 +0200 Subject: [PATCH 0090/1195] Modified into a two-stage transition plan involving DST As suggested by Aaron Turon: https://github.com/rust-lang/rfcs/pull/592#issuecomment-73927495 --- text/0000-c-str-deref.md | 67 ++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 218553b115a..6fa51dc0a11 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -23,8 +23,8 @@ fn main() { The type `std::ffi::CString` is used to prepare string data for passing as null-terminated strings to FFI functions. This type dereferences to a -DST, `[libc::c_char]`. The slice type, however, is a poor choice for -representing borrowed C string data, since: +DST, `[libc::c_char]`. The slice type as it is, however, is a poor choice +for representing borrowed C string data, since: 1. A slice does not express the C string invariant at compile time. Safe interfaces wrapping FFI functions cannot take slice references as is @@ -49,33 +49,53 @@ it makes sense that `CString` gets its own borrowed counterpart. # Detailed design -## CStr, an Irrelevantly Sized Type +This proposal introduces `CStr`, a type to designate a null-terminated +string. This type does not implement `Sized`, `Copy`, or `Clone`. +References to `CStr` are only safely obtained by dereferencing `CString` +and a few other helper methods, described below. A `CStr` value should provide +no size information, as there is intent to turn `CStr` into an +[unsized type](https://github.com/rust-lang/rfcs/issues/813), +pending resolution on that proposal. -This proposal introduces `CStr`, a token type to designate a null-terminated -string. This type does not implement `Copy` or `Clone` and is only used in -borrowed references. `CStr` is sized, but its size and layout are of no -consequence to its users. It's only safely obtained by dereferencing -`CString` and a few other helper methods, described below. +## Stage 1: CStr, a DST with a weight problem + +As current Rust does not have unsized types that are not DSTs, at this stage +`CStr` is defined as a newtype over a character slice: ```rust #[repr(C)] pub struct CStr { - head: libc::c_char, - marker: std::marker::NoCopy + chars: [libc::c_char] } impl CStr { pub fn as_ptr(&self) -> *const libc::c_char { - &self.head as *const libc::c_char + self.chars.as_ptr() } } +``` + +`CString` is changed to dereference to `CStr`: +```rust impl Deref for CString { type Target = CStr; fn deref(&self) -> &CStr { ... } } ``` +In implementation, the `CStr` value needs a length for the internal slice. +This RFC provides no guarantees that the length will be equal to the length +of the string, or be any particular value suitable for safe use. + +## Stage 2: unsized CStr + +If unsized types are enabled later one way of another, the definition +of `CStr` would change to an unsized type with statically sized contents. +The authors of this RFC believe this would constitute no breakage to code +using `CStr` safely. With a view towards this future change, it's recommended +to avoid any unsafe code depending on the internal representation of `CStr`. + ## Returning C strings In cases when an FFI function returns a pointer to a non-owned C string, @@ -102,7 +122,7 @@ impl CStr { ``` An odd consequence is that it is valid, if wasteful, to call `to_bytes` on -`CString` via auto-dereferencing. +a `CString` via auto-dereferencing. ## Remove c_str_to_bytes @@ -113,8 +133,10 @@ in favor of composition of the functions described above: ## Proof of concept -The described changes are implemented in crate -[c_string](https://github.com/mzabaluev/rust-c-str). +The described interface changes are implemented in crate +[c_string](https://github.com/mzabaluev/rust-c-str), with a difference +that the `CStr` token type has a bogus static size, as a compromise to +offer better performance in current Rust. # Drawbacks @@ -125,25 +147,14 @@ expose the slice in type annotations, parameter signatures and so on, the change should not be breaking since `CStr` also provides this method. -Making the deref target practically unsized throws away the length information +Making the deref target unsized throws away the length information intrinsic to `CString` and makes it less useful as a container for bytes. This is countered by the fact that there are general purpose byte containers in the core libraries, whereas `CString` addresses the specific need to convey string data from Rust to C-style APIs. -While it's not possible outside of unsafe code to unintentionally copy out -or modify the nominal value of `CStr` under an immutable reference, some -unforeseen trouble or confusion can arise due to the structure having a -bogus size. A separate [RFC](https://github.com/rust-lang/rfcs/issues/813), -if accepted, will solve this by opting out of `Sized`. - # Alternatives -`CStr` could be made a newtype on DST `[libc::c_char]`, allowing no-cost -slices. It's not clear if this is useful, and the need to calculate length -up front might prevent some optimized uses possible with the 'thin' -reference. - If the proposed enhancements or other equivalent facilities are not adopted, users of Rust can turn to third-party libraries for better convenience and safety when working with C strings. This may result in proliferation of @@ -152,8 +163,4 @@ is established. # Unresolved questions -`CStr` can be made a -[truly unsized type](https://github.com/rust-lang/rfcs/issues/813), -pending on that proposal's approval. - Need a `Cow`? From ec3ea51b629aad9ef900e2434747a1ef003b9b30 Mon Sep 17 00:00:00 2001 From: Julian Orth Date: Fri, 13 Feb 2015 12:50:23 +0100 Subject: [PATCH 0091/1195] add String method --- text/0000-drain-range.md | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/text/0000-drain-range.md b/text/0000-drain-range.md index 13cfa1eadde..85cd289d7b9 100644 --- a/text/0000-drain-range.md +++ b/text/0000-drain-range.md @@ -4,7 +4,8 @@ # Summary -Replace `Vec::drain` by a method that accepts a range parameter. +Replace `Vec::drain` by a method that accepts a range parameter. Add +`String::drain` with similar functionality. # Motivation @@ -64,3 +65,19 @@ pub fn drain(&mut self, range: T) -> RangeIter { Where `Drainer` should be implemented for `Range`, `RangeTo`, `RangeFrom`, `FullRange`, and `usize`. + +Add `String::drain`: + +```rust +/// Creates a draining iterator that clears the specified range in the String +/// and iterates over the characters contained in the range. +/// +/// # Panics +/// +/// Panics if the range is decreasing, if the upper bound is larger than the +/// length of the String, or if the start and the end of the range don't lie on +/// character boundaries. +pub fn drain(&mut self, range: /* ? */) -> /* ? */ { + // ? +} +``` From 6641a1ee18dcfaa7f6ed6b27b8851654fef7b0a9 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Fri, 13 Feb 2015 20:00:08 +0800 Subject: [PATCH 0092/1195] Propose `RingBuf` -> `VecDeque`. And various other adjustments. --- text/0000-rename-collections.md | 33 +++++++++++++++++++-------------- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/text/0000-rename-collections.md b/text/0000-rename-collections.md index c042cd4504b..eba49b0d471 100644 --- a/text/0000-rename-collections.md +++ b/text/0000-rename-collections.md @@ -26,12 +26,16 @@ The current collection names (and their longer versions) are: The abbreviation rules do seem unclear. Sometimes the first word is abbreviated, sometimes the last. However there are also cases where the names are not abbreviated. `Bitv`, `BitvSet` and `DList` seem strange on first glance. Such inconsistencies are undesirable, as Rust should not give an impression as "the promising language that has strangely inconsistent naming conventions for its standard collections". +Also, it should be noted that traditionally *ring buffers* have fixed sizes, but Rust's `RingBuf` does not. So it is preferable to rename it to something clearer, in order to avoid incorrect assumptions and surprises. + # Detailed design First some general naming rules should be established. -1. Prefer commonly used names. -2. Prefer full names when full names and abbreviated names are almost equally elegant. +1. At least maintain module level consistency when abbreviations are concerned. +2. Prefer commonly used abbreviations. +3. When in doubt, prefer full names to abbreviated ones. +4. Don't be dogmatic. And the new names: @@ -44,7 +48,7 @@ And the new names: * `DList` -> `LinkedList` * `HashMap` * `HashSet` -* `RingBuf` -> `RingBuffer` +* `RingBuf` -> `VecDeque` * `VecMap` The following changes should be made: @@ -52,25 +56,24 @@ The following changes should be made: - Rename `Bitv`, `BitvSet`, `DList` and `RingBuf`. Change affected codes accordingly. - If necessary, redefine the original names as aliases of the new names, and mark them as deprecated. After a transition period, remove the original names completely. -## Why prefer full names when full names and abbreviated ones are almost equally elegant? +## Why prefer full names when in doubt? -The naming rules should apply not only to standard collections, but also to other codes. It is (comparatively) easier to maintain a higher level of naming consistency by preferring full names to abbreviated ones *when in doubt*. Because given a full name, there are possibly many abbreviated forms to choose from. Which should be chosen and why? It is hard to write down guideline for that. +The naming rules should apply not only to standard collections, but also to other codes. It is (comparatively) easier to maintain a higher level of naming consistency by preferring full names to abbreviated ones *when in doubt*. Because given a full name, there are possibly many abbreviated forms to choose from. Which one should be chosen and why? It is hard to write down guidelines for that. -For example, a name `BinaryBuffer` have at least three convincing abbreviated forms: `BinBuffer`/`BinaryBuf`/`BinBuf`. Which one would be the most preferred? Hard to say. But it is clear that the full name `BinaryBuffer` is not a bad name. +For example, the name `BinaryBuffer` has at least three convincing abbreviated forms: `BinBuffer`/`BinaryBuf`/`BinBuf`. Which one would be the most preferred? Hard to say. But it is clear that the full name `BinaryBuffer` is not a bad name. -However, if there *is* a convincing reason, one should not hesitate using abbreviated names. A series of names like `BinBuffer/OctBuffer/HexBuffer` is very natural. Also, few would think the full name of `Arc` is a good type name. +However, if there *is* a convincing reason, one should not hesitate using abbreviated names. A series of names like `BinBuffer/OctBuffer/HexBuffer` is very natural. Also, few would think that `AtomicallyReferenceCounted`, the full name of `Arc`, is a good type name. ## Advantages of the new names: -- `Vec`: The name of the most frequently used Rust collection is left unchanged (and by extension `VecMap`), so the scope of the changes are greatly reduced. `Vec` is an exception to the rule because it is *the* collection in Rust. +- `Vec`: The name of the most frequently used Rust collection is left unchanged (and by extension `VecMap`), so the scope of the changes are greatly reduced. `Vec` is an exception to the "prefer full names" rule because it is *the* collection in Rust. - `BitVec`: `Bitv` is a very unusual abbreviation of `BitVector`, but `BitVec` is a good one given `Vector` is shortened to `Vec`. - `BitSet`: Technically, `BitSet` is a synonym of `BitVec(tor)`, but it has `Set` in its name and can be interpreted as a set-like "view" into the underlying bit array/vector, so `BitSet` is a good name. No need to have an additional `v`. - `LinkedList`: `DList` doesn't say much about what it actually is. `LinkedList` is not too long (like `DoublyLinkedList`) and it being a doubly-linked list follows Java/C#'s traditions. -- `RingBuffer`: `RingBuf` is a good name, but `RingBuffer` is good too. No reason to violate the rule here. +- `VecDeque`: This name exposes some implementation details and signifies its "interface" just like `HashSet`, and it doesn't have the "fixed-size" connotation that `RingBuf` has. Also, `Deque` is commonly preferred to `DoubleEndedQueue`, it is clear that the former should be chosen. # Drawbacks -- Preferring full names may result in people naming things with overly-long names that are hard to write and more importantly, read. - There will be breaking changes to standard collections that are already marked `stable`. # Alternatives @@ -83,17 +86,19 @@ And Rust's standard collections will have some strange names and no consistent n And by extension, `Bitv` to `BitVector` and `VecMap` to `VectorMap`. -This means breaking changes at a much larger scale. Undesirable at this stage. +This means breaking changes at a larger scale. Given that `Vec` is *the* collection of Rust, we can have an exception here. ## C. Rename `DList` to `DLinkedList`, not `LinkedList`: It is clearer, but also inconsistent with the other names by having a single-lettered abbreviation of `Doubly`. As Java/C# also have doubly-linked `LinkedList`, it is not necessary to use the additional `D`. -## D. Instead of renaming `RingBuf` to `RingBuffer`, rename `BinaryHeap` to `BinHeap`. +## D. Also rename `BinaryHeap` to `BinHeap`. + +`BinHeap` can also mean `BinomialHeap`, so `BinaryHeap` is the better name here. -Or, reversing the second rule: prefer abbreviated names to full ones when in doubt. +## E. Rename `RingBuf` to `RingBuffer`, or do not rename `RingBuf` at all. -This has the advantage of encouraging succinct names, but everyone has his/her own preferences of how to abbreviate things. Naming consistency will suffer. Whether this is a problem is also a quite subjective matter. +Doing so would fail to stop people from making the incorrect assumption that Rust's `RingBuf`s have fixed sizes. # Unresolved questions From 1a57d20621bccb4dc4d05ce3e174a139c06448ca Mon Sep 17 00:00:00 2001 From: Julian Orth Date: Fri, 13 Feb 2015 13:38:42 +0100 Subject: [PATCH 0093/1195] add drawbacks --- text/0000-drain-range.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-drain-range.md b/text/0000-drain-range.md index 85cd289d7b9..b3e3f739a34 100644 --- a/text/0000-drain-range.md +++ b/text/0000-drain-range.md @@ -81,3 +81,10 @@ pub fn drain(&mut self, range: /* ? */) -> /* ? */ { // ? } ``` + +# Drawbacks + +- The function signature differs from other collections. +- It's not clear from the signature that `..` can be used to get the old behavior. +- The trait documentation will link to the `std::ops` module. It's not immediately apparent how the types in there are related to the `N..M` syntax. +- Some of these problems can be mitigated by solid documentation of the function itself. From aaccf2ae5d1d111934c0a2a1479320e1eb0ff87c Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 14 Feb 2015 06:35:04 +0200 Subject: [PATCH 0094/1195] Added RFC: get panics out of std::ffi::CString --- text/0000-no-panic-in-c-string.md | 80 +++++++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 text/0000-no-panic-in-c-string.md diff --git a/text/0000-no-panic-in-c-string.md b/text/0000-no-panic-in-c-string.md new file mode 100644 index 00000000000..ddc9a2988ca --- /dev/null +++ b/text/0000-no-panic-in-c-string.md @@ -0,0 +1,80 @@ +- Feature Name: non_panicky_cstring +- Start Date: 2015-02-13 +- RFC PR: +- Rust Issue: + +# Summary + +Remove panics from `CString::from_slice` and `CString::from_vec`, making +these functions return `Result` instead. + +# Motivation + +Currently the functions that produce `std::ffi::CString` out of Rust byte +strings panic when the input contains NUL bytes. As strings containing NULs +are not commonly seen in real-world usage, it is easy for developers to +overlook the potential panic unless they test for such atypical input. + +The panic is particularly sneaky when hidden behind an API using regular Rust +string types. Consider this example: + +```rust +fn set_text(text: &str) { + let c_text = CString::from_slice(text.as_bytes()); // panic lurks here + unsafe { ffi::set_text(c_text.as_ptr()) }; +} +``` + +This implementation effectively imposes a requirement on the input string to +contain no inner NUL bytes, which is generally permitted in pure Rust. +This restriction is not apparent in the signature of the function and needs to +be described in the documentation. Furthermore, the creator of the code may be +oblivious to the potential panic. + +The conventions on failure modes elsewhere in Rust libraries tend to limit +panics to outcomes of programmer errors. Functions validating external data +should return `Result` to allow graceful handling of the errors. + +# Detailed design + +The return types of `CString::from_slice` and `CString::from_vec` is changed +to `Result`: + +```rust +impl CString { + pub fn from_slice(s: &[u8]) -> Result { ... } + pub fn from_vec(v: Vec) -> Result { ... } +} +``` + +The error type `NulError` provides information on the position of the first +NUL byte found in the string. `IntoCStrError` wraps `NulError` and also +provides the `Vec` which has been moved into `CString::from_vec`. + +`std::error::FromError` implementations are provided to convert the error +types above to `std::io::Error` of the `InvalidInput` kind. This facilitates +use of the conversion functions in input-processing code. + +# Proof-of-concept implementation + +The proposed changes are implemented in a crates.io project +[c_string](https://github.com/mzabaluev/rust-c-str), where the analog of +`CString` is named `CStrBuf`. + +# Drawbacks + +The need to extract the data from a `Result` in the success case is annoying. +However, it may be viewed as a speed bump to make the developer aware of a +potential failure and to require an explicit choice on how to handle it. +Even the least graceful way, a call to `unwrap`, makes the potential panic +apparent in the code. + +# Alternatives + +If the panicky behavior is preserved, plentiful possibilities for DoS attacks +and other unforeseen failures in the field may be introduced by code oblivious +to the input constraints. + +# Unresolved questions + +None. From 3641ade9fcb6a9ebdf7b8c422a48b36a556820cd Mon Sep 17 00:00:00 2001 From: Alexis Date: Fri, 13 Feb 2015 23:48:22 -0500 Subject: [PATCH 0095/1195] extend upgroid --- text/0000-embrace-extend-extinguish.md | 96 ++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) create mode 100644 text/0000-embrace-extend-extinguish.md diff --git a/text/0000-embrace-extend-extinguish.md b/text/0000-embrace-extend-extinguish.md new file mode 100644 index 00000000000..02423697660 --- /dev/null +++ b/text/0000-embrace-extend-extinguish.md @@ -0,0 +1,96 @@ +- Feature Name: embrace-extend-extinguish +- Start Date: 2015-02-13 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Extend the Extend trait to take IntoIterator, and make all collections +`impl<'a, T: Clone> Extend<&'a T>`. This enables both `vec.extend(&[1, 2, 3])`, and +`vec.extend(&hash_set)`. This provides a more expressive replacement for +`Vec::push_all` with literally no ergonomic loss, while leveraging established APIs. + +# Motivation + +Vec::push_all is kinda random and specific. Partially motivated by performance concerns, +but largely just "nice" to not have to do something like +`vec.extend([1, 2, 3].iter().cloned())`. The performance argument falls flat +(we *must* make iterators fast, and trusted_len should get us there). The ergonomics +argument is salient, though. Working with Plain Old Data types in Rust is super annoying +because generic APIs and semantics are tailored for non-Copy types. + +Even with Extend upgraded to take IntoIterator, that won't work with &[Copy], +because a slice can't be moved out of. Collections would have to take `IntoIterator<&T>`, +and clone out of the reference. So, do exactly that. + +As a bonus, this is more expressive than `push_all`, because you can feed in any +collection by-reference to clone the data out of it. + +# Detailed design + +Here's a quick hack to get this working today: + +``` +/// A type growable from an `Iterator` implementation +pub trait Extend { + fn extend, I: IntoIterator> + (&mut self, iterator: I); +} +``` + +This isn't the signature we'd like longterm, but it's what works with today's +IntoIterator and where clauses. Longterm (like, tomorrow) this should work: + +``` +/// A type growable from an `Iterator` implementation +pub trait Extend { + fn extend>(&mut self, iterator: I); +} +``` + +And here's usage: + +``` +use std::iter::IntoIterator; + +impl<'a, T: Clone> Extend<&'a T> for Vec { + fn extend, I: IntoIterator> + (&mut self, iterator: I){ + self.extend(iterator.into_iter().cloned()) + } +} + + +fn main() { + let mut foo = vec![1]; + foo.extend(&[1, 2, 3, 4]); + let bar = vec![1, 2, 3]; + foo.extend(&bar); + foo.extend(bar.iter()); + + println!("{:?}", foo); +} +``` + +# Drawbacks + +Mo' generics, mo' magic. How you gonna discover it? + +"hidden" clones? + +# Alternatives + +Nope. + +# Unresolved questions + +FromIterator could also be extended in the same manner, but this is less useuful for +two reasons: + +* FromIterator is always called by calling `collect`, and IntoIterator doesn't really +"work" right in `self` position. +* Introduces ambiguities in some cases. What is `let foo: Vec<_> = [1, 2, 3].iter().collect()`? + +Of course, context might disambiguate in many cases, and +`let foo: Vec = [1, 2, 3].iter().collect()` might still be nicer than +`let foo: Vec<_> = [1, 2, 3].iter().cloned().collect()`. From c5394124af5e07e69c05a2b5f7fa17c20d02512c Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Fri, 13 Feb 2015 23:53:55 -0500 Subject: [PATCH 0096/1195] Scare quotes make me feel good --- text/0000-embrace-extend-extinguish.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-embrace-extend-extinguish.md b/text/0000-embrace-extend-extinguish.md index 02423697660..c0cb6f6b762 100644 --- a/text/0000-embrace-extend-extinguish.md +++ b/text/0000-embrace-extend-extinguish.md @@ -76,7 +76,7 @@ fn main() { Mo' generics, mo' magic. How you gonna discover it? -"hidden" clones? +Hidden clones? # Alternatives From 8c7895d9f80dd9c5f7afdb7b93d7e3fedf161413 Mon Sep 17 00:00:00 2001 From: Richard Zhang Date: Sat, 14 Feb 2015 13:03:08 +0800 Subject: [PATCH 0097/1195] Promote `isize/usize` to be the main candidate for literal suffixes. --- text/0544-rename-int-uint.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/0544-rename-int-uint.md b/text/0544-rename-int-uint.md index 77f883960f4..0ee1928d17b 100644 --- a/text/0544-rename-int-uint.md +++ b/text/0544-rename-int-uint.md @@ -46,7 +46,7 @@ However, given the discussions about the previous revisions of this RFC, and the # Detailed Design -- Rename `int/uint` to `isize/usize`, with `isz/usz` being their literal suffixes, respectively. +- Rename `int/uint` to `isize/usize`, with them being their own literal suffixes. - Update code and documentation to use pointer-sized integers more narrowly for their intended purposes. Provide a deprecation period to carry out these updates. `usize` in action: @@ -62,7 +62,7 @@ There are different opinions about which literal suffixes to use. The following ### `isize/usize`: * Pros: They are the same as the type names, very consistent with the rest of the integer primitives. -* Cons: They are too long for some, and may stand out too much as suffixes. +* Cons: They are too long for some, and may stand out too much as suffixes. However, discouraging people from overusing `isize/usize` is the point of this RFC. And if they are not overused, then this will not be a problem in practice. ### `is/us`: @@ -72,9 +72,11 @@ There are different opinions about which literal suffixes to use. The following Note: No matter which suffixes get chosen, it can be beneficial to reserve `is` as a keyword, but this is outside the scope of this RFC. ### `iz/uz`: + * Pros and cons: Similar to those of `is/us`, except that `iz/uz` are not actual words, which is an additional advantage. However it may not be immediately clear that `iz/uz` are abbreviations of `isize/usize`. ### `i/u`: + * Pros: They are very succinct. * Cons: They are *too* succinct and carry the "default integer types" connotation, which is undesirable. @@ -83,7 +85,7 @@ Note: No matter which suffixes get chosen, it can be beneficial to reserve `is` * Pros: They are the middle grounds between `isize/usize` and `is/us`, neither too long nor too short. They are not actual English words and it's clear that they are short for `isize/usize`. * Cons: Not everyone likes the appearances of `isz/usz`, but this can be said about all the candidates. -Thus, this author believes that `isz/usz` are the best choices here. +After community discussions, it is deemed that using `isize/usize` directly as suffixes is a fine choice and there is no need to introduce other suffixes. ## Advantages of `isize/usize`: From 0c2cb653b73deedf6ae86f3672cbf5a6c5b2d9c2 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 14 Feb 2015 07:18:45 +0200 Subject: [PATCH 0098/1195] Added a fitting quote --- text/0000-no-panic-in-c-string.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/text/0000-no-panic-in-c-string.md b/text/0000-no-panic-in-c-string.md index ddc9a2988ca..aa25229f986 100644 --- a/text/0000-no-panic-in-c-string.md +++ b/text/0000-no-panic-in-c-string.md @@ -10,6 +10,14 @@ these functions return `Result` instead. # Motivation +> As I shivered and brooded on the casting of that brain-blasting shadow, +> I knew that I had at last pried out one of earth’s supreme horrors—one of +> those nameless blights of outer voids whose faint daemon scratchings we +> sometimes hear on the farthest rim of space, yet from which our own finite +> vision has given us a merciful immunity. +> +> — H. P. Lovecraft, The Lurking Fear + Currently the functions that produce `std::ffi::CString` out of Rust byte strings panic when the input contains NUL bytes. As strings containing NULs are not commonly seen in real-world usage, it is easy for developers to From e4b047c3bedd0a524bc09c29e333edbd0afc0df4 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sat, 14 Feb 2015 08:05:31 +0200 Subject: [PATCH 0099/1195] c_string now has CStr as a pseudo-DST --- text/0000-c-str-deref.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 6fa51dc0a11..0a9113c26cd 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -134,9 +134,7 @@ in favor of composition of the functions described above: ## Proof of concept The described interface changes are implemented in crate -[c_string](https://github.com/mzabaluev/rust-c-str), with a difference -that the `CStr` token type has a bogus static size, as a compromise to -offer better performance in current Rust. +[c_string](https://github.com/mzabaluev/rust-c-str). # Drawbacks From 10ad3a23d9422d5f98ab128190e9376ad27228c9 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Sun, 15 Feb 2015 06:47:08 +0200 Subject: [PATCH 0100/1195] Addressed the alternative of adding functions --- text/0000-no-panic-in-c-string.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-no-panic-in-c-string.md b/text/0000-no-panic-in-c-string.md index aa25229f986..f05ed669d01 100644 --- a/text/0000-no-panic-in-c-string.md +++ b/text/0000-no-panic-in-c-string.md @@ -79,6 +79,12 @@ apparent in the code. # Alternatives +Non-panicky functions can be added alongside the existing functions, e.g., +as `from_slice_failing`. Adding new functions complicates the API where little +reason for that exists; composition is preferred to adding function variants. +Longer function names, together with a less convenient return value, may deter +people from using the safer functions. + If the panicky behavior is preserved, plentiful possibilities for DoS attacks and other unforeseen failures in the field may be introduced by code oblivious to the input constraints. From 3b5abbfd05049b38dce501f862bcf37b71ad587e Mon Sep 17 00:00:00 2001 From: Sean Patrick Santos Date: Sun, 15 Feb 2015 16:03:20 -0700 Subject: [PATCH 0101/1195] Update RFC 195 for RFC 246. --- text/0195-associated-items.md | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/text/0195-associated-items.md b/text/0195-associated-items.md index 45f97efe0fb..1b48e859247 100644 --- a/text/0195-associated-items.md +++ b/text/0195-associated-items.md @@ -9,6 +9,7 @@ more convenient, scalable, and powerful. In particular, traits will consist of a set of methods, together with: * Associated functions (already present as "static" functions) +* Associated consts * Associated statics * Associated types * Associated lifetimes @@ -173,7 +174,7 @@ provide a distinct `impl` for every member of this family. Associated types, lifetimes, and functions can already be expressed in today's Rust, though it is unwieldy to do so (as argued above). -But associated _statics_ cannot be expressed using today's traits. +But associated _consts_ and _statics_ cannot be expressed using today's traits. For example, today's Rust includes a variety of numeric traits, including `Float`, which must currently expose constants as static functions: @@ -191,19 +192,19 @@ trait Float { ``` Because these functions cannot be used in static initializers, the modules for -float types _also_ export a separate set of constants as statics, not using +float types _also_ export a separate set of constants as consts, not using traits. -Associated constants would allow the statics to live directly on the traits: +Associated constants would allow the consts to live directly on the traits: ```rust trait Float { - static NAN: Self; - static INFINITY: Self; - static NEG_INFINITY: Self; - static NEG_ZERO: Self; - static PI: Self; - static TWO_PI: Self; + const NAN: Self; + const INFINITY: Self; + const NEG_INFINITY: Self; + const NEG_ZERO: Self; + const PI: Self; + const TWO_PI: Self; ... } ``` @@ -282,13 +283,14 @@ distinction" below. ## Trait bodies: defining associated items -Trait bodies are expanded to include three new kinds of items: statics, types, -and lifetimes: +Trait bodies are expanded to include four new kinds of items: consts, statics, +types, and lifetimes: ``` TRAIT = TRAIT_HEADER '{' TRAIT_ITEM* '}' TRAIT_ITEM = ... + | 'const' IDENT ':' TYPE [ '=' CONST_EXP ] ';' | 'static' IDENT ':' TYPE [ '=' CONST_EXP ] ';' | 'type' IDENT [ ':' BOUNDS ] [ WHERE_CLAUSE ] [ '=' TYPE ] ';' | 'lifetime' LIFETIME_IDENT ';' @@ -352,7 +354,7 @@ external to the trait. ### Defaults -Notice that associated statics and types both permit defaults, just as trait +Notice that associated consts, statics, and types permit defaults, just as trait methods and functions can provide defaults. Defaults are useful both as a code reuse mechanism, and as a way to expand the @@ -430,6 +432,7 @@ lifetime items are allowed: ``` IMPL_ITEM = ... + | 'const' IDENT ':' TYPE '=' CONST_EXP ';' | 'static' IDENT ':' TYPE '=' CONST_EXP ';' | 'type' IDENT' '=' 'TYPE' ';' | 'lifetime' LIFETIME_IDENT '=' LIFETIME_REFERENCE ';' @@ -767,7 +770,8 @@ as UFCS-style functions: trait Foo { type AssocType; lifetime 'assoc_lifetime; - static ASSOC_STATIC: uint; + const ASSOC_CONST: uint; + static ASSOC_STATIC: &'static [uint, ..1024]; fn assoc_fn() -> Self; // Note: 'assoc_lifetime and AssocType in scope: @@ -776,6 +780,7 @@ trait Foo { fn default_method(&self) -> uint { // method in scope UFCS-style, assoc_fn in scope let _ = method(self, assoc_fn()); + ASSOC_CONST // in scope ASSOC_STATIC // in scope } } From aa384c92b278696aa7aa2f35612c1a063e59657c Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Tue, 17 Feb 2015 18:17:32 +1100 Subject: [PATCH 0102/1195] RFC to lex binary and octal literals more eagerly. --- text/0000-small-base-lexing.md | 97 ++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) create mode 100644 text/0000-small-base-lexing.md diff --git a/text/0000-small-base-lexing.md b/text/0000-small-base-lexing.md new file mode 100644 index 00000000000..9f6aaa4e4b1 --- /dev/null +++ b/text/0000-small-base-lexing.md @@ -0,0 +1,97 @@ +- Feature Name: stable, it restricts the language +- Start Date: 2015-02-17 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Lex binary and octal literals as if they were decimal. + +# Motivation + +Lexing all digits (even ones not valid in the given base) allows for +improved error messages & future proofing (this is more conservative +than the current approach) and less confusion, with little downside. + +Currently, the lexer stop lexing a binary and octal literal (`0b10` +and `0o12345670`) as soon as it sees an invalid digit (2-9 or 8-9 +respectively), and start lexing a new token, e.g. `0b0123` is two +tokens, `0b01` and `23`. Writing such a thing in normal code gives a +strange error message: + +```rust +:2:9: 2:11 error: expected one of `.`, `;`, `}`, or an operator, found `23` +:2 0b0123 + ^~ +``` + +However, it is valid to write such a thing in a macro (e.g. using the +`tt` non-terminal), and thus lexing the adjacent digits as two tokens +can lead to unexpected behaviour. + +```rust +macro_rules! expr { ($e: expr) => { $e } } + +macro_rules! add { + ($($token: tt)*) => { + 0 $(+ expr!($token))* + } +} +fn main() { + println!("{}", add!(0b0123)); +} +``` + +prints `24` (`add` expands to `0 + 0b01 + 23`). + +It would be much nicer for both cases to print an error like: + +```rust +error: found invalid digit `2` in binary literal +0b0123 + ^ +``` + +Code that wants two tokens can opt in to it by `0b01 23`, for +example. This is easy to write, and expresses the intent more clearly +anyway. + +# Detailed design + +The grammar that the lexer uses becomes + +``` +(0b[0-9]+ | 0o[0-9]+ | [0-9]+ | 0x[0-9a-fA-F]+) suffix +``` + +instead of just `[01]` and `[0-7]` for the first two, respectively. + +However, it is always an error (in the lexer) to have invalid digits +in a numeric literal beginning with `0b` or `0o`. In particular, even +a macro invocation like + +```rust +macro_rules! ignore { ($($_t: tt)*) => { {} } } + +ignore!(0b0123) +``` + +is an error even though it doesn't use the tokens. + + +# Drawbacks + +This adds a slightly peculiar special case, that is somewhat unique to +Rust. On the other hand, most languages do not expose the lexical +grammar so directly, and so have more freedom in this respect. That +is, in many languages it is indistinguishable if `0b1234` is one or +two tokens: it is *always* an error either way. + + +# Alternatives + +Not do it, obviously. + +# Unresolved questions + +None. From 805f896b228664a9d3cedd3de5e04279c4cf1f0a Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Tue, 17 Feb 2015 18:31:01 +1100 Subject: [PATCH 0103/1195] Revert "RFC to lex binary and octal literals more eagerly." This reverts commit aa384c92b278696aa7aa2f35612c1a063e59657c. --- text/0000-small-base-lexing.md | 97 ---------------------------------- 1 file changed, 97 deletions(-) delete mode 100644 text/0000-small-base-lexing.md diff --git a/text/0000-small-base-lexing.md b/text/0000-small-base-lexing.md deleted file mode 100644 index 9f6aaa4e4b1..00000000000 --- a/text/0000-small-base-lexing.md +++ /dev/null @@ -1,97 +0,0 @@ -- Feature Name: stable, it restricts the language -- Start Date: 2015-02-17 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary - -Lex binary and octal literals as if they were decimal. - -# Motivation - -Lexing all digits (even ones not valid in the given base) allows for -improved error messages & future proofing (this is more conservative -than the current approach) and less confusion, with little downside. - -Currently, the lexer stop lexing a binary and octal literal (`0b10` -and `0o12345670`) as soon as it sees an invalid digit (2-9 or 8-9 -respectively), and start lexing a new token, e.g. `0b0123` is two -tokens, `0b01` and `23`. Writing such a thing in normal code gives a -strange error message: - -```rust -:2:9: 2:11 error: expected one of `.`, `;`, `}`, or an operator, found `23` -:2 0b0123 - ^~ -``` - -However, it is valid to write such a thing in a macro (e.g. using the -`tt` non-terminal), and thus lexing the adjacent digits as two tokens -can lead to unexpected behaviour. - -```rust -macro_rules! expr { ($e: expr) => { $e } } - -macro_rules! add { - ($($token: tt)*) => { - 0 $(+ expr!($token))* - } -} -fn main() { - println!("{}", add!(0b0123)); -} -``` - -prints `24` (`add` expands to `0 + 0b01 + 23`). - -It would be much nicer for both cases to print an error like: - -```rust -error: found invalid digit `2` in binary literal -0b0123 - ^ -``` - -Code that wants two tokens can opt in to it by `0b01 23`, for -example. This is easy to write, and expresses the intent more clearly -anyway. - -# Detailed design - -The grammar that the lexer uses becomes - -``` -(0b[0-9]+ | 0o[0-9]+ | [0-9]+ | 0x[0-9a-fA-F]+) suffix -``` - -instead of just `[01]` and `[0-7]` for the first two, respectively. - -However, it is always an error (in the lexer) to have invalid digits -in a numeric literal beginning with `0b` or `0o`. In particular, even -a macro invocation like - -```rust -macro_rules! ignore { ($($_t: tt)*) => { {} } } - -ignore!(0b0123) -``` - -is an error even though it doesn't use the tokens. - - -# Drawbacks - -This adds a slightly peculiar special case, that is somewhat unique to -Rust. On the other hand, most languages do not expose the lexical -grammar so directly, and so have more freedom in this respect. That -is, in many languages it is indistinguishable if `0b1234` is one or -two tokens: it is *always* an error either way. - - -# Alternatives - -Not do it, obviously. - -# Unresolved questions - -None. From 67ef390abf8a412deb867fd18989f5d432d8ee88 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Tue, 17 Feb 2015 18:17:32 +1100 Subject: [PATCH 0104/1195] RFC to lex binary and octal literals more eagerly. --- text/0000-small-base-lexing.md | 101 +++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 text/0000-small-base-lexing.md diff --git a/text/0000-small-base-lexing.md b/text/0000-small-base-lexing.md new file mode 100644 index 00000000000..65d4895c9c8 --- /dev/null +++ b/text/0000-small-base-lexing.md @@ -0,0 +1,101 @@ +- Feature Name: stable, it only restricts the language +- Start Date: 2015-02-17 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Lex binary and octal literals as if they were decimal. + +# Motivation + +Lexing all digits (even ones not valid in the given base) allows for +improved error messages & future proofing (this is more conservative +than the current approach) and less confusion, with little downside. + +Currently, the lexer stops lexing binary and octal literals (`0b10` and +`0o12345670`) as soon as it sees an invalid digit (2-9 or 8-9 +respectively), and immediately starts lexing a new token, +e.g. `0b0123` is two tokens, `0b01` and `23`. Writing such a thing in +normal code gives a strange error message: + +```rust +:2:9: 2:11 error: expected one of `.`, `;`, `}`, or an operator, found `23` +:2 0b0123 + ^~ +``` + +However, it is valid to write such a thing in a macro (e.g. using the +`tt` non-terminal), and thus lexing the adjacent digits as two tokens +can lead to unexpected behaviour. + +```rust +macro_rules! expr { ($e: expr) => { $e } } + +macro_rules! add { + ($($token: tt)*) => { + 0 $(+ expr!($token))* + } +} +fn main() { + println!("{}", add!(0b0123)); +} +``` + +prints `24` (`add` expands to `0 + 0b01 + 23`). + +It would be nicer for both cases to print an error like: + +```rust +error: found invalid digit `2` in binary literal +0b0123 + ^ +``` + +(The non-macro case could be handled by detecting this pattern in the +lexer and special casing the message, but this doesn't not handle the +macro case.) + +Code that wants two tokens can opt in to it by `0b01 23`, for +example. This is easy to write, and expresses the intent more clearly +anyway. + +# Detailed design + +The grammar that the lexer uses becomes + +``` +(0b[0-9]+ | 0o[0-9]+ | [0-9]+ | 0x[0-9a-fA-F]+) suffix +``` + +instead of just `[01]` and `[0-7]` for the first two, respectively. + +However, it is always an error (in the lexer) to have invalid digits +in a numeric literal beginning with `0b` or `0o`. In particular, even +a macro invocation like + +```rust +macro_rules! ignore { ($($_t: tt)*) => { {} } } + +ignore!(0b0123) +``` + +is an error even though it doesn't use the tokens. + + +# Drawbacks + +This adds a slightly peculiar special case, that is somewhat unique to +Rust. On the other hand, most languages do not expose the lexical +grammar so directly, and so have more freedom in this respect. That +is, in many languages it is indistinguishable if `0b1234` is one or +two tokens: it is *always* an error either way. + + +# Alternatives + +Don't do it, obviously. + +# Unresolved questions + +None. From 3989c5a0e18e57d578e069c21d79c52bc366d68c Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Tue, 17 Feb 2015 19:10:15 +1100 Subject: [PATCH 0105/1195] Add suffix alternative. --- text/0000-small-base-lexing.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-small-base-lexing.md b/text/0000-small-base-lexing.md index 65d4895c9c8..2936b7e6b79 100644 --- a/text/0000-small-base-lexing.md +++ b/text/0000-small-base-lexing.md @@ -96,6 +96,11 @@ two tokens: it is *always* an error either way. Don't do it, obviously. +Consider `0b123` to just be `0b1` with a suffix of `23`, and this is +an error or not depending if a suffix of `23` is valid. Handling this +uniformly would require `"foo"123` and `'a'123` also being lexed as a +single token. (Which may be a good idea anyway.) + # Unresolved questions None. From 82e9e2c0448e41517a5f734a7601e254125e4bfd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 17 Feb 2015 20:42:11 +0100 Subject: [PATCH 0106/1195] Updated RFC with the current protype design of the traits --- text/0000-string-patterns.md | 137 ++++++++++++++++++++--------------- 1 file changed, 80 insertions(+), 57 deletions(-) diff --git a/text/0000-string-patterns.md b/text/0000-string-patterns.md index c9e5535a209..e74fac67c81 100644 --- a/text/0000-string-patterns.md +++ b/text/0000-string-patterns.md @@ -24,9 +24,9 @@ This presents a couple of issues: - The API is inconsistent. - The API duplicates similar operations on different types. (`contains` vs `contains_char`) -- The API does not provide all operations for all types. (No `rsplit` for `&str` patterns) +- The API does not provide all operations for all types. (For example, no `rsplit` for `&str` patterns) - The API is not extensible, eg to allow splitting at regex matches. -- The API offers no way to statically decide between different basic search algorithms +- The API offers no way to explicitly decide between different search algorithms for the same pattern, for example to use Boyer-Moore string searching. At the moment, the full set of relevant string methods roughly looks like this: @@ -79,24 +79,24 @@ First, new traits will be added to the `str` module in the std library: ```rust trait Pattern<'a> { - type MatcherImpl: Matcher<'a>; + type Searcher: Searcher<'a>; + fn into_matcher(self, haystack: &'a str) -> Self::Searcher; - fn into_matcher(self, haystack: &'a str) -> Self::MatcherImpl; - - // Can be implemented to optimize the "find only" case. - fn is_contained_in(self, haystack: &'a str) -> bool { - self.into_matcher(s).next_match().is_some() - } + fn is_contained_in(self, haystack: &'a str) -> bool { /* default*/ } + fn match_starts_at(self, haystack: &'a str, idx: usize) -> bool { /* default*/ } + fn match_ends_at(self, haystack: &'a str, idx: usize) -> bool + where Self::Searcher: ReverseSearcher<'a> { /* default*/ } } ``` A `Pattern` represents a builder for an associated type implementing a -family of `Matcher` traits (see below), and will be implemented by all types that +family of `Searcher` traits (see below), and will be implemented by all types that represent string patterns, which includes: -- `char` and `&str` -- Everything implementing `CharEq` -- Additional types like `&Regex` or `Ascii` +- `&str` +- `char`, and everything else implementing `CharEq` +- Third party types like `&Regex` or `Ascii` +- Alternative algorithm wrappers like `struct BoyerMoore(&str)` ```rust impl<'a> Pattern<'a> for char { /* ... */ } @@ -112,51 +112,62 @@ The lifetime parameter on `Pattern` exists in order to allow threading the lifet of the haystack (the string to be searched through) through the API, and is a workaround for not having associated higher kinded types yet. -Consumers of this API can then call `into_matcher()` on the pattern to convert it into -a type implementing a family of `Matcher` traits: +Consumers of this API can then call `into_searcher()` on the pattern to convert it into +a type implementing a family of `Searcher` traits: ```rust -unsafe trait Matcher<'a> { - fn haystack(&self) -> &'a str - fn next_match(&mut self) -> Option<(uint, uint)>; +pub enum SearchStep { + Match(usize, usize), + Reject(usize, usize), + Done } +pub unsafe trait Searcher<'a> { + fn haystack(&self) -> &'a str; + fn next(&mut self) -> SearchStep; -unsafe trait ReverseMatcher<'a>: Matcher<'a> { - fn next_match_back(&mut self) -> Option<(uint, uint)>; + fn next_match(&mut self) -> Option<(usize, usize)> { /* default*/ } + fn next_reject(&mut self) -> Option<(usize, usize)> { /* default*/ } } +pub unsafe trait ReverseSearcher<'a>: Searcher<'a> { + fn next_back(&mut self) -> SearchStep; -trait DoubleEndedMatcher<'a>: ReverseMatcher<'a> {} + fn next_match_back(&mut self) -> Option<(usize, usize)> { /* default*/ } + fn next_reject_back(&mut self) -> Option<(usize, usize)> { /* default*/ } +} +pub trait DoubleEndedSearcher<'a>: ReverseSearcher<'a> {} ``` -The basic idea of a `Matcher` is to expose a `Iterator`-like interface for -iterating through all matches of a pattern in the given haystack. +The basic idea of a `Searcher` is to expose a interface for +iterating through all connected string fragments of the haystack while classifing them as either a match, or a reject. -Similar to iterators, depending on the concrete implementation a matcher can have +This happens in form of the returned enum value. A `Match` needs to contain the start and end indices of a complete non-overlapping match, while a `Rejects` may be emitted for arbitary non-overlapping rejected parts of the string, as long as the start and end indices lie on valid utf8 boundaries. + +Similar to iterators, depending on the concrete implementation a searcher can have additional capabilities that build on each other, which is why they will be defined in terms of a three-tier hierarchy: -- `Matcher<'a>` is the basic trait that all matchers need to implement. - It contains a `next_match()` method that returns the `start` and `end` indices of - the next non-overlapping match in the haystack, with the search beginning at the front +- `Searcher<'a>` is the basic trait that all searchers need to implement. + It contains a `next()` method that returns the `start` and `end` indices of + the next match or reject in the haystack, with the search beginning at the front (left) of the string. It also contains a `haystack()` getter for returning the actual haystack, which is the source of the `'a` lifetime on the hierarchy. The reason for this getter being made part of the trait is twofold: - - Every matcher needs to store some reference to the haystack anyway. + - Every searcher needs to store some reference to the haystack anyway. - Users of this trait will need access to the haystack in order for the individual match results to be useful. -- `ReverseMatcher<'a>` adds an `next_match_back` method, for also allowing to efficiently - search for matches in reverse (starting from the right). +- `ReverseSearcher<'a>` adds an `next_back()` method, for also allowing to efficiently + search in reverse (starting from the right). However, the results are not required to be equal to the results of - `next_match` in reverse, (as would be the case for the `DoubleEndedIterator` trait) - as that can not be efficiently guaranteed for all matchers. (For an example, see further below) -- Instead `DoubleEndedMatcher<'a>` is provided as an marker trait for expressing - that guarantee - If a matcher implements this trait, all results found from the + `next()` in reverse, (as would be the case for the `DoubleEndedIterator` trait) + because that can not be efficiently guaranteed for all searchers. (For an example, see further below) +- Instead `DoubleEndedSearcher<'a>` is provided as an marker trait for expressing + that guarantee - If a searcher implements this trait, all results found from the left need to be equal to all results found from the right in reverse order. As an important last detail, both -`Matcher` and `ReverseMatcher` are marked as `unsafe` traits, even though the actual methods +`Searcher` and `ReverseSearcher` are marked as `unsafe` traits, even though the actual methods aren't. This is because every implementation of these traits need to ensure that all -indices returned by `next_match` and `next_match_back` lay on valid utf8 boundaries +indices returned by `next()` and `next_back()` lie on valid utf8 boundaries in the haystack. Without that guarantee, every single match returned by a matcher would need to be @@ -171,6 +182,15 @@ Given that most implementations of these traits will likely live in the std library anyway, and are thoroughly tested, marking these traits `unsafe` doesn't seem like a huge burden to bear for good, optimizable performance. +### The role of the additional default methods + +`Pattern`, `Searcher` and `ReverseSearcher` each offer a few additional +default methods that give better optimization opportunities. + +Most consumers of the pattern API will use them to more narrowly constraint +how they are looking for a pattern, which given an optimized implementantion, +should lead to mostly optimal code being generated. + ### Example for the issue with double-ended searching Let the haystack be the string `"fooaaaaabar"`, and let the pattern be the string `"aa"`. @@ -190,10 +210,11 @@ be considered a different operation than "matching from the back". ### Why `(uint, uint)` instead of `&str` -It would be possible to define `next_match` and `next_match_back` to return an `&str` -to the match instead of `(uint, uint)`. +> Note: This section is a bit outdated now -A concrete matcher impl could then make use of unsafe code to construct such an slice cheaply, +It would be possible to define `next` and `next_back` to return `&str`s instead of `(uint, uint)` tuples. + +A concrete searcher impl could then make use of unsafe code to construct such an slice cheaply, and by its very nature it is guaranteed to lie on utf8 boundaries, which would also allow not marking the traits as unsafe. @@ -224,7 +245,7 @@ as the "simple" default design. ## New methods on `StrExt` -With the `Pattern` and `Matcher` traits defined and implemented, the actual `str` +With the `Pattern` and `Searcher` traits defined and implemented, the actual `str` methods will be changed to make use of them: ```rust @@ -245,17 +266,17 @@ pub trait StrExt for ?Sized { fn starts_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>; fn ends_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>, - P::MatcherImpl: ReverseMatcher<'a>; + P::Searcher: ReverseSearcher<'a>; fn trim_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>, - P::MatcherImpl: ReverseMatcher<'a>; + P::Searcher: DoubleEndedSearcher<'a>; fn trim_left_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>; fn trim_right_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>, - P::MatcherImpl: ReverseMatcher<'a>; + P::Searcher: ReverseSearcher<'a>; fn find<'a, P>(&'a self, pat: P) -> Option where P: Pattern<'a>; fn rfind<'a, P>(&'a self, pat: P) -> Option where P: Pattern<'a>, - P::MatcherImpl: ReverseMatcher<'a>; + P::Searcher: ReverseSearcher<'a>; // ... } @@ -278,7 +299,7 @@ changed to uniformly use the new pattern API. The main differences are: to behave like a double ended queues where you just pop elements from both sides. _However_, all iterators will still implement `DoubleEndedIterator` if the underlying -matcher implements `DoubleEndedMatcher`, to keep the ability to do things like `foo.split('a').rev()`. +matcher implements `DoubleEndedSearcher`, to keep the ability to do things like `foo.split('a').rev()`. ## Transition and deprecation plans @@ -288,7 +309,7 @@ methods will still compile, or give deprecation warning. It would even be possible to generically implement `Pattern` for all `CharEq` types, making the transition more painless. -Long-term, post 1.0, it would be possible to define new sets of `Pattern` and `Matcher` +Long-term, post 1.0, it would be possible to define new sets of `Pattern` and `Searcher` without a lifetime parameter by making use of higher kinded types in order to simplify the string APIs. Eg, instead of `fn starts_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>;` you'd have `fn starts_with

(&self, pat: P) -> bool where P: Pattern;`. @@ -298,30 +319,30 @@ forward to the old traits, which would roughly look like this: ```rust unsafe trait NewPattern { - type MatcherImpl<'a> where MatcherImpl: NewMatcher; + type Searcher<'a> where Searcher: NewSearcher; - fn into_matcher<'a>(self, s: &'a str) -> Self::MatcherImpl<'a>; + fn into_matcher<'a>(self, s: &'a str) -> Self::Searcher<'a>; } unsafe impl<'a, P> Pattern<'a> for P where P: NewPattern { - type MatcherImpl = ::MatcherImpl<'a>; + type Searcher = ::Searcher<'a>; - fn into_matcher(self, haystack: &'a str) -> Self::MatcherImpl { + fn into_matcher(self, haystack: &'a str) -> Self::Searcher { ::into_matcher(self, haystack) } } -unsafe trait NewMatcher for Self<'_> { +unsafe trait NewSearcher for Self<'_> { fn haystack<'a>(self: &Self<'a>) -> &'a str; fn next_match<'a>(self: &mut Self<'a>) -> Option<(uint, uint)>; } -unsafe impl<'a, M> Matcher<'a> for M<'a> where M: NewMatcher { +unsafe impl<'a, M> Searcher<'a> for M<'a> where M: NewSearcher { fn haystack(&self) -> &'a str { - ::haystack(self) + ::haystack(self) } fn next_match(&mut self) -> Option<(uint, uint)> { - ::next_match(self) + ::next_match(self) } } ``` @@ -346,6 +367,8 @@ the `prelude` (which would be unneeded anyway). # Alternatives +> Note: This section is not updated to the new naming scheme + In general: - Keep status quo, with all issues listed at the beginning. @@ -371,8 +394,8 @@ some negative trade-offs: for immediate results. - Extend `Pattern` into `Pattern` and `ReversePattern`, starting the forward-reverse split at the level of patterns directly. The two would still be in a inherits-from relationship like - `Matcher` and `ReverseMatcher`, and be interchangeable if the later also implement `DoubleEndedMatcher`, - but on the `str` API where clauses like `where P: Pattern<'a>, P::MatcherImpl: ReverseMatcher<'a>` + `Matcher` and `ReverseSearcher`, and be interchangeable if the later also implement `DoubleEndedSearcher`, + but on the `str` API where clauses like `where P: Pattern<'a>, P::Searcher: ReverseSearcher<'a>` would turn into `where P: ReversePattern<'a>`. Lastly, there are alternatives that don't seem very favorable, but are listed for completeness sake: @@ -400,7 +423,7 @@ Lastly, there are alternatives that don't seem very favorable, but are listed fo - Should the API split in regard to forward-reverse matching be as symmetrical as possible, or as minimal as possible? In the first case, iterators like `Matches` and `RMatches` could both implement `DoubleEndedIterator` if a - `DoubleEndedMatcher` exists, in the latter only `Matches` would, with `RMatches` only providing the + `DoubleEndedSearcher` exists, in the latter only `Matches` would, with `RMatches` only providing the minimum to support reverse operation. A ruling in favor of symmetry would also speak for the `ReversePattern` alternative. From db61898019d25f32cb91f5ab2bb5992973ee2990 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 17 Feb 2015 14:21:06 -0800 Subject: [PATCH 0107/1195] Minor updates and tweaks --- text/0000-hash-simplification.md | 40 ++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 17 deletions(-) diff --git a/text/0000-hash-simplification.md b/text/0000-hash-simplification.md index c3c301c71c8..114ca0b45dc 100644 --- a/text/0000-hash-simplification.md +++ b/text/0000-hash-simplification.md @@ -5,10 +5,10 @@ # Summary -Pare back the `std::hash` module's API to match more closely what other -languages such as Java and C++ have. Consequently, this alteration in default -hashing strategy alters the story for DoS protection with the standard library's -`HashMap` implementation. +Pare back the `std::hash` module's API to improve ergonomics of usage and +definitions. While an alternative scheme more in line with what Java and C++ +have is considered, the current `std::hash` module will remain largely as-is +with modifications to its core traits. # Motivation @@ -156,12 +156,17 @@ The new API of `std::hash` would be: ```rust trait Hash { fn hash(&self, h: &mut H); + + fn hash_slice(data: &[Self], h: &mut H) { + for piece in data { + data.hash(h); + } + } } trait Hasher { - type Output; fn write(&mut self, data: &[u8]); - fn finish(&self) -> Self::Output; + fn finish(&self) -> u64; fn write_u8(&mut self, i: u8) { ... } fn write_i8(&mut self, i: i8) { ... } @@ -190,8 +195,13 @@ This API is quite similar to today's API, but has a few tweaks: implies that the trait is no longer object-safe, but it is much more ergonomic to operate over generically. -> **Note**: A possible tweak would be to remove the `Output` associated type in -> favor of just always returning `usize` (or `u64`). +* The `Hash` trait now has a `hash_slice` method to slice a number of instances + of `Self` at once. This will allow optimization of the `Hash` implementation + of `&[u8]` to translate to a raw `write` as well as other various slices of + primitives. + +* The `Output` associated type was removed in favor of an explicit `u64` return + from `finish`. The purpose of this API is to continue to allow APIs to be generic over the hashing algorithm used. This would allow `HashMap` continue to use a randomly @@ -200,12 +210,11 @@ protection, more information on this below). An example encoding of the alternative API (proposed below) would look like: ```rust -impl Hasher for usize { - type Output = usize; +impl Hasher for u64 { fn write(&mut self, data: &[u8]) { for b in data.iter() { self.write_u8(*b); } } - fn finish(&self) -> usize { *self } + fn finish(&self) -> u64 { *self } fn write_u8(&mut self, i: u8) { *self = combine(*self, i); } // and so on... @@ -231,13 +240,9 @@ impl u64> HashMap { fn with_hasher(hasher: H) -> HashMap; } -fn global_siphash_keys() -> (u64, u64) { - // ... -} - impl Fn(&K) -> u64 for DefaultHasher { fn call(&self, arg: &K) -> u64 { - let (k1, k2) = global_siphash_keys(); + let (k1, k2) = self.siphash_keys(); let mut s = SipHasher::new_with_keys(k1, k2); arg.hash(&mut s); s.finish() @@ -267,7 +272,8 @@ trait for calculating hashes. * The API of `Hasher` is approaching the realm of serialization/reflection and it's unclear whether its API should grow over time to support more basic Rust - types. + types. It would be unfortunate if the `Hasher` trait approached a full-blown + `Encoder` trait (as `rustc-serialize` has). # Alternatives From 88a3079ebf5707ddfc8f447690c71db825a00969 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 17 Feb 2015 15:38:19 -0800 Subject: [PATCH 0108/1195] Add another alternative drawback --- text/0000-hash-simplification.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-hash-simplification.md b/text/0000-hash-simplification.md index 114ca0b45dc..cdb24fc7978 100644 --- a/text/0000-hash-simplification.md +++ b/text/0000-hash-simplification.md @@ -419,6 +419,10 @@ may not be able to provide any form of DoS protection guarantee at all. randomly seed each individual instance but may at best have one global seed. This consequently elevates the risk of a DoS attack on a `HashMap` instance. +* The method of combining hashes together is not proven among other languages + and is not guaranteed to provide the guarantees we want. This departure from + the may have unknown consequences. + # Unresolved questions * To what degree should `HashMap` attempt to prevent DoS attacks? Is it the From c626c0d586b3873c4619c68ff75b349feca5261f Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 17 Feb 2015 15:45:14 -0800 Subject: [PATCH 0109/1195] RFC 823 is: Simplify std::hash --- ...0-hash-simplification.md => 0823-hash-simplification.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-hash-simplification.md => 0823-hash-simplification.md} (99%) diff --git a/text/0000-hash-simplification.md b/text/0823-hash-simplification.md similarity index 99% rename from text/0000-hash-simplification.md rename to text/0823-hash-simplification.md index cdb24fc7978..7d51aa03ac1 100644 --- a/text/0000-hash-simplification.md +++ b/text/0823-hash-simplification.md @@ -1,7 +1,7 @@ - Feature Name: hash -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Start Date: 2015-02-17 +- RFC PR: https://github.com/rust-lang/rfcs/pull/823 +- Rust Issue: https://github.com/rust-lang/rust/issues/22467 # Summary From cd368cd3b9210be5f1841c3debc76f981b2ad22f Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Mon, 16 Feb 2015 13:10:31 -0700 Subject: [PATCH 0110/1195] Allow macros in types --- text/0000-type-macros.md | 412 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 412 insertions(+) create mode 100644 text/0000-type-macros.md diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md new file mode 100644 index 00000000000..290c2265b9d --- /dev/null +++ b/text/0000-type-macros.md @@ -0,0 +1,412 @@ +- Feature Name: Macros in type positions +- Start Date: 2015-02-16 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow macros in type positions + +# Motivation + +Macros are currently allowed in syntax fragments for expressions, +items, and patterns, but not for types. This RFC proposes to lift that +restriction for the following reasons: + +1. Increase generality of the macro system - in the absence of a + concrete reason for disallowing macros in types, the limitation + should be removed in order to promote generality and to enable use + cases which would otherwise require resorting either to compiler + plugins or to more elaborate item-level macros. + +2. Enable more programming patterns - macros in type positions provide + a means to express **recursion** and **choice** within types in a + fashion that is still legible. Associated types alone can accomplish + the former (recursion/choice) but not the latter (legibility). + +# Detailed design + +## Implementation + +The proposed feature has been implemented at +[this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). There +is no real novelty to the design as it is simply an extension of the +existing macro machinery to handle the additional case of macro +expansion in types. The biggest change is the addition of a +[`TyMac`](https://github.com/freebroccolo/rust/blob/f8f8dbb6d332c364ecf26b248ce5f872a7a67019/src/libsyntax/ast.rs#L1274-L1275) +to the `Ty_` enum so that the parser can indicate a macro invocation +in a type position. In other words, `TyMac` is added to the ast and +handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. + +## Examples + +### Heterogeneous Lists + +Heterogeneous lists are one example where the ability to express +recursion via type macros is very useful. They can be used as an +alternative to (or in combination with) tuples. Their recursive +structure provide a means to abstract over arity and to manipulate +arbitrary products of types with operations like appending, taking +length, adding/removing items, computing permutations, etc. + +Heterogeneous lists are straightforward to define: + +```rust +struct Nil; // empty HList +struct Cons(H, T); // cons cell of HList + +// trait to classify valid HLists +trait HList {} +impl HList for Nil {} +impl HList for Cons {} +``` + +However, writing them in code is not so convenient: + +```rust +let xs = Cons("foo", Cons(false, Cons(vec![0u64], Nil))); +``` + +At the term-level, this is easy enough to fix with a macro: + +```rust +// term-level macro for HLists +macro_rules! hlist { + {} => { Nil }; + { $head:expr } => { Cons($head, Nil) }; + { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; +} + +let xs = hlist!["foo", false, vec![0u64]]; +``` + +Unfortunately, this is an incomplete solution. HList terms are more +convenient to write but HList types are not: + +```rust +let xs: Cons<&str, Cons, Nil>>> = hlist!["foo", false, vec![0u64]]; +``` + +Under this proposal—allowing macros in types—we would be able to use a +macro to improve writing the HList type as well. The complete example +follows: + +```rust +// term-level macro for HLists +macro_rules! hlist { + {} => { Nil }; + { $head:expr } => { Cons($head, Nil) }; + { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; +} + +// type-level macro for HLists +macro_rules! HList { + {} => { Nil }; + { $head:ty } => { Cons<$head, Nil> }; + { $head:ty, $($tail:ty),* } => { Cons<$head, HList!($($tail),*)> }; +} + +let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![0u64]]; +``` + +Operations on HLists can be defined by recursion, using traits with +associated type outputs at the type-level and implementation methods +at the term-level. + +HList append is provided as an example of such an operation. Macros in +types are used to make writing append at the type level more +convenient, e.g., with `Expr!`: + +```rust +use std::ops; + +// nil case for HList append +impl ops::Add for Nil { + type Output = Ys; + + #[inline] + fn add(self, rhs: Ys) -> Ys { + rhs + } +} + +// cons case for HList append +impl ops::Add for Cons where + Xs: ops::Add, +{ + type Output = Cons; + + #[inline] + fn add(self, rhs: Ys) -> Cons { + Cons(self.0, self.1 + rhs) + } +} + +// type macro Expr allows us to expand the + operator appropriately +macro_rules! Expr { + { $A:ty } => { $A }; + { $LHS:tt + $RHS:tt } => { >::Output }; +} + +// test demonstrating term level `xs + ys` and type level `Expr!(Xs + Ys)` +#[test] +fn test_append() { + fn aux(xs: Xs, ys: Ys) -> Expr!(Xs + Ys) where + Xs: ops::Add + { + xs + ys + } + let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![]]; + let ys: HList![u64, [u8; 3], ()] = hlist![0, [0, 1, 2], ()]; + + // parentheses around compound types due to limitations in macro parsing; + // real implementation could use a plugin to avoid this + let zs: Expr!((HList![&str, bool, Vec]) + + (HList![u64, [u8; 3], ()])) + = aux(xs, ys); + assert_eq!(zs, hlist!["foo", false, vec![], 0, [0, 1, 2], ()]) +} +``` + +### Additional Examples ### + +#### Type-level numbers + +Another example where type macros can be useful is in the encoding of +numbers as types. Binary natural numbers can be represented as +follows: + +```rust +struct _0; // 0 bit +struct _1; // 1 bit + +// classify valid bits +trait Bit {} +impl Bit for _0 {} +impl Bit for _1 {} + +// classify positive binary naturals +trait Pos {} +impl Pos for _1 {} +impl Pos for (P, B) {} + +// classify binary naturals with 0 +trait Nat {} +impl Nat for _0 {} +impl Nat for _1 {} +impl Nat for (P, B) {} +``` + +These can be used to index into tuples or HLists generically (linear +time generally or constant time up to a fixed number of +specializations). They can also be used to encode "sized" or "bounded" +data, like vectors: + +```rust +struct LengthVec(Vec); +``` + +The type number can either be a phantom parameter `N` as above, or +represented concretely at the term-level (similar to list). In either +case, a length-safe API can be provided on top of types `Vec`. Because +the length is known statically, unsafe indexing would be allowable by +default. + +We could imagine an idealized API in the following fashion: + +```rust +// push, adding one to the length +fn push(x: A, xs: LengthVec) -> LengthVec; + +// pop, subtracting one from the length +fn pop(store: &mut A, xs: LengthVec) -> LengthVec; + +// append, adding the individual lengths +fn append(xs: LengthVec, ys: LengthVec) -> LengthVec; + +// produce a length respecting iterator from an indexed vector +fn iter(xs: LengthVec) -> LengthIterator; +``` + +However, in order to be able to write something close to that in Rust, +we would need macros in types: + +```rust + +// Nat! would expand integer constants to type-level binary naturals; would +// be implemented as a plugin for efficiency +Nat!(4) ==> ((_1, _0), _0) + +// Expr! would expand + to Add::Output and integer constants to Nat!; see +// the HList append earlier in the RFC for a concrete example of how this +// might be defined +Expr!(N + M) ==> >::Output + +// Now we could expand the following type to something meaningful in Rust: +LengthVec + ==> LengthVec>::Output> + ==> LengthVec>::Output> +``` + +##### Optimization of `Expr`! + +Because `Expr!` could be implemented as a plugin, the opportunity +would exist to perform various optimizations of type-level expressions +during expansion. Partial evaluation would be one approach to +this. Furthermore, expansion-time optimizations would not necessarily +be limited to simple arithmetic expressions but could be used for +other data like HLists. + +#### Conversion from HList to Tuple + +With type macros, it is possible to write macros that convert back and +forth between tuples and HLists in the following fashion: + +```rust +// type-level macro for HLists +macro_rules! HList { + {} => { Nil }; + { $head:ty } => { Cons<$head, Nil> }; + { $head:ty, $($tail:ty),* } => { Cons<$head, HList!($($tail),*)> }; +} + +// term-level macro for HLists +macro_rules! hlist { + {} => { Nil }; + { $head:expr } => { Cons($head, Nil) }; + { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; +} + +// term-level HLists in patterns +macro_rules! hlist_match { + {} => { Nil }; + { $head:ident } => { Cons($head, Nil) }; + { $head:ident, $($tail:ident),* } => { Cons($head, hlist_match!($($tail),*)) }; +} + +// iterate macro for generated comma separated sequences of idents +fn impl_for_seq_upto_expand<'cx>( + ecx: &'cx mut base::ExtCtxt, + span: codemap::Span, + args: &[ast::TokenTree], +) -> Box { + let mut parser = ecx.new_parser_from_tts(args); + + // parse the macro name + let mac = parser.parse_ident(); + + // parse a comma + parser.eat(&token::Token::Comma); + + // parse the number of iterations + let iterations = match parser.parse_lit().node { + ast::Lit_::LitInt(i, _) => i, + _ => { + ecx.span_err(span, "welp"); + return base::DummyResult::any(span); + } + }; + + // generate a token tree: A0, ..., An + let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { + if k % 2 == 0 { + token::str_to_ident(format!("A{}", (k / 2)).as_slice()) + .to_tokens(ecx) + .into_iter() + } else { + let span = codemap::DUMMY_SP; + let token = parse::token::Token::Comma; + vec![ast::TokenTree::TtToken(span, token)] + .into_iter() + } + }).collect::>(); + + // iterate over the ctx and generate impl syntax fragments + let mut items = vec![]; + let mut i = ctx.len(); + for _ in range(0, iterations) { + items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); + i -= 2; + ctx.truncate(i); + } + + // splice the impl fragments into the ast + base::MacItems::new(items.into_iter()) +} + +pub struct ToHList; +pub struct ToTuple; + +// macro to implement: ToTuple(hlist![…]) => (…,) +macro_rules! impl_to_tuple_for_seq { + ($($seq:ident),*) => { + #[allow(non_snake_case)] + impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { + type Output = ($($seq,)*); + #[inline] + extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { + match this { + hlist_match![$($seq),*] => ($($seq,)*) + } + } + } + } +} + +// macro to implement: ToHList((…,)) => hlist![…] +macro_rules! impl_to_hlist_for_seq { + ($($seq:ident),*) => { + #[allow(non_snake_case)] + impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { + type Output = HList![$($seq),*]; + #[inline] + extern "rust-call" fn call(&self, (this,): (($($seq,)*),)) -> HList![$($seq),*] { + match this { + ($($seq,)*) => hlist![$($seq),*] + } + } + } + } +} + +// generate implementations up to length 32 +impl_for_seq_upto!{ impl_to_tuple_for_seq, 32 } +impl_for_seq_upto!{ impl_to_hlist_for_seq, 32 } +``` + +# Drawbacks + +There seem to be few drawbacks to implementing this feature as an +extension of the existing macro machinery. Parsing macro invocations +in types adds a very small amount of additional complexity to the +parser (basically looking for `!`). Having an extra case for macro +invocation in types slightly complicates conversion. As with all +feature proposals, it is possible that designs for future extensions +to the macro system or type system might somehow interfere with this +functionality. + +# Alternatives + +There are no direct alternatives to my knowledge. Extensions to the +type system like data kinds, singletons, and various more elaborate +forms of staged programming (so-called CTFE) could conceivably cover +some cases where macros in types might otherwise be used. It is +unlikely they would provide the same level of functionality as macros, +particularly where plugins are concerned. Instead, such features would +probably benefit from type macros too. + +Not implementing this feature would mean disallowing some useful +programming patterns. There are some discussions in the community +regarding more extensive changes to the type system to address some of +these patterns. However, type macros along with associated types can +already accomplish many of the same things without the significant +engineering cost in terms of changes to the type system. Either way, +type macros would not prevent additional extensions. + +# Unresolved questions + +There is a question as to whether macros in types should allow `<` and +`>` as delimiters for invocations, e.g. `Foo!`. However, this would +raise a number of additional complications and is not necessary to +consider for this RFC. If deemed desirable by the community, this +functionality can be proposed separately. From 333aae101f2bf00b9892f6428526be4f5fe8940e Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 17 Feb 2015 16:24:21 -0800 Subject: [PATCH 0111/1195] RFC 592 is: CStr, the dereferenced complement to CString --- text/0000-c-str-deref.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-c-str-deref.md b/text/0000-c-str-deref.md index 0a9113c26cd..9c803126e34 100644 --- a/text/0000-c-str-deref.md +++ b/text/0000-c-str-deref.md @@ -1,6 +1,6 @@ - Start Date: 2015-01-17 -- RFC PR: -- Rust Issue: +- RFC PR: https://github.com/rust-lang/rfcs/pull/592 +- Rust Issue: https://github.com/rust-lang/rust/issues/22469 # Summary From 10588f8847842edf81c342f51474453b223cc4ce Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 17 Feb 2015 16:29:44 -0800 Subject: [PATCH 0112/1195] RFC 840 is: Remove panic from CString --- text/{0000-c-str-deref.md => 0592-c-str-deref.md} | 0 ...-panic-in-c-string.md => 0840-no-panic-in-c-string.md} | 8 ++++---- 2 files changed, 4 insertions(+), 4 deletions(-) rename text/{0000-c-str-deref.md => 0592-c-str-deref.md} (100%) rename text/{0000-no-panic-in-c-string.md => 0840-no-panic-in-c-string.md} (96%) diff --git a/text/0000-c-str-deref.md b/text/0592-c-str-deref.md similarity index 100% rename from text/0000-c-str-deref.md rename to text/0592-c-str-deref.md diff --git a/text/0000-no-panic-in-c-string.md b/text/0840-no-panic-in-c-string.md similarity index 96% rename from text/0000-no-panic-in-c-string.md rename to text/0840-no-panic-in-c-string.md index f05ed669d01..f7f319190e3 100644 --- a/text/0000-no-panic-in-c-string.md +++ b/text/0840-no-panic-in-c-string.md @@ -1,7 +1,7 @@ - Feature Name: non_panicky_cstring - Start Date: 2015-02-13 -- RFC PR: -- Rust Issue: +- RFC PR: https://github.com/rust-lang/rfcs/pull/840 +- Rust Issue: https://github.com/rust-lang/rust/issues/22470 # Summary @@ -15,7 +15,7 @@ these functions return `Result` instead. > those nameless blights of outer voids whose faint daemon scratchings we > sometimes hear on the farthest rim of space, yet from which our own finite > vision has given us a merciful immunity. -> +> > — H. P. Lovecraft, The Lurking Fear Currently the functions that produce `std::ffi::CString` out of Rust byte @@ -87,7 +87,7 @@ people from using the safer functions. If the panicky behavior is preserved, plentiful possibilities for DoS attacks and other unforeseen failures in the field may be introduced by code oblivious -to the input constraints. +to the input constraints. # Unresolved questions From 4ba33ee94a14cb40411c283f9ad4bff7bf8ec0e1 Mon Sep 17 00:00:00 2001 From: Sean Patrick Santos Date: Tue, 17 Feb 2015 19:57:13 -0700 Subject: [PATCH 0113/1195] Add historical note to RFC 195. --- text/0195-associated-items.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0195-associated-items.md b/text/0195-associated-items.md index 1b48e859247..718b63b9fea 100644 --- a/text/0195-associated-items.md +++ b/text/0195-associated-items.md @@ -21,6 +21,11 @@ This RFC also provides a mechanism for *multidispatch* traits, where the `impl` is selected based on multiple types. The connection to associated items will become clear in the detailed text below. +*Note: This RFC was originally accepted before RFC 246 added consts and changed +the definition of statics. The text has been updated to clarify that both consts +and statics can be associated with a trait. Other than that modification, the +proposal has not been changed to reflect newer Rust features or syntax.* + # Motivation A typical example where associated items are helpful is data structures like From d452ce145bf6b73f1fd14bee8a1493993a301a4b Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 17 Feb 2015 20:23:32 -0800 Subject: [PATCH 0114/1195] RFC 528 is: Add a generic string pattern matching API --- text/{0000-string-patterns.md => 0528-string-patterns.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-string-patterns.md => 0528-string-patterns.md} (99%) diff --git a/text/0000-string-patterns.md b/text/0528-string-patterns.md similarity index 99% rename from text/0000-string-patterns.md rename to text/0528-string-patterns.md index e74fac67c81..a31dcd85dc0 100644 --- a/text/0000-string-patterns.md +++ b/text/0528-string-patterns.md @@ -1,6 +1,6 @@ -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Start Date: 2015-02-17 +- RFC PR: https://github.com/rust-lang/rfcs/pull/528 +- Rust Issue: https://github.com/rust-lang/rust/issues/22477 # Summary From dc04dbe7d5bd92da65c36b7ed84bde97d6514383 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 17 Feb 2015 20:57:55 -0800 Subject: [PATCH 0115/1195] RFC 580 is Rename some collections for consistency and clarity --- ...ame-collections.md => 0580-rename-collections.md} | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) rename text/{0000-rename-collections.md => 0580-rename-collections.md} (97%) diff --git a/text/0000-rename-collections.md b/text/0580-rename-collections.md similarity index 97% rename from text/0000-rename-collections.md rename to text/0580-rename-collections.md index eba49b0d471..ee4fcf76f97 100644 --- a/text/0000-rename-collections.md +++ b/text/0580-rename-collections.md @@ -1,6 +1,6 @@ - Start Date: 2015-01-13 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/580 +- Rust Issue: https://github.com/rust-lang/rust/issues/22479 # Summary @@ -13,8 +13,8 @@ In [this comment](http://www.reddit.com/r/programming/comments/2rvoha/announcing The current collection names (and their longer versions) are: * `Vec` -> `Vector` -* `BTreeMap` -* `BTreeSet` +* `BTreeMap` +* `BTreeSet` * `BinaryHeap` * `Bitv` -> `BitVec` -> `BitVector` * `BitvSet` -> `BitVecSet` -> `BitVectorSet` @@ -40,8 +40,8 @@ First some general naming rules should be established. And the new names: * `Vec` -* `BTreeMap` -* `BTreeSet` +* `BTreeMap` +* `BTreeSet` * `BinaryHeap` * `Bitv` -> `BitVec` * `BitvSet` -> `BitSet` From 24f772d5ec314e823f06fb46083ac3c30c1de67e Mon Sep 17 00:00:00 2001 From: Julian Orth Date: Wed, 18 Feb 2015 09:26:22 +0100 Subject: [PATCH 0116/1195] update detailed design --- text/0000-drain-range.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/text/0000-drain-range.md b/text/0000-drain-range.md index b3e3f739a34..adbd22dcc15 100644 --- a/text/0000-drain-range.md +++ b/text/0000-drain-range.md @@ -58,13 +58,14 @@ Remove `Vec::drain` and add the following method: /// /// Panics if the range is decreasing or if the upper bound is larger than the /// length of the vector. -pub fn drain(&mut self, range: T) -> RangeIter { - range.drain(self) -} +pub fn drain(&mut self, range: T) -> /* ... */; ``` -Where `Drainer` should be implemented for `Range`, `RangeTo`, -`RangeFrom`, `FullRange`, and `usize`. +Where `Trait` is some trait that is implemented for at least `Range`, +`RangeTo`, `RangeFrom`, `FullRange`, and `usize`. + +The precise nature of the return value is to be determined during implementation +and may or may not depend on `T`. Add `String::drain`: @@ -77,11 +78,11 @@ Add `String::drain`: /// Panics if the range is decreasing, if the upper bound is larger than the /// length of the String, or if the start and the end of the range don't lie on /// character boundaries. -pub fn drain(&mut self, range: /* ? */) -> /* ? */ { - // ? -} +pub fn drain(&mut self, range: T) -> /* ... */; ``` +Where `Trait` and the return value are as above but need not be the same. + # Drawbacks - The function signature differs from other collections. From 5843e5cc55972236d61b95ef46d386824ea3b86c Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Wed, 18 Feb 2015 10:58:00 -0500 Subject: [PATCH 0117/1195] Update 0000-embrace-extend-extinguish.md --- text/0000-embrace-extend-extinguish.md | 41 ++++++++------------------ 1 file changed, 13 insertions(+), 28 deletions(-) diff --git a/text/0000-embrace-extend-extinguish.md b/text/0000-embrace-extend-extinguish.md index c0cb6f6b762..cd80373f5f8 100644 --- a/text/0000-embrace-extend-extinguish.md +++ b/text/0000-embrace-extend-extinguish.md @@ -5,10 +5,13 @@ # Summary -Extend the Extend trait to take IntoIterator, and make all collections -`impl<'a, T: Clone> Extend<&'a T>`. This enables both `vec.extend(&[1, 2, 3])`, and -`vec.extend(&hash_set)`. This provides a more expressive replacement for -`Vec::push_all` with literally no ergonomic loss, while leveraging established APIs. +NOTE: This RFC assumes Extend is improved to take IntoIterator, as was always intended. + +Make all collections `impl<'a, T: Clone> Extend<&'a T>`. + +This enables both `vec.extend(&[1, 2, 3])`, and `vec.extend(&hash_set)`. +This provides a more expressive replacement for `Vec::push_all` with +literally no ergonomic loss, while leveraging established APIs. # Motivation @@ -28,35 +31,17 @@ collection by-reference to clone the data out of it. # Detailed design -Here's a quick hack to get this working today: - -``` -/// A type growable from an `Iterator` implementation -pub trait Extend { - fn extend, I: IntoIterator> - (&mut self, iterator: I); -} -``` - -This isn't the signature we'd like longterm, but it's what works with today's -IntoIterator and where clauses. Longterm (like, tomorrow) this should work: - -``` -/// A type growable from an `Iterator` implementation -pub trait Extend { - fn extend>(&mut self, iterator: I); -} -``` +* For sequences and sets: `impl<'a, T: Clone> Extend<&'a T>` +* For maps: `impl<'a, K: Clone, V: Clone> Extend<(&'a K, &'a V)>` -And here's usage: +e.g. ``` use std::iter::IntoIterator; impl<'a, T: Clone> Extend<&'a T> for Vec { - fn extend, I: IntoIterator> - (&mut self, iterator: I){ - self.extend(iterator.into_iter().cloned()) + fn extend>(&mut self, iter: I) { + self.extend(iter.into_iter().cloned()) } } @@ -84,7 +69,7 @@ Nope. # Unresolved questions -FromIterator could also be extended in the same manner, but this is less useuful for +FromIterator could also be extended in the same manner, but this is less useful for two reasons: * FromIterator is always called by calling `collect`, and IntoIterator doesn't really From 4782cd831aaf7e69fa05444b528855d0c85ed0fd Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Wed, 18 Feb 2015 10:33:29 -0700 Subject: [PATCH 0118/1195] Link RFC for parameterizing types with constants --- text/0000-type-macros.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 290c2265b9d..14738dea32a 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -257,6 +257,17 @@ this. Furthermore, expansion-time optimizations would not necessarily be limited to simple arithmetic expressions but could be used for other data like HLists. +##### Native alternatives: types parameterized by constant values + +This example with type-level naturals is meant to illustrate the kind +of patterns macros in types enable. I am not suggesting the standard +libraries adopt _this particular_ representation as a means to address +the more general issue of lack of numeric parameterization for +types. There is +[another RFC here](https://github.com/rust-lang/rfcs/pull/884) which +does propose extending the type system to allow parameterization over +constants. + #### Conversion from HList to Tuple With type macros, it is possible to write macros that convert back and From e40ea65e9bb378081a7a2cd73378f48012729ce2 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Wed, 18 Feb 2015 10:43:12 -0700 Subject: [PATCH 0119/1195] Add tests to hlist/tuple conversion example --- text/0000-type-macros.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 14738dea32a..478733f5095 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -383,6 +383,20 @@ macro_rules! impl_to_hlist_for_seq { // generate implementations up to length 32 impl_for_seq_upto!{ impl_to_tuple_for_seq, 32 } impl_for_seq_upto!{ impl_to_hlist_for_seq, 32 } + +// test converting an hlist to tuple +#[test] +fn test_to_tuple() { + assert_eq(ToTuple(hlist!["foo", true, (), vec![42u64]]), + ("foo", true, (), vec![42u64])) +} + +// test converting a tuple to hlist +#[test] +fn test_to_hlist() { + assert_eq(ToHList(("foo", true, (), vec![42u64])), + hlist!["foo", true, (), vec![42u64]]) +} ``` # Drawbacks From a64af8414a5cedf01995f4b9fc42147709311e96 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 18 Feb 2015 13:56:10 -0500 Subject: [PATCH 0120/1195] Merge RFC #563: remove ndebug support. Also add to index and add in unresolved questions from the RFC discussion. --- README.md | 1 + ...remove-ndebug.md => 0563-remove-ndebug.md} | 24 ++++++++++++++++--- 2 files changed, 22 insertions(+), 3 deletions(-) rename text/{0000-remove-ndebug.md => 0563-remove-ndebug.md} (62%) diff --git a/README.md b/README.md index 2b545458627..50b5922e758 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ the direction the language is evolving in. * [0509-collections-reform-part-2.md](text/0509-collections-reform-part-2.md) * [0517-io-os-reform.md](text/0517-io-os-reform.md) * [0560-integer-overflow.md](text/0560-integer-overflow.md) +* [0563-remove-ndebug.md](text/0563-remove-ndebug.md) * [0572-rustc-attribute.md](text/0572-rustc-attribute.md) * [0702-rangefull-expression.md](text/0702-rangefull-expression.md) * [0738-variance.md](text/0738-variance.md) diff --git a/text/0000-remove-ndebug.md b/text/0563-remove-ndebug.md similarity index 62% rename from text/0000-remove-ndebug.md rename to text/0563-remove-ndebug.md index 5cf9f4a4fcb..34f29fcda36 100644 --- a/text/0000-remove-ndebug.md +++ b/text/0563-remove-ndebug.md @@ -1,6 +1,6 @@ - Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#563](https://github.com/rust-lang/rfcs/pull/563) +- Rust Issue: [rust-lang/rust#22492](https://github.com/rust-lang/rust/issues/22492) # Summary @@ -42,4 +42,22 @@ No real alternatives beyond different names and defaults. # Unresolved questions -None. \ No newline at end of file +From the RFC discussion there remain some unresolved details: + +* brson + [writes](https://github.com/rust-lang/rfcs/pull/563#issuecomment-72549694), + "I have a minor concern that `-C debug-assertions` might not be the + right place for this command line flag - it doesn't really affect + code generation, at least in the current codebase (also `--cfg + debug_assertions` has the same effect).". +* huonw + [writes](https://github.com/rust-lang/rfcs/pull/563#issuecomment-72550619), + "It seems like the flag could be more than just a boolean, but + rather take a list of what to enable to allow fine-grained control, + e.g. none, overflow-checks, debug_cfg,overflow-checks, all. (Where + -C debug-assertions=debug_cfg acts like --cfg debug.)". +* huonw + [writes](https://github.com/rust-lang/rfcs/pull/563#issuecomment-74762795), + "if we want this to apply to more than just debug_assert do we want + to use a name other than -C debug-assertions?". + From 04b9a88b2d34d3ebe3e42180d54d617345fd0b6d Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 18 Feb 2015 14:03:35 -0500 Subject: [PATCH 0121/1195] Merge RFC 505: API Comment Conventions --- README.md | 1 + ...comment-conventions.md => 0505-api-comment-conventions.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-api-comment-conventions.md => 0505-api-comment-conventions.md} (97%) diff --git a/README.md b/README.md index 50b5922e758..9bd8b5a98d9 100644 --- a/README.md +++ b/README.md @@ -38,6 +38,7 @@ the direction the language is evolving in. * [0447-no-unused-impl-parameters.md](text/0447-no-unused-impl-parameters.md) * [0458-send-improvements.md](text/0458-send-improvements.md) * [0501-consistent_no_prelude_attributes.md](text/0501-consistent_no_prelude_attributes.md) +* [0505-api-comment-conventions.md](text/0505-api-comment-conventions.md) * [0509-collections-reform-part-2.md](text/0509-collections-reform-part-2.md) * [0517-io-os-reform.md](text/0517-io-os-reform.md) * [0560-integer-overflow.md](text/0560-integer-overflow.md) diff --git a/text/0000-api-comment-conventions.md b/text/0505-api-comment-conventions.md similarity index 97% rename from text/0000-api-comment-conventions.md rename to text/0505-api-comment-conventions.md index 353d585b0a1..0908db572e5 100644 --- a/text/0000-api-comment-conventions.md +++ b/text/0505-api-comment-conventions.md @@ -1,6 +1,6 @@ - Start Date: 2014-12-08 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#505](https://github.com/rust-lang/rfcs/pull/505) +- Rust Issue: N/A # Summary From 8e452d1c9ac763274b3b14b3d5ac7e20a2e1ed39 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 18 Feb 2015 14:41:13 -0500 Subject: [PATCH 0122/1195] Merge RFC 573 which amends RFC 544 to update the integer suffixes. --- README.md | 1 + text/0544-rename-int-uint.md | 12 ++++++++++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 9bd8b5a98d9..dad28d4c55d 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ the direction the language is evolving in. * [0505-api-comment-conventions.md](text/0505-api-comment-conventions.md) * [0509-collections-reform-part-2.md](text/0509-collections-reform-part-2.md) * [0517-io-os-reform.md](text/0517-io-os-reform.md) +* [0544-rename-int-uint.md](text/0544-rename-int-uint.md) * [0560-integer-overflow.md](text/0560-integer-overflow.md) * [0563-remove-ndebug.md](text/0563-remove-ndebug.md) * [0572-rustc-attribute.md](text/0572-rustc-attribute.md) diff --git a/text/0544-rename-int-uint.md b/text/0544-rename-int-uint.md index 0ee1928d17b..4ee61964487 100644 --- a/text/0544-rename-int-uint.md +++ b/text/0544-rename-int-uint.md @@ -1,6 +1,6 @@ - Start Date: 2014-12-28 -- RFC PR #: https://github.com/rust-lang/rfcs/pull/544 -- Rust Issue #: https://github.com/rust-lang/rust/issues/20639 +- RFC PR #: [rust-lang/rfcs#544](https://github.com/rust-lang/rfcs/pull/544) +- Rust Issue #: [rust-lang/rust#20639](https://github.com/rust-lang/rust/issues/20639) # Summary @@ -240,3 +240,11 @@ There are other alternatives not covered in this RFC. Please refer to this RFC's # Unresolved questions None. Necessary decisions about Rust's general integer type policies have been made in [Restarting the `int/uint` Discussion](http://discuss.rust-lang.org/t/restarting-the-int-uint-discussion/1131). + +# History + +Amended by [RFC 573][573] to change the suffixes from `is` and `us` to +`isize` and `usize`. Tracking issue for this amendment is +[rust-lang/rust#22496](https://github.com/rust-lang/rust/issues/22496). + +[573]: https://github.com/rust-lang/rfcs/pull/573 From 1e97d22107ad3e9b0d4cec5f4ef11212bd133071 Mon Sep 17 00:00:00 2001 From: Eduard Burtescu Date: Thu, 19 Feb 2015 05:28:44 +0200 Subject: [PATCH 0123/1195] Revert "RFC to require `impl MyStruct` to be nearby the definition of `MyStruct`" --- text/0000-allow-inherent-impls-anywhere.md | 73 ++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 text/0000-allow-inherent-impls-anywhere.md diff --git a/text/0000-allow-inherent-impls-anywhere.md b/text/0000-allow-inherent-impls-anywhere.md new file mode 100644 index 00000000000..aca724e0796 --- /dev/null +++ b/text/0000-allow-inherent-impls-anywhere.md @@ -0,0 +1,73 @@ +- Start Date: 2015-02-19 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow inherent implementations on types outside of the module they are defined in, +effectively reverting [RFC PR 155](https://github.com/rust-lang/rfcs/pull/155). + +# Motivation + +The main motivation for disallowing such `impl` bodies was the implementation +detail of fake modules being created to allow resolving `Type::method`, which +only worked correctly for `impl Type {...}` if a `struct Type` or `enum Type` +were defined in the same module. The old mechanism was obsoleted by UFCS, +which desugars `Type::method` to `::method` and perfoms a type-based +method lookup instead, with path resolution having no knowledge of inherent +`impl`s - and all of that was implemented by [rust-lang/rust#22172](https://github.com/rust-lang/rust/pull/22172). + +Aside from invalidating the previous RFC's motivation, there is something to be +said about dealing with restricted inherent `impl`s: it leads to non-DRY single +use extension traits, the worst offender being `AstBuilder` in libsyntax, with +almost 300 lines of redundant method definitions. + +# Detailed design + +Remove the existing limitation, and only require that the `Self` type of the +`impl` is defined in the same crate. This allows moving methods to other modules: +```rust +struct Player; + +mod achievements { + struct Achievement; + impl Player { + fn achieve(&mut self, _: Achievement) {} + } +} +``` + +# Drawbacks + +Consistency and ease of finding method definitions by looking at the module the +type is defined in, has been mentioned as an advantage of this limitation. +However, trait `impl`s already have that problem and single use extension traits +could arguably be worse. + +# Alternatives + +- Leave it as it is. Seems unsatisfactory given that we're no longer limited + by implementation details. + +- We could go further and allow adding inherent methods to any type that could + implement a trait outside the crate: + ```rust + struct Point { x: T, y: T } + impl (Vec>, T) { + fn foo(&mut self) -> T { ... } + } + ``` + + The implementation would reuse the same coherence rules as for trait `impl`s, + and, for looking up methods, the "type definition to impl" map would be replaced + with a map from method name to a set of `impl`s containing that method. + + *Technically*, I am not aware of any formulation that limits inherent methods + to user-defined types in the same crate, and this extra support could turn out + to have a straight-foward implementation with no complications, but I'm trying + to present the whole situation to avoid issues in the future - even though I'm + not aware of backwards compatibility ones or any related to compiler internals. + +# Unresolved questions + +None. From 6afbb395257d910b2b4d6376fbc42d37b6b90ade Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Wed, 18 Feb 2015 20:06:39 -0700 Subject: [PATCH 0124/1195] Rewording, comments, etc. --- text/0000-type-macros.md | 76 +++++++++++++++++++++++++++------------- 1 file changed, 51 insertions(+), 25 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 478733f5095..f29c7e1ecc0 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -29,14 +29,18 @@ restriction for the following reasons: ## Implementation The proposed feature has been implemented at -[this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). There -is no real novelty to the design as it is simply an extension of the -existing macro machinery to handle the additional case of macro -expansion in types. The biggest change is the addition of a +[this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). The +implementation is very simple and there is no novelty to the +design. The patches make a small modification to the existing macro +expansion functionality in order to support macro invocations in +syntax for types. No changes are made to type-checking or other phases +of the compiler. + +The biggest change introduced by this feature is a [`TyMac`](https://github.com/freebroccolo/rust/blob/f8f8dbb6d332c364ecf26b248ce5f872a7a67019/src/libsyntax/ast.rs#L1274-L1275) -to the `Ty_` enum so that the parser can indicate a macro invocation -in a type position. In other words, `TyMac` is added to the ast and -handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. +case for the `Ty_` enum so that the parser can indicate a macro +invocation in a type position. In other words, `TyMac` is added to the +ast and handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. ## Examples @@ -235,12 +239,14 @@ we would need macros in types: // Nat! would expand integer constants to type-level binary naturals; would // be implemented as a plugin for efficiency -Nat!(4) ==> ((_1, _0), _0) +Nat!(4) + ==> ((_1, _0), _0) // Expr! would expand + to Add::Output and integer constants to Nat!; see // the HList append earlier in the RFC for a concrete example of how this // might be defined -Expr!(N + M) ==> >::Output +Expr!(N + M) + ==> >::Output // Now we could expand the following type to something meaningful in Rust: LengthVec @@ -271,7 +277,12 @@ constants. #### Conversion from HList to Tuple With type macros, it is possible to write macros that convert back and -forth between tuples and HLists in the following fashion: +forth between tuples and HLists. This is very powerful because it lets +us reuse all of the operations we define for HLists (appending, taking +length, adding/removing items, computing permutations, etc.) on tuples +just by converting to HList, computing, then convert back to a tuple. + +The conversion can be implemented in the following fashion: ```rust // type-level macro for HLists @@ -295,8 +306,17 @@ macro_rules! hlist_match { { $head:ident, $($tail:ident),* } => { Cons($head, hlist_match!($($tail),*)) }; } -// iterate macro for generated comma separated sequences of idents -fn impl_for_seq_upto_expand<'cx>( +// `invoke_for_seq_upto` is a `higher-order` macro that takes the name +// of another macro and a number and iteratively invokes the named +// macro with sequences of identifiers, e.g., +// +// invoke_for_seq_upto{ my_mac, 5 } +// ==> my_mac!{ A0, A1, A2, A3, A4 }; +// my_mac!{ A0, A1, A2, A3 }; +// my_mac!{ A0, A1, A2 }; +// ... + +fn invoke_for_seq_upto_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], @@ -348,8 +368,9 @@ fn impl_for_seq_upto_expand<'cx>( pub struct ToHList; pub struct ToTuple; -// macro to implement: ToTuple(hlist![…]) => (…,) -macro_rules! impl_to_tuple_for_seq { +// macro to implement conversion from hlist to tuple, +// e.g., ToTuple(hlist![…]) ==> (…,) +macro_rules! impl_to_tuple { ($($seq:ident),*) => { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { @@ -364,8 +385,9 @@ macro_rules! impl_to_tuple_for_seq { } } -// macro to implement: ToHList((…,)) => hlist![…] -macro_rules! impl_to_hlist_for_seq { +// macro to implement conversion from tuple to hlist, +// e.g., ToHList((…,)) ==> hlist![…] +macro_rules! impl_to_hlist { ($($seq:ident),*) => { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { @@ -381,8 +403,8 @@ macro_rules! impl_to_hlist_for_seq { } // generate implementations up to length 32 -impl_for_seq_upto!{ impl_to_tuple_for_seq, 32 } -impl_for_seq_upto!{ impl_to_hlist_for_seq, 32 } +invoke_for_seq_upto!{ impl_to_tuple, 32 } +invoke_for_seq_upto!{ impl_to_hlist, 32 } // test converting an hlist to tuple #[test] @@ -402,13 +424,17 @@ fn test_to_hlist() { # Drawbacks There seem to be few drawbacks to implementing this feature as an -extension of the existing macro machinery. Parsing macro invocations -in types adds a very small amount of additional complexity to the -parser (basically looking for `!`). Having an extra case for macro -invocation in types slightly complicates conversion. As with all -feature proposals, it is possible that designs for future extensions -to the macro system or type system might somehow interfere with this -functionality. +extension of the existing macro machinery. The change adds a very +small amount of additional complexity to the +[parser](https://github.com/freebroccolo/rust/blob/e09cb32bcc04029dc4c16790e2aaa9811af27f25/src/libsyntax/parse/parser.rs#L1547-L1560) +and +[conversion](https://github.com/freebroccolo/rust/blob/e4b826b7afa1b5496b41ddaa1666014046ac5704/src/librustc_typeck/astconv.rs#L1301-L1303) +but the changes are almost negligible. + +As with all feature proposals, it is possible that designs for future +extensions to the macro system or type system might somehow interfere +with this functionality but it seems unlikely unless they are +significant, breaking changes. # Alternatives From f2d740e26cbbc8ab4c3af4df29630394b3ca9e56 Mon Sep 17 00:00:00 2001 From: Peter Marheine Date: Thu, 19 Feb 2015 13:00:35 -0700 Subject: [PATCH 0125/1195] Add singlethreaded (compiler-only) fences. --- text/0000-compiler-fence-intrinsics.md | 59 ++++++++++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100644 text/0000-compiler-fence-intrinsics.md diff --git a/text/0000-compiler-fence-intrinsics.md b/text/0000-compiler-fence-intrinsics.md new file mode 100644 index 00000000000..f2ec7d0fbcd --- /dev/null +++ b/text/0000-compiler-fence-intrinsics.md @@ -0,0 +1,59 @@ +- Feature Name: compiler_fence_intrinsics +- Start Date: 2015-02-19 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add intrinsics for single-threaded memory fences. + +# Motivation + +Rust currently supports memory barriers through a set of intrinsics, +`atomic_fence` and its variants, which generate machine instructions and are +suitable as cross-processor fences. However, there is currently no compiler +support for single-threaded fences which do not emit machine instructions. + +Certain use cases require that the compiler not reorder loads or stores across a +given barrier but do not require a corresponding hardware guarantee, such as +when a thread interacts with a signal handler which will run on the same thread. +By omitting a fence instruction, relatively costly machine operations can be +avoided. + +The C++ equivalent of this feature is `std::atomic_signal_fence`. + +# Detailed design + +Add four language intrinsics for single-threaded fences: + + * `atomic_compilerfence` + * `atomic_compilerfence_acq` + * `atomic_compilerfence_rel` + * `atomic_compilerfence_acqrel` + +These have the same semantics as the existing `atomic_fence` intrinsics but only +constrain memory reordering by the compiler, not by hardware. + +The existing fence intrinsics are exported in libstd with safe wrappers, but +this design does not export safe wrappers for the new intrinsics. The existing +fence functions will still perform correctly if used where a single-threaded +fence is called for, but with a slight reduction in efficiency. Not exposing +these new intrinsics through a safe wrapper reduces the possibility for +confusion on which fences are appropriate in a given situation, while still +providing the capability for users to opt in to a single-threaded fence when +appropriate. + +# Alternatives + + * Do nothing. The existing fence intrinsics support all use cases, but with a + negative impact on performance in some situations where a compiler-only fence + is appropriate. + +# Unresolved questions + +These intrinsics may be better represented with a different name, such as +`atomic_signal_fence` or `atomic_singlethread_fence`. The existing +implementation of atomic intrinsics in the compiler precludes the use of +underscores in their names and I believe it is clearer to refer to this +construct as a "compiler fence" rather than a "signal fence" because not all use +cases necessarily involve signal handlers, hence the current choice of name. From 9c756add8fba9412903347f73b2607ed7ff277a2 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Thu, 19 Feb 2015 21:50:36 -0800 Subject: [PATCH 0126/1195] 735 --- ...mpls-anywhere.md => 0735-allow-inherent-impls-anywhere.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-allow-inherent-impls-anywhere.md => 0735-allow-inherent-impls-anywhere.md} (94%) diff --git a/text/0000-allow-inherent-impls-anywhere.md b/text/0735-allow-inherent-impls-anywhere.md similarity index 94% rename from text/0000-allow-inherent-impls-anywhere.md rename to text/0735-allow-inherent-impls-anywhere.md index aca724e0796..8d700157884 100644 --- a/text/0000-allow-inherent-impls-anywhere.md +++ b/text/0735-allow-inherent-impls-anywhere.md @@ -1,6 +1,6 @@ - Start Date: 2015-02-19 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#735](https://github.com/rust-lang/rfcs/pull/735) +- Rust Issue: [rust-lang/rust#22563](https://github.com/rust-lang/rust/issues/22563) # Summary From 5985668d68480fa724f2301edcd2d68a96a7255f Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 03:56:59 -0700 Subject: [PATCH 0127/1195] Cleanup invoke_for_seq_upto macro --- text/0000-type-macros.md | 68 +++++++++++++++++++++------------------- 1 file changed, 35 insertions(+), 33 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index f29c7e1ecc0..6842868d944 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -315,7 +315,6 @@ macro_rules! hlist_match { // my_mac!{ A0, A1, A2, A3 }; // my_mac!{ A0, A1, A2 }; // ... - fn invoke_for_seq_upto_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, @@ -327,42 +326,45 @@ fn invoke_for_seq_upto_expand<'cx>( let mac = parser.parse_ident(); // parse a comma - parser.eat(&token::Token::Comma); + parser.expect(&token::Token::Comma); // parse the number of iterations - let iterations = match parser.parse_lit().node { - ast::Lit_::LitInt(i, _) => i, - _ => { - ecx.span_err(span, "welp"); - return base::DummyResult::any(span); - } - }; - - // generate a token tree: A0, ..., An - let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { - if k % 2 == 0 { - token::str_to_ident(format!("A{}", (k / 2)).as_slice()) - .to_tokens(ecx) - .into_iter() - } else { - let span = codemap::DUMMY_SP; - let token = parse::token::Token::Comma; - vec![ast::TokenTree::TtToken(span, token)] - .into_iter() + if let ast::Lit_::LitInt(lit, _) = parser.parse_lit().node { + Some(lit) + } else { + None + }.and_then(|iterations| { + + // generate a token tree: A0, ..., An + let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { + if k % 2 == 0 { + token::str_to_ident(format!("A{}", (k / 2)).as_slice()) + .to_tokens(ecx) + .into_iter() + } else { + let span = codemap::DUMMY_SP; + let token = parse::token::Token::Comma; + vec![ast::TokenTree::TtToken(span, token)] + .into_iter() + } + }).collect::>(); + + // iterate over the ctx and generate impl syntax fragments + let mut items = vec![]; + let mut i = ctx.len(); + for _ in range(0, iterations) { + items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); + i -= 2; + ctx.truncate(i); } - }).collect::>(); - - // iterate over the ctx and generate impl syntax fragments - let mut items = vec![]; - let mut i = ctx.len(); - for _ in range(0, iterations) { - items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); - i -= 2; - ctx.truncate(i); - } - // splice the impl fragments into the ast - base::MacItems::new(items.into_iter()) + // splice the impl fragments into the ast + Some(base::MacItems::new(items.into_iter())) + + }).unwrap_or_else(|| { + ecx.span_err(span, "invoke_for_seq_upto!: expected an integer literal argument"); + base::DummyResult::any(span) + }) } pub struct ToHList; From d9b15ed4283fa98337231d492bd680b7132053cc Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 04:03:18 -0700 Subject: [PATCH 0128/1195] Add example plugin to expand integers to type nats --- text/0000-type-macros.md | 93 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 87 insertions(+), 6 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 6842868d944..214efe87fe5 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -236,24 +236,105 @@ However, in order to be able to write something close to that in Rust, we would need macros in types: ```rust - -// Nat! would expand integer constants to type-level binary naturals; would -// be implemented as a plugin for efficiency -Nat!(4) - ==> ((_1, _0), _0) - // Expr! would expand + to Add::Output and integer constants to Nat!; see // the HList append earlier in the RFC for a concrete example of how this // might be defined Expr!(N + M) ==> >::Output +// Nat! would expand integer literals to type-level binary naturals +// and be implemented as a plugin for efficiency; see the following +// section for a concrete implementation +Nat!(4) + ==> ((_1, _0), _0) + // Now we could expand the following type to something meaningful in Rust: LengthVec ==> LengthVec>::Output> ==> LengthVec>::Output> ``` +##### Implementation of `Nat!` as a plugin + +The following code demonstrates concretely how `Nat!` can be +implemented as a plugin. As with the `HList!` example, this code is +already usable with the type macros implemented in the branch +referenced earlier in this RFC. + +For efficiency, the binary representation is first constructed as a +string via iteration rather than recursively using `quote` macros. The +string is then parsed as a type, returning an ast fragment. + +```rust +// convert a u64 to a string representation of a type-level binary natural, e.g., +// to_bin_nat(1024) +// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) +#[inline] +fn to_bin_nat(mut num: u64) -> String { + let mut res = String::from_str("_"); + if num < 2 { + res.push_str(num.to_string().as_slice()); + } else { + let mut bin = vec![]; + while num > 0 { + bin.push(num % 2); + num >>= 1; + } + res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); + res.push_str("_"); + res.push_str(bin.pop().unwrap().to_string().as_slice()); + for b in bin.iter().rev() { + res.push_str(", _"); + res.push_str(b.to_string().as_slice()); + res.push_str(")"); + } + } + return res; +} + +// generate a parser to convert a string representation of a type-level natural +// to an ast fragment for a type +#[inline] +pub fn bin_nat_parser<'cx>( + ecx: &'cx mut base::ExtCtxt, + num: u64, +) -> parse::parser::Parser<'cx> { + let filemap = ecx + .codemap() + .new_filemap(String::from_str(""), to_bin_nat(num)); + let reader = lexer::StringReader::new( + &ecx.parse_sess().span_diagnostic, + filemap); + parser::Parser::new( + ecx.parse_sess(), + ecx.cfg(), + Box::new(reader)) +} + +// Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., +// Nat!(1024) +// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) +#[inline] +pub fn nat_expand<'cx>( + ecx: &'cx mut base::ExtCtxt, + span: codemap::Span, + args: &[ast::TokenTree], +) -> Box { + let mut litp = ecx.new_parser_from_tts(args); + if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { + Some(lit) + } else { + None + }.and_then(|lit| { + let mut natp = bin_nat_parser(ecx, lit); + Some(base::MacTy::new(natp.parse_ty())) + }).unwrap_or_else(|| { + ecx.span_err(span, "Nat!: expected an integer literal argument"); + base::DummyResult::any(span) + }) +} +``` + ##### Optimization of `Expr`! Because `Expr!` could be implemented as a plugin, the opportunity From b9f0c277e59aee60a0408fa7f75755cbad5196c3 Mon Sep 17 00:00:00 2001 From: Peter Marheine Date: Fri, 20 Feb 2015 13:15:32 -0700 Subject: [PATCH 0129/1195] Alternative: inline assembly. --- text/0000-compiler-fence-intrinsics.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-compiler-fence-intrinsics.md b/text/0000-compiler-fence-intrinsics.md index f2ec7d0fbcd..9b5b3b5887d 100644 --- a/text/0000-compiler-fence-intrinsics.md +++ b/text/0000-compiler-fence-intrinsics.md @@ -49,6 +49,12 @@ appropriate. negative impact on performance in some situations where a compiler-only fence is appropriate. + * Recommend inline assembly to get a similar effect, such as `asm!("" ::: + "memory" : "volatile")`. LLVM provides an IR item specifically for this case + (`fence singlethread`), so I believe taking advantage of that feature in LLVM is + most appropriate, since its semantics are more rigorously defined and less + likely to yield unexpected (but not necessarily wrong) behavior. + # Unresolved questions These intrinsics may be better represented with a different name, such as From 508ece03c7eaa12ada8b74e4b7c09356a637eba4 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 13:56:42 -0700 Subject: [PATCH 0130/1195] Remove unnecessary attributes from examples --- text/0000-type-macros.md | 7 ------- 1 file changed, 7 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 214efe87fe5..d5e6ef88c45 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -128,7 +128,6 @@ use std::ops; impl ops::Add for Nil { type Output = Ys; - #[inline] fn add(self, rhs: Ys) -> Ys { rhs } @@ -140,7 +139,6 @@ impl ops::Add for Cons w { type Output = Cons; - #[inline] fn add(self, rhs: Ys) -> Cons { Cons(self.0, self.1 + rhs) } @@ -269,7 +267,6 @@ string is then parsed as a type, returning an ast fragment. // convert a u64 to a string representation of a type-level binary natural, e.g., // to_bin_nat(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -#[inline] fn to_bin_nat(mut num: u64) -> String { let mut res = String::from_str("_"); if num < 2 { @@ -294,7 +291,6 @@ fn to_bin_nat(mut num: u64) -> String { // generate a parser to convert a string representation of a type-level natural // to an ast fragment for a type -#[inline] pub fn bin_nat_parser<'cx>( ecx: &'cx mut base::ExtCtxt, num: u64, @@ -314,7 +310,6 @@ pub fn bin_nat_parser<'cx>( // Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., // Nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -#[inline] pub fn nat_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, @@ -458,7 +453,6 @@ macro_rules! impl_to_tuple { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { type Output = ($($seq,)*); - #[inline] extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { match this { hlist_match![$($seq),*] => ($($seq,)*) @@ -475,7 +469,6 @@ macro_rules! impl_to_hlist { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { type Output = HList![$($seq),*]; - #[inline] extern "rust-call" fn call(&self, (this,): (($($seq,)*),)) -> HList![$($seq),*] { match this { ($($seq,)*) => hlist![$($seq),*] From f448514b4db4c5e48ce06447b94db40959646781 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 14:00:50 -0700 Subject: [PATCH 0131/1195] Clean up nat plugin example; add term-level macro --- text/0000-type-macros.md | 69 ++++++++++++++++++++++++++++++---------- 1 file changed, 52 insertions(+), 17 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index d5e6ef88c45..6208bf4d25b 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -265,10 +265,11 @@ string is then parsed as a type, returning an ast fragment. ```rust // convert a u64 to a string representation of a type-level binary natural, e.g., -// to_bin_nat(1024) +// nat_str(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -fn to_bin_nat(mut num: u64) -> String { - let mut res = String::from_str("_"); +fn nat_str(mut num: u64) -> String { + let path = "bit::_"; + let mut res = String::from_str(path); if num < 2 { res.push_str(num.to_string().as_slice()); } else { @@ -278,10 +279,11 @@ fn to_bin_nat(mut num: u64) -> String { num >>= 1; } res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); - res.push_str("_"); + res.push_str(path); res.push_str(bin.pop().unwrap().to_string().as_slice()); for b in bin.iter().rev() { - res.push_str(", _"); + res.push_str(", "); + res.push_str(path); res.push_str(b.to_string().as_slice()); res.push_str(")"); } @@ -289,15 +291,14 @@ fn to_bin_nat(mut num: u64) -> String { return res; } -// generate a parser to convert a string representation of a type-level natural -// to an ast fragment for a type -pub fn bin_nat_parser<'cx>( +// Generate a parser with the nat string for `num` as input +fn nat_str_parser<'cx>( ecx: &'cx mut base::ExtCtxt, num: u64, ) -> parse::parser::Parser<'cx> { let filemap = ecx .codemap() - .new_filemap(String::from_str(""), to_bin_nat(num)); + .new_filemap(String::from_str(""), nat_str(num)); let reader = lexer::StringReader::new( &ecx.parse_sess().span_diagnostic, filemap); @@ -307,27 +308,61 @@ pub fn bin_nat_parser<'cx>( Box::new(reader)) } +// Try to parse an integer literal and return a new parser for its nat +// string; this is used to create both a type-level `Nat!` with +// `nat_ty_expand` and term-level `nat!` macro with `nat_tm_expand` +pub fn nat_lit_parser<'cx>( + ecx: &'cx mut base::ExtCtxt, + args: &[ast::TokenTree], +) -> Option> { + let mut litp = ecx.new_parser_from_tts(args); + if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { + Some(nat_str_parser(ecx, lit)) + } else { + None + } +} + // Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., // Nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn nat_expand<'cx>( +pub fn nat_ty_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], ) -> Box { - let mut litp = ecx.new_parser_from_tts(args); - if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { - Some(lit) - } else { - None - }.and_then(|lit| { - let mut natp = bin_nat_parser(ecx, lit); + { + nat_lit_parser(ecx, args) + }.and_then(|mut natp| { Some(base::MacTy::new(natp.parse_ty())) }).unwrap_or_else(|| { ecx.span_err(span, "Nat!: expected an integer literal argument"); base::DummyResult::any(span) }) } + +// Expand nat!(n) to a term-level binary nat where n is an int literal, e.g., +// nat!(1024) +// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) +pub fn nat_tm_expand<'cx>( + ecx: &'cx mut base::ExtCtxt, + span: codemap::Span, + args: &[ast::TokenTree], +) -> Box { + { + nat_lit_parser(ecx, args) + }.and_then(|mut natp| { + Some(base::MacExpr::new(natp.parse_expr())) + }).unwrap_or_else(|| { + ecx.span_err(span, "nat!: expected an integer literal argument"); + base::DummyResult::any(span) + }) +} + +#[test] +fn nats() { + let _: Nat!(42) = nat!(42); +} ``` ##### Optimization of `Expr`! From 41673515e458eb98c64e390a536d9f0c5af7b6f8 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 18:22:04 -0700 Subject: [PATCH 0132/1195] More clean up; mention hygiene --- text/0000-type-macros.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 6208bf4d25b..4b963a76633 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -268,10 +268,9 @@ string is then parsed as a type, returning an ast fragment. // nat_str(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) fn nat_str(mut num: u64) -> String { - let path = "bit::_"; - let mut res = String::from_str(path); + let mut res: String; if num < 2 { - res.push_str(num.to_string().as_slice()); + res = num.to_string(); } else { let mut bin = vec![]; while num > 0 { @@ -279,11 +278,9 @@ fn nat_str(mut num: u64) -> String { num >>= 1; } res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); - res.push_str(path); res.push_str(bin.pop().unwrap().to_string().as_slice()); for b in bin.iter().rev() { res.push_str(", "); - res.push_str(path); res.push_str(b.to_string().as_slice()); res.push_str(")"); } @@ -567,8 +564,16 @@ type macros would not prevent additional extensions. # Unresolved questions +## Alternative syntax for macro invocations in types + There is a question as to whether macros in types should allow `<` and `>` as delimiters for invocations, e.g. `Foo!`. However, this would raise a number of additional complications and is not necessary to consider for this RFC. If deemed desirable by the community, this functionality can be proposed separately. + +## Hygiene and type macros + +This RFC does not address the topic of hygiene regarding macros in +types. It is not clear to me whether there are issues here or not but +it may be worth considering in further detail. From f7d65de2ce0ff70474f29057cbe814146c6550ac Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 20:35:17 -0700 Subject: [PATCH 0133/1195] Rewording; clarification; cleanup --- text/0000-type-macros.md | 204 ++++++++++++++++++++------------------- 1 file changed, 106 insertions(+), 98 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 4b963a76633..b0249df0cce 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -13,11 +13,10 @@ Macros are currently allowed in syntax fragments for expressions, items, and patterns, but not for types. This RFC proposes to lift that restriction for the following reasons: -1. Increase generality of the macro system - in the absence of a - concrete reason for disallowing macros in types, the limitation - should be removed in order to promote generality and to enable use - cases which would otherwise require resorting either to compiler - plugins or to more elaborate item-level macros. +1. Increase generality of the macro system - the limitation should be + removed in order to promote generality and to enable use cases which + would otherwise require resorting either more elaborate plugins or + macros at the item-level. 2. Enable more programming patterns - macros in type positions provide a means to express **recursion** and **choice** within types in a @@ -28,15 +27,13 @@ restriction for the following reasons: ## Implementation -The proposed feature has been implemented at +The proposed feature has been prototyped at [this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). The -implementation is very simple and there is no novelty to the -design. The patches make a small modification to the existing macro -expansion functionality in order to support macro invocations in -syntax for types. No changes are made to type-checking or other phases -of the compiler. +implementation is straightforward and the impact of the changes are +limited in scope to the macro system. Type-checking and other phases +of compilation should be unaffected. -The biggest change introduced by this feature is a +The most significant change introduced by this feature is a [`TyMac`](https://github.com/freebroccolo/rust/blob/f8f8dbb6d332c364ecf26b248ce5f872a7a67019/src/libsyntax/ast.rs#L1274-L1275) case for the `Ty_` enum so that the parser can indicate a macro invocation in a type position. In other words, `TyMac` is added to the @@ -48,12 +45,12 @@ ast and handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. Heterogeneous lists are one example where the ability to express recursion via type macros is very useful. They can be used as an -alternative to (or in combination with) tuples. Their recursive +alternative to or in combination with tuples. Their recursive structure provide a means to abstract over arity and to manipulate arbitrary products of types with operations like appending, taking length, adding/removing items, computing permutations, etc. -Heterogeneous lists are straightforward to define: +Heterogeneous lists can be defined like so: ```rust struct Nil; // empty HList @@ -65,13 +62,13 @@ impl HList for Nil {} impl HList for Cons {} ``` -However, writing them in code is not so convenient: +However, writing HList terms in code is not very convenient: ```rust let xs = Cons("foo", Cons(false, Cons(vec![0u64], Nil))); ``` -At the term-level, this is easy enough to fix with a macro: +At the term-level, this is an easy fix using macros: ```rust // term-level macro for HLists @@ -84,16 +81,16 @@ macro_rules! hlist { let xs = hlist!["foo", false, vec![0u64]]; ``` -Unfortunately, this is an incomplete solution. HList terms are more -convenient to write but HList types are not: +Unfortunately, this solution is incomplete because we have only made +HList terms easier to write. HList types are still inconvenient: ```rust let xs: Cons<&str, Cons, Nil>>> = hlist!["foo", false, vec![0u64]]; ``` -Under this proposal—allowing macros in types—we would be able to use a -macro to improve writing the HList type as well. The complete example -follows: +Allowing type macros as this RFC proposes would allows us to be +able to use Rust's macros to improve writing the HList type as +well. The complete example follows: ```rust // term-level macro for HLists @@ -117,9 +114,9 @@ Operations on HLists can be defined by recursion, using traits with associated type outputs at the type-level and implementation methods at the term-level. -HList append is provided as an example of such an operation. Macros in -types are used to make writing append at the type level more -convenient, e.g., with `Expr!`: +The HList append operation is provided as an example. type macros are +used to make writing append at the type level (see `Expr!`) more +convenient than specifying the associated type projection manually: ```rust use std::ops; @@ -172,11 +169,12 @@ fn test_append() { ### Additional Examples ### -#### Type-level numbers +#### Type-level numerics -Another example where type macros can be useful is in the encoding of -numbers as types. Binary natural numbers can be represented as -follows: +Type-level numerics are another area where type macros can be +useful. The more common unary encodings (Peano numerals) are not +efficient enough to use in practice so we present an example +demonstrating binary natural numbers instead: ```rust struct _0; // 0 bit @@ -199,29 +197,41 @@ impl Nat for _1 {} impl Nat for (P, B) {} ``` -These can be used to index into tuples or HLists generically (linear -time generally or constant time up to a fixed number of -specializations). They can also be used to encode "sized" or "bounded" -data, like vectors: +These can be used to index into tuples or HLists generically, either +by specifying the path explicitly (e.g., `(a, b, c).at::<(_1, _0)>() +==> c`) or by providing a singleton term with the appropriate type +`(a, b, c).at((_1, _0)) ==> c`. Indexing is linear time in the general +case due to recursion, but can be made constant time for a fixed +number of specialized implementations. + +Type-level numbers can also be used to define "sized" or "bounded" +data, such as a vector indexed by its length: ```rust struct LengthVec(Vec); ``` -The type number can either be a phantom parameter `N` as above, or -represented concretely at the term-level (similar to list). In either -case, a length-safe API can be provided on top of types `Vec`. Because -the length is known statically, unsafe indexing would be allowable by -default. +Similar to the indexing example, the parameter `N` can either serve as +phantom data, or such a struct could also include a term-level +representation of N as another field. + +In either case, a length-safe API could be defined for container types +like `Vec`. "Unsafe" indexing (without bounds checking) into the +underlying container would be safe in general because the length of +the container would be known statically and reflected in the type of +the length-indexed wrapper. We could imagine an idealized API in the following fashion: ```rust // push, adding one to the length -fn push(x: A, xs: LengthVec) -> LengthVec; +fn push(xs: LengthVec, x: A) -> LengthVec; // pop, subtracting one from the length -fn pop(store: &mut A, xs: LengthVec) -> LengthVec; +fn pop(xs: LengthVec, store: &mut A) -> LengthVec; + +// look up an element at an index +fn at(xs: LengthVec, index: M) -> A; // append, adding the individual lengths fn append(xs: LengthVec, ys: LengthVec) -> LengthVec; @@ -230,23 +240,22 @@ fn append(xs: LengthVec, ys: LengthVec) -> Length fn iter(xs: LengthVec) -> LengthIterator; ``` -However, in order to be able to write something close to that in Rust, -we would need macros in types: +We can't write code like the above directly in Rust but we could +approximate it through type-level macros: ```rust // Expr! would expand + to Add::Output and integer constants to Nat!; see -// the HList append earlier in the RFC for a concrete example of how this -// might be defined +// the HList append earlier in the RFC for a concrete example Expr!(N + M) ==> >::Output // Nat! would expand integer literals to type-level binary naturals // and be implemented as a plugin for efficiency; see the following -// section for a concrete implementation +// section for a concrete example Nat!(4) ==> ((_1, _0), _0) -// Now we could expand the following type to something meaningful in Rust: +// `Expr!` and `Nat!` used for the LengthVec type: LengthVec ==> LengthVec>::Output> ==> LengthVec>::Output> @@ -255,9 +264,9 @@ LengthVec ##### Implementation of `Nat!` as a plugin The following code demonstrates concretely how `Nat!` can be -implemented as a plugin. As with the `HList!` example, this code is -already usable with the type macros implemented in the branch -referenced earlier in this RFC. +implemented as a plugin. As with the `HList!` example, this code (with +some additions) compiles and is usable with the type macros prototype +in the branch referenced earlier. For efficiency, the binary representation is first constructed as a string via iteration rather than recursively using `quote` macros. The @@ -268,9 +277,11 @@ string is then parsed as a type, returning an ast fragment. // nat_str(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) fn nat_str(mut num: u64) -> String { + let path = "_"; let mut res: String; if num < 2 { - res = num.to_string(); + res = String::from_str(path); + res.push_str(num.to_string().as_slice()); } else { let mut bin = vec![]; while num > 0 { @@ -278,9 +289,11 @@ fn nat_str(mut num: u64) -> String { num >>= 1; } res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); + res.push_str(path); res.push_str(bin.pop().unwrap().to_string().as_slice()); for b in bin.iter().rev() { res.push_str(", "); + res.push_str(path); res.push_str(b.to_string().as_slice()); res.push_str(")"); } @@ -364,33 +377,32 @@ fn nats() { ##### Optimization of `Expr`! -Because `Expr!` could be implemented as a plugin, the opportunity -would exist to perform various optimizations of type-level expressions -during expansion. Partial evaluation would be one approach to -this. Furthermore, expansion-time optimizations would not necessarily -be limited to simple arithmetic expressions but could be used for -other data like HLists. +Defining `Expr!` as a plugin would provide an opportunity to perform +various optimizations of more complex type-level expressions during +expansion. Partial evaluation would be one way to achieve +this. Furthermore, expansion-time optimizations wouldn't be limited to +arithmetic expressions but could be used for other data like HLists. -##### Native alternatives: types parameterized by constant values +##### Builtin alternatives: types parameterized by constant values -This example with type-level naturals is meant to illustrate the kind -of patterns macros in types enable. I am not suggesting the standard -libraries adopt _this particular_ representation as a means to address -the more general issue of lack of numeric parameterization for -types. There is +The example with type-level naturals serves to illustrate some of the +patterns type macros enable. This RFC is not intended to address the +lack of constant value type parameterization and type-level numerics +specifically. There is [another RFC here](https://github.com/rust-lang/rfcs/pull/884) which -does propose extending the type system to allow parameterization over -constants. +proposes extending the type system to address those issue. #### Conversion from HList to Tuple -With type macros, it is possible to write macros that convert back and -forth between tuples and HLists. This is very powerful because it lets -us reuse all of the operations we define for HLists (appending, taking -length, adding/removing items, computing permutations, etc.) on tuples -just by converting to HList, computing, then convert back to a tuple. +With type macros, it is possible to define conversions back and forth +between tuples and HLists. This is very powerful because it lets us +reuse at the level of tuples all of the recursive operations we can +define for HLists (appending, taking length, adding/removing items, +computing permutations, etc.). -The conversion can be implemented in the following fashion: +Conversions can be defined using macros/plugins and function +traits. Type macros are useful in this example for the associated type +`Output` and method return type in the traits. ```rust // type-level macro for HLists @@ -443,7 +455,7 @@ fn invoke_for_seq_upto_expand<'cx>( None }.and_then(|iterations| { - // generate a token tree: A0, ..., An + // generate a token tree: A0, …, An let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { if k % 2 == 0 { token::str_to_ident(format!("A{}", (k / 2)).as_slice()) @@ -532,48 +544,44 @@ fn test_to_hlist() { # Drawbacks There seem to be few drawbacks to implementing this feature as an -extension of the existing macro machinery. The change adds a very -small amount of additional complexity to the +extension of the existing macro machinery. The change adds a small +amount of additional complexity to the [parser](https://github.com/freebroccolo/rust/blob/e09cb32bcc04029dc4c16790e2aaa9811af27f25/src/libsyntax/parse/parser.rs#L1547-L1560) and [conversion](https://github.com/freebroccolo/rust/blob/e4b826b7afa1b5496b41ddaa1666014046ac5704/src/librustc_typeck/astconv.rs#L1301-L1303) -but the changes are almost negligible. +but the changes are minimal. As with all feature proposals, it is possible that designs for future -extensions to the macro system or type system might somehow interfere -with this functionality but it seems unlikely unless they are -significant, breaking changes. +extensions to the macro system or type system might interfere with +this functionality but it seems unlikely unless they are significant, +breaking changes. # Alternatives -There are no direct alternatives to my knowledge. Extensions to the -type system like data kinds, singletons, and various more elaborate -forms of staged programming (so-called CTFE) could conceivably cover -some cases where macros in types might otherwise be used. It is -unlikely they would provide the same level of functionality as macros, -particularly where plugins are concerned. Instead, such features would -probably benefit from type macros too. - -Not implementing this feature would mean disallowing some useful -programming patterns. There are some discussions in the community -regarding more extensive changes to the type system to address some of -these patterns. However, type macros along with associated types can -already accomplish many of the same things without the significant -engineering cost in terms of changes to the type system. Either way, -type macros would not prevent additional extensions. +There are no _direct_ alternatives. Extensions to the type system like +data kinds, singletons, and other forms of staged programming +(so-called CTFE) might alleviate the need for type macros in some +cases, however it is unlikely that they would provide a comprehensive +replacement, particularly where plugins are concerned. + +Not implementing this feature would mean not taking some reasonably +low-effort steps toward making certain programming patterns +easier. One potential consequence of this might be more pressure to +significantly extend the type system and other aspects of the language +to compensate. # Unresolved questions ## Alternative syntax for macro invocations in types -There is a question as to whether macros in types should allow `<` and -`>` as delimiters for invocations, e.g. `Foo!`. However, this would -raise a number of additional complications and is not necessary to +There is a question as to whether type macros should allow `<` and `>` +as delimiters for invocations, e.g. `Foo!`. This would raise a +number of additional complications and is probably not necessary to consider for this RFC. If deemed desirable by the community, this -functionality can be proposed separately. +functionality should be proposed separately. ## Hygiene and type macros -This RFC does not address the topic of hygiene regarding macros in -types. It is not clear to me whether there are issues here or not but -it may be worth considering in further detail. +This RFC also does not address the topic of hygiene regarding macros +in types. It is not clear whether there are issues here or not but it +may be worth considering in further detail. From 0b10896a620f168fd3efd36f647873e5a88f7761 Mon Sep 17 00:00:00 2001 From: Ionoclast Laboratories Date: Mon, 23 Feb 2015 08:59:00 -0800 Subject: [PATCH 0134/1195] Minor fixups to 0231-upvar-capture-inference.md. A handful of minor grammatical and typographical fixes. --- text/0231-upvar-capture-inference.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0231-upvar-capture-inference.md b/text/0231-upvar-capture-inference.md index f4becf914cf..1326f2e0008 100644 --- a/text/0231-upvar-capture-inference.md +++ b/text/0231-upvar-capture-inference.md @@ -8,9 +8,9 @@ The `||` unboxed closure form should be split into two forms—`||` for nonescap # Motivation -Having to specify `ref` and the capture mode for each unboxed closure is inconvenient (see Rust PR rust-lang/rust#16610). It would be more convenient for the programmer if, the type of the closure and the modes of the upvars could be inferred. This also eliminates the "line-noise" syntaxes like `|&:|`, which are arguably unsightly. +Having to specify `ref` and the capture mode for each unboxed closure is inconvenient (see Rust PR rust-lang/rust#16610). It would be more convenient for the programmer if the type of the closure and the modes of the upvars could be inferred. This also eliminates the "line-noise" syntaxes like `|&:|`, which are arguably unsightly. -Not all knobs can be removed, however: the programmer must manually specify whether each closure is escaping or nonescaping. To see this, observe that no sensible default for the closure `|| (*x).clone()` exists: if the function is nonescaping, it's a closure that returns a copy of `x` every time but does not move `x` into it; if the function is escaping, it's a that returns a copy of `x` and takes ownership of `x`. +Not all knobs can be removed, however—the programmer must manually specify whether each closure is escaping or nonescaping. To see this, observe that no sensible default for the closure `|| (*x).clone()` exists: if the function is nonescaping, it's a closure that returns a copy of `x` every time but does not move `x` into it; if the function is escaping, it's a closure that returns a copy of `x` and takes ownership of `x`. Therefore, we need two forms: one for *nonescaping* closures and one for *escaping* closures. Nonescaping closures are the commonest, so they get the `||` syntax that we have today, and a new `move ||` syntax will be introduced for escaping closures. From 6301bcc57aa664470178f38b7ef70c5124d0ecdd Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 23 Feb 2015 15:48:37 -0800 Subject: [PATCH 0135/1195] Amend RFC 517: Add material for stdio Expand the section on stdin, stdout, and stderr while also adding a new section explaining the fate of the current print-related functions. --- text/0517-io-os-reform.md | 49 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 3 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index a867e549266..8bd44018d3b 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -62,6 +62,7 @@ follow-up PRs against this RFC. * [Errors] * [Channel adapters] * [stdin, stdout, stderr] + * [Printing functions] * [std::env] * [std::fs] * [Free functions] @@ -72,7 +73,6 @@ follow-up PRs against this RFC. * [TCP] * [UDP] * [Addresses] - * [std::net] (stub) * [std::process] * [Command] * [Child] @@ -1155,7 +1155,49 @@ RFC recommends they remain unstable. #### `stdin`, `stdout`, `stderr` [stdin, stdout, stderr]: #stdin-stdout-stderr -> To be added in a follow-up PR. +The current `stdio` module will be removed in favor of three constructors: + +* `stdin` - returns a handle to a **globally shared** to the standard input of + the process which is buffered as well. All operations on this handle will + first require acquiring a lock to ensure access to the shared buffer is + synchronized. The handle can be explicitly locked for a critical section so + relocking is not necessary. + + The `Read` trait will be implemented directly on the returned `Stdin` handle + but the `BufRead` trait will not be (due to synchronization concerns). The + locked version of `Stdin` will provide an implementation of `BufRead`. + + The design will largely be the same as is today with the `old_io` module. + +* `stderr` - returns a **non buffered** handle to the standard error output + stream for the process. Each call to `write` will roughly translate to a + system call to output data when written to `stderr`. + +* `stdout` - returns a **locally buffered** handle to the standard output of the + current process. The amount of buffering can be decided at runtime to allow + for different situations such as being attached to a TTY or being redirected + to an output file. The `Write` trait will be implemented for this handle. + +The `stderr_raw` constructor is removed because the handle is no longer buffered +and the `stdin_raw` and `stdout_raw` handles are removed to be added at a later +date in the `std::os` modules if necessary. + +#### Printing functions +[Printing functions]: #printing-functions + +The current `print`, `println`, `print_args`, and `println_args` functions will +all be "removed from the public interface" by [prefixing them with `__` and +marking `#[doc(hidden)]`][gh22607]. These are all implementation details of the +`print!` and `println!` macros and don't need to be exposed in the public +interface. + +[gh22607]: https://github.com/rust-lang/rust/issues/22607 + +The `set_stdout` and `set_stderr` functions will be moved to a new +`std::fmt::output` module and renamed to `set_print` and `set_panic`, +respectively. These new names reflect what they actually do, removing a +longstanding confusion. The current `stdio::flush` function will also move to +this module and be renamed to `flush_print`. ### `std::env` [std::env]: #stdenv @@ -1173,7 +1215,8 @@ and the signatures will be updated to follow this RFC's * `vars` (renamed from `env`): yields a vector of `(OsString, OsString)` pairs. * `var` (renamed from `getenv`): take a value bounded by `AsOsStr`, - allowing Rust strings and slices to be ergonomically passed in. Yields an `Option`. + allowing Rust strings and slices to be ergonomically passed in. Yields an + `Option`. * `var_string`: take a value bounded by `AsOsStr`, returning `Result` where `VarError` represents a non-unicode `OsString` or a "not present" value. From 0466d1836362d275fd015b44cb79aedab3b8174f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 24 Feb 2015 21:44:08 -0800 Subject: [PATCH 0136/1195] All handles are globally locked, stdout is globally buffered Also add back text for `foo_raw` and details. --- text/0517-io-os-reform.md | 74 +++++++++++++++++++++++++++++++++------ 1 file changed, 64 insertions(+), 10 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 8bd44018d3b..bb178255b95 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1155,9 +1155,19 @@ RFC recommends they remain unstable. #### `stdin`, `stdout`, `stderr` [stdin, stdout, stderr]: #stdin-stdout-stderr -The current `stdio` module will be removed in favor of three constructors: +The current `stdio` module will be removed in favor of these constructors in the +`io` module: -* `stdin` - returns a handle to a **globally shared** to the standard input of +```rust +fn stdin() -> Stdin; +fn stdout() -> Stdout; +fn stderr() -> Stderr; +fn stdin_raw() -> StdinRaw; +fn stdout_raw() -> StdoutRaw; +fn stderr_raw() -> StderrRaw; +``` + +* `stdin` - returns a handle to a **globally shared** standard input of the process which is buffered as well. All operations on this handle will first require acquiring a lock to ensure access to the shared buffer is synchronized. The handle can be explicitly locked for a critical section so @@ -1169,18 +1179,60 @@ The current `stdio` module will be removed in favor of three constructors: The design will largely be the same as is today with the `old_io` module. + ```rust + impl Stdin { + fn lock(&self) -> StdinLock; + fn read_line(&mut self, into: &mut String) -> io::Result<()>; + fn read_until(&mut self, byte: u8, into: &mut Vec) -> io::Result<()>; + } + impl Read for Stdin { ... } + impl Read for StdinLock { ... } + impl BufRead for StdinLock { ... } + ``` + * `stderr` - returns a **non buffered** handle to the standard error output stream for the process. Each call to `write` will roughly translate to a - system call to output data when written to `stderr`. + system call to output data when written to `stderr`. This handle is locked + like `stdin` to ensure, for example, that calls to `write_all` are atomic with + respect to one another. There will also be an RAII guard to lock the handle + and use the result as an instance of `Write`. -* `stdout` - returns a **locally buffered** handle to the standard output of the - current process. The amount of buffering can be decided at runtime to allow - for different situations such as being attached to a TTY or being redirected - to an output file. The `Write` trait will be implemented for this handle. + ```rust + impl Stderr { + fn lock(&self) -> StderrLock; + } + impl Write for Stderr { ... } + impl Write for StderrLock { ... } + ``` -The `stderr_raw` constructor is removed because the handle is no longer buffered -and the `stdin_raw` and `stdout_raw` handles are removed to be added at a later -date in the `std::os` modules if necessary. +* `stdout` - returns a **globally buffered** handle to the standard output of + the current process. The amount of buffering can be decided at runtime to + allow for different situations such as being attached to a TTY or being + redirected to an output file. The `Write` trait will be implemented for this + handle, and like `stderr` it will be possible to lock it and then use the + result as an instance of `Write` as well. + + ```rust + impl Stdout { + fn lock(&self) -> StdoutLock; + } + impl Write for Stdout { ... } + impl Write for StdoutLock { ... } + ``` + +* `*_raw` - these constructors will return references to the raw handles which + are guaranteed to not be protected with any form of lock or have any backing + buffer. Their APIs will look like: + + ```rust + impl Read for StdinRaw { ... } + impl Write for StdoutRaw { ... } + impl Write for StderrRaw { ... } + ``` + + The documentation for `stdin_raw` will indicate that extra data may be + buffered in the `stdin` handle which will not be accessible to the `stdin_raw` + handle. #### Printing functions [Printing functions]: #printing-functions @@ -1199,6 +1251,8 @@ respectively. These new names reflect what they actually do, removing a longstanding confusion. The current `stdio::flush` function will also move to this module and be renamed to `flush_print`. +The entire `std::fmt::output` module will remain `#[unstable]` for now, however. + ### `std::env` [std::env]: #stdenv From 9907335c4d977210a4653bafc38e5932e5168383 Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 25 Feb 2015 11:09:24 +0100 Subject: [PATCH 0137/1195] Update 0840-no-panic-in-c-string.md Added an alternative design which renames the currently used functions. --- text/0840-no-panic-in-c-string.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0840-no-panic-in-c-string.md b/text/0840-no-panic-in-c-string.md index f7f319190e3..4b665a04d4a 100644 --- a/text/0840-no-panic-in-c-string.md +++ b/text/0840-no-panic-in-c-string.md @@ -85,6 +85,9 @@ reason for that exists; composition is preferred to adding function variants. Longer function names, together with a less convenient return value, may deter people from using the safer functions. +The panicky functions could also be renamed to `unpack_slice` and `unpack_vec`, +respectively, to highlight their conceptual proximity to `unpack`. + If the panicky behavior is preserved, plentiful possibilities for DoS attacks and other unforeseen failures in the field may be introduced by code oblivious to the input constraints. From 1e5c6030cf48a3021b59dc898c7d0c8f7ec12bcf Mon Sep 17 00:00:00 2001 From: Carl Lerche Date: Wed, 25 Feb 2015 09:40:45 -0800 Subject: [PATCH 0138/1195] Move `std::thread_local::*` into `std::thread` --- text/0000-move-thread-local-to-std-thread.md | 49 ++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 text/0000-move-thread-local-to-std-thread.md diff --git a/text/0000-move-thread-local-to-std-thread.md b/text/0000-move-thread-local-to-std-thread.md new file mode 100644 index 00000000000..eb059b92871 --- /dev/null +++ b/text/0000-move-thread-local-to-std-thread.md @@ -0,0 +1,49 @@ +- Feature Name: N/A +- Start Date: 2015-02-25 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Move the contents of `std::thread_local` into `std::thread`. Fully +remove `std::thread_local` from the standard library. + +# Motivation + +Thread locals are directly related to threading. Combining the modules +would reduce the number of top level modules, making browsing the docs +easier as well as reduce the number of `use` statements. + +# Detailed design + +The goal is to move the contents of `std::thread_local` into +`std::thread`. There are a few possible strategies that could be used to +achieve this. + +One option would be to move the contents as is into `std::thread`. This +would leave `Key` and `State` as is. There would be no naming conflict, +but the names would be less ideal since the containing module is not +directly related to thread locals anymore. This could be handled by +renaming the types to something like `LocalKey` and `LocalState`. + +Another option would be to move the contents into a dedicated sub module +such as `std::thread::local`. This would mean some code would still have +an extra `use` statement for pulling in thread local related types, but +it would also enable doing: + +`use std::thread::{local, Thread};` + +# Drawbacks + +It's pretty late in the 1.0 release cycle. This is a mostly bike +shedding level of a change. It may not be worth changing it at this +point and staying with two top level modules in `std`. Also, some users +may prefer to have more top level modules. + +# Alternatives + +Leaving `std::thread_local` in its own module. + +# Unresolved questions + +The exact strategy for moving the contents into `std::thread` From ea2b0e79e08ae0ab84277ce155b9885641b38445 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 25 Feb 2015 13:59:49 -0800 Subject: [PATCH 0139/1195] Clarify stdio_raw and behavior on Windows vs Unix --- text/0517-io-os-reform.md | 103 +++++++++++++++++++++++++++++++------- 1 file changed, 86 insertions(+), 17 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index bb178255b95..0f648c2f55c 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1159,12 +1159,9 @@ The current `stdio` module will be removed in favor of these constructors in the `io` module: ```rust -fn stdin() -> Stdin; -fn stdout() -> Stdout; -fn stderr() -> Stderr; -fn stdin_raw() -> StdinRaw; -fn stdout_raw() -> StdoutRaw; -fn stderr_raw() -> StderrRaw; +pub fn stdin() -> Stdin; +pub fn stdout() -> Stdout; +pub fn stderr() -> Stderr; ``` * `stdin` - returns a handle to a **globally shared** standard input of @@ -1220,19 +1217,91 @@ fn stderr_raw() -> StderrRaw; impl Write for StdoutLock { ... } ``` -* `*_raw` - these constructors will return references to the raw handles which - are guaranteed to not be protected with any form of lock or have any backing - buffer. Their APIs will look like: +#### Windows and stdio +[Windows stdio]: #windows-and-stdio - ```rust - impl Read for StdinRaw { ... } - impl Write for StdoutRaw { ... } - impl Write for StderrRaw { ... } - ``` +On Windows, standard input and output handles can work with either arbitrary +`[u8]` or `[u16]` depending on the state at runtime. For example a program +attached to the console will work with arbitrary `[u16]`, but a program attached +to a pipe would work with arbitrary `[u8]`. + +To handle this difference, the following behavior will be enforced for the +standard primitives listed above: + +* If attached to a pipe then no attempts at encoding or decoding will be done, + the data will be ferried through as `[u8]`. + +* If attached to a console, then `stdin` will attempt to interpret all input as + UTF-16, re-encoding into UTF-8 and returning the UTF-8 data instead. This + implies that data will be buffered internally to handle partial reads/writes. + Invalid UTF-16 will simply be discarded returning an `io::Error` explaining + why. + +* If attached to a console, then `stdout` and `stderr` will attempt to interpret + input as UTF-8, re-encoding to UTF-16. If the input is not valid UTF-8 then an + error will be returned and no data will be written. + +#### Raw stdio +[Raw stdio]: #raw-stdio + +The above standard input/output handles all involve some form of locking or +buffering (or both). This cost is not always wanted, and hence raw variants will +be provided. Due to platform differences across unix/windows, the following +structure will be supported: + +```rust +mod os { + mod unix { + mod stdio { + struct Stdio { .. } + + impl Stdio { + fn stdout() -> Stdio; + fn stderr() -> Stdio; + fn stdin() -> Stdio; + } + + impl Read for Stdio { ... } + impl Write for Stdio { ... } + } + } + + mod windows { + mod stdio { + struct Stdio { ... } + struct StdioConsole { ... } + + impl Stdio { + fn stdout() -> io::Result; + fn stderr() -> io::Result; + fn stdin() -> io::Result; + } + // same constructors StdioConsole + + impl Read for Stdio { ... } + impl Write for Stdio { ... } + + impl StdioConsole { + // returns slice of what was read + fn read<'a>(&self, buf: &'a mut OsString) -> io::Result<&'a OsStr>; + // returns remaining part of `buf` to be written + fn write<'a>(&self, buf: &'a OsStr) -> io::Result<&'a OsStr>; + } + } + } +} +``` + +There are some key differences from today's API: - The documentation for `stdin_raw` will indicate that extra data may be - buffered in the `stdin` handle which will not be accessible to the `stdin_raw` - handle. +* On unix, the API has not changed much except that the handles have been + consolidated into one type which implements both `Read` and `Write` (although + writing to stdin is likely to generate an error). +* On windows, there are two sets of handles representing the difference between + "console mode" and not (e.g. a pipe). When not a console the normal I/O traits + are implemented (delegating to `ReadFile` and `WriteFile`. The console mode + operations work with `OsStr`, however, to show how they work with UCS-2 under + the hood. #### Printing functions [Printing functions]: #printing-functions From 3c02ef5cc680b9e0472da57bfb69e5196a0fd312 Mon Sep 17 00:00:00 2001 From: Eduard Burtescu Date: Thu, 26 Feb 2015 00:07:09 +0200 Subject: [PATCH 0140/1195] Const functions and inherent methods. --- text/0000-const-fn.md | 150 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 150 insertions(+) create mode 100644 text/0000-const-fn.md diff --git a/text/0000-const-fn.md b/text/0000-const-fn.md new file mode 100644 index 00000000000..d491bd6cbf4 --- /dev/null +++ b/text/0000-const-fn.md @@ -0,0 +1,150 @@ +- Feature Name: const_fn +- Start Date: 2015-02-25 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow marking free functions and inherent methods as `const`, enabling them to be +called in constants contexts, with constant arguments. + +# Motivation + +As it is right now, `UnsafeCell` is a stabilization and safety hazard: the field +it is supposed to be wrapping is public. This is only done out of the necessity +to initialize static items containing atomics, mutexes, etc. - for example: +```rust +#[lang="unsafe_cell"] +struct UnsafeCell { pub value: T } +struct AtomicUsize { v: UnsafeCell } +const ATOMIC_USIZE_INIT: AtomicUsize = AtomicUsize { + v: UnsafeCell { value: 0 } +}; +``` +This approach is fragile and doesn't compose well - consider having to initialize +an `AtomicUsize` static with `usize::MAX` - you would need a `const` for each +possible value. +Also, types like `AtomicPtr` or `Cell` have no way *at all* to initialize +them in constant contexts, leading to overuse of `UnsafeCell` or `static mut`, +disregarding type safety and proper abstractions. +During implementation, the worst offender I've found was `std::thread_local`: +all the fields of `std::thread_local::imp::Key` are public, so they can be +filled in by a macro - and they're marked "stable". + +A pre-RFC for the removal of the dangerous (and oftenly misued) `static mut` +received positive feedback, but only under the condition that abstractions +could be created and used in `const` and `static` items. + +Another concern is the ability to use certain intrinsics, like `size_of`, inside +constant expressions, including fixed-length array types. Unlike keyword-based +alternatives, `const fn` provides an extensible and composable building block +for such features. + +# Detailed design + +Functions and inherent methods can be marked as `const`: +```rust +const fn foo(x: T, y: U) -> Foo { + stmts; + expr +} +impl Foo { + const fn new(x: T) -> Foo { + stmts; + expr + } + + const fn transform(self, y: U) -> Foo { + stmts; + expr + } +} +``` +Traits, trait implementations and their methods cannot be `const` - this +allows us to properly design a constness/CTFE system that interacts well +with traits - for more details, see *Alternatives*. +Only simple by-value immutable bindings are allowed as arguments' patterns. +The body of the function is checked as if it were a block inside a `const`: +```rust +const FOO: Foo = { + // Currently, only item "statements" are allowed here. + stmts; + // The function's arguments and constant expressions can be freely combined. + expr +} +``` +For the purpose of rvalue promotion (to static memory), arguments are considered +potentially varying, because the function can still be called with non-constant +values at runtime. + +`const` functions and methods can be called from any constant expression: +```rust +// Standalone example. +struct Point { x: i32, y: i32 } + +impl Point { + const fn new(x: i32, y: i32) -> Point { + Point { x: x, y: y } + } + + const fn add(self, other: Point) -> Point { + Point::new(self.x + other.x, self.y + other.y) + } +} + +const ORIGIN: Point = Point::new(0, 0); + +const fn sum_test(xs: [Point; 3]) -> Point { + xs[0].add(xs[1]).add(xs[2]) +} + +const A: Point = Point::new(1, 0); +const B: Point = Point::new(0, 1); +const C: Point = A.add(B); +const D: Point = sum_test([A, B, C]); + +// Assuming the Foo::new methods used here are const. +static FLAG: AtomicBool = AtomicBool::new(true); +static COUNTDOWN: AtomicUsize = AtomicUsize::new(10); +#[thread_local] +static TLS_COUNTER: Cell = Cell::new(1); +``` + +# Drawbacks + +None that I know of. + +# Alternatives + +* Not do anything for 1.0. This would result in some APIs being crippled and +serious backwards compatibility issues - `UnsafeCell`'s `value` field cannot +simply be removed later. +* While not an alternative, but rather a potential extension, there is only way +I could make `const fn`s work with traits (in an untested design, that is): +qualify trait implementations and bounds with `const`. This is necessary for +meaningful interactions with overloading traits - quick example: +```rust +const fn map_vec3 T>(xs: [T; 3], f: F) -> [T; 3] { + [f([xs[0]), f([xs[1]), f([xs[2])] +} + +const fn neg_vec3(xs: [T; 3]) -> [T; 3] { + map_vec3(xs, |x| -x) +} + +const impl Add for Point { + fn add(self, other: Point) -> Point { + Point { + x: self.x + other.x, + y: self.y + other.y + } + } +} +``` +Having `const` trait methods (where all implementations are `const`) seems +useful, but is not enough of its own. + +# Unresolved questions + +Should we allow `unsafe const fn`? The implementation cost is neglible, but I +am not certain it needs to exist. From 05e10ca7d4a21cf50aedab64d17ec5c2b49baa0f Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 26 Feb 2015 18:05:31 +1300 Subject: [PATCH 0141/1195] Remove ascription on patterns. Add details about coercions. --- text/0000-type-ascription.md | 154 ++++++++++------------------------- 1 file changed, 43 insertions(+), 111 deletions(-) diff --git a/text/0000-type-ascription.md b/text/0000-type-ascription.md index 7c7e8bc80d5..a36c86cea41 100644 --- a/text/0000-type-ascription.md +++ b/text/0000-type-ascription.md @@ -4,10 +4,10 @@ # Summary -Add type ascription to expressions and patterns. +Add type ascription to expressions. (An earlier version of this RFC covered type +ascription in patterns too, that has been postponed). -Type ascription on expression has already been implemented. Type ascription on -patterns can probably wait until post-1.0. +Type ascription on expression has already been implemented. See also discussion on [#354](https://github.com/rust-lang/rfcs/issues/354) and [rust issue 10502](https://github.com/rust-lang/rust/issues/10502). @@ -16,12 +16,12 @@ See also discussion on [#354](https://github.com/rust-lang/rfcs/issues/354) and # Motivation Type inference is imperfect. It is often useful to help type inference by -annotating a sub-expression or sub-pattern with a type. Currently, this is only -possible by extracting the sub-expression into a variable using a `let` -statement and/or giving a type for a whole expression or pattern. This is un- -ergonomic, and sometimes impossible due to lifetime issues. Specifically, a -variable has lifetime of its enclosing scope, but a sub-expression's lifetime is -typically limited to the nearest semi-colon. +annotating a sub-expression with a type. Currently, this is only possible by +extracting the sub-expression into a variable using a `let` statement and/or +giving a type for a whole expression or pattern. This is un- ergonomic, and +sometimes impossible due to lifetime issues. Specifically, where a variable has +lifetime of its enclosing scope, but a sub-expression's lifetime is typically +limited to the nearest semi-colon. Typical use cases are where a function's return type is generic (e.g., collect) and where we want to force a coercion. @@ -90,18 +90,6 @@ let x: T = { let x: T foo(): U<_>; ``` -In patterns: - -``` -struct Foo { a: T, b: String } - -// Current -fn foo(Foo { a, .. }: Foo) { ... } - -// With type ascription. -fn foo(Foo { a: i32, .. }) { ... } -``` - # Detailed design @@ -122,72 +110,44 @@ At runtime, type ascription is a no-op, unless an implicit coercion was used in type checking, in which case the dynamic semantics of a type ascription expression are exactly those of the implicit coercion. -The syntax of patterns is extended to include an optional type ascription. -Old syntax: - -``` -PT ::= P: T -P ::= var | 'box' P | ... -e ::= 'let' (PT | P) = ... | ... -``` - -where `PT` is a pattern with optional type, `P` is a sub-pattern, `T` is a type, -and `var` is a variable name. (Formal arguments are `PT`, patterns in match arms -are `P`). +@eddyb has implemented the expressions part of this RFC, +[PR](https://github.com/rust-lang/rust/pull/21836). -New syntax: -``` -PT ::= P: T | P -P ::= var | 'box' PT | ... -e ::= 'let' PT = ... | ... -``` +### coercion and `as` vs `:` -Type ascription in patterns has the narrowest precedence, e.g., `box x: T` means -`box (x: T)`. In particular, in a struct initialiser or patter, `x : y : z` is -parsed as `x : (y: z)`, i.e., a field named `x` is initialised with a value `y` -and that value must have type `z`. If only `x: y` is given, that is considered -to be the field name and the field's contents, with no type ascription. +A downside of type ascription is the overlap with explicit coercions (aka casts, +the `as` operator). Type ascription makes implicit coercions explicit. In RFC +401, it is proposed that all valid implicit coercions are valid explicit +coercions. However, that may be too confusing for users, since there is no +reason to use type ascription rather than `as` (if there is some coercion). +Furthermore, if programmers do opt to use `as` as the default whether or not it +is required, then it loses its function as a warning sign for programmers to +beware of. -The chagnes to pattern syntax mean that in some contexts where a pattern -previously required a type annotation, it is no longer required if all variables -can be assigned types via the ascription. Examples, +To address this I propose three lints which check for: trivial casts, coercible +casts, and trivial numeric casts. Other than these lints we stick with the +proposal from #401 that unnecessary casts will no longer be an error. -``` -struct Foo { - a: Bar, - b: Baz, -} -fn foo(x: Foo); // Ok, type of x given by type of whole pattern -fn foo(Foo { a: x, b: y}: Foo) // Ok, types of x and y found by destructuring -fn foo(Foo { a: x: Bar, b: y: Baz}) // Ok, no type annotation, but types given as ascriptions -fn foo(Foo { a: x: Bar, _ }) // Ok, we can still deduce the type of x and the whole argument -fn foo(Foo { a: x, b: y}) // Ok, type of x and y given by Foo - -struct Qux { - a: Bar, - b: X, -} -fn foo(x: Qux); // Ok, type of x given by type of whole pattern -fn foo(Qux { a: x, b: y}: Qux) // Ok, types of x and y found by destructuring -fn foo(Qux { a: x: Bar, b: y: Baz}) // Ok, no type annotation, but types given as ascriptions -fn foo(Qux { a: x: Bar, _ }) // Error, can't find the type of the whole argument -fn foo(Qux { a: x, b: y}) // Error can't find type of y or the whole argument -``` +A trivial cast is a cast `x as T` where `x` has type `U` and `U` is a subtype of +`T` (note that subtyping includes reflexivity). -Note the above changes mean moving some errors from parsing to later in type -checking. For example, all uses of patterns have optional types, and it is a -type error if there must be a type (e.g., in function arguments) but it is not -fully specified (currently it would be a parsing error). +A coercible cast is a cast `x as T` where `x` has type `U` and `x` can be +implicitly coerced to `T`, but `U` is not a subtype of `T`. -In type checking, if an expression is matched against a pattern, when matching -a sub-pattern the matching sub-expression must have the ascribed type (again, -this check includes subtyping and implicit coercion). Types in patterns play no -role at runtime. +A trivial numeric cast is a cast `x as T` where `x` has type `U` and `x` is +implicitly coercible to `T` or `U` is a subtype of `T`, and both `U` and `T` are +numeric types. -@eddyb has implemented the expressions part of this RFC, -[PR](https://github.com/rust-lang/rust/pull/21836). +Like any lints, these can be customised per-crate by the programmer. The trivial +cast lint is 'deny' by default (i.e., causes an error); the coercible cast and +trivial numeric cast lints are 'warn' by default. +Although this is a somewhat complex scheme, it allows code that works today to +work with only minor adjustment, it allows for a backwards compatible path to +'promoting' type conversions from explicit casts to implicit coercions, and it +allows customisation of a contentious kind of error (especially so in the +context of cross-platform programming). # Drawbacks @@ -205,11 +165,6 @@ difficult to support the same syntax as field initialisers. We could do nothing and force programmers to use temporary variables to specify a type. However, this is less ergonomic and has problems with scopes/lifetimes. -Patterns can be given a type as a whole rather than annotating a part of the -pattern. - -We could allow type ascription in expressions but not patterns. This is a -smaller change and addresses most of the motivation. Rely on explicit coercions - the current plan [RFC 401](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md) is to allow explicit coercion to any valid type and to use a customisable lint @@ -221,35 +176,12 @@ which require more programmer attention. This also does not help with patterns. We could use a different symbol or keyword instead of `:`, e.g., `is`. -# Unresolved questions - -Is the suggested precedence correct? Especially for patterns. -Does type ascription on patterns have backwards compatibility issues? - -Given the potential confusion with struct literal syntax, it is perhaps worth -re-opening that discussion. But given the timing, probably not. +# Unresolved questions -Should remove integer suffixes in favour of type ascription? +Is the suggested precedence correct? -### `as` vs `:` +Should we remove integer suffixes in favour of type ascription? -A downside of type ascription is the overlap with explicit coercions (aka casts, -the `as` operator). Type ascription makes implicit coercions explicit. In RFC -401, it is proposed that all valid implicit coercions are valid explicit -coercions. However, that may be too confusing for users, since there is no -reason to use type ascription rather than `as` (if there is some coercion). It -might be a good idea to revisit that decision (it has not yet been implemented). -Then it is clear that the user uses `as` for explicit casts and `:` for non- -coercing ascription and implicit casts. Although there is no hard guideline for -which operations are implicit or explicit, the intuition is that if the -programmer ought to be aware of the change (i.e., the invariants of using the -type change to become less safe in any way) then coercion should be explicit, -otherwise it can be implicit. - -Alternatively we could remove `as` and require `:` for explicit coercions, but -not for implicit ones (they would keep the same rules as they currently have). -The only loss would be that `:` doesn't stand out as much as `as` and there -would be no lint for trivial coercions. Another (backwards compatible) -alternative would be to keep `as` and `:` as synonyms and recommend against -using `as`. +Style guidelines - should we recommend spacing or parenthesis to make type +ascription syntax more easily recognisable? From 92a52439c80f4ac984bbedce4148ecca2c18be85 Mon Sep 17 00:00:00 2001 From: Eduard Burtescu Date: Thu, 26 Feb 2015 10:25:17 +0200 Subject: [PATCH 0142/1195] Address comments and expand the goals and definitions that were partially implied. --- text/0000-const-fn.md | 82 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 73 insertions(+), 9 deletions(-) diff --git a/text/0000-const-fn.md b/text/0000-const-fn.md index d491bd6cbf4..8df5c9e4368 100644 --- a/text/0000-const-fn.md +++ b/text/0000-const-fn.md @@ -21,15 +21,19 @@ const ATOMIC_USIZE_INIT: AtomicUsize = AtomicUsize { v: UnsafeCell { value: 0 } }; ``` + This approach is fragile and doesn't compose well - consider having to initialize an `AtomicUsize` static with `usize::MAX` - you would need a `const` for each possible value. + Also, types like `AtomicPtr` or `Cell` have no way *at all* to initialize them in constant contexts, leading to overuse of `UnsafeCell` or `static mut`, disregarding type safety and proper abstractions. + During implementation, the worst offender I've found was `std::thread_local`: all the fields of `std::thread_local::imp::Key` are public, so they can be -filled in by a macro - and they're marked "stable". +filled in by a macro - and they're also marked "stable" (due to the lack of +stability hygiene in macros). A pre-RFC for the removal of the dangerous (and oftenly misued) `static mut` received positive feedback, but only under the condition that abstractions @@ -40,6 +44,12 @@ constant expressions, including fixed-length array types. Unlike keyword-based alternatives, `const fn` provides an extensible and composable building block for such features. +The design should be as simple as it can be, while keeping enough functionality +to solve the issues mentioned above. +The intention is to have something usable at 1.0 without limiting what we can +in the future. Compile-time pure constants (the existing `const` items) with +added parametrization over types and values (arguments) should suffice. + # Detailed design Functions and inherent methods can be marked as `const`: @@ -60,10 +70,13 @@ impl Foo { } } ``` + Traits, trait implementations and their methods cannot be `const` - this allows us to properly design a constness/CTFE system that interacts well with traits - for more details, see *Alternatives*. + Only simple by-value immutable bindings are allowed as arguments' patterns. + The body of the function is checked as if it were a block inside a `const`: ```rust const FOO: Foo = { @@ -73,6 +86,35 @@ const FOO: Foo = { expr } ``` + +As the current `const` items are not formally specified (yet), there is a need +to expand on the rules for `const` values (pure compile-time constants), instead +of leaving them implicit: +* the set of currently implemented expressions is: primitive literals, ADTs +(tuples, arrays, structs, enum variants), unary/binary operations on primitives, +casts, field accesses/indexing, capture-less closures, references and blocks +(only item statements and a tail expression) +* no side-effects (assignments, non-`const` function calls, inline assembly) +* struct/enum values are not allowed if their type implements `Drop`, but +this is not transitive, allowing the (perfectly harmless) creation of, e.g. +`None::>` (as an aside, this rule could be used to allow `[x; N]` even +for non-`Copy` types of `x`, but that is out of the scope of this RFC) +* references are trully immutable, no value with interior mutability can be placed +behind a reference, and mutable references can only be created from zero-sized +values (e.g. `&mut || {}`) - this allows a reference to be represented just by +its value, with no guarantees for the actual address in memory +* raw pointers can only be created from an integer, a reference or another raw +pointer, and cannot be dereferenced or cast back to an integer, which means any +constant raw pointer can be represented by either a constant integer or reference +* as a result of not having any side-effects, loops would only affect termination, +which has no practical value, thus remaining unimplemented +* although more useful than loops, conditional control flow (`if`/`else` and +`match`) also remains unimplemented and only `match` would pose a challenge +* immutable `let` bindings in blocks have the same status and implementation +difficulty as `if`/`else` and they both suffer from a lack of demand (blocks +were originally introduced to `const`/`static` for scoping items used only in +the initializer of a global). + For the purpose of rvalue promotion (to static memory), arguments are considered potentially varying, because the function can still be called with non-constant values at runtime. @@ -110,19 +152,30 @@ static COUNTDOWN: AtomicUsize = AtomicUsize::new(10); static TLS_COUNTER: Cell = Cell::new(1); ``` +Type parameters and their bounds are not restricted, though trait methods cannot +be called, as they are never `const` in this design. Accessing trait methods can +still be useful - for example, they can be turned into function pointers: +```rust +const fn arithmetic_ops() -> [fn(T, T) -> T; 4] { + [Add::add, Sub::sub, Mul::mul, Div::div] +} +``` + # Drawbacks -None that I know of. +* A design that is not conservative enough risks creating backwards compatibility +hazards that might only be uncovered when a more extensive CTFE proposal is made, +after 1.0. # Alternatives * Not do anything for 1.0. This would result in some APIs being crippled and serious backwards compatibility issues - `UnsafeCell`'s `value` field cannot simply be removed later. -* While not an alternative, but rather a potential extension, there is only way -I could make `const fn`s work with traits (in an untested design, that is): -qualify trait implementations and bounds with `const`. This is necessary for -meaningful interactions with overloading traits - quick example: +* While not an alternative, but rather a potential extension, I want to point +out there is only way I could make `const fn`s work with traits (in an untested +design, that is): qualify trait implementations and bounds with `const`. +This is necessary for meaningful interactions with operator overloading traits: ```rust const fn map_vec3 T>(xs: [T; 3], f: F) -> [T; 3] { [f([xs[0]), f([xs[1]), f([xs[2])] @@ -142,9 +195,20 @@ const impl Add for Point { } ``` Having `const` trait methods (where all implementations are `const`) seems -useful, but is not enough of its own. +useful, but it would not allow the usecase above on its own. +Trait implementations with `const` methods (instead of the entire `impl` +being `const`) would allow direct calls, but it's not obvious how one could +write a function generic over a type which implements a trait and requiring +that a certain method of that trait is implemented as `const`. # Unresolved questions -Should we allow `unsafe const fn`? The implementation cost is neglible, but I -am not certain it needs to exist. +* Allow `unsafe const fn`? The implementation cost is negligible, but I am not +certain it needs to exist. +* Keep recursion or disallow it for now? The conservative choice of having no +recursive `const fn`s would not affect the usecases intended for this RFC. +If we do allow it, we probably need a recursion limit, and/or an evaluation +algorithm that can handle *at least* tail recursion. +Also, there is no way to actually write a recursive `const fn` at this moment, +because no control flow primitives are implemented for constants, but that +cannot be taken for granted, at least `if`/`else` should eventually work. From 64657484db963f66becf531fb6df0d821fa25197 Mon Sep 17 00:00:00 2001 From: Eduard Burtescu Date: Thu, 26 Feb 2015 11:18:58 +0200 Subject: [PATCH 0143/1195] Simplify the limitations on arguments and add explanation. --- text/0000-const-fn.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-const-fn.md b/text/0000-const-fn.md index 8df5c9e4368..5414625abf8 100644 --- a/text/0000-const-fn.md +++ b/text/0000-const-fn.md @@ -75,7 +75,9 @@ Traits, trait implementations and their methods cannot be `const` - this allows us to properly design a constness/CTFE system that interacts well with traits - for more details, see *Alternatives*. -Only simple by-value immutable bindings are allowed as arguments' patterns. +Only simple by-value bindings are allowed in arguments, e.g. `x: T`. While +by-ref bindings and destructuring can be supported, they're not necessary +and they would only complicate the implementation. The body of the function is checked as if it were a block inside a `const`: ```rust From 5e38d1d8bb9889f963d3f88da6c290edb1001a0e Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 27 Feb 2015 08:47:35 +1300 Subject: [PATCH 0144/1195] Correct a typo and improve some text --- text/0000-type-ascription.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/text/0000-type-ascription.md b/text/0000-type-ascription.md index a36c86cea41..a14e8b34733 100644 --- a/text/0000-type-ascription.md +++ b/text/0000-type-ascription.md @@ -87,7 +87,7 @@ let x: T = { }; // With type ascription. -let x: T foo(): U<_>; +let x: T = foo(): U<_>; ``` @@ -117,13 +117,14 @@ expression are exactly those of the implicit coercion. ### coercion and `as` vs `:` A downside of type ascription is the overlap with explicit coercions (aka casts, -the `as` operator). Type ascription makes implicit coercions explicit. In RFC -401, it is proposed that all valid implicit coercions are valid explicit -coercions. However, that may be too confusing for users, since there is no -reason to use type ascription rather than `as` (if there is some coercion). -Furthermore, if programmers do opt to use `as` as the default whether or not it -is required, then it loses its function as a warning sign for programmers to -beware of. +the `as` operator). To the programmer, type ascription makes implicit coercions +explicit (however, the compiler makes no distinction between coercions due to +type ascription and other coercions). In RFC 401, it is proposed that all valid +implicit coercions are valid explicit coercions. However, that may be too +confusing for users, since there is no reason to use type ascription rather than +`as` (if there is some coercion). Furthermore, if programmers do opt to use `as` +as the default whether or not it is required, then it loses its function as a +warning sign for programmers to beware of. To address this I propose three lints which check for: trivial casts, coercible casts, and trivial numeric casts. Other than these lints we stick with the From fcee8fc6002c45689e6292ed57bd22c69c06126d Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Thu, 26 Feb 2015 13:33:11 -0700 Subject: [PATCH 0145/1195] Reword/clarify motivation --- text/0000-type-macros.md | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index b0249df0cce..cb5957e2cb1 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -11,17 +11,21 @@ Allow macros in type positions Macros are currently allowed in syntax fragments for expressions, items, and patterns, but not for types. This RFC proposes to lift that -restriction for the following reasons: +restriction. -1. Increase generality of the macro system - the limitation should be - removed in order to promote generality and to enable use cases which - would otherwise require resorting either more elaborate plugins or - macros at the item-level. +1. This would allow macros to be used more flexibly, avoiding the + need for more complex item-level macros or plugins in some + cases. For example, when creating trait implementations with + macros, it is sometimes useful to be able to define the + associated types using a nested type macro but this is + currently problematic. + +2. Enable more programming patterns, particularly with respect to + type level programming. Macros in type positions provide + convenient way to express recursion and choice. It is possible + to do the same thing purely through programming with associated + types but the resulting code can be cumbersome to read and write. -2. Enable more programming patterns - macros in type positions provide - a means to express **recursion** and **choice** within types in a - fashion that is still legible. Associated types alone can accomplish - the former (recursion/choice) but not the latter (legibility). # Detailed design From 8d8bf2ccbb6e25df775f848bbe99d3eef0a3aa8d Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Thu, 26 Feb 2015 13:44:01 -0700 Subject: [PATCH 0146/1195] Update links --- text/0000-type-macros.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index cb5957e2cb1..6ca8cbfc85f 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -550,9 +550,9 @@ fn test_to_hlist() { There seem to be few drawbacks to implementing this feature as an extension of the existing macro machinery. The change adds a small amount of additional complexity to the -[parser](https://github.com/freebroccolo/rust/blob/e09cb32bcc04029dc4c16790e2aaa9811af27f25/src/libsyntax/parse/parser.rs#L1547-L1560) +[parser](https://github.com/freebroccolo/rust/commit/a224739e92a3aa1febb67d6371988622bd141361) and -[conversion](https://github.com/freebroccolo/rust/blob/e4b826b7afa1b5496b41ddaa1666014046ac5704/src/librustc_typeck/astconv.rs#L1301-L1303) +[conversion](https://github.com/freebroccolo/rust/commit/9341232087991dee73713dc4521acdce11a799a2) but the changes are minimal. As with all feature proposals, it is possible that designs for future From 05ef23c308b9b3d71bf5006af8d057777215fa7a Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Thu, 26 Feb 2015 22:41:53 -0500 Subject: [PATCH 0147/1195] Add Copy alternative. --- text/0000-embrace-extend-extinguish.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/0000-embrace-extend-extinguish.md b/text/0000-embrace-extend-extinguish.md index cd80373f5f8..b28e9d7dba0 100644 --- a/text/0000-embrace-extend-extinguish.md +++ b/text/0000-embrace-extend-extinguish.md @@ -65,7 +65,10 @@ Hidden clones? # Alternatives -Nope. +Restrict this proposal to only work for Copy types. This avoids any concern over +implicit expensive operations, and enables easily working with Plain Old Data. +The only downside is creating a larger divide between Clone and Copy, while also +being a bit needlessly inexpressive. # Unresolved questions From 50ff468f9b1d7bafb912374e7df2c51329112d6e Mon Sep 17 00:00:00 2001 From: Carl Lerche Date: Thu, 26 Feb 2015 21:29:21 -0800 Subject: [PATCH 0148/1195] Focus on a single `std::thread_local` renaming strategy --- text/0000-move-thread-local-to-std-thread.md | 32 +++++++++----------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/text/0000-move-thread-local-to-std-thread.md b/text/0000-move-thread-local-to-std-thread.md index eb059b92871..cde24abdd5b 100644 --- a/text/0000-move-thread-local-to-std-thread.md +++ b/text/0000-move-thread-local-to-std-thread.md @@ -11,27 +11,21 @@ remove `std::thread_local` from the standard library. # Motivation Thread locals are directly related to threading. Combining the modules -would reduce the number of top level modules, making browsing the docs -easier as well as reduce the number of `use` statements. +would reduce the number of top level modules, combine related concepts, +and make browsing the docs easier. It also would have the potential to +slightly reduce the number of `use` statementsl # Detailed design -The goal is to move the contents of `std::thread_local` into -`std::thread`. There are a few possible strategies that could be used to -achieve this. +The `std::thread_local` module would be renamed to `std::thread::local`. +All contents of the module would remain the same. This way, all thread +related code is combined in one module. -One option would be to move the contents as is into `std::thread`. This -would leave `Key` and `State` as is. There would be no naming conflict, -but the names would be less ideal since the containing module is not -directly related to thread locals anymore. This could be handled by -renaming the types to something like `LocalKey` and `LocalState`. +It would also allow using it as such: -Another option would be to move the contents into a dedicated sub module -such as `std::thread::local`. This would mean some code would still have -an extra `use` statement for pulling in thread local related types, but -it would also enable doing: - -`use std::thread::{local, Thread};` +```rust +use std::thread::{local, Thread}; +``` # Drawbacks @@ -42,7 +36,11 @@ may prefer to have more top level modules. # Alternatives -Leaving `std::thread_local` in its own module. +Another strategy for moving `std::thread_local` would be to move it +directly into `std::thread` without scoping it in a dedicated module. +There are no naming conflicts, but the names would not be ideal anymore. +One way to mitigate would be to rename the types to something like +`LocalKey` and `LocalState`. # Unresolved questions From 7b1994c3a899c2aa023bacf6fb62ded91420d0d5 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sat, 28 Feb 2015 13:52:51 -0700 Subject: [PATCH 0149/1195] Include additional details in code examples --- text/0000-type-macros.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 6ca8cbfc85f..9e342a444b2 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -57,11 +57,13 @@ length, adding/removing items, computing permutations, etc. Heterogeneous lists can be defined like so: ```rust +#[derive(Copy, Clone, Debug, Eq, Ord, PartialEq, PartialOrd)] struct Nil; // empty HList +#[derive(Copy, Clone, Debug, Eq, Ord, PartialEq, PartialOrd)] struct Cons(H, T); // cons cell of HList // trait to classify valid HLists -trait HList {} +trait HList: MarkerTrait {} impl HList for Nil {} impl HList for Cons {} ``` @@ -185,17 +187,17 @@ struct _0; // 0 bit struct _1; // 1 bit // classify valid bits -trait Bit {} +trait Bit: MarkerTrait {} impl Bit for _0 {} impl Bit for _1 {} // classify positive binary naturals -trait Pos {} +trait Pos: MarkerTrait {} impl Pos for _1 {} impl Pos for (P, B) {} // classify binary naturals with 0 -trait Nat {} +trait Nat: MarkerTrait {} impl Nat for _0 {} impl Nat for _1 {} impl Nat for (P, B) {} From ce82cc889e50eba54cd36bd6e023a059c7085d5a Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sat, 28 Feb 2015 13:53:25 -0700 Subject: [PATCH 0150/1195] Modify `Expr!` to not need extra parentheses --- text/0000-type-macros.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 9e342a444b2..39ee64b449d 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -149,8 +149,10 @@ impl ops::Add for Cons w // type macro Expr allows us to expand the + operator appropriately macro_rules! Expr { - { $A:ty } => { $A }; - { $LHS:tt + $RHS:tt } => { >::Output }; + { ( $($LHS:tt)+ ) } => { Expr!($($LHS)+) }; + { HList ! [ $($LHS:tt)* ] + $($RHS:tt)+ } => { >::Output }; + { $LHS:tt + $($RHS:tt)+ } => { >::Output }; + { $LHS:ty } => { $LHS }; } // test demonstrating term level `xs + ys` and type level `Expr!(Xs + Ys)` @@ -164,10 +166,10 @@ fn test_append() { let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![]]; let ys: HList![u64, [u8; 3], ()] = hlist![0, [0, 1, 2], ()]; - // parentheses around compound types due to limitations in macro parsing; - // real implementation could use a plugin to avoid this - let zs: Expr!((HList![&str, bool, Vec]) + - (HList![u64, [u8; 3], ()])) + // demonstrate recursive expansion of Expr! + let zs: Expr!((HList![&str] + HList![bool] + HList![Vec]) + + (HList![u64] + HList![[u8; 3], ()]) + + HList![]) = aux(xs, ys); assert_eq!(zs, hlist!["foo", false, vec![], 0, [0, 1, 2], ()]) } From 27faef47ab427f0b1cb55a43450e54b525b6f5bf Mon Sep 17 00:00:00 2001 From: Alexis Date: Sun, 1 Mar 2015 10:37:39 -0500 Subject: [PATCH 0151/1195] entry API v3 --- text/0000-entry_v3.md | 123 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 123 insertions(+) create mode 100644 text/0000-entry_v3.md diff --git a/text/0000-entry_v3.md b/text/0000-entry_v3.md new file mode 100644 index 00000000000..957fee41ab0 --- /dev/null +++ b/text/0000-entry_v3.md @@ -0,0 +1,123 @@ +- Feature Name: entry_v3 +- Start Date: 2015-03-01 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Replace Entry::get with Entry::default and Entry::default_with for better ergonomics and clearer +code. + +# Motivation + +Entry::get was introduced to reduce a lot of the boiler-plate involved in simple Entry usage. Two +incredibly common patterns in particular stand out: + +``` +match map.entry(key) => { + Entry::Vacant(entry) => { entry.insert(1); }, + Entry::Occupied(entry) => { *entry.get_mut() += 1; }, +} +``` + +``` +match map.entry(key) => { + Entry::Vacant(entry) => { entry.insert(vec![val]); }, + Entry::Occupied(entry) => { entry.get_mut().push(val); }, +} +``` + +This code is noisy, and is visibly fighting the Entry API a bit, such as having to supress +the return value of insert. It requires the `Entry` enum to be imported into scope. It requires +the user to learn a whole new API. It also introduces a "many ways to do it" stylistic ambiguity: + +``` +match map.entry(key) => { + Entry::Vacant(entry) => entry.insert(vec![]), + Entry::Occupied(entry) => entry.into_mut(), +}.push(val); +``` + +Entry::get tries to address some of this by doing something similar to `Result::ok`. +It maps the Entry into a more familiar Result, while automatically converting the +Occupied case into an `&mut V`. Usage looks like: + + +``` +*map.entry(key).get().unwrap_or_else(|entry| entry.insert(0)) += 1; +``` + +``` +entry(key).get().unwrap_or_else(|entry| entry.insert(vec![])).push(val); +``` + +This is certainly *nicer*. No imports are needed, the Occupied case is handled, and we're closer +to a "only one way". However this is still fairly tedious and arcane. `get` provides little +meaning for what is done; unwrap_or_else is long and scary-sounding; and VacantEntry litterally +*only* supports `insert`, so having to call it seems redundant. + +# Detailed design + +Replace `Entry::get` with the following two methods: + +``` + /// Ensures a value is in the entry by inserting the default if empty, and returns + /// a mutable reference to the value in the entry. + pub fn default(self. default: V) -> &'a mut V { + match self { + Occupied(entry) => entry.into_mut(), + Vacant(entry) => entry.insert(default), + } + } + + #[unstable(feature = "collections", + /// Ensures a value is in the entry by inserting the result of the default function if empty, + /// and returns a mutable reference to the value in the entry. + pub fn default_with V>(self. default: F) -> &'a mut V { + match self { + Occupied(entry) => entry.into_mut(), + Vacant(entry) => entry.insert(default()), + } + } +``` + +which allows the following: + + +``` +*map.entry(key).default(0) += 1; +``` + +``` +// vec![] doesn't even allocate, and is only 3 ptrs big. +entry(key).default(vec![]).push(val); +``` + +``` +let val = entry(key).default_with(|| expensive(big, data)); +``` + +Look at all that ergonomics. *Look at it*. This pushes us more into the "one right way" +territory, since this is unambiguously clearer and easier than a full `match` or abusing Result. +Novices don't really need to learn the entry API at all with this. They can just learn the +`.entry(key).default(value)` incantation to start, and work their way up to more complex +usage later. + +Oh hey look this entire RFC is already implemented with all of `rust-lang/rust`'s `entry` +usage audited and updated: https://github.com/rust-lang/rust/pull/22930 + +# Drawbacks + +Replaces the composability of just mapping to a Result with more adhoc specialty methods. This +is hardly a drawback for the reasons stated in the RFC. Maybe someone was really leveraging +the Result-ness in an exotic way, but it was likely an abuse of the API. Regardless, the `get` +method is trivial to write as a consumer of the API. + +# Alternatives + +Settle for Result chumpsville or abandon this sugar altogether. Truly, fates worse than death. + +# Unresolved questions + +None. + From 89dc9fe8b8396ff464e66250257389bd3de415cb Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sun, 1 Mar 2015 21:41:13 -0700 Subject: [PATCH 0152/1195] Renaming; fix examples to use MacEager --- text/0000-type-macros.md | 57 ++++++++++++++++++++++------------------ 1 file changed, 31 insertions(+), 26 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 39ee64b449d..ecdf229acbc 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -281,10 +281,14 @@ string via iteration rather than recursively using `quote` macros. The string is then parsed as a type, returning an ast fragment. ```rust -// convert a u64 to a string representation of a type-level binary natural, e.g., -// nat_str(1024) -// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -fn nat_str(mut num: u64) -> String { +// Convert a u64 to a string representation of a type-level binary natural, e.g., +// ast_as_str(1024) +// ==> "(((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0)" +fn ast_as_str<'cx>( + ecx: &'cx base::ExtCtxt, + mut num: u64, + mode: Mode, +) -> String { let path = "_"; let mut res: String; if num < 2 { @@ -306,17 +310,18 @@ fn nat_str(mut num: u64) -> String { res.push_str(")"); } } - return res; + res } -// Generate a parser with the nat string for `num` as input -fn nat_str_parser<'cx>( - ecx: &'cx mut base::ExtCtxt, +// Generate a parser which uses the nat's ast-as-string as its input +fn ast_parser<'cx>( + ecx: &'cx base::ExtCtxt, num: u64, + mode: Mode, ) -> parse::parser::Parser<'cx> { let filemap = ecx .codemap() - .new_filemap(String::from_str(""), nat_str(num)); + .new_filemap(String::from_str(""), ast_as_str(ecx, num, mode)); let reader = lexer::StringReader::new( &ecx.parse_sess().span_diagnostic, filemap); @@ -326,16 +331,16 @@ fn nat_str_parser<'cx>( Box::new(reader)) } -// Try to parse an integer literal and return a new parser for its nat -// string; this is used to create both a type-level `Nat!` with -// `nat_ty_expand` and term-level `nat!` macro with `nat_tm_expand` -pub fn nat_lit_parser<'cx>( - ecx: &'cx mut base::ExtCtxt, +// Try to parse an integer literal and return a new parser which uses +// the nat's ast-as-string as its input +pub fn lit_parser<'cx>( + ecx: &'cx base::ExtCtxt, args: &[ast::TokenTree], + mode: Mode, ) -> Option> { - let mut litp = ecx.new_parser_from_tts(args); - if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { - Some(nat_str_parser(ecx, lit)) + let mut lit_parser = ecx.new_parser_from_tts(args); + if let ast::Lit_::LitInt(lit, _) = lit_parser.parse_lit().node { + Some(ast_parser(ecx, lit, mode)) } else { None } @@ -344,15 +349,15 @@ pub fn nat_lit_parser<'cx>( // Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., // Nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn nat_ty_expand<'cx>( +pub fn expand_ty<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], ) -> Box { { - nat_lit_parser(ecx, args) - }.and_then(|mut natp| { - Some(base::MacTy::new(natp.parse_ty())) + lit_parser(ecx, args, Mode::Ty) + }.and_then(|mut ast_parser| { + Some(base::MacEager::ty(ast_parser.parse_ty())) }).unwrap_or_else(|| { ecx.span_err(span, "Nat!: expected an integer literal argument"); base::DummyResult::any(span) @@ -362,15 +367,15 @@ pub fn nat_ty_expand<'cx>( // Expand nat!(n) to a term-level binary nat where n is an int literal, e.g., // nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn nat_tm_expand<'cx>( +pub fn expand_tm<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], ) -> Box { { - nat_lit_parser(ecx, args) - }.and_then(|mut natp| { - Some(base::MacExpr::new(natp.parse_expr())) + lit_parser(ecx, args, Mode::Tm) + }.and_then(|mut ast_parser| { + Some(base::MacEager::expr(ast_parser.parse_expr())) }).unwrap_or_else(|| { ecx.span_err(span, "nat!: expected an integer literal argument"); base::DummyResult::any(span) @@ -487,7 +492,7 @@ fn invoke_for_seq_upto_expand<'cx>( } // splice the impl fragments into the ast - Some(base::MacItems::new(items.into_iter())) + Some(base::MacEager::items(SmallVector::many(items))) }).unwrap_or_else(|| { ecx.span_err(span, "invoke_for_seq_upto!: expected an integer literal argument"); From 729ab73f1d43f3daec6f2d805673c285bb5df437 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sun, 1 Mar 2015 23:25:31 -0700 Subject: [PATCH 0153/1195] Modify macro example to match patterns --- text/0000-type-macros.md | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index ecdf229acbc..3b588cb174e 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -80,8 +80,16 @@ At the term-level, this is an easy fix using macros: // term-level macro for HLists macro_rules! hlist { {} => { Nil }; - { $head:expr } => { Cons($head, Nil) }; + {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; + { $head:expr } => { Cons($head, Nil) }; +} + +// term-level HLists in patterns +macro_rules! hlist_pat { + {} => { Nil }; + { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; + { $head:pat } => { Cons($head, Nil) }; } let xs = hlist!["foo", false, vec![0u64]]; @@ -102,8 +110,16 @@ well. The complete example follows: // term-level macro for HLists macro_rules! hlist { {} => { Nil }; - { $head:expr } => { Cons($head, Nil) }; + {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; + { $head:expr } => { Cons($head, Nil) }; +} + +// term-level HLists in patterns +macro_rules! hlist_pat { + {} => { Nil }; + { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; + { $head:pat } => { Cons($head, Nil) }; } // type-level macro for HLists @@ -428,15 +444,16 @@ macro_rules! HList { // term-level macro for HLists macro_rules! hlist { {} => { Nil }; - { $head:expr } => { Cons($head, Nil) }; + {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; + { $head:expr } => { Cons($head, Nil) }; } // term-level HLists in patterns -macro_rules! hlist_match { +macro_rules! hlist_pat { {} => { Nil }; - { $head:ident } => { Cons($head, Nil) }; - { $head:ident, $($tail:ident),* } => { Cons($head, hlist_match!($($tail),*)) }; + { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; + { $head:pat } => { Cons($head, Nil) }; } // `invoke_for_seq_upto` is a `higher-order` macro that takes the name @@ -512,7 +529,7 @@ macro_rules! impl_to_tuple { type Output = ($($seq,)*); extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { match this { - hlist_match![$($seq),*] => ($($seq,)*) + hlist![=> $($seq),*] => ($($seq,)*) } } } From 431d6601dd458339e56196b38432bebe8be0812d Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Mon, 2 Mar 2015 07:51:15 -0500 Subject: [PATCH 0154/1195] fixup + bikeshed --- text/0000-entry_v3.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/text/0000-entry_v3.md b/text/0000-entry_v3.md index 957fee41ab0..5e1acb41904 100644 --- a/text/0000-entry_v3.md +++ b/text/0000-entry_v3.md @@ -48,7 +48,7 @@ Occupied case into an `&mut V`. Usage looks like: ``` ``` -entry(key).get().unwrap_or_else(|entry| entry.insert(vec![])).push(val); +map.entry(key).get().unwrap_or_else(|entry| entry.insert(vec![])).push(val); ``` This is certainly *nicer*. No imports are needed, the Occupied case is handled, and we're closer @@ -70,7 +70,6 @@ Replace `Entry::get` with the following two methods: } } - #[unstable(feature = "collections", /// Ensures a value is in the entry by inserting the result of the default function if empty, /// and returns a mutable reference to the value in the entry. pub fn default_with V>(self. default: F) -> &'a mut V { @@ -90,11 +89,11 @@ which allows the following: ``` // vec![] doesn't even allocate, and is only 3 ptrs big. -entry(key).default(vec![]).push(val); +map.entry(key).default(vec![]).push(val); ``` ``` -let val = entry(key).default_with(|| expensive(big, data)); +let val = map.entry(key).default_with(|| expensive(big, data)); ``` Look at all that ergonomics. *Look at it*. This pushes us more into the "one right way" @@ -119,5 +118,11 @@ Settle for Result chumpsville or abandon this sugar altogether. Truly, fates wor # Unresolved questions -None. +`default` and `default_with` are universally reviled as *names*. Need a better name. Some candidates. + +* set_default +* or_insert +* insert_default +* insert_if_vacant +* with_default From a17aa3563a07fdedd301faa783a92c9212831eb2 Mon Sep 17 00:00:00 2001 From: Carl Lerche Date: Mon, 2 Mar 2015 11:26:34 -0800 Subject: [PATCH 0155/1195] Rename to InetAddr, add any* functions * Rename SocketAddr -> InetAddr * Add various proxy fns to IpAddr * Introduce `any*` functions to create unspecified inet addresses --- text/0517-io-os-reform.md | 111 +++++++++++++++++++++++++++++++++----- 1 file changed, 98 insertions(+), 13 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index a867e549266..db19e061d60 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1370,6 +1370,91 @@ The contents of `std::io::net` submodules `tcp`, `udp`, `ip` and the other modules are being moved or removed and are described elsewhere. +#### InetAddr + +The composition of an `IpAddr` and a port. It has the following interface: + +```rust +impl InetAddr { + /// Returns a new InetAddr composed of an unspecified v4 IP and a 0 + /// port + fn any_v4() -> InetAddr; + + /// Returns a new InetAddr composed of an unspecified v6 IP and a 0 + /// port + fn any_v6() -> InetAddr; + + fn ip(&self) -> IpAddr; + fn port(&self) -> u16; + + /// Returns true if the IpAddr is unspecified and port == 0 + fn is_unspecified(&self) -> bool; +} +``` + +#### IpAddr + +Represents an IP address. It has the following interface: + +```rust +impl IpAddr { + fn new_v4(a: u8, b: u8, c: u8, d: u8) -> IpAddr; + fn any_v4() -> IpAddr; + + fn new_v6(a: u16, b: u16, c: u16, d: u16, e: u16, f: u16, g: u16, h: u16) -> IpAddr; + fn any_v6() -> IpAddr; + + // The following functions proxy to the versioned IP address value + fn is_unspecified(&self) -> bool; + fn is_loopback(&self) -> bool; + fn is_global(&self) -> bool; + fn is_private(&self) -> bool; + fn is_multicast(&self) -> bool; +} +``` + +#### Ipv4Addr + +Represents a version 4 IP address. It has the following interface: + +```rust +impl Ipv4Addr { + fn new(a: u8, b: u8, c: u8, d: u8) -> Ipv4Addr; + fn any() -> Ipv4Addr; + fn octets(&self) -> [u8; 4]; + fn is_unspecified(&self) -> bool; + fn is_loopback(&self) -> bool; + fn is_private(&self) -> bool; + fn is_link_local(&self) -> bool; + fn is_global(&self) -> bool; + fn is_multicast(&self) -> bool; + fn to_ipv6_compatible(&self) -> Ipv6Addr; + fn to_ipv6_mapped(&self) -> Ipv6Addr; +} +``` + +#### Ipv6Addr + +Represents a version 6 IP address. It has the following interface: + +```rust +impl Ipv6Addr { + fn new(a: u16, b: u16, c: u16, d: u16, e: u16, f: u16, g: u16, h: u16) -> Ipv6Addr; + fn any() -> Ipv6Addr; + fn segments(&self) -> [u16; 8] + fn is_unspecified(&self) -> bool; + fn is_loopback(&self) -> bool; + fn is_global(&self) -> bool; + fn is_unique_local(&self) -> bool; + fn is_unicast_link_local(&self) -> bool; + fn is_unicast_site_local(&self) -> bool; + fn is_unicast_global(&self) -> bool; + fn multicast_scope(&self) -> Option; + fn is_multicast(&self) -> bool; + fn to_ipv4(&self) -> Option; +} +``` + #### TCP [TCP]: #tcp @@ -1380,9 +1465,9 @@ following interface: // TcpStream, which contains both a reader and a writer impl TcpStream { - fn connect(addr: &A) -> io::Result; - fn peer_addr(&self) -> io::Result; - fn socket_addr(&self) -> io::Result; + fn connect(addr: &A) -> io::Result; + fn peer_addr(&self) -> io::Result; + fn inet_addr(&self) -> io::Result; fn shutdown(&self, how: Shutdown) -> io::Result<()>; fn duplicate(&self) -> io::Result; } @@ -1420,10 +1505,10 @@ into the `TcpListener` structure. Specifically, this will be the resulting API: ```rust impl TcpListener { - fn bind(addr: &A) -> io::Result; - fn socket_addr(&self) -> io::Result; + fn bind(addr: &A) -> io::Result; + fn inet_addr(&self) -> io::Result; fn duplicate(&self) -> io::Result; - fn accept(&self) -> io::Result<(TcpStream, SocketAddr)>; + fn accept(&self) -> io::Result<(TcpStream, InetAddr)>; fn incoming(&self) -> Incoming; } @@ -1447,10 +1532,10 @@ Some major changes from today's API include: date with a more robust interface. * The `set_timeout` functionality has also been removed in favor of returning at a later date in a more robust fashion with `select`. -* The `accept` function no longer takes `&mut self` and returns `SocketAddr`. +* The `accept` function no longer takes `&mut self` and returns `InetAddr`. The change in mutability is done to express that multiple `accept` calls can happen concurrently. -* For convenience the iterator does not yield the `SocketAddr` from `accept`. +* For convenience the iterator does not yield the `InetAddr` from `accept`. The `TcpListener` type will also adhere to `Send` and `Sync`. @@ -1462,10 +1547,10 @@ infrastructure will: ```rust impl UdpSocket { - fn bind(addr: &A) -> io::Result; - fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, SocketAddr)>; - fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; - fn socket_addr(&self) -> io::Result; + fn bind(addr: &A) -> io::Result; + fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, InetAddr)>; + fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; + fn inet_addr(&self) -> io::Result; fn duplicate(&self) -> io::Result; } @@ -1514,7 +1599,7 @@ For the current `addrinfo` module: For the current `ip` module: -* The `ToSocketAddr` trait should become `ToSocketAddrs` +* The `ToInetAddr` trait should become `ToInetAddrs` * The default `to_socket_addr_all` method should be removed. The actual address structures could use some scrutiny, but any From 71d5471a37c2a7fe590e69308dbd482b48c2701d Mon Sep 17 00:00:00 2001 From: Chris Wong Date: Wed, 4 Mar 2015 20:35:51 +1300 Subject: [PATCH 0156/1195] Hyphens considered harmful --- text/0000-hyphens-considered-harmful.md | 138 ++++++++++++++++++++++++ 1 file changed, 138 insertions(+) create mode 100644 text/0000-hyphens-considered-harmful.md diff --git a/text/0000-hyphens-considered-harmful.md b/text/0000-hyphens-considered-harmful.md new file mode 100644 index 00000000000..a45fbdcecfc --- /dev/null +++ b/text/0000-hyphens-considered-harmful.md @@ -0,0 +1,138 @@ +- Feature Name: `hyphens_considered_harmful` +- Start Date: 2015-03-05 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Disallow hyphens in package and crate names. Propose a clear transition path for existing packages. + +# Motivation + +Currently, Cargo packages and Rust crates both allow hyphens in their names. This is not good, for two reasons: + +1. **Usability**: Since hyphens are not allowed in identifiers, anyone who uses such a crate must rename it on import: + + ```rust + extern crate "rustc-serialize" as rustc_serialize; + ``` + + This boilerplate confers no additional meaning, and is a common source of confusion for beginners. + +2. **Consistency**: Nowhere else do we allow hyphens in names, so having them in crates is inconsistent with the rest of the language. + +For these reasons, we should work to remove this feature before the beta. + +However, as of January 2015 there are 589 packages with hyphens on crates.io. It is unlikely that simply removing hyphens from the syntax will work, given all the code that depends on them. In particular, we need a plan that: + +* Is easy to implement and understand; + +* Accounts for the existing packages on crates.io; and + +* Gives as much time as possible for users to fix their code. + +# Detailed design + +1. On **crates.io**: + + + Reject all further uploads for hyphenated names. Packages with hyphenated *dependencies* will still be allowed though. + + + On the server, migrate all existing hyphenated packages to underscored names. Keep the old packages around for compatibility, but hide them from search. To keep things simple, only the `name` field will change; dependencies will stay as they are. + +2. In **Cargo**: + + + Continue allowing hyphens in package names, but treat them as having underscores internally. Warn the user when this happens. + + This applies to both the package itself and its dependencies. For example, imagine we have an `apple-fritter` package that depends on `rustc-serialize`. When Cargo builds this package, it will instead fetch `rustc_serialize` and build `apple_fritter`. + +3. In **rustc**: + + + As with Cargo, continue allowing hyphens in `extern crate`, but rewrite them to underscores in the parser. Warn the user when this happens. + + + Do *not* allow hyphens in other contexts, such as the `#[crate_name]` attribute or `--crate-name` and `--extern` options. + + > Rationale: These options are usually provided by external tools, which would break in strange ways if rustc chooses a different name. + +4. Announce the change on the users forum and /r/rust. Tell users to update to the latest Cargo and rustc, and to begin transitioning their packages to the new system. Party. + +5. Some time between the beta and 1.0 release, remove support for hyphens from Cargo and rustc. + +## C dependency (`*-sys`) packages + +[RFC 403] introduced a `*-sys` convention for wrappers around C libraries. Under this proposal, we will use `*_sys` instead. + +[RFC 403]: https://github.com/rust-lang/rfcs/blob/master/text/0403-cargo-build-command.md + +# Drawbacks + +## Code churn + +While most code should not break from these changes, there will be much churn as maintainers fix their packages. However, the work should not amount to more than a simple find/replace. Also, because old packages are migrated automatically, maintainers can delay fixing their code until they need to publish a new version. + +## Loss of hyphens + +There are two advantages to keeping hyphens around: + +* Aesthetics: Hyphens do look nicer than underscores. + +* Namespacing: Hyphens are often used for pseudo-namespaces. For example in Python, the Django web framework has a wealth of addon packages, all prefixed with `django-`. + +The author believes the disadvantages of hyphens outweigh these benefits. + +# Alternatives + +## Do nothing + +As with any proposal, we can choose to do nothing. But given the reasons outlined above, the author believes it is important that we address the problem before the beta release. + +## Disallow hyphens in crates, but allow them in packages + +What we often call "crate name" is actually two separate concepts: the *package name* as seen by Cargo and crates.io, and the *crate name* used by rustc and `extern crate`. While the two names are usually equal, Cargo lets us set them separately. + +For example, if we have a package named `lily-valley`, we can rename the inner crate to `lily_valley` as follows: + +```toml +[package] +name = "lily-valley" # Package name +# ... + +[lib] +name = "lily_valley" # Crate name +``` + +This will let us import the crate as `extern crate lily_valley` while keeping the hyphenated name in Cargo. + +But while this solution solves the usability problem, it still leaves the package and crate names inconsistent. Given the few use cases for hyphens, it is unclear whether this solution is better than just disallowing them altogether. + +## Make `extern crate` match fuzzily + +Alternatively, we can have the compiler consider hyphens and underscores as equal while looking up a crate. In other words, the crate `flim-flam` would match both `extern crate flim_flam` and `extern crate "flim-flam" as flim_flam`. This will let us keep the hyphenated names, without having to rename them on import. + +The drawback to this solution is complexity. We will need to add this special case to the compiler, guard against conflicting packages on crates.io, and explain this behavior to newcomers. That's too much work to support a marginal use case. + +## Repurpose hyphens as namespace separators + +Alternatively, we can treat hyphens as path separators in Rust. + +For example, the crate `hoity-toity` could be imported as + +```rust +extern crate hoity::toity; +``` + +which is desugared to: + +```rust +mod hoity { + mod toity { + extern crate "hoity-toity" as krate; + pub use krate::*; + } +} +``` + +However, on prototyping this proposal, the author found it too complex and fraught with edge cases. Banning hyphens outright would be much easier to implement and understand. + +# Unresolved questions + +None so far. From f3be78b317cbd71223ec1a99527f41e096feadf5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 4 Mar 2015 16:30:44 -0800 Subject: [PATCH 0157/1195] Clarify raw stdio will not get implemented just yet --- text/0517-io-os-reform.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 0f648c2f55c..c8e73839e68 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1244,6 +1244,10 @@ standard primitives listed above: #### Raw stdio [Raw stdio]: #raw-stdio +> **Note**: This section is intended to be a sketch of possible raw stdio +> support, but it is not planned to implement or stabilize this +> implementation at this time. + The above standard input/output handles all involve some form of locking or buffering (or both). This cost is not always wanted, and hence raw variants will be provided. Due to platform differences across unix/windows, the following From 5b569fc7189bd128fb8e108cb0ce1aa9d4a511a6 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 4 Mar 2015 16:43:42 -0800 Subject: [PATCH 0158/1195] Remove `std::fmt::output` --- text/0517-io-os-reform.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index c8e73839e68..0b80808821f 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1318,13 +1318,11 @@ interface. [gh22607]: https://github.com/rust-lang/rust/issues/22607 -The `set_stdout` and `set_stderr` functions will be moved to a new -`std::fmt::output` module and renamed to `set_print` and `set_panic`, -respectively. These new names reflect what they actually do, removing a -longstanding confusion. The current `stdio::flush` function will also move to -this module and be renamed to `flush_print`. - -The entire `std::fmt::output` module will remain `#[unstable]` for now, however. +The `set_stdout` and `set_stderr` functions will be removed with no replacement +for now. It's unclear whether these functions should indeed control a thread +local handle instead of a global handle as whether they're justified in the +first place. It is a backwards-compatible extension to allow this sort of output +to be redirected and can be considered if the need arises. ### `std::env` [std::env]: #stdenv From 2baf8d335f6ce5f97c1b691d6373d438ec510464 Mon Sep 17 00:00:00 2001 From: Chris Wong Date: Thu, 5 Mar 2015 14:17:29 +1300 Subject: [PATCH 0159/1195] Amend #403: Change `*-sys` to `*_sys` --- text/0403-cargo-build-command.md | 34 ++++++++++++++++---------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/text/0403-cargo-build-command.md b/text/0403-cargo-build-command.md index cb169b59346..9d00bd8ab02 100644 --- a/text/0403-cargo-build-command.md +++ b/text/0403-cargo-build-command.md @@ -9,11 +9,11 @@ around build commands to facilitate linking native code to Cargo packages. 1. Instead of having the `build` command be some form of script, it will be a Rust command instead -2. Establish a namespace of `foo-sys` packages which represent the native +2. Establish a namespace of `foo_sys` packages which represent the native library `foo`. These packages will have Cargo-based dependencies between - `*-sys` packages to express dependencies among C packages themselves. + `*_sys` packages to express dependencies among C packages themselves. 3. Establish a set of standard environment variables for build commands which - will instruct how `foo-sys` packages should be built in terms of dynamic or + will instruct how `foo_sys` packages should be built in terms of dynamic or static linkage, as well as providing the ability to override where a package comes from via environment variables. @@ -101,7 +101,7 @@ Summary: * Add platform-specific dependencies to Cargo manifests * Allow pre-built libraries in the same manner as Cargo overrides * Use Rust for build scripts -* Develop a convention of `*-sys` packages +* Develop a convention of `*_sys` packages ## Modifications to `rustc` @@ -358,38 +358,38 @@ useful to interdependencies among native packages themselves. For example libssh2 depends on OpenSSL on linux, which means it needs to find the corresponding libraries and header files. The metadata keys serve as a vector through which this information can be transmitted. The maintainer of the -`openssl-sys` package (described below) would have a build script responsible +`openssl_sys` package (described below) would have a build script responsible for generating this sort of metadata so consumer packages can use it to build C libraries themselves. -## A set of `*-sys` packages +## A set of `*_sys` packages This section will discuss a *convention* by which Cargo packages providing native dependencies will be named, it is not proposed to have Cargo enforce this convention via any means. These conventions are proposed to address constraints 5 and 6 above. -Common C dependencies will be refactored into a package named `foo-sys` where -`foo` is the name of the C library that `foo-sys` will provide and link to. +Common C dependencies will be refactored into a package named `foo_sys` where +`foo` is the name of the C library that `foo_sys` will provide and link to. There are two key motivations behind this convention: -* Each `foo-sys` package will declare its own dependencies on other `foo-sys` +* Each `foo_sys` package will declare its own dependencies on other `foo_sys` based packages * Dependencies on native libraries expressed through Cargo will be subject to version management, version locking, and deduplication as usual. -Each `foo-sys` package is responsible for providing the following: +Each `foo_sys` package is responsible for providing the following: -* Declarations of all symbols in a library. Essentially each `foo-sys` library +* Declarations of all symbols in a library. Essentially each `foo_sys` library is *only* a header file in terms of Rust-related code. -* Ensuring that the native library `foo` is linked to the `foo-sys` crate. This +* Ensuring that the native library `foo` is linked to the `foo_sys` crate. This guarantees that all exposed symbols are indeed linked into the crate. -Dependencies making use of `*-sys` packages will not expose `extern` blocks -themselves, but rather use the symbols exposed in the `foo-sys` package -directly. Additionally, packages using `*-sys` packages should not declare a +Dependencies making use of `*_sys` packages will not expose `extern` blocks +themselves, but rather use the symbols exposed in the `foo_sys` package +directly. Additionally, packages using `*_sys` packages should not declare a `#[link]` directive to link to the native library as it's already linked to the -`*-sys` package. +`*_sys` package. ## Phasing strategy @@ -517,7 +517,7 @@ perform this configuration (be it environment or in files). * Features themselves will also likely need to be platform-specific, but this runs into a number of tricky situations and needs to be fleshed out. -[verbose]: https://github.com/alexcrichton/complicated-linkage-example/blob/master/curl-sys/Cargo.toml#L9-L17 +[verbose]: https://github.com/alexcrichton/complicated-linkage-example/blob/master/curl_sys/Cargo.toml#L9-L17 # Alternatives From 7f4a612a446777fe5a15b03b86572c17a79c6dac Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 5 Mar 2015 17:27:30 +1300 Subject: [PATCH 0160/1195] Reduce number of lints, add feature gate. --- text/0000-type-ascription.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/text/0000-type-ascription.md b/text/0000-type-ascription.md index a14e8b34733..19af72e2c8b 100644 --- a/text/0000-type-ascription.md +++ b/text/0000-type-ascription.md @@ -1,6 +1,7 @@ - Start Date: 2015-2-3 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) +- Feature: `ascription` # Summary @@ -113,6 +114,8 @@ expression are exactly those of the implicit coercion. @eddyb has implemented the expressions part of this RFC, [PR](https://github.com/rust-lang/rust/pull/21836). +This feature should land behind the `ascription` feature gate. + ### coercion and `as` vs `:` @@ -126,23 +129,19 @@ confusing for users, since there is no reason to use type ascription rather than as the default whether or not it is required, then it loses its function as a warning sign for programmers to beware of. -To address this I propose three lints which check for: trivial casts, coercible -casts, and trivial numeric casts. Other than these lints we stick with the -proposal from #401 that unnecessary casts will no longer be an error. - -A trivial cast is a cast `x as T` where `x` has type `U` and `U` is a subtype of -`T` (note that subtyping includes reflexivity). +To address this I propose two lints which check for: trivial casts and trivial +numeric casts. Other than these lints we stick with the proposal from #401 that +unnecessary casts will no longer be an error. -A coercible cast is a cast `x as T` where `x` has type `U` and `x` can be -implicitly coerced to `T`, but `U` is not a subtype of `T`. +A trivial cast is a cast `x as T` where `x` has type `U` and `x` can be +implicitly coerced to `T` or is already a subtype of `T`. A trivial numeric cast is a cast `x as T` where `x` has type `U` and `x` is implicitly coercible to `T` or `U` is a subtype of `T`, and both `U` and `T` are numeric types. -Like any lints, these can be customised per-crate by the programmer. The trivial -cast lint is 'deny' by default (i.e., causes an error); the coercible cast and -trivial numeric cast lints are 'warn' by default. +Like any lints, these can be customised per-crate by the programmer. Both lints +are 'warn' by default. Although this is a somewhat complex scheme, it allows code that works today to work with only minor adjustment, it allows for a backwards compatible path to From b94fbde46bc06679fd0c4a9f5f1500ab0e4bc6ab Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 4 Mar 2015 20:54:26 -0800 Subject: [PATCH 0161/1195] RFC 574 is Replace `Vec::drain` by a method that accepts a range parameter --- text/{0000-drain-range.md => 0574-drain-range.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-drain-range.md => 0574-drain-range.md} (95%) diff --git a/text/0000-drain-range.md b/text/0574-drain-range.md similarity index 95% rename from text/0000-drain-range.md rename to text/0574-drain-range.md index adbd22dcc15..b1982a48685 100644 --- a/text/0000-drain-range.md +++ b/text/0574-drain-range.md @@ -1,6 +1,6 @@ - Start Date: 2015-01-12 -- RFC PR #: (leave this empty) -- Rust Issue #: (leave this empty) +- RFC PR #: https://github.com/rust-lang/rfcs/pull/574 +- Rust Issue #: https://github.com/rust-lang/rust/issues/23055 # Summary From ca2e7f6fddb30cf33c95997639561f9f64b38bed Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 5 Mar 2015 11:59:05 -0800 Subject: [PATCH 0162/1195] Debug improvements is RFC 640 --- README.md | 1 + ...{0000-debug-improvements.md => 0640-debug-improvements.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-debug-improvements.md => 0640-debug-improvements.md} (97%) diff --git a/README.md b/README.md index dad28d4c55d..b9affc1607f 100644 --- a/README.md +++ b/README.md @@ -45,6 +45,7 @@ the direction the language is evolving in. * [0560-integer-overflow.md](text/0560-integer-overflow.md) * [0563-remove-ndebug.md](text/0563-remove-ndebug.md) * [0572-rustc-attribute.md](text/0572-rustc-attribute.md) +* [0640-debug-improvements.md](0640-debug-improvements.md) * [0702-rangefull-expression.md](text/0702-rangefull-expression.md) * [0738-variance.md](text/0738-variance.md) * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) diff --git a/text/0000-debug-improvements.md b/text/0640-debug-improvements.md similarity index 97% rename from text/0000-debug-improvements.md rename to text/0640-debug-improvements.md index 7c55ca5409b..f7cbfeb5cfa 100644 --- a/text/0000-debug-improvements.md +++ b/text/0640-debug-improvements.md @@ -1,6 +1,6 @@ - Start Date: 2015-01-20 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#640](https://github.com/rust-lang/rfcs/pull/640) +- Rust Issue: [rust-lang/rust#23083](https://github.com/rust-lang/rust/issues/23083) # Summary From 8ce380a610988e46fd7d387ed5ef56960aaff6de Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 5 Mar 2015 11:59:35 -0800 Subject: [PATCH 0163/1195] Fix link for debug-improvements --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b9affc1607f..3eb32dd799f 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,7 @@ the direction the language is evolving in. * [0560-integer-overflow.md](text/0560-integer-overflow.md) * [0563-remove-ndebug.md](text/0563-remove-ndebug.md) * [0572-rustc-attribute.md](text/0572-rustc-attribute.md) -* [0640-debug-improvements.md](0640-debug-improvements.md) +* [0640-debug-improvements.md](text/0640-debug-improvements.md) * [0702-rangefull-expression.md](text/0702-rangefull-expression.md) * [0738-variance.md](text/0738-variance.md) * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) From f2ab53441d970ab9c3a80bbd78354902ef37f2d7 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 5 Mar 2015 16:14:46 -0800 Subject: [PATCH 0164/1195] Tweaks here and there * Pare back the APIs * `inet_addr` methods to `local_addr` * `InetV{4,6}Addr` types * Remove `IpAddr` * Show impls of `ToInetAddrs` --- text/0517-io-os-reform.md | 97 +++++++++++++++++---------------------- 1 file changed, 43 insertions(+), 54 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index db19e061d60..5560c1fbf5b 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1372,23 +1372,27 @@ elsewhere. #### InetAddr -The composition of an `IpAddr` and a port. It has the following interface: +This structure will represent either a `sockaddr_in` or `sockaddr_in6` which is +commonly just a pairing of an `IpAddr` and a port. ```rust impl InetAddr { - /// Returns a new InetAddr composed of an unspecified v4 IP and a 0 - /// port - fn any_v4() -> InetAddr; - - /// Returns a new InetAddr composed of an unspecified v6 IP and a 0 - /// port - fn any_v6() -> InetAddr; + fn as_v4(&self) -> Option<&InetV4Addr>; + fn as_v6(&self) -> Option<&InetV6Addr>; +} - fn ip(&self) -> IpAddr; - fn port(&self) -> u16; +impl InetV4Addr { + fn new(addr: Ipv4Addr, port: u16) -> InetV4Addr; + fn ip(&self) -> &Ipv4Addr; + fn port(&self) -> u16; +} - /// Returns true if the IpAddr is unspecified and port == 0 - fn is_unspecified(&self) -> bool; +impl InetV6Addr { + fn new(addr: Ipv6Addr, port: u16, flowinfo: u32, scope_id: u32) -> InetV6Addr; + fn ip(&self) -> &Ipv6Addr; + fn port(&self) -> u16; + fn flowinfo(&self) -> u32; + fn scope_id(&self) -> u32; } ``` @@ -1398,18 +1402,8 @@ Represents an IP address. It has the following interface: ```rust impl IpAddr { - fn new_v4(a: u8, b: u8, c: u8, d: u8) -> IpAddr; - fn any_v4() -> IpAddr; - - fn new_v6(a: u16, b: u16, c: u16, d: u16, e: u16, f: u16, g: u16, h: u16) -> IpAddr; - fn any_v6() -> IpAddr; - - // The following functions proxy to the versioned IP address value - fn is_unspecified(&self) -> bool; - fn is_loopback(&self) -> bool; - fn is_global(&self) -> bool; - fn is_private(&self) -> bool; - fn is_multicast(&self) -> bool; + fn as_v4(&self) -> Option<&Ipv4Addr>; + fn as_v6(&self) -> Option<&Ipv6Addr>; } ``` @@ -1419,17 +1413,11 @@ Represents a version 4 IP address. It has the following interface: ```rust impl Ipv4Addr { - fn new(a: u8, b: u8, c: u8, d: u8) -> Ipv4Addr; - fn any() -> Ipv4Addr; - fn octets(&self) -> [u8; 4]; - fn is_unspecified(&self) -> bool; - fn is_loopback(&self) -> bool; - fn is_private(&self) -> bool; - fn is_link_local(&self) -> bool; - fn is_global(&self) -> bool; - fn is_multicast(&self) -> bool; - fn to_ipv6_compatible(&self) -> Ipv6Addr; - fn to_ipv6_mapped(&self) -> Ipv6Addr; + fn new(a: u8, b: u8, c: u8, d: u8) -> Ipv4Addr; + fn any() -> Ipv4Addr; + fn octets(&self) -> [u8; 4]; + fn to_ipv6_compatible(&self) -> Ipv6Addr; + fn to_ipv6_mapped(&self) -> Ipv6Addr; } ``` @@ -1439,19 +1427,10 @@ Represents a version 6 IP address. It has the following interface: ```rust impl Ipv6Addr { - fn new(a: u16, b: u16, c: u16, d: u16, e: u16, f: u16, g: u16, h: u16) -> Ipv6Addr; - fn any() -> Ipv6Addr; - fn segments(&self) -> [u16; 8] - fn is_unspecified(&self) -> bool; - fn is_loopback(&self) -> bool; - fn is_global(&self) -> bool; - fn is_unique_local(&self) -> bool; - fn is_unicast_link_local(&self) -> bool; - fn is_unicast_site_local(&self) -> bool; - fn is_unicast_global(&self) -> bool; - fn multicast_scope(&self) -> Option; - fn is_multicast(&self) -> bool; - fn to_ipv4(&self) -> Option; + fn new(a: u16, b: u16, c: u16, d: u16, e: u16, f: u16, g: u16, h: u16) -> Ipv6Addr; + fn any() -> Ipv6Addr; + fn segments(&self) -> [u16; 8] + fn to_ipv4(&self) -> Option; } ``` @@ -1467,7 +1446,7 @@ following interface: impl TcpStream { fn connect(addr: &A) -> io::Result; fn peer_addr(&self) -> io::Result; - fn inet_addr(&self) -> io::Result; + fn local_addr(&self) -> io::Result; fn shutdown(&self, how: Shutdown) -> io::Result<()>; fn duplicate(&self) -> io::Result; } @@ -1506,7 +1485,7 @@ into the `TcpListener` structure. Specifically, this will be the resulting API: ```rust impl TcpListener { fn bind(addr: &A) -> io::Result; - fn inet_addr(&self) -> io::Result; + fn local_addr(&self) -> io::Result; fn duplicate(&self) -> io::Result; fn accept(&self) -> io::Result<(TcpStream, InetAddr)>; fn incoming(&self) -> Incoming; @@ -1550,7 +1529,7 @@ impl UdpSocket { fn bind(addr: &A) -> io::Result; fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, InetAddr)>; fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; - fn inet_addr(&self) -> io::Result; + fn local_addr(&self) -> io::Result; fn duplicate(&self) -> io::Result; } @@ -1599,11 +1578,21 @@ For the current `addrinfo` module: For the current `ip` module: -* The `ToInetAddr` trait should become `ToInetAddrs` +* The `ToSocketAddr` trait should become `ToInetAddrs` * The default `to_socket_addr_all` method should be removed. -The actual address structures could use some scrutiny, but any -revisions there are left as an unresolved question. +The following implementations of `ToInetAddrs` will be available: + +```rust +impl ToInetAddrs for InetAddr { ... } +impl ToInetAddrs for InetV4Addr { ... } +impl ToInetAddrs for InetV6Addr { ... } +impl ToInetAddrs for (Ipv4Addr, u16) { ... } +impl ToInetAddrs for (Ipv6Addr, u16) { ... } +impl ToInetAddrs for (&str, u16) { ... } +impl ToInetAddrs for str { ... } +impl ToInetAddrs for &T { ... } +``` ### `std::process` [std::process]: #stdprocess From 3803d1070141d110e0858727759234f2757601a5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 5 Mar 2015 16:18:57 -0800 Subject: [PATCH 0165/1195] Remove `IpAddr` --- text/0517-io-os-reform.md | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 5560c1fbf5b..ec467788d57 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1373,7 +1373,7 @@ elsewhere. #### InetAddr This structure will represent either a `sockaddr_in` or `sockaddr_in6` which is -commonly just a pairing of an `IpAddr` and a port. +commonly just a pairing of an IP address and a port. ```rust impl InetAddr { @@ -1396,17 +1396,6 @@ impl InetV6Addr { } ``` -#### IpAddr - -Represents an IP address. It has the following interface: - -```rust -impl IpAddr { - fn as_v4(&self) -> Option<&Ipv4Addr>; - fn as_v6(&self) -> Option<&Ipv6Addr>; -} -``` - #### Ipv4Addr Represents a version 4 IP address. It has the following interface: From dab3e4b3f087e7bb49c3314be65adee20c094889 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 6 Mar 2015 11:19:30 -0500 Subject: [PATCH 0166/1195] Accept RFC #495 --- README.md | 1 + ...array-pattern-changes.md => 0495-array-pattern-changes.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-array-pattern-changes.md => 0495-array-pattern-changes.md} (96%) diff --git a/README.md b/README.md index 3eb32dd799f..20d84b6f4cf 100644 --- a/README.md +++ b/README.md @@ -37,6 +37,7 @@ the direction the language is evolving in. * [0401-coercions.md](text/0401-coercions.md) * [0447-no-unused-impl-parameters.md](text/0447-no-unused-impl-parameters.md) * [0458-send-improvements.md](text/0458-send-improvements.md) +* [0495-array-pattern-changes.md](text/0495-array-pattern-changes.md) * [0501-consistent_no_prelude_attributes.md](text/0501-consistent_no_prelude_attributes.md) * [0505-api-comment-conventions.md](text/0505-api-comment-conventions.md) * [0509-collections-reform-part-2.md](text/0509-collections-reform-part-2.md) diff --git a/text/0000-array-pattern-changes.md b/text/0495-array-pattern-changes.md similarity index 96% rename from text/0000-array-pattern-changes.md rename to text/0495-array-pattern-changes.md index c568e7ddb05..eb7d8c83d1b 100644 --- a/text/0000-array-pattern-changes.md +++ b/text/0495-array-pattern-changes.md @@ -1,6 +1,6 @@ - Start Date: 2014-12-03 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#495](https://github.com/rust-lang/rfcs/pull/495) +- Rust Issue: [rust-lang/rust#23121](https://github.com/rust-lang/rust/issues/23121) Summary ======= From 23689916783da973dcbd1f115f3f2045ecf27a79 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Fri, 6 Mar 2015 11:13:12 -0800 Subject: [PATCH 0167/1195] Retire RFC 8 without implementing it. This RFC changed the design of intrinsics with the goal of eliminating the need for creating inlined wrappers around them. In the meantime, the critical intrinsics for which this have been a problem have all had their wrappers removed (afaik without solving the problem this RFC set out to solve). In the meantime, I decided I do not like the design here that unifies intrinsics and lang items - they are sufficiently different to warrant different names. The ideas here have bitrotted and when we need to solve this for real we should start over. --- README.md | 1 - text/0008-new-intrinsics.md | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index dad28d4c55d..65c52f6adac 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,6 @@ the direction the language is evolving in. ## Active RFC List [Active RFC List]: #active-rfc-list -* [0008-new-intrinsics.md](text/0008-new-intrinsics.md) * [0016-more-attributes.md](text/0016-more-attributes.md) * [0019-opt-in-builtin-traits.md](text/0019-opt-in-builtin-traits.md) * [0066-better-temporary-lifetimes.md](text/0066-better-temporary-lifetimes.md) diff --git a/text/0008-new-intrinsics.md b/text/0008-new-intrinsics.md index d6a2468c1fc..230b02964c5 100644 --- a/text/0008-new-intrinsics.md +++ b/text/0008-new-intrinsics.md @@ -2,6 +2,8 @@ - RFC PR: [rust-lang/rfcs#8](https://github.com/rust-lang/rfcs/pull/8) - Rust Issue: +** Note: this RFC was never implemented. ** + # Summary The way our intrinsics work forces them to be wrapped in order to From 8d5b133ab133cf4b9eef19f15efe77d9ecd4ce17 Mon Sep 17 00:00:00 2001 From: Jorge Aparicio Date: Sun, 8 Mar 2015 02:58:14 -0500 Subject: [PATCH 0168/1195] overloaded assignment operations `a += b` --- text/0000-op-assign.md | 74 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 text/0000-op-assign.md diff --git a/text/0000-op-assign.md b/text/0000-op-assign.md new file mode 100644 index 00000000000..3ba083267f5 --- /dev/null +++ b/text/0000-op-assign.md @@ -0,0 +1,74 @@ +- Feature Name: op_assign +- Start Date: 2015-03-08 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add the family of `[Op]Assign` traits to allow overloading assignment +operations like `a += b`. + +# Motivation + +We already let users overload the binary operations, letting them overload the +assignment version is the next logical step. Plus, this sugar is important to +make mathematical libraries more palatable. + +# Detailed design + +Add the following **unstable** traits to libcore and reexported them in stdlib: + +``` +// `+=` +#[lang = "add_assign"] +trait AddAssign { + fn add_assign(&mut self, &Rhs); +} + +// the remaining traits have the same signature +// (lang items have been omitted for brevity) +trait BitAndAssign { .. } // `&=` +trait BitOrAssign { .. } // `|=` +trait BitXorAssign { .. } // `^=` +trait DivAssign { .. } // `/=` +trait MulAssign { .. } // `*=` +trait RemAssign { .. } // `%=` +trait ShlAssign { .. } // `<<=` +trait ShrAssign { .. } // `>>=` +trait SubAssign { .. } // `-=` +``` + +Implement these traits for the primitive numeric types *without* overloading, +i.e. only `impl AddAssign for i32 { .. }`. + +Add an `op_assign` feature gate. When the feature gate is enabled, the compiler +will consider these traits when typecheking `a += b`. Without the feature gate +the compiler will enforce that `a` and `b` must be primitives of the same +type/category as it does today. + +Once we feel comfortable with the implementation we'll remove the feature gate +and mark the traits as stable. This can be done after 1.0 as this change is +backwards compatible. + +# Drawbacks + +None that I can think of. + +# Alternatives + +Alternatively, we could change the traits to take the RHS by value. This makes +them more "flexible" as the user can pick by value or by reference, but makes +the use slightly unergonomic in the by ref case as the borrow must be explicit +e.g. `x += &big_float;` vs `x += big_float;`. + +# Unresolved questions + +Are there any use cases of assignment operations where the RHS has to be taken +by value for the operation to be performant (e.g. to avoid internal cloning)? + +Should we overload `ShlAssign` and `ShrAssign`, e.g. +`impl ShlAssign for i32`, since we have already overloaded the `Shl` and +`Shr` traits? + +Should we overload all the traits for references, e.g. +`impl<'a> AddAssign<&'a i32> for i32` to allow `x += &0;`? From b78d276d65944ccb44c0a65e7e674af694430923 Mon Sep 17 00:00:00 2001 From: Jorge Aparicio Date: Sun, 8 Mar 2015 12:20:17 -0500 Subject: [PATCH 0169/1195] add section on taking the RHS by ref vs by value --- text/0000-op-assign.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/text/0000-op-assign.md b/text/0000-op-assign.md index 3ba083267f5..8fb962ceaf5 100644 --- a/text/0000-op-assign.md +++ b/text/0000-op-assign.md @@ -50,6 +50,20 @@ Once we feel comfortable with the implementation we'll remove the feature gate and mark the traits as stable. This can be done after 1.0 as this change is backwards compatible. +## RHS: By ref vs by value + +This RFC proposes that the assignment operations take the RHS always by ref; +instead of by value like the "normal" binary operations (e.g. `Add`) do. The +rationale is that, as far as the author has seen in practice [1], one never +wants to mutate the RHS or consume it, or in other words an immutable view into +the RHS is enough to perform the operation. Therefore, this RFC follows in the +footsteps of the `Index` traits, where the same situation arises with the +indexing value, and by ref was chosen over by value. + +[1] It could be possible that the author is not aware of use cases where taking +RHS by value is necessary. Feedback on this matter would be appreciated. (See +the first unresolved question) + # Drawbacks None that I can think of. From bf3201d0453aaecba16e2faeb8f3f75dfc184a05 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Mon, 9 Mar 2015 21:07:39 -0600 Subject: [PATCH 0170/1195] Remove HList/tuple conversion example --- text/0000-type-macros.md | 150 --------------------------------------- 1 file changed, 150 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 3b588cb174e..c16751ddd35 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -421,156 +421,6 @@ specifically. There is [another RFC here](https://github.com/rust-lang/rfcs/pull/884) which proposes extending the type system to address those issue. -#### Conversion from HList to Tuple - -With type macros, it is possible to define conversions back and forth -between tuples and HLists. This is very powerful because it lets us -reuse at the level of tuples all of the recursive operations we can -define for HLists (appending, taking length, adding/removing items, -computing permutations, etc.). - -Conversions can be defined using macros/plugins and function -traits. Type macros are useful in this example for the associated type -`Output` and method return type in the traits. - -```rust -// type-level macro for HLists -macro_rules! HList { - {} => { Nil }; - { $head:ty } => { Cons<$head, Nil> }; - { $head:ty, $($tail:ty),* } => { Cons<$head, HList!($($tail),*)> }; -} - -// term-level macro for HLists -macro_rules! hlist { - {} => { Nil }; - {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; - { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; - { $head:expr } => { Cons($head, Nil) }; -} - -// term-level HLists in patterns -macro_rules! hlist_pat { - {} => { Nil }; - { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; - { $head:pat } => { Cons($head, Nil) }; -} - -// `invoke_for_seq_upto` is a `higher-order` macro that takes the name -// of another macro and a number and iteratively invokes the named -// macro with sequences of identifiers, e.g., -// -// invoke_for_seq_upto{ my_mac, 5 } -// ==> my_mac!{ A0, A1, A2, A3, A4 }; -// my_mac!{ A0, A1, A2, A3 }; -// my_mac!{ A0, A1, A2 }; -// ... -fn invoke_for_seq_upto_expand<'cx>( - ecx: &'cx mut base::ExtCtxt, - span: codemap::Span, - args: &[ast::TokenTree], -) -> Box { - let mut parser = ecx.new_parser_from_tts(args); - - // parse the macro name - let mac = parser.parse_ident(); - - // parse a comma - parser.expect(&token::Token::Comma); - - // parse the number of iterations - if let ast::Lit_::LitInt(lit, _) = parser.parse_lit().node { - Some(lit) - } else { - None - }.and_then(|iterations| { - - // generate a token tree: A0, …, An - let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { - if k % 2 == 0 { - token::str_to_ident(format!("A{}", (k / 2)).as_slice()) - .to_tokens(ecx) - .into_iter() - } else { - let span = codemap::DUMMY_SP; - let token = parse::token::Token::Comma; - vec![ast::TokenTree::TtToken(span, token)] - .into_iter() - } - }).collect::>(); - - // iterate over the ctx and generate impl syntax fragments - let mut items = vec![]; - let mut i = ctx.len(); - for _ in range(0, iterations) { - items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); - i -= 2; - ctx.truncate(i); - } - - // splice the impl fragments into the ast - Some(base::MacEager::items(SmallVector::many(items))) - - }).unwrap_or_else(|| { - ecx.span_err(span, "invoke_for_seq_upto!: expected an integer literal argument"); - base::DummyResult::any(span) - }) -} - -pub struct ToHList; -pub struct ToTuple; - -// macro to implement conversion from hlist to tuple, -// e.g., ToTuple(hlist![…]) ==> (…,) -macro_rules! impl_to_tuple { - ($($seq:ident),*) => { - #[allow(non_snake_case)] - impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { - type Output = ($($seq,)*); - extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { - match this { - hlist![=> $($seq),*] => ($($seq,)*) - } - } - } - } -} - -// macro to implement conversion from tuple to hlist, -// e.g., ToHList((…,)) ==> hlist![…] -macro_rules! impl_to_hlist { - ($($seq:ident),*) => { - #[allow(non_snake_case)] - impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { - type Output = HList![$($seq),*]; - extern "rust-call" fn call(&self, (this,): (($($seq,)*),)) -> HList![$($seq),*] { - match this { - ($($seq,)*) => hlist![$($seq),*] - } - } - } - } -} - -// generate implementations up to length 32 -invoke_for_seq_upto!{ impl_to_tuple, 32 } -invoke_for_seq_upto!{ impl_to_hlist, 32 } - -// test converting an hlist to tuple -#[test] -fn test_to_tuple() { - assert_eq(ToTuple(hlist!["foo", true, (), vec![42u64]]), - ("foo", true, (), vec![42u64])) -} - -// test converting a tuple to hlist -#[test] -fn test_to_hlist() { - assert_eq(ToHList(("foo", true, (), vec![42u64])), - hlist!["foo", true, (), vec![42u64]]) -} -``` - # Drawbacks There seem to be few drawbacks to implementing this feature as an From 48fbc1456f0e3fae29a10c217559908819e44a2a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 10 Mar 2015 10:30:47 -0700 Subject: [PATCH 0171/1195] Clarify wording around locking --- text/0517-io-os-reform.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 0b80808821f..6babd82b337 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1165,14 +1165,16 @@ pub fn stderr() -> Stderr; ``` * `stdin` - returns a handle to a **globally shared** standard input of - the process which is buffered as well. All operations on this handle will - first require acquiring a lock to ensure access to the shared buffer is - synchronized. The handle can be explicitly locked for a critical section so - relocking is not necessary. + the process which is buffered as well. Due to the globally shared nature of + this handle, all operations on `Stdin` directly will acquire a lock internally + to ensure access to the shared buffer is synchronized. This implementation + detail is also exposed through a `lock` method where the handle can be + explicitly locked for a period of time so relocking is not necessary. The `Read` trait will be implemented directly on the returned `Stdin` handle but the `BufRead` trait will not be (due to synchronization concerns). The - locked version of `Stdin` will provide an implementation of `BufRead`. + locked version of `Stdin` (`StdinLock`) will provide an implementation of + `BufRead`. The design will largely be the same as is today with the `old_io` module. From 82c85e426b3125a5458b0a712979daa2ae111913 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Tue, 10 Mar 2015 19:28:48 -0600 Subject: [PATCH 0172/1195] Remove additional examples --- text/0000-type-macros.md | 236 +-------------------------------------- 1 file changed, 2 insertions(+), 234 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index c16751ddd35..6f906903695 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -43,9 +43,7 @@ case for the `Ty_` enum so that the parser can indicate a macro invocation in a type position. In other words, `TyMac` is added to the ast and handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. -## Examples - -### Heterogeneous Lists +## Example: Heterogeneous Lists Heterogeneous lists are one example where the ability to express recursion via type macros is very useful. They can be used as an @@ -136,7 +134,7 @@ Operations on HLists can be defined by recursion, using traits with associated type outputs at the type-level and implementation methods at the term-level. -The HList append operation is provided as an example. type macros are +The HList append operation is provided as an example. Type macros are used to make writing append at the type level (see `Expr!`) more convenient than specifying the associated type projection manually: @@ -191,236 +189,6 @@ fn test_append() { } ``` -### Additional Examples ### - -#### Type-level numerics - -Type-level numerics are another area where type macros can be -useful. The more common unary encodings (Peano numerals) are not -efficient enough to use in practice so we present an example -demonstrating binary natural numbers instead: - -```rust -struct _0; // 0 bit -struct _1; // 1 bit - -// classify valid bits -trait Bit: MarkerTrait {} -impl Bit for _0 {} -impl Bit for _1 {} - -// classify positive binary naturals -trait Pos: MarkerTrait {} -impl Pos for _1 {} -impl Pos for (P, B) {} - -// classify binary naturals with 0 -trait Nat: MarkerTrait {} -impl Nat for _0 {} -impl Nat for _1 {} -impl Nat for (P, B) {} -``` - -These can be used to index into tuples or HLists generically, either -by specifying the path explicitly (e.g., `(a, b, c).at::<(_1, _0)>() -==> c`) or by providing a singleton term with the appropriate type -`(a, b, c).at((_1, _0)) ==> c`. Indexing is linear time in the general -case due to recursion, but can be made constant time for a fixed -number of specialized implementations. - -Type-level numbers can also be used to define "sized" or "bounded" -data, such as a vector indexed by its length: - -```rust -struct LengthVec(Vec); -``` - -Similar to the indexing example, the parameter `N` can either serve as -phantom data, or such a struct could also include a term-level -representation of N as another field. - -In either case, a length-safe API could be defined for container types -like `Vec`. "Unsafe" indexing (without bounds checking) into the -underlying container would be safe in general because the length of -the container would be known statically and reflected in the type of -the length-indexed wrapper. - -We could imagine an idealized API in the following fashion: - -```rust -// push, adding one to the length -fn push(xs: LengthVec, x: A) -> LengthVec; - -// pop, subtracting one from the length -fn pop(xs: LengthVec, store: &mut A) -> LengthVec; - -// look up an element at an index -fn at(xs: LengthVec, index: M) -> A; - -// append, adding the individual lengths -fn append(xs: LengthVec, ys: LengthVec) -> LengthVec; - -// produce a length respecting iterator from an indexed vector -fn iter(xs: LengthVec) -> LengthIterator; -``` - -We can't write code like the above directly in Rust but we could -approximate it through type-level macros: - -```rust -// Expr! would expand + to Add::Output and integer constants to Nat!; see -// the HList append earlier in the RFC for a concrete example -Expr!(N + M) - ==> >::Output - -// Nat! would expand integer literals to type-level binary naturals -// and be implemented as a plugin for efficiency; see the following -// section for a concrete example -Nat!(4) - ==> ((_1, _0), _0) - -// `Expr!` and `Nat!` used for the LengthVec type: -LengthVec - ==> LengthVec>::Output> - ==> LengthVec>::Output> -``` - -##### Implementation of `Nat!` as a plugin - -The following code demonstrates concretely how `Nat!` can be -implemented as a plugin. As with the `HList!` example, this code (with -some additions) compiles and is usable with the type macros prototype -in the branch referenced earlier. - -For efficiency, the binary representation is first constructed as a -string via iteration rather than recursively using `quote` macros. The -string is then parsed as a type, returning an ast fragment. - -```rust -// Convert a u64 to a string representation of a type-level binary natural, e.g., -// ast_as_str(1024) -// ==> "(((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0)" -fn ast_as_str<'cx>( - ecx: &'cx base::ExtCtxt, - mut num: u64, - mode: Mode, -) -> String { - let path = "_"; - let mut res: String; - if num < 2 { - res = String::from_str(path); - res.push_str(num.to_string().as_slice()); - } else { - let mut bin = vec![]; - while num > 0 { - bin.push(num % 2); - num >>= 1; - } - res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); - res.push_str(path); - res.push_str(bin.pop().unwrap().to_string().as_slice()); - for b in bin.iter().rev() { - res.push_str(", "); - res.push_str(path); - res.push_str(b.to_string().as_slice()); - res.push_str(")"); - } - } - res -} - -// Generate a parser which uses the nat's ast-as-string as its input -fn ast_parser<'cx>( - ecx: &'cx base::ExtCtxt, - num: u64, - mode: Mode, -) -> parse::parser::Parser<'cx> { - let filemap = ecx - .codemap() - .new_filemap(String::from_str(""), ast_as_str(ecx, num, mode)); - let reader = lexer::StringReader::new( - &ecx.parse_sess().span_diagnostic, - filemap); - parser::Parser::new( - ecx.parse_sess(), - ecx.cfg(), - Box::new(reader)) -} - -// Try to parse an integer literal and return a new parser which uses -// the nat's ast-as-string as its input -pub fn lit_parser<'cx>( - ecx: &'cx base::ExtCtxt, - args: &[ast::TokenTree], - mode: Mode, -) -> Option> { - let mut lit_parser = ecx.new_parser_from_tts(args); - if let ast::Lit_::LitInt(lit, _) = lit_parser.parse_lit().node { - Some(ast_parser(ecx, lit, mode)) - } else { - None - } -} - -// Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., -// Nat!(1024) -// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn expand_ty<'cx>( - ecx: &'cx mut base::ExtCtxt, - span: codemap::Span, - args: &[ast::TokenTree], -) -> Box { - { - lit_parser(ecx, args, Mode::Ty) - }.and_then(|mut ast_parser| { - Some(base::MacEager::ty(ast_parser.parse_ty())) - }).unwrap_or_else(|| { - ecx.span_err(span, "Nat!: expected an integer literal argument"); - base::DummyResult::any(span) - }) -} - -// Expand nat!(n) to a term-level binary nat where n is an int literal, e.g., -// nat!(1024) -// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn expand_tm<'cx>( - ecx: &'cx mut base::ExtCtxt, - span: codemap::Span, - args: &[ast::TokenTree], -) -> Box { - { - lit_parser(ecx, args, Mode::Tm) - }.and_then(|mut ast_parser| { - Some(base::MacEager::expr(ast_parser.parse_expr())) - }).unwrap_or_else(|| { - ecx.span_err(span, "nat!: expected an integer literal argument"); - base::DummyResult::any(span) - }) -} - -#[test] -fn nats() { - let _: Nat!(42) = nat!(42); -} -``` - -##### Optimization of `Expr`! - -Defining `Expr!` as a plugin would provide an opportunity to perform -various optimizations of more complex type-level expressions during -expansion. Partial evaluation would be one way to achieve -this. Furthermore, expansion-time optimizations wouldn't be limited to -arithmetic expressions but could be used for other data like HLists. - -##### Builtin alternatives: types parameterized by constant values - -The example with type-level naturals serves to illustrate some of the -patterns type macros enable. This RFC is not intended to address the -lack of constant value type parameterization and type-level numerics -specifically. There is -[another RFC here](https://github.com/rust-lang/rfcs/pull/884) which -proposes extending the type system to address those issue. - # Drawbacks There seem to be few drawbacks to implementing this feature as an From 2698af84631b515a92b9f044b877305ba225d7cc Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 11 Mar 2015 17:00:23 -0400 Subject: [PATCH 0173/1195] Initial draft. --- text/0000-closure-return-type-syntax.md | 52 +++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 text/0000-closure-return-type-syntax.md diff --git a/text/0000-closure-return-type-syntax.md b/text/0000-closure-return-type-syntax.md new file mode 100644 index 00000000000..142d094afdc --- /dev/null +++ b/text/0000-closure-return-type-syntax.md @@ -0,0 +1,52 @@ +- Feature Name: N/A +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Restrict closure return type syntax for future compatibility. + +# Motivation + +Today's closure return type syntax juxtaposes a type and an +expression. This is dangerous: if we choose to extend the type grammar +to be more acceptable, we can easily break existing code. + +# Detailed design + +The current closure syntax for annotating the return type is `|Args| +-> Type Expr`, where `Type` is the return type and `Expr` is the body +of the closure. This syntax is future hostile and relies on being able +to determine the end point of a type. If we extend the syntax for +types, we could cause parse errors in existing code. + +An example from history is that we extended the type grammar to +include things like `Fn(..)`. This would have caused the following, +previous, legal -- closure not to parse: `|| -> Foo (Foo)`. As a +simple fix, this RFC proposes that if a return type annotation is +supplied, the body must be enclosed in braces: `|| -> Foo { (Foo) }`. +Types are already juxtaposed with open braces in `fn` items, so this +should not be an additional danger for future evolution. + +# Drawbacks + +This design is minimally invasive but perhaps unfortunate in that it's +not obvious that braces would be required. But then, return type +annotations are very rarely used. + +# Alternatives + +I am not aware of any alternate designs. One possibility would be to +remove return type anotations altogether, perhaps relying on type +ascription or other annotations to force the inferencer to figure +things out, but they are useful in rare scenarios. In particular type +ascription would not be able to handle a higher-ranked signature like +`for<'a> &'a X -> &'a Y` without improving the type checker +implementation in other ways (in particular, we don't infer +generalization over lifetimes at present, unless we can figure it out +from the expected type or explicit annotations). + +# Unresolved questions + +None. From e215e59498c87bef1a1096343c3d708750c65c49 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 12 Mar 2015 18:53:06 +1300 Subject: [PATCH 0174/1195] Added section on rvalues/temporaries --- text/0000-type-ascription.md | 41 ++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/text/0000-type-ascription.md b/text/0000-type-ascription.md index 19af72e2c8b..156e351b854 100644 --- a/text/0000-type-ascription.md +++ b/text/0000-type-ascription.md @@ -149,6 +149,47 @@ work with only minor adjustment, it allows for a backwards compatible path to allows customisation of a contentious kind of error (especially so in the context of cross-platform programming). + +### Type ascription and temporaries + +There is an implementation choice between treating `x: T` as an lvalue or +rvalue. Note that when a rvalue is used in lvalue context (e.g., the subject of +a reference operation), then the compiler introduces a temporary variable. +Neither option is satisfactory, if we treat an ascription expression as an +lvalue (i.e., no new temporary), then there is potential for unsoundness: + +``` +let mut foo: S = ...; +{ + let bar = &mut (foo: T); // S <: T, no coercion required + *bar = ... : T; +} +// Whoops, foo has type T, but the compiler thinks it has type S, where potentially T ` is a type ascription +expression): + +``` +&[mut] +let ref [mut] x = +match { .. ref [mut] x .. => { .. } .. } +.foo() // due to autoref +``` + +Like other rvalues, type ascription would not be allowed as the lhs of assignment. + +Note that, if type asciption is required in such a context, an lvalue can be +forced by using `{}`, e.g., write `&mut { foo: T }`, rather than `&mut (foo: T)`. + + # Drawbacks More syntax, another feature in the language. From d6c40b461a63d03233ed121495e4e4f263065ae0 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 12 Mar 2015 16:29:05 -0700 Subject: [PATCH 0175/1195] Move back to SocketAddr --- text/0517-io-os-reform.md | 60 +++++++++++++++++++-------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index ec467788d57..93e9ca217b9 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1370,25 +1370,25 @@ The contents of `std::io::net` submodules `tcp`, `udp`, `ip` and the other modules are being moved or removed and are described elsewhere. -#### InetAddr +#### SocketAddr This structure will represent either a `sockaddr_in` or `sockaddr_in6` which is commonly just a pairing of an IP address and a port. ```rust -impl InetAddr { - fn as_v4(&self) -> Option<&InetV4Addr>; - fn as_v6(&self) -> Option<&InetV6Addr>; +enum SocketAddr { + V4(SocketAddrV4), + V6(SocketAddrV6), } -impl InetV4Addr { - fn new(addr: Ipv4Addr, port: u16) -> InetV4Addr; +impl SocketAddrV4 { + fn new(addr: Ipv4Addr, port: u16) -> SocketAddrV4; fn ip(&self) -> &Ipv4Addr; fn port(&self) -> u16; } -impl InetV6Addr { - fn new(addr: Ipv6Addr, port: u16, flowinfo: u32, scope_id: u32) -> InetV6Addr; +impl SocketAddrV6 { + fn new(addr: Ipv6Addr, port: u16, flowinfo: u32, scope_id: u32) -> SocketAddrV6; fn ip(&self) -> &Ipv6Addr; fn port(&self) -> u16; fn flowinfo(&self) -> u32; @@ -1433,9 +1433,9 @@ following interface: // TcpStream, which contains both a reader and a writer impl TcpStream { - fn connect(addr: &A) -> io::Result; - fn peer_addr(&self) -> io::Result; - fn local_addr(&self) -> io::Result; + fn connect(addr: &A) -> io::Result; + fn peer_addr(&self) -> io::Result; + fn local_addr(&self) -> io::Result; fn shutdown(&self, how: Shutdown) -> io::Result<()>; fn duplicate(&self) -> io::Result; } @@ -1473,10 +1473,10 @@ into the `TcpListener` structure. Specifically, this will be the resulting API: ```rust impl TcpListener { - fn bind(addr: &A) -> io::Result; - fn local_addr(&self) -> io::Result; + fn bind(addr: &A) -> io::Result; + fn local_addr(&self) -> io::Result; fn duplicate(&self) -> io::Result; - fn accept(&self) -> io::Result<(TcpStream, InetAddr)>; + fn accept(&self) -> io::Result<(TcpStream, SocketAddr)>; fn incoming(&self) -> Incoming; } @@ -1500,10 +1500,10 @@ Some major changes from today's API include: date with a more robust interface. * The `set_timeout` functionality has also been removed in favor of returning at a later date in a more robust fashion with `select`. -* The `accept` function no longer takes `&mut self` and returns `InetAddr`. +* The `accept` function no longer takes `&mut self` and returns `SocketAddr`. The change in mutability is done to express that multiple `accept` calls can happen concurrently. -* For convenience the iterator does not yield the `InetAddr` from `accept`. +* For convenience the iterator does not yield the `SocketAddr` from `accept`. The `TcpListener` type will also adhere to `Send` and `Sync`. @@ -1515,10 +1515,10 @@ infrastructure will: ```rust impl UdpSocket { - fn bind(addr: &A) -> io::Result; - fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, InetAddr)>; - fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; - fn local_addr(&self) -> io::Result; + fn bind(addr: &A) -> io::Result; + fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, SocketAddr)>; + fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; + fn local_addr(&self) -> io::Result; fn duplicate(&self) -> io::Result; } @@ -1567,20 +1567,20 @@ For the current `addrinfo` module: For the current `ip` module: -* The `ToSocketAddr` trait should become `ToInetAddrs` +* The `ToSocketAddr` trait should become `ToSocketAddrs` * The default `to_socket_addr_all` method should be removed. -The following implementations of `ToInetAddrs` will be available: +The following implementations of `ToSocketAddrs` will be available: ```rust -impl ToInetAddrs for InetAddr { ... } -impl ToInetAddrs for InetV4Addr { ... } -impl ToInetAddrs for InetV6Addr { ... } -impl ToInetAddrs for (Ipv4Addr, u16) { ... } -impl ToInetAddrs for (Ipv6Addr, u16) { ... } -impl ToInetAddrs for (&str, u16) { ... } -impl ToInetAddrs for str { ... } -impl ToInetAddrs for &T { ... } +impl ToSocketAddrs for SocketAddr { ... } +impl ToSocketAddrs for SocketAddrV4 { ... } +impl ToSocketAddrs for SocketAddrV6 { ... } +impl ToSocketAddrs for (Ipv4Addr, u16) { ... } +impl ToSocketAddrs for (Ipv6Addr, u16) { ... } +impl ToSocketAddrs for (&str, u16) { ... } +impl ToSocketAddrs for str { ... } +impl ToSocketAddrs for &T { ... } ``` ### `std::process` From 3b194da403b15ea942250d247dcaa99e2b352568 Mon Sep 17 00:00:00 2001 From: Chris Wong Date: Thu, 12 Mar 2015 21:29:10 -0400 Subject: [PATCH 0176/1195] Rewrite RFC * Allow hyphens in the package name (but not the crate) * De-quotify crate renaming syntax --- text/0000-hyphens-considered-harmful.md | 95 ++++++++----------------- 1 file changed, 31 insertions(+), 64 deletions(-) diff --git a/text/0000-hyphens-considered-harmful.md b/text/0000-hyphens-considered-harmful.md index a45fbdcecfc..9383d2545cd 100644 --- a/text/0000-hyphens-considered-harmful.md +++ b/text/0000-hyphens-considered-harmful.md @@ -5,79 +5,61 @@ # Summary -Disallow hyphens in package and crate names. Propose a clear transition path for existing packages. +Disallow hyphens in Rust crate names, but continue allowing them in Cargo packages. # Motivation -Currently, Cargo packages and Rust crates both allow hyphens in their names. This is not good, for two reasons: +This RFC aims to reconcile two conflicting points of view. -1. **Usability**: Since hyphens are not allowed in identifiers, anyone who uses such a crate must rename it on import: +First: hyphens in crate names are awkward to use, and inconsistent with the rest of the language. Anyone who uses such a crate must rename it on import: - ```rust - extern crate "rustc-serialize" as rustc_serialize; - ``` - - This boilerplate confers no additional meaning, and is a common source of confusion for beginners. - -2. **Consistency**: Nowhere else do we allow hyphens in names, so having them in crates is inconsistent with the rest of the language. - -For these reasons, we should work to remove this feature before the beta. - -However, as of January 2015 there are 589 packages with hyphens on crates.io. It is unlikely that simply removing hyphens from the syntax will work, given all the code that depends on them. In particular, we need a plan that: +```rust +extern crate "rustc-serialize" as rustc_serialize; +``` -* Is easy to implement and understand; +An earlier version of this RFC aimed to solve this issue by removing hyphens entirely. -* Accounts for the existing packages on crates.io; and +However, there is a large amount of precedent for keeping `-` in package names. Systems as varied as GitHub, npm, RubyGems and Debian all have an established convention of using hyphens. Disallowing them would go against this precedent, causing friction with the wider community. -* Gives as much time as possible for users to fix their code. +Fortunately, Cargo presents us with a solution. It already separates the concepts of *package name* (used by Cargo and crates.io) and *crate name* (used by rustc and `extern crate`). We can disallow hyphens in the crate name only, while still accepting them in the outer package. This solves the usability problem, while keeping with the broader convention. # Detailed design -1. On **crates.io**: - - + Reject all further uploads for hyphenated names. Packages with hyphenated *dependencies* will still be allowed though. - - + On the server, migrate all existing hyphenated packages to underscored names. Keep the old packages around for compatibility, but hide them from search. To keep things simple, only the `name` field will change; dependencies will stay as they are. - -2. In **Cargo**: +## Disallow hyphens in crates (only) - + Continue allowing hyphens in package names, but treat them as having underscores internally. Warn the user when this happens. +In **Cargo**, continue allowing hyphens in package names. But unless the `Cargo.toml` says otherwise, the inner crate name will have all hyphens replaced with underscores. - This applies to both the package itself and its dependencies. For example, imagine we have an `apple-fritter` package that depends on `rustc-serialize`. When Cargo builds this package, it will instead fetch `rustc_serialize` and build `apple_fritter`. +For example, if I had a package named `apple-fritter`, its crate will be named `apple_fritter` instead. -3. In **rustc**: +In **rustc**, enforce that all crate names are valid identifiers. With the changes in Cargo, existing hyphenated packages should continue to build unchanged. - + As with Cargo, continue allowing hyphens in `extern crate`, but rewrite them to underscores in the parser. Warn the user when this happens. +## Identify `-` and `_` on crates.io - + Do *not* allow hyphens in other contexts, such as the `#[crate_name]` attribute or `--crate-name` and `--extern` options. +Right now, crates.io compares package names case-insensitively. This means, for example, you cannot upload a new package named `RUSTC-SERIALIZE` because `rustc-serialize` already exists. - > Rationale: These options are usually provided by external tools, which would break in strange ways if rustc chooses a different name. +Under this proposal, we will extend this logic to identify `-` and `_` as well. -4. Announce the change on the users forum and /r/rust. Tell users to update to the latest Cargo and rustc, and to begin transitioning their packages to the new system. Party. +## Remove the quotes from `extern crate` -5. Some time between the beta and 1.0 release, remove support for hyphens from Cargo and rustc. +Change the syntax of `extern crate` so that the crate name is no longer in quotes (e.g. `extern crate photo_finish as photo;`). This is viable now that all crate names are valid identifiers. -## C dependency (`*-sys`) packages - -[RFC 403] introduced a `*-sys` convention for wrappers around C libraries. Under this proposal, we will use `*_sys` instead. - -[RFC 403]: https://github.com/rust-lang/rfcs/blob/master/text/0403-cargo-build-command.md +To ease the transition, keep the old `extern crate` syntax around, transparently mapping any hyphens to underscores. For example, `extern crate "silver-spoon" as spoon;` will be desugared to `extern crate silver_spoon as spoon;`. This syntax will be deprecated, and removed before 1.0. # Drawbacks -## Code churn +## Inconsistency between packages and crates -While most code should not break from these changes, there will be much churn as maintainers fix their packages. However, the work should not amount to more than a simple find/replace. Also, because old packages are migrated automatically, maintainers can delay fixing their code until they need to publish a new version. +This proposal makes package and crate names inconsistent: the former will accept hyphens while the latter will not. -## Loss of hyphens +However, this drawback may not be an issue in practice. As hinted in the motivation, most other platforms have different syntaxes for packages and crates/modules anyway. Since the package system is orthogonal to the language itself, there is no need for consistency between the two. -There are two advantages to keeping hyphens around: +## Inconsistency between `-` and `_` -* Aesthetics: Hyphens do look nicer than underscores. +Quoth @P1start: -* Namespacing: Hyphens are often used for pseudo-namespaces. For example in Python, the Django web framework has a wealth of addon packages, all prefixed with `django-`. +> ... it's also annoying to have to choose between `-` and `_` when choosing a crate name, and to remember which of `-` and `_` a particular crate uses. -The author believes the disadvantages of hyphens outweigh these benefits. +I believe, like other naming issues, this problem can be addressed by conventions. # Alternatives @@ -85,30 +67,15 @@ The author believes the disadvantages of hyphens outweigh these benefits. As with any proposal, we can choose to do nothing. But given the reasons outlined above, the author believes it is important that we address the problem before the beta release. -## Disallow hyphens in crates, but allow them in packages - -What we often call "crate name" is actually two separate concepts: the *package name* as seen by Cargo and crates.io, and the *crate name* used by rustc and `extern crate`. While the two names are usually equal, Cargo lets us set them separately. - -For example, if we have a package named `lily-valley`, we can rename the inner crate to `lily_valley` as follows: - -```toml -[package] -name = "lily-valley" # Package name -# ... - -[lib] -name = "lily_valley" # Crate name -``` - -This will let us import the crate as `extern crate lily_valley` while keeping the hyphenated name in Cargo. +## Disallow hyphens in package names as well -But while this solution solves the usability problem, it still leaves the package and crate names inconsistent. Given the few use cases for hyphens, it is unclear whether this solution is better than just disallowing them altogether. +An earlier version of this RFC proposed to disallow hyphens in packages as well. The drawbacks of this idea are covered in the motivation. ## Make `extern crate` match fuzzily -Alternatively, we can have the compiler consider hyphens and underscores as equal while looking up a crate. In other words, the crate `flim-flam` would match both `extern crate flim_flam` and `extern crate "flim-flam" as flim_flam`. This will let us keep the hyphenated names, without having to rename them on import. +Alternatively, we can have the compiler consider hyphens and underscores as equal while looking up a crate. In other words, the crate `flim-flam` would match both `extern crate flim_flam` and `extern crate "flim-flam" as flim_flam`. -The drawback to this solution is complexity. We will need to add this special case to the compiler, guard against conflicting packages on crates.io, and explain this behavior to newcomers. That's too much work to support a marginal use case. +This involves much more magic than the original proposal, and it is not clear what advantages it has over it. ## Repurpose hyphens as namespace separators @@ -131,7 +98,7 @@ mod hoity { } ``` -However, on prototyping this proposal, the author found it too complex and fraught with edge cases. Banning hyphens outright would be much easier to implement and understand. +However, on prototyping this proposal, the author found it too complex and fraught with edge cases. For these reasons the author chose not to push this solution. # Unresolved questions From cb89f1efb890f307444d5e9081661b7365584ad6 Mon Sep 17 00:00:00 2001 From: Chris Wong Date: Fri, 13 Mar 2015 03:00:33 -0400 Subject: [PATCH 0177/1195] Revert "Amend #403: Change `*-sys` to `*_sys`" This reverts commit 2baf8d335f6ce5f97c1b691d6373d438ec510464. --- text/0403-cargo-build-command.md | 34 ++++++++++++++++---------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/text/0403-cargo-build-command.md b/text/0403-cargo-build-command.md index 9d00bd8ab02..cb169b59346 100644 --- a/text/0403-cargo-build-command.md +++ b/text/0403-cargo-build-command.md @@ -9,11 +9,11 @@ around build commands to facilitate linking native code to Cargo packages. 1. Instead of having the `build` command be some form of script, it will be a Rust command instead -2. Establish a namespace of `foo_sys` packages which represent the native +2. Establish a namespace of `foo-sys` packages which represent the native library `foo`. These packages will have Cargo-based dependencies between - `*_sys` packages to express dependencies among C packages themselves. + `*-sys` packages to express dependencies among C packages themselves. 3. Establish a set of standard environment variables for build commands which - will instruct how `foo_sys` packages should be built in terms of dynamic or + will instruct how `foo-sys` packages should be built in terms of dynamic or static linkage, as well as providing the ability to override where a package comes from via environment variables. @@ -101,7 +101,7 @@ Summary: * Add platform-specific dependencies to Cargo manifests * Allow pre-built libraries in the same manner as Cargo overrides * Use Rust for build scripts -* Develop a convention of `*_sys` packages +* Develop a convention of `*-sys` packages ## Modifications to `rustc` @@ -358,38 +358,38 @@ useful to interdependencies among native packages themselves. For example libssh2 depends on OpenSSL on linux, which means it needs to find the corresponding libraries and header files. The metadata keys serve as a vector through which this information can be transmitted. The maintainer of the -`openssl_sys` package (described below) would have a build script responsible +`openssl-sys` package (described below) would have a build script responsible for generating this sort of metadata so consumer packages can use it to build C libraries themselves. -## A set of `*_sys` packages +## A set of `*-sys` packages This section will discuss a *convention* by which Cargo packages providing native dependencies will be named, it is not proposed to have Cargo enforce this convention via any means. These conventions are proposed to address constraints 5 and 6 above. -Common C dependencies will be refactored into a package named `foo_sys` where -`foo` is the name of the C library that `foo_sys` will provide and link to. +Common C dependencies will be refactored into a package named `foo-sys` where +`foo` is the name of the C library that `foo-sys` will provide and link to. There are two key motivations behind this convention: -* Each `foo_sys` package will declare its own dependencies on other `foo_sys` +* Each `foo-sys` package will declare its own dependencies on other `foo-sys` based packages * Dependencies on native libraries expressed through Cargo will be subject to version management, version locking, and deduplication as usual. -Each `foo_sys` package is responsible for providing the following: +Each `foo-sys` package is responsible for providing the following: -* Declarations of all symbols in a library. Essentially each `foo_sys` library +* Declarations of all symbols in a library. Essentially each `foo-sys` library is *only* a header file in terms of Rust-related code. -* Ensuring that the native library `foo` is linked to the `foo_sys` crate. This +* Ensuring that the native library `foo` is linked to the `foo-sys` crate. This guarantees that all exposed symbols are indeed linked into the crate. -Dependencies making use of `*_sys` packages will not expose `extern` blocks -themselves, but rather use the symbols exposed in the `foo_sys` package -directly. Additionally, packages using `*_sys` packages should not declare a +Dependencies making use of `*-sys` packages will not expose `extern` blocks +themselves, but rather use the symbols exposed in the `foo-sys` package +directly. Additionally, packages using `*-sys` packages should not declare a `#[link]` directive to link to the native library as it's already linked to the -`*_sys` package. +`*-sys` package. ## Phasing strategy @@ -517,7 +517,7 @@ perform this configuration (be it environment or in files). * Features themselves will also likely need to be platform-specific, but this runs into a number of tricky situations and needs to be fleshed out. -[verbose]: https://github.com/alexcrichton/complicated-linkage-example/blob/master/curl_sys/Cargo.toml#L9-L17 +[verbose]: https://github.com/alexcrichton/complicated-linkage-example/blob/master/curl-sys/Cargo.toml#L9-L17 # Alternatives From b39532514de079a0527a1747da7dfe2b70f0edee Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 13 Mar 2015 14:47:10 -0700 Subject: [PATCH 0178/1195] Update with improved method names, From trait --- text/0000-conversion-traits.md | 103 +++++++++++++++++++++++---------- 1 file changed, 72 insertions(+), 31 deletions(-) diff --git a/text/0000-conversion-traits.md b/text/0000-conversion-traits.md index 98e3a8e8608..69061a66cee 100644 --- a/text/0000-conversion-traits.md +++ b/text/0000-conversion-traits.md @@ -110,27 +110,36 @@ might expect: we introduce a total of *four* traits: ```rust trait As for Sized? { - fn cvt_as(&self) -> &T; + fn convert_as(&self) -> &T; } trait AsMut for Sized? { - fn cvt_as_mut(&mut self) -> &mut T; + fn convert_as_mut(&mut self) -> &mut T; } trait To for Sized? { - fn cvt_to(&self) -> T; + fn convert_to(&self) -> T; } trait Into { - fn cvt_into(self) -> T; + fn convert_into(self) -> T; +} + +trait From { + fn from(T) -> Self; } ``` -These traits mirror our `as`/`to`/`into` conventions, but add a bit -more structure to them: `as`-style conversions are from references to -references, `to`-style conversions are from references to arbitrary -types, and `into`-style conversions are between arbitrary types -(consuming their argument). +The first three traits mirror our `as`/`to`/`into` conventions, but +add a bit more structure to them: `as`-style conversions are from +references to references, `to`-style conversions are from references +to arbitrary types, and `into`-style conversions are between arbitrary +types (consuming their argument). + +The final trait, `From`, mimics the `from` constructors. Unlike the +other traits, its method is not prefixed with `convert`. This is +because, again unlike the other traits, this trait is expected to +outright replace most custom `from` constructors. See below. **Why the reference restrictions?** @@ -140,7 +149,7 @@ would have to use generalized where clauses and explicit lifetimes even for simp ```rust // Possible alternative: trait As { - fn cvt_as(self) -> T; + fn convert_as(self) -> T; } // But then you get this: @@ -150,8 +159,8 @@ fn take_as<'a, T>(t: &'a T) where &'a T: As<&'a MyType>; fn take_as(t: &T) where T: As; ``` -What's worse, if you need a conversion that works over any lifetime, -*there's no way to specify it*: you can't write something like +If you need a conversion that works over any lifetime, you need to use +higher-ranked trait bounds: ```rust ... where for<'a> &'a T: As<&'a MyType> @@ -159,8 +168,10 @@ What's worse, if you need a conversion that works over any lifetime, This case is particularly important when you cannot name a lifetime in advance, because it will be created on the stack within the -function. While such a `where` clause can likely be added in the -future, it's a bit of a gamble to pin conversion traits on it today. +function. It might be possible to add sugar so that `where &T: +As<&MyType>` expands to the above automatically, but such an elision +might have other problems, and in any case it would preclude writing +direct bounds like `fn foo`. The proposed trait definition essentially *bakes in* the needed lifetime connection, capturing the most common mode of use for @@ -176,6 +187,14 @@ cost and consumption, and having multiple traits makes it possible to (by convention) restrict attention to e.g. "free" `as`-style conversions by bounding only by `As`. +Why have both `Into` and `From`? There are a few reasons: + +* Coherence issues: the order of the types is significant, so `From` + allows extensibility in some cases that `Into` does not. + +* To match with existing conventions around conversions and + constructors (in particular, replacing many `from` constructors). + ## Blanket `impl`s Given the above trait design, there are a few straightforward blanket @@ -184,53 +203,60 @@ Given the above trait design, there are a few straightforward blanket ```rust // As implies To impl<'a, Sized? T, Sized? U> To<&'a U> for &'a T where T: As { - fn cvt_to(&self) -> &'a U { - self.cvt_as() + fn convert_to(&self) -> &'a U { + self.convert_as() } } // To implies Into impl<'a, T, U> Into for &'a T where T: To { - fn cvt_into(self) -> U { - self.cvt_to() + fn convert_into(self) -> U { + self.convert_to() } } // AsMut implies Into impl<'a, T, U> Into<&'a mut U> for &'a mut T where T: AsMut { - fn cvt_into(self) -> &'a mut U { - self.cvt_as_mut() + fn convert_into(self) -> &'a mut U { + self.convert_as_mut() } } + +// Into implies From +impl From for U where T: Into { + fn from(t: T) -> U { t.cvt_into() } +} ``` +The interaction between + ## An example Using all of the above, here are some example `impl`s and their use: ```rust impl As for String { - fn cvt_as(&self) -> &str { + fn convert_as(&self) -> &str { self.as_slice() } } impl As<[u8]> for String { - fn cvt_as(&self) -> &[u8] { + fn convert_as(&self) -> &[u8] { self.as_bytes() } } impl Into> for String { - fn cvt_into(self) -> Vec { + fn convert_into(self) -> Vec { self.into_bytes() } } fn main() { let a = format!("hello"); - let b: &[u8] = a.cvt_as(); - let c: &str = a.cvt_as(); - let d: Vec = a.cvt_into(); + let b: &[u8] = a.convert_as(); + let c: &str = a.convert_as(); + let d: Vec = a.convert_into(); } ``` @@ -242,7 +268,7 @@ impl Path { fn join_path_inner(&self, p: &Path) -> PathBuf { ... } pub fn join_path>(&self, p: &P) -> PathBuf { - self.join_path_inner(p.cvt_as()) + self.join_path_inner(p.convert_as()) } } ``` @@ -295,16 +321,19 @@ So a rough, preliminary convention would be the following: in this RFC. An *ad hoc conversion trait* is a trait providing an ad hoc conversion method. -* Use ad hoc conversion methods for "natural" conversions that should - have easy names and good discoverability. A conversion is "natural" - if you'd call it directly on the type in normal code; "unnatural" - conversions usually come from generic programming. +* Use ad hoc conversion methods for "natural", *outgoing* conversions + that should have easy method names and good discoverability. A + conversion is "natural" if you'd call it directly on the type in + normal code; "unnatural" conversions usually come from generic + programming. For example, `to_string` is a natural conversion for `str`, while `into_string` is not; but the latter is sometimes useful in a generic context -- and that's what the generic conversion traits can help with. +* On the other hand, favor `From` for all conversion constructors. + * Introduce ad hoc conversion *traits* if you need to provide a blanket `impl` of an ad hoc conversion method, or need special functionality. For example, `to_string` needs a trait so that every @@ -323,6 +352,18 @@ So a rough, preliminary convention would be the following: * Use the "inner function" pattern mentioned above to avoid code bloat. +## Prelude changes + +*All* of the conversion traits are added to the prelude. There are two + reasons for doing so: + +* For `As`/`To`/`Into`, the reasoning is similar to the inclusion of + `PartialEq` and friends: they are expected to appear ubiquitously as + bounds. + +* For `From`, bounds are somewhat less common but the use of the + `from` constructor is expected to be rather widespread. + # Drawbacks There are a few drawbacks to the design as proposed: From 19bf9bb31a0f91f4a789347ae78302d1cd7eaec7 Mon Sep 17 00:00:00 2001 From: Chris Wong Date: Sat, 14 Mar 2015 11:51:18 +1300 Subject: [PATCH 0179/1195] Clarify interaction between Cargo and rustc --- text/0000-hyphens-considered-harmful.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/text/0000-hyphens-considered-harmful.md b/text/0000-hyphens-considered-harmful.md index 9383d2545cd..f81d987414f 100644 --- a/text/0000-hyphens-considered-harmful.md +++ b/text/0000-hyphens-considered-harmful.md @@ -27,11 +27,15 @@ Fortunately, Cargo presents us with a solution. It already separates the concept ## Disallow hyphens in crates (only) -In **Cargo**, continue allowing hyphens in package names. But unless the `Cargo.toml` says otherwise, the inner crate name will have all hyphens replaced with underscores. +In **rustc**, enforce that all crate names are valid identifiers. -For example, if I had a package named `apple-fritter`, its crate will be named `apple_fritter` instead. +In **Cargo**, continue allowing hyphens in package names. -In **rustc**, enforce that all crate names are valid identifiers. With the changes in Cargo, existing hyphenated packages should continue to build unchanged. +The difference will be in the crate name Cargo passes to the compiler. If the `Cargo.toml` does *not* specify an explicit crate name, then Cargo will use the package name but with all `-` replaced by `_`. + +For example, if I have a package named `apple-fritter`, Cargo will pass `--crate-name apple_fritter` to the compiler instead. + +Since most packages do not set their own crate names, this mapping will ensure that the majority of hyphenated packages continue to build unchanged. ## Identify `-` and `_` on crates.io From 6749ea2c1c88c8867fab4f680be0c6c083f2f538 Mon Sep 17 00:00:00 2001 From: Mikhail Zabaluev Date: Fri, 6 Mar 2015 23:48:03 +0200 Subject: [PATCH 0180/1195] Clarify the behavior of into_inner The distinction between "shallow" and "deep" flush needs to be made clear in the specification. --- text/0517-io-os-reform.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 6babd82b337..01674f75533 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1016,10 +1016,11 @@ strings) and is usually what you want when working with iterators. The `BufReader`, `BufWriter` and `BufStream` types stay essentially as they are today, except that for streams and writers the -`into_inner` method yields the structure back in the case of a flush error: - +`into_inner` method yields the structure back in the case of a write error, +and its behavior is clarified to writing out the buffered data without +flushing the underlying reader: ```rust -// If flushing fails, you get the unflushed data back +// If writing fails, you get the unwritten data back fn into_inner(self) -> Result>; pub struct IntoInnerError(W, Error); From 67978cd173c5504fa0b41e0244cc8701fa7d19ef Mon Sep 17 00:00:00 2001 From: Jake Goulding Date: Sun, 15 Mar 2015 14:04:05 -0400 Subject: [PATCH 0181/1195] Align the count parameter of splitn with other languages --- .../0000-align-splitn-with-other-languages.md | 153 ++++++++++++++++++ 1 file changed, 153 insertions(+) create mode 100644 text/0000-align-splitn-with-other-languages.md diff --git a/text/0000-align-splitn-with-other-languages.md b/text/0000-align-splitn-with-other-languages.md new file mode 100644 index 00000000000..8c19956f61b --- /dev/null +++ b/text/0000-align-splitn-with-other-languages.md @@ -0,0 +1,153 @@ +- Feature Name: n/a +- Start Date: 2015-03-15 +- RFC PR: +- Rust Issue: + +# Summary + +Make the `count` parameter of `SliceExt::splitn`, `StrExt::splitn` and +corresponding reverse variants mean the *maximum number of items +returned*, instead of the *maximum number of times to match the +separator*. + +# Motivation + +The majority of other languages (see examples below) treat the `count` +parameter as the maximum number of items to return. Rust already has +many things newcomers need to learn, making other things similar can +help adoption. + +# Detailed design + +Currently `splitn` uses the `count` parameter to decide how many times +the separator should be matched: + +```rust +let v: Vec<_> = "a,b,c".splitn(2, ',').collect(); +assert_eq!(v, ["a", "b", "c"]); +``` + +The simplest change we can make is to decrement the count in the +constructor functions. If the count becomes zero, we mark the returned +iterator as `finished`. See **Unresolved questions** for nicer +transition paths. + +## Example usage + +### Strings + +```rust +let input = "a,b,c"; +let v: Vec<_> = input.splitn(2, ',').collect(); +assert_eq!(v, ["a", "b,c"]); + +let v: Vec<_> = input.splitn(1, ',').collect(); +assert_eq!(v, ["a,b,c"]); + +let v: Vec<_> = input.splitn(0, ',').collect(); +assert_eq!(v, []); +``` + +### Slices + +```rust +let input = [1, 0, 2, 0, 3]; +let v: Vec<_> = input.splitn(2, |&x| x == 0).collect(); +assert_eq!(v, [[1], [2, 0, 3]]); + +let v: Vec<_> = input.splitn(1, |&x| x == 0).collect(); +assert_eq!(v, [[1, 0, 2, 0, 3]]); + +let v: Vec<_> = input.splitn(0, |&x| x == 0).collect(); +assert_eq!(v, []); +``` + +## Languages where `count` is the maximum number of items returned + +### C# ### + +```c# +"a,b,c".Split(new char[] {','}, 2) +// ["a", "b,c"] +``` + +### Clojure + +```clojure +(clojure.string/split "a,b,c" #"," 2) +;; ["a" "b,c"] +``` + +### Go + +```go +strings.SplitN("a,b,c", ",", 2) +// [a b,c] +``` + +### Java + +```java +"a,b,c".split(",", 2); +// ["a", "b,c"] +``` + +### Ruby + +```ruby +"a,b,c".split(',', 2) +# ["a", "b,c"] +``` + +### Perl + +```perl +split(",", "a,b,c", 2) +# ['a', 'b,c'] +``` + +## Languages where `count` is the maximum number of times the separator will be matched + +### Python + +```python +"a,b,c".split(',', 2) +# ['a', 'b', 'c'] +``` + +### Swift + +```swift +split("a,b,c", { $0 == "," }, maxSplit: 2) +// ["a", "b", "c"] +``` + +# Drawbacks + +Changing the *meaning* of the `count` parameter without changing the +*type* is sure to cause subtle issues. See **Unresolved questions**. + +The iterator can only return 2^64 values; previously we could return +2^64 + 1. This could also be considered an upside, as we can now +return an empty iterator. + +# Alternatives + +1. Keep the status quo. People migrating from many other languages +will continue to be surprised. + +2. Add a parallel set of functions that clearly indicate that `count` +is the maximum number of items that can be returned. + +# Unresolved questions + +Is there a nicer way to change the behavior of `count` such that users +of `splitn` get compile-time errors when migrating? + +1. Add a dummy parameter, and mark the methods unstable. Remove the +parameterand re-mark as stable near the end of the beta period. + +2. Move the methods from `SliceExt` and `StrExt` to a new trait that +needs to be manually imported. After the transition, move the methods +back and deprecate the trait. This would not break user code that +migrated to the new semantic. From 288b1b522b048f821b447ce496e38f1276615a74 Mon Sep 17 00:00:00 2001 From: David Turner Date: Sun, 15 Mar 2015 15:52:26 -0400 Subject: [PATCH 0182/1195] RFC for read_all --- text/0000-read-all.md | 50 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 text/0000-read-all.md diff --git a/text/0000-read-all.md b/text/0000-read-all.md new file mode 100644 index 00000000000..47c82a104a8 --- /dev/null +++ b/text/0000-read-all.md @@ -0,0 +1,50 @@ +- Feature Name: read_all +- Start Date: 2015-03-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Rust's Write trait has write_all, which attempts to write an entire +buffer. This proposal adds read_all, which attempts to read a fixed +number of bytes into a given buffer. + +# Motivation + +The new read_all method will allow programs to read from disk without +having to write their own read loops. Most Rust programs which need +to read from disk will prefer this to the plain read function. Many C +programs have the same need, and solve it the same way (e.g. git has +read_in_full). Here's one example of a Rust library doing this: +https://github.com/BurntSushi/byteorder/blob/master/src/new.rs#L184 + +# Detailed design + +The read_all function will take a mutable, borrowed slice of u8 to +read into, and will attempt to fill that entire slice with data. + +It will loop, calling read() once per iteration and attempting to read +the remaining amount of data. If read returns EINTR, the loop will +retry. If there are no more bytes to read (as signalled by a return +of Ok(0) from read()), a new error type, ErrorKind::ReadZero, will be +returned. In the event of another error, that error will be +returned. After a read call returns having successfully read some +bytes, the total number of bytes read will be updated. If that +total is equal to the size of the buffer, read will return +successfully. + +# Drawbacks + +The major weakness of this API (shared with write_all) is that in the +event of an error, there is no way to return the number of bytes that +were successfully read before the error. But since that is the design +of write_all, it makes sense to mimic that design decision for read_all. + +# Alternatives + +One alternative design would return some new kind of Result which +could report the number of bytes sucessfully read before an error. +This would be inconsistent with write_all, but arguably more correct. + +Or we could leave this out, and let every Rust user write their own +read_all function -- like savages. From 72998dbe23776f4f4378459e770ca58b723eab9a Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 16 Mar 2015 14:13:21 -0400 Subject: [PATCH 0183/1195] Merge RFC 803 (Type Ascription) --- README.md | 1 + text/{0000-type-ascription.md => 0803-type-ascription.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-type-ascription.md => 0803-type-ascription.md} (97%) diff --git a/README.md b/README.md index 20d84b6f4cf..76f83d571f9 100644 --- a/README.md +++ b/README.md @@ -50,6 +50,7 @@ the direction the language is evolving in. * [0702-rangefull-expression.md](text/0702-rangefull-expression.md) * [0738-variance.md](text/0738-variance.md) * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) +* [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) ## Table of Contents diff --git a/text/0000-type-ascription.md b/text/0803-type-ascription.md similarity index 97% rename from text/0000-type-ascription.md rename to text/0803-type-ascription.md index 156e351b854..aecc5f4586c 100644 --- a/text/0000-type-ascription.md +++ b/text/0803-type-ascription.md @@ -1,6 +1,6 @@ - Start Date: 2015-2-3 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#803](https://github.com/rust-lang/rfcs/pull/803) +- Rust Issue: [rust-lang/rust#23416](https://github.com/rust-lang/rust/issues/23416) - Feature: `ascription` # Summary From a8b59657077bc8dd5681568d626cf1bdc46d4a35 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Mon, 16 Mar 2015 13:17:19 -0700 Subject: [PATCH 0184/1195] Closure return type syntax is 968 --- ...rn-type-syntax.md => 0968-closure-return-type-syntax.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-closure-return-type-syntax.md => 0968-closure-return-type-syntax.md} (91%) diff --git a/text/0000-closure-return-type-syntax.md b/text/0968-closure-return-type-syntax.md similarity index 91% rename from text/0000-closure-return-type-syntax.md rename to text/0968-closure-return-type-syntax.md index 142d094afdc..86679f6cc54 100644 --- a/text/0000-closure-return-type-syntax.md +++ b/text/0968-closure-return-type-syntax.md @@ -1,7 +1,7 @@ - Feature Name: N/A -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Start Date: 2015-03-16 +- RFC PR: [rust-lang/rfcs#968](https://github.com/rust-lang/rfcs/pull/968) +- Rust Issue: [rust-lang/rust#23420](https://github.com/rust-lang/rust/issues/23420) # Summary From 361bd5176e4981cfff488fab15bf9b981bb462b3 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Mon, 16 Mar 2015 13:19:18 -0700 Subject: [PATCH 0185/1195] Update index --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 76f83d571f9..7e90f6975f0 100644 --- a/README.md +++ b/README.md @@ -52,6 +52,7 @@ the direction the language is evolving in. * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) +* [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) ## Table of Contents [Table of Contents]: #table-of-contents From 84157a6fab9018e96d52a6b4b8ec232e0308021d Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Mon, 16 Mar 2015 23:48:40 +0200 Subject: [PATCH 0186/1195] Fix a wrong word in #640 --- text/0640-debug-improvements.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0640-debug-improvements.md b/text/0640-debug-improvements.md index f7cbfeb5cfa..b9755f8ccdd 100644 --- a/text/0640-debug-improvements.md +++ b/text/0640-debug-improvements.md @@ -4,7 +4,7 @@ # Summary -The `Debug` trait is intended to be implemented by every trait and display +The `Debug` trait is intended to be implemented by every type and display useful runtime information to help with debugging. This RFC proposes two additions to the fmt API, one of which aids implementors of `Debug`, and one which aids consumers of the output of `Debug`. Specifically, the `#` format From af16c663c56ae67f4ac14396703597e856d0aa31 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 17 Mar 2015 10:47:17 -0700 Subject: [PATCH 0187/1195] Remove completed RFCs from README --- README.md | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/README.md b/README.md index 7e90f6975f0..eab2e0f7093 100644 --- a/README.md +++ b/README.md @@ -29,26 +29,16 @@ the direction the language is evolving in. * [0141-lifetime-elision.md](text/0141-lifetime-elision.md) * [0195-associated-items.md](text/0195-associated-items.md) * [0213-defaulted-type-params.md](text/0213-defaulted-type-params.md) -* [0235-collections-conventions.md](text/0235-collections-conventions.md) * [0320-nonzeroing-dynamic-drop.md](text/0320-nonzeroing-dynamic-drop.md) * [0339-statically-sized-literals.md](text/0339-statically-sized-literals.md) -* [0369-num-reform.md](text/0369-num-reform.md) * [0385-module-system-cleanup.md](text/0385-module-system-cleanup.md) * [0401-coercions.md](text/0401-coercions.md) * [0447-no-unused-impl-parameters.md](text/0447-no-unused-impl-parameters.md) -* [0458-send-improvements.md](text/0458-send-improvements.md) * [0495-array-pattern-changes.md](text/0495-array-pattern-changes.md) * [0501-consistent_no_prelude_attributes.md](text/0501-consistent_no_prelude_attributes.md) -* [0505-api-comment-conventions.md](text/0505-api-comment-conventions.md) * [0509-collections-reform-part-2.md](text/0509-collections-reform-part-2.md) * [0517-io-os-reform.md](text/0517-io-os-reform.md) -* [0544-rename-int-uint.md](text/0544-rename-int-uint.md) * [0560-integer-overflow.md](text/0560-integer-overflow.md) -* [0563-remove-ndebug.md](text/0563-remove-ndebug.md) -* [0572-rustc-attribute.md](text/0572-rustc-attribute.md) -* [0640-debug-improvements.md](text/0640-debug-improvements.md) -* [0702-rangefull-expression.md](text/0702-rangefull-expression.md) -* [0738-variance.md](text/0738-variance.md) * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) From 6566a1d5a17e4e38dd2c176b7325c599d34f0d93 Mon Sep 17 00:00:00 2001 From: David Turner Date: Wed, 18 Mar 2015 00:52:52 -0400 Subject: [PATCH 0188/1195] address review comments --- text/0000-read-all.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index 47c82a104a8..c676ec747e7 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -26,7 +26,7 @@ read into, and will attempt to fill that entire slice with data. It will loop, calling read() once per iteration and attempting to read the remaining amount of data. If read returns EINTR, the loop will retry. If there are no more bytes to read (as signalled by a return -of Ok(0) from read()), a new error type, ErrorKind::ReadZero, will be +of Ok(0) from read()), a new error type, ErrorKind::ShortRead, will be returned. In the event of another error, that error will be returned. After a read call returns having successfully read some bytes, the total number of bytes read will be updated. If that @@ -46,5 +46,14 @@ One alternative design would return some new kind of Result which could report the number of bytes sucessfully read before an error. This would be inconsistent with write_all, but arguably more correct. +Another would be that ErrorKind::ShortRead would be parameterized by +the number of bytes read before EOF. The downside of this is that it +bloats the size of io::Error. + +Finally, in the event of a short read, we could return Ok(number of +bytes read before EOF) instead of an error. But then every user would +have to check for this case. And it would be inconsistent with +write_all. + Or we could leave this out, and let every Rust user write their own read_all function -- like savages. From 62fde060063641657942d07922936a2556253ae2 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 18 Mar 2015 23:23:45 -0700 Subject: [PATCH 0189/1195] RFC 921 is Entry API V3 --- text/{0000-entry_v3.md => 0921-entry_v3.md} | 29 ++++++++------------- 1 file changed, 11 insertions(+), 18 deletions(-) rename text/{0000-entry_v3.md => 0921-entry_v3.md} (82%) diff --git a/text/0000-entry_v3.md b/text/0921-entry_v3.md similarity index 82% rename from text/0000-entry_v3.md rename to text/0921-entry_v3.md index 5e1acb41904..dfb7b835f2b 100644 --- a/text/0000-entry_v3.md +++ b/text/0921-entry_v3.md @@ -1,12 +1,12 @@ - Feature Name: entry_v3 - Start Date: 2015-03-01 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/921 +- Rust Issue: https://github.com/rust-lang/rust/issues/23508 # Summary -Replace Entry::get with Entry::default and Entry::default_with for better ergonomics and clearer -code. +Replace `Entry::get` with `Entry::or_insert` and +`Entry::or_insert_with` for better ergonomics and clearer code. # Motivation @@ -63,7 +63,7 @@ Replace `Entry::get` with the following two methods: ``` /// Ensures a value is in the entry by inserting the default if empty, and returns /// a mutable reference to the value in the entry. - pub fn default(self. default: V) -> &'a mut V { + pub fn or_insert(self. default: V) -> &'a mut V { match self { Occupied(entry) => entry.into_mut(), Vacant(entry) => entry.insert(default), @@ -72,7 +72,7 @@ Replace `Entry::get` with the following two methods: /// Ensures a value is in the entry by inserting the result of the default function if empty, /// and returns a mutable reference to the value in the entry. - pub fn default_with V>(self. default: F) -> &'a mut V { + pub fn or_insert_with V>(self. default: F) -> &'a mut V { match self { Occupied(entry) => entry.into_mut(), Vacant(entry) => entry.insert(default()), @@ -84,16 +84,16 @@ which allows the following: ``` -*map.entry(key).default(0) += 1; +*map.entry(key).or_insert(0) += 1; ``` ``` // vec![] doesn't even allocate, and is only 3 ptrs big. -map.entry(key).default(vec![]).push(val); +map.entry(key).or_insert(vec![]).push(val); ``` ``` -let val = map.entry(key).default_with(|| expensive(big, data)); +let val = map.entry(key).or_insert_with(|| expensive(big, data)); ``` Look at all that ergonomics. *Look at it*. This pushes us more into the "one right way" @@ -114,15 +114,8 @@ method is trivial to write as a consumer of the API. # Alternatives -Settle for Result chumpsville or abandon this sugar altogether. Truly, fates worse than death. +Settle for `Result` chumpsville or abandon this sugar altogether. Truly, fates worse than death. # Unresolved questions -`default` and `default_with` are universally reviled as *names*. Need a better name. Some candidates. - -* set_default -* or_insert -* insert_default -* insert_if_vacant -* with_default - +None. From 1e457ade4b98336f6d29a437ffc8856343793516 Mon Sep 17 00:00:00 2001 From: David Turner Date: Thu, 19 Mar 2015 11:55:27 -0400 Subject: [PATCH 0190/1195] Address review feedback --- text/0000-read-all.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index c676ec747e7..28259dfb679 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -26,12 +26,12 @@ read into, and will attempt to fill that entire slice with data. It will loop, calling read() once per iteration and attempting to read the remaining amount of data. If read returns EINTR, the loop will retry. If there are no more bytes to read (as signalled by a return -of Ok(0) from read()), a new error type, ErrorKind::ShortRead, will be -returned. In the event of another error, that error will be +of Ok(0) from read()), a new error type, ErrorKind::ShortRead(usize), +will be returned. ShortRead includes the number of bytes successfully +read. In the event of another error, that error will be returned. After a read call returns having successfully read some -bytes, the total number of bytes read will be updated. If that -total is equal to the size of the buffer, read will return -successfully. +bytes, the total number of bytes read will be updated. If that total +is equal to the size of the buffer, read will return successfully. # Drawbacks @@ -46,9 +46,9 @@ One alternative design would return some new kind of Result which could report the number of bytes sucessfully read before an error. This would be inconsistent with write_all, but arguably more correct. -Another would be that ErrorKind::ShortRead would be parameterized by -the number of bytes read before EOF. The downside of this is that it -bloats the size of io::Error. +If we wanted io::Error to be a smaller type, ErrorKind::ShortRead +could be unparameterized. But this would reduce the information +available to calleres. Finally, in the event of a short read, we could return Ok(number of bytes read before EOF) instead of an error. But then every user would From 82cdd3bb76a8008b223e0264bd911a7f1197f1c8 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 19 Mar 2015 15:12:28 -0700 Subject: [PATCH 0191/1195] RFC 940 is tweaking hyphen behavior --- ...nsidered-harmful.md => 0940-hyphens-considered-harmful.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-hyphens-considered-harmful.md => 0940-hyphens-considered-harmful.md} (96%) diff --git a/text/0000-hyphens-considered-harmful.md b/text/0940-hyphens-considered-harmful.md similarity index 96% rename from text/0000-hyphens-considered-harmful.md rename to text/0940-hyphens-considered-harmful.md index f81d987414f..7c5e4dae9b1 100644 --- a/text/0000-hyphens-considered-harmful.md +++ b/text/0940-hyphens-considered-harmful.md @@ -1,7 +1,7 @@ - Feature Name: `hyphens_considered_harmful` - Start Date: 2015-03-05 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#940](https://github.com/rust-lang/rfcs/pull/940) +- Rust Issue: [rust-lang/rust#23533](https://github.com/rust-lang/rust/issues/23533) # Summary From db410afdaeb0a3321c3ee22242bafc78b9861ad1 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 19 Mar 2015 18:30:24 -0700 Subject: [PATCH 0192/1195] Fix minor mistakes --- text/0000-conversion-traits.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-conversion-traits.md b/text/0000-conversion-traits.md index 69061a66cee..8534775df99 100644 --- a/text/0000-conversion-traits.md +++ b/text/0000-conversion-traits.md @@ -106,7 +106,7 @@ more detail below, and merits community discussion. ## Basic design The design is fairly simple, although perhaps not as simple as one -might expect: we introduce a total of *four* traits: +might expect: we introduce a total of *five* traits: ```rust trait As for Sized? { @@ -228,8 +228,6 @@ impl From for U where T: Into { } ``` -The interaction between - ## An example Using all of the above, here are some example `impl`s and their use: From 1b5461a955c28733098555b6efb30eb8e5654719 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 19 Mar 2015 23:15:57 -0700 Subject: [PATCH 0193/1195] RFC 909 is Move `std::thread_local::*` into `std::thread` --- README.md | 2 +- ...> 0909-move-thread-local-to-std-thread.md} | 22 +++++++++---------- 2 files changed, 12 insertions(+), 12 deletions(-) rename text/{0000-move-thread-local-to-std-thread.md => 0909-move-thread-local-to-std-thread.md} (57%) diff --git a/README.md b/README.md index eab2e0f7093..95154240c62 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,7 @@ the direction the language is evolving in. * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) +* [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) ## Table of Contents @@ -267,4 +268,3 @@ necessary. [core team]: https://github.com/mozilla/rust/wiki/Note-core-team [triage process]: https://github.com/rust-lang/rust/wiki/Note-development-policy#milestone-and-priority-nomination-and-triage [weekly meeting]: https://github.com/rust-lang/meeting-minutes - diff --git a/text/0000-move-thread-local-to-std-thread.md b/text/0909-move-thread-local-to-std-thread.md similarity index 57% rename from text/0000-move-thread-local-to-std-thread.md rename to text/0909-move-thread-local-to-std-thread.md index cde24abdd5b..937c5dd608f 100644 --- a/text/0000-move-thread-local-to-std-thread.md +++ b/text/0909-move-thread-local-to-std-thread.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-02-25 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/909 +- Rust Issue: https://github.com/rust-lang/rust/issues/23547 # Summary @@ -17,14 +17,15 @@ slightly reduce the number of `use` statementsl # Detailed design -The `std::thread_local` module would be renamed to `std::thread::local`. -All contents of the module would remain the same. This way, all thread -related code is combined in one module. +The contents of`std::thread_local` module would be moved into to +`std::thread::local`. `Key` would be renamed to `LocalKey`, and +`scoped` would also be flattened, providing `ScopedKey`, etc. This +way, all thread related code is combined in one module. It would also allow using it as such: ```rust -use std::thread::{local, Thread}; +use std::thread::{LocalKey, Thread}; ``` # Drawbacks @@ -36,11 +37,10 @@ may prefer to have more top level modules. # Alternatives -Another strategy for moving `std::thread_local` would be to move it -directly into `std::thread` without scoping it in a dedicated module. -There are no naming conflicts, but the names would not be ideal anymore. -One way to mitigate would be to rename the types to something like -`LocalKey` and `LocalState`. +An alternative (as the RFC originally proposed) would be to bring +`thread_local` in as a submodule, rather than flattening. This was +decided against in an effort to keep hierarchies flat, and because of +the slim contents on the `thread_local` module. # Unresolved questions From e4a774f89d97bd6264ef3b8e4e676cafbc20059c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 20 Mar 2015 14:27:09 -0700 Subject: [PATCH 0194/1195] RFC 529 is conversion traits --- text/{0000-conversion-traits.md => 0529-conversion-traits.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-conversion-traits.md => 0529-conversion-traits.md} (99%) diff --git a/text/0000-conversion-traits.md b/text/0529-conversion-traits.md similarity index 99% rename from text/0000-conversion-traits.md rename to text/0529-conversion-traits.md index 8534775df99..095772b2ff1 100644 --- a/text/0000-conversion-traits.md +++ b/text/0529-conversion-traits.md @@ -1,6 +1,6 @@ - Start Date: 2014-11-21 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#529](https://github.com/rust-lang/rfcs/pull/529) +- Rust Issue: [rust-lang/rust#23567](https://github.com/rust-lang/rust/issues/23567) # Summary From 7bca6c58ae234976d9f94c58cf69e32a764fa33c Mon Sep 17 00:00:00 2001 From: Jorge Aparicio Date: Fri, 20 Mar 2015 23:15:37 -0500 Subject: [PATCH 0195/1195] take RHS by value --- text/0000-op-assign.md | 41 +++++++++++++++++++++-------------------- 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/text/0000-op-assign.md b/text/0000-op-assign.md index 8fb962ceaf5..6657f0a982f 100644 --- a/text/0000-op-assign.md +++ b/text/0000-op-assign.md @@ -16,13 +16,13 @@ make mathematical libraries more palatable. # Detailed design -Add the following **unstable** traits to libcore and reexported them in stdlib: +Add the following **unstable** traits to libcore and reexported them in libstd: ``` // `+=` #[lang = "add_assign"] trait AddAssign { - fn add_assign(&mut self, &Rhs); + fn add_assign(&mut self, Rhs); } // the remaining traits have the same signature @@ -50,19 +50,22 @@ Once we feel comfortable with the implementation we'll remove the feature gate and mark the traits as stable. This can be done after 1.0 as this change is backwards compatible. -## RHS: By ref vs by value +## RHS: By value vs by ref -This RFC proposes that the assignment operations take the RHS always by ref; -instead of by value like the "normal" binary operations (e.g. `Add`) do. The -rationale is that, as far as the author has seen in practice [1], one never -wants to mutate the RHS or consume it, or in other words an immutable view into -the RHS is enough to perform the operation. Therefore, this RFC follows in the -footsteps of the `Index` traits, where the same situation arises with the -indexing value, and by ref was chosen over by value. +Taking the RHS by value is more flexible. The implementations allowed with +a by value RHS are a superset of the implementations allowed with a by ref RHS. +An example where taking the RHS by value is necessary would be operator sugar +for extending a collection with an iterator [1]: `vec ++= iter` where +`vec: Vec` and `iter impls Iterator`. This can't be implemented with the +by ref version as the iterator couldn't be advanced in that case. -[1] It could be possible that the author is not aware of use cases where taking -RHS by value is necessary. Feedback on this matter would be appreciated. (See -the first unresolved question) +[1] Where `++` is the "combine" operator that has been proposed [elsewhere]. +Note that this RFC doesn't propose adding that particular operator or adding +similar overloaded operations (`vec += iter`) to stdlib's collections, but it +leaves the door open to the possibility of adding them in the future (if +desired). + +[elsewhere]: https://github.com/rust-lang/rfcs/pull/203 # Drawbacks @@ -70,16 +73,14 @@ None that I can think of. # Alternatives -Alternatively, we could change the traits to take the RHS by value. This makes -them more "flexible" as the user can pick by value or by reference, but makes -the use slightly unergonomic in the by ref case as the borrow must be explicit -e.g. `x += &big_float;` vs `x += big_float;`. +Take the RHS by ref. This is less flexible than taking the RHS by value but, in +some instances, it can save writing `&rhs` when the RHS is owned and the +implementation demands a reference. However, this last point will be moot if we +implement auto-referencing for binary operators, as `lhs += rhs` would actually +call `add_assign(&mut lhs, &rhs)` if `Lhs impls AddAssign<&Rhs>`. # Unresolved questions -Are there any use cases of assignment operations where the RHS has to be taken -by value for the operation to be performant (e.g. to avoid internal cloning)? - Should we overload `ShlAssign` and `ShrAssign`, e.g. `impl ShlAssign for i32`, since we have already overloaded the `Shl` and `Shr` traits? From 1ad497df2877d46a85fa256c63797e3d00c24491 Mon Sep 17 00:00:00 2001 From: Sean Patrick Santos Date: Sat, 21 Mar 2015 23:09:03 -0600 Subject: [PATCH 0196/1195] Remove associated statics, and clarify limitations on associated constants pending further design work. --- text/0195-associated-items.md | 192 +++++++++++++++++++++++++++++++--- 1 file changed, 177 insertions(+), 15 deletions(-) diff --git a/text/0195-associated-items.md b/text/0195-associated-items.md index 718b63b9fea..0a76c63d5fc 100644 --- a/text/0195-associated-items.md +++ b/text/0195-associated-items.md @@ -10,7 +10,6 @@ set of methods, together with: * Associated functions (already present as "static" functions) * Associated consts -* Associated statics * Associated types * Associated lifetimes @@ -21,10 +20,12 @@ This RFC also provides a mechanism for *multidispatch* traits, where the `impl` is selected based on multiple types. The connection to associated items will become clear in the detailed text below. -*Note: This RFC was originally accepted before RFC 246 added consts and changed -the definition of statics. The text has been updated to clarify that both consts -and statics can be associated with a trait. Other than that modification, the -proposal has not been changed to reflect newer Rust features or syntax.* +*Note: This RFC was originally accepted before RFC 246 introduced the +distinction between const and static items. The text has been updated to clarify +that associated consts will be added rather than statics, and to provide a +summary of restrictions on the initial implementation of associated +consts. Other than that modification, the proposal has not been changed to +reflect newer Rust features or syntax.* # Motivation @@ -179,7 +180,7 @@ provide a distinct `impl` for every member of this family. Associated types, lifetimes, and functions can already be expressed in today's Rust, though it is unwieldy to do so (as argued above). -But associated _consts_ and _statics_ cannot be expressed using today's traits. +But associated _consts_ cannot be expressed using today's traits. For example, today's Rust includes a variety of numeric traits, including `Float`, which must currently expose constants as static functions: @@ -196,7 +197,7 @@ trait Float { } ``` -Because these functions cannot be used in static initializers, the modules for +Because these functions cannot be used in constant expressions, the modules for float types _also_ export a separate set of constants as consts, not using traits. @@ -288,15 +289,14 @@ distinction" below. ## Trait bodies: defining associated items -Trait bodies are expanded to include four new kinds of items: consts, statics, -types, and lifetimes: +Trait bodies are expanded to include three new kinds of items: consts, types, +and lifetimes: ``` TRAIT = TRAIT_HEADER '{' TRAIT_ITEM* '}' TRAIT_ITEM = ... | 'const' IDENT ':' TYPE [ '=' CONST_EXP ] ';' - | 'static' IDENT ':' TYPE [ '=' CONST_EXP ] ';' | 'type' IDENT [ ':' BOUNDS ] [ WHERE_CLAUSE ] [ '=' TYPE ] ';' | 'lifetime' LIFETIME_IDENT ';' ``` @@ -359,7 +359,7 @@ external to the trait. ### Defaults -Notice that associated consts, statics, and types permit defaults, just as trait +Notice that associated consts and types both permit defaults, just as trait methods and functions can provide defaults. Defaults are useful both as a code reuse mechanism, and as a way to expand the @@ -431,14 +431,13 @@ We deal with this in a very simple way: ## Trait implementations -Trait `impl` syntax is much the same as before, except that static, type, and +Trait `impl` syntax is much the same as before, except that const, type, and lifetime items are allowed: ``` IMPL_ITEM = ... | 'const' IDENT ':' TYPE '=' CONST_EXP ';' - | 'static' IDENT ':' TYPE '=' CONST_EXP ';' | 'type' IDENT' '=' 'TYPE' ';' | 'lifetime' LIFETIME_IDENT '=' LIFETIME_REFERENCE ';' ``` @@ -776,7 +775,6 @@ trait Foo { type AssocType; lifetime 'assoc_lifetime; const ASSOC_CONST: uint; - static ASSOC_STATIC: &'static [uint, ..1024]; fn assoc_fn() -> Self; // Note: 'assoc_lifetime and AssocType in scope: @@ -786,7 +784,6 @@ trait Foo { // method in scope UFCS-style, assoc_fn in scope let _ = method(self, assoc_fn()); ASSOC_CONST // in scope - ASSOC_STATIC // in scope } } @@ -1200,6 +1197,73 @@ trait Mappable While the above demonstrates the versatility of associated types and `where` clauses, it is probably too much of a hack to be viable for use in `libstd`. +### Associated consts in generic code + +There are some restrictions on uses of associated consts in generic code. These +might be loosened or removed in the future (see the related sub-sections in +"Unresolved questions" below). + + 1. Values of constant expressions in match patterns cannot depend on a type + parameter (by extension, neither can the types of such expressions). This + restriction is necessary for exhaustiveness and reachability to be checked + in generic code. + + Note that the dependence of a value on a type parameter may be indirect: + + ```rust + enum MyEnum { + Var1, + Var2, + } + trait HasVar { + const VAR: MyEnum; + } + fn do_something(x: MyEnum) { + const y: MyEnum = ::VAR; + // The following is forbidden because the value `y` depends on `T`. + match x { + y => { /* ... */ } + _ => { /* ... */ } + } + // However, this is OK because the guard is not a part of the pattern. + match x { + z if z == y => { /* ... */ } + _ => { /* ... */ } + } + } + ``` + + 2. Array sizes that depend on type parameters cannot be compared for equality + by type-checking, with one exception: if the expression for an array size + comprises only a single reference to a constant item (or associated item), + it will be considered equal to any other array size that refers to the same + item, even if that item itself depends on the type parameters. + + For clarification, here are some examples. Assume that `T` is a type + parameter in the outer scope, and that it is known to have an associated + const `::N` of type `usize`. + + ```rust + // This is OK (but there are limitations to how x can be used). + let x: [u8; ::N] = [0u8; ::N]; + // Equivalent to the above. + let x = [0u8; ::N]; + // Neither of the following are allowed (type checking shouldn't have to + // know anything about arithmetic). + let x: [u8; 2 * ::N] = [0u8; ::N + ::N]; + let x: [u8; ::N + 1] = [0u8; 1 + ::N]; + // Still not allowed. + let x: [u8; ::N + 1] = [0u8; ::N + 1]; + // Workaround for the expression above. + const N_PLUS_1: usize = ::N + 1; + let x: [u8; N_PLUS_1] = [0u8; N_PLUS_1]; + // Neither of the following are allowed. + const ALIAS_N_PLUS_1: usize = N_PLUS_1; + let x: [u8; N_PLUS_1] = [0u8; ALIAS_N_PLUS_1]; + const ALIAS_N: usize = ::N; + let x: [u8; ::N] = [0u8; ALIAS_N]; + ``` + # Staging Associated lifetimes are probably not necessary for the 1.0 timeframe. While we @@ -1403,3 +1467,101 @@ This seems like a potentially useful feature, and should be unproblematic for bounds, but may have implications for vtables that make it problematic for trait objects. Whether or not such trait combinations are allowed will likely depend on implementation concerns, which are not yet clear. + +## Generic associated consts in match patterns + +It seems desirable to allow constants that depend on type parameters in match +patterns, but it's not clear how to do so. + +Looking at the `HasVar` example above, one possibility would be to simply treat +the first, forbidden match expression as syntactic sugar for the second, allowed +match expression that uses a pattern guard. This is simple to implement because +one can simply ignore the constant when performing exhaustiveness and +reachability checks. Unfortunately, this approach blurs the difference between +match patterns (which provide strict checks) and pattern guards (which are just +useful syntactic sugar), and it does not increase the expressiveness of the +language. + +An alternative would be to allow `where` clauses to place constraints on +associated consts. If an associated const is known to be equal/unequal to some +other value (or in the case of integers, inside/outside a given range), this can +inform exhaustiveness and reachability checks. But this requires more design and +implementation work, and more syntax. + +For now, we simply defer the question. + +## Generic associated consts in array sizes + +The above solution for type-checking array sizes is somewhat unsatisfactory. In +particular, it is counter-intuitive that neither of the following will type +check: + +```rust +// Shouldn't this be OK? +const ALIAS_N: usize = ::N; +let x: [u8; ::N] = [0u8; ALIAS_N]; +// This is likely to yield an embarrassing error message such as: +// "couldn't prove that `::N + 1` is equal to `::N + 1`" +let x: [u8; ::N + 1] = [0u8; ::N + 1]; +``` + +A function like this is especially affected: + +```rust +trait HasN { + const N: usize; +} +fn foo() -> [u8; ::N + 1] { + // Can't be verified to be correct for the return type, and can't use the + // intermediate const workaround due to scoping issues. + [0u8; ::N + 1] +} +``` + +This can be worked around with type-level naturals that use associated consts to +produce array sizes, but this is syntactically a bit inelegant. + +```rust +// Assume that `TypeAdd` and `One` are from a type-level naturals or similar +// library, and that `NAsTypeNatN` provides some way of translating the `N` +// on a `HasN` to a type compatible with that library. +trait HasN { + const N: usize; + type TypeNatN; +} +fn foo() -> [u8; TypeAdd<::TypeNatN, One>::AsUsize] { + // Because the type `TypeAdd<::TypeNatN, One>` can be verified to be + // equal to itself in type checking, we know that the associated const + // `AsUsize` below must be the same item as the `AsUsize` mentioned in the + // return type above. + [0u8; TypeAdd<::NAsTypeNat, One>::AsUsize] +} +``` + +There are a variety of possible ways to address the above issues, including: + + - Implementing smarter handling of consts that are just aliases of other + constant items. + - Allowing `where` clauses to constrain some associated constants to be equal, + to other expressions, and using this information in type checking. + - Adding normalization with little or no awareness of arithmetic (e.g. allowing + expressions that are exactly the same to be considered equal, or using only + a very basic understanding of which operations are commutative and/or + associative). + - Adding new syntax and/or new capability to plugins to allow type-level + naturals to be used with more ergonomic and clear syntax. + - Implementing a dependent type system that provides built-in semantics for + integer arithmetic at the type level, rather than implementing this in an + external or standard library. + - Using a full-fledged SMT solver. + - Some other creative solutions not on this list. + +While there are many ways to improve on the current design, and many of these +approaches are not mutually exclusive, much more work is needed to investigate +and implement a self-consistent, effective, and ideally intuitive set of +solutions. + +Though admittedly not very satisfying at the moment, the current approach has +the advantage of being (arguably) a good minimalist design, allowing associated +consts to be used for array sizes in generic code now, but also allowing for any +of a number of improved systems to be implemented later. From 906439df42ad86d8bb3ae6ffa27d32dd1823cd06 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Wed, 18 Mar 2015 15:10:13 +1300 Subject: [PATCH 0197/1195] Modify RFC #803 (type ascription) to make type ascription expressions lvalues. --- text/0803-type-ascription.md | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/text/0803-type-ascription.md b/text/0803-type-ascription.md index aecc5f4586c..a79c213572b 100644 --- a/text/0803-type-ascription.md +++ b/text/0803-type-ascription.md @@ -172,10 +172,10 @@ lvalue position), then we don't have the soundness problem, but we do get the unexpected result that `&(x: T)` is not in fact a reference to `x`, but a reference to a temporary copy of `x`. -The proposed solution is that type ascription expressions are rvalues, but -taking a reference of such an expression is forbidden. I.e., type asciption is -forbidden in the following contexts (where `` is a type ascription -expression): +The proposed solution is that type ascription expressions are lvalues, where +the type ascription expression is in reference context, then we require the +ascribed type to exactly match the type of the expression, i.e., neither +subtyping nor coercion is allowed. These contexts are: ``` &[mut] @@ -184,12 +184,6 @@ match { .. ref [mut] x .. => { .. } .. } .foo() // due to autoref ``` -Like other rvalues, type ascription would not be allowed as the lhs of assignment. - -Note that, if type asciption is required in such a context, an lvalue can be -forced by using `{}`, e.g., write `&mut { foo: T }`, rather than `&mut (foo: T)`. - - # Drawbacks More syntax, another feature in the language. From 18fe78ed3e1611be4ee5484b4b5e9d18e0c534b3 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 24 Mar 2015 12:35:37 +1300 Subject: [PATCH 0198/1195] Update the text about lvalues --- text/0803-type-ascription.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/0803-type-ascription.md b/text/0803-type-ascription.md index a79c213572b..94dc373d59c 100644 --- a/text/0803-type-ascription.md +++ b/text/0803-type-ascription.md @@ -172,16 +172,18 @@ lvalue position), then we don't have the soundness problem, but we do get the unexpected result that `&(x: T)` is not in fact a reference to `x`, but a reference to a temporary copy of `x`. -The proposed solution is that type ascription expressions are lvalues, where -the type ascription expression is in reference context, then we require the -ascribed type to exactly match the type of the expression, i.e., neither -subtyping nor coercion is allowed. These contexts are: +The proposed solution is that type ascription expressions are lvalues. If the +type ascription expression is in reference context, then we require the ascribed +type to exactly match the type of the expression, i.e., neither subtyping nor +coercion is allowed. These reference contexts are as follows (where is a +type ascription expression): ``` &[mut] let ref [mut] x = match { .. ref [mut] x .. => { .. } .. } .foo() // due to autoref + = ...; ``` # Drawbacks From 2dabd26152745c24cc03e3b4b52b712322b1e8f5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 24 Mar 2015 17:28:18 -0700 Subject: [PATCH 0199/1195] RFC: Add std::process::exit Add a new function to the `std::process` module to exit the process immediately with a specified exit code. --- text/0000-process-exit.md | 88 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 text/0000-process-exit.md diff --git a/text/0000-process-exit.md b/text/0000-process-exit.md new file mode 100644 index 00000000000..f801669ac7c --- /dev/null +++ b/text/0000-process-exit.md @@ -0,0 +1,88 @@ +- Feature Name: exit +- Start Date: 2015-03-24 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add a function to the `std::process` module to exit the process immediately with +a specified exit code. + +# Motivation + +Currently there is no stable method to exit a program in Rust with a nonzero +exit code without panicking. The current unstable method for doing so is by +using the `exit_status` feature with the `std::env::set_exit_status` function. + +This function has not been stabilized as it diverges from the system APIs (there +is no equivalent) and it represents an odd piece of global state for a Rust +program to have. One example of odd behavior that may arise is that if a library +calls `env::set_exit_status`, then the process is not guaranteed to exit with +that status (e.g. Rust was called from C). + +The purpose of this RFC is to provide at least one method on the path to +stabilization which will provide a method to exit a process with a nonzero exit +code. + +# Detailed design + +The following function will be added to the `std::process` module: + +```rust +/// Terminates the current process with the specified exit code. +/// +/// This function will never return and will immediately terminate the current +/// process. The exit code is passed through to the underlying OS and will be +/// available for consumption by another process. +/// +/// Note that because this function never returns, and that it terminates the +/// process, no destructors on the current stack or any other thread's stack +/// will be run. If a clean shutdown is needed it is recommended to only call +/// this function at a known point where there are no more destructors left +/// to run. +pub fn exit(code: i32) -> !; +``` + +Implementation-wise this will correspond to the [`exit` function][unix] on unix +and the [`ExitProcess` function][win] on windows. + +[unix]: http://pubs.opengroup.org/onlinepubs/000095399/functions/exit.html +[win]: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682658%28v=vs.85%29.aspx + +This function is also not marked `unsafe`, despite the risk of leaking +allocated resources (e.g. destructor smany not be run). It is already possible +to safely create memory leaks in Rust, however, (with `Rc` + `RefCell`), so +this is not considered a strong enough threshold to mark the function as +`unsafe`. + +# Drawbacks + +* This API does not solve all use cases of exiting with a nonzero exit status. + It is sometimes more convenient to simply return a code from the `main` + function instead of having to call a separate function in the standard + library. + +# Alternatives + +* One alternative would be to stabilize `set_exit_status` as-is today. The + semantics of the function would be clearly documented to prevent against + surprises, but it would arguably not prevent all surprises from arising. Some + reasons for not pursuing this route, however, have been outlined in the + motivation. + +* The `main` function of binary programs could be altered to either require an + `i32` return value. This would greatly lessen the need to stabilize this + function as-is today as it would be possible to exit with a nonzero code by + returning a nonzero value from `main`. This is a backwards-incompatible + change, however. + +* The `main` function of binary programs could optionally be typed as `fn() -> + i32` instead of just `fn()`. This would be a backwards-compatible change, but + does somewhat add complexity. It may strike some as odd to be able to define + the `main` function with two different signatures in Rust. + +# Unresolved questions + +* To what degree should the documentation imply that `rt::at_exit` handlers are + run? Implementation-wise their execution is guaranteed, but we may not wish + for this to always be so. From 9b1344bc9ba08c6b2fe112f4b0e1446d43dced09 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 24 Mar 2015 17:31:27 -0700 Subject: [PATCH 0200/1195] typo --- text/0000-process-exit.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-process-exit.md b/text/0000-process-exit.md index f801669ac7c..6e6e239a933 100644 --- a/text/0000-process-exit.md +++ b/text/0000-process-exit.md @@ -70,7 +70,7 @@ this is not considered a strong enough threshold to mark the function as reasons for not pursuing this route, however, have been outlined in the motivation. -* The `main` function of binary programs could be altered to either require an +* The `main` function of binary programs could be altered to require an `i32` return value. This would greatly lessen the need to stabilize this function as-is today as it would be possible to exit with a nonzero code by returning a nonzero value from `main`. This is a backwards-incompatible From d2e1136b06347cc61108bcdf4f4ee3470286fcd9 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 24 Mar 2015 17:58:22 -0700 Subject: [PATCH 0201/1195] More typos --- text/0000-process-exit.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-process-exit.md b/text/0000-process-exit.md index 6e6e239a933..1015559ec02 100644 --- a/text/0000-process-exit.md +++ b/text/0000-process-exit.md @@ -50,7 +50,7 @@ and the [`ExitProcess` function][win] on windows. [win]: https://msdn.microsoft.com/en-us/library/windows/desktop/ms682658%28v=vs.85%29.aspx This function is also not marked `unsafe`, despite the risk of leaking -allocated resources (e.g. destructor smany not be run). It is already possible +allocated resources (e.g. destructors may not be run). It is already possible to safely create memory leaks in Rust, however, (with `Rc` + `RefCell`), so this is not considered a strong enough threshold to mark the function as `unsafe`. From 0e44ae6b858c81d6a8b96c69b90cccd1ca5a6711 Mon Sep 17 00:00:00 2001 From: Robin Stocker Date: Wed, 25 Mar 2015 19:13:51 +1100 Subject: [PATCH 0202/1195] Fix typo in 0921-entry_v3.md The names where changed from `default*` to `or_insert*`, but one occurrence was still using the old name. --- text/0921-entry_v3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0921-entry_v3.md b/text/0921-entry_v3.md index dfb7b835f2b..0773ca22919 100644 --- a/text/0921-entry_v3.md +++ b/text/0921-entry_v3.md @@ -99,7 +99,7 @@ let val = map.entry(key).or_insert_with(|| expensive(big, data)); Look at all that ergonomics. *Look at it*. This pushes us more into the "one right way" territory, since this is unambiguously clearer and easier than a full `match` or abusing Result. Novices don't really need to learn the entry API at all with this. They can just learn the -`.entry(key).default(value)` incantation to start, and work their way up to more complex +`.entry(key).or_insert(value)` incantation to start, and work their way up to more complex usage later. Oh hey look this entire RFC is already implemented with all of `rust-lang/rust`'s `entry` From 88a9dbb807398594ba468540048387bd0cd36ca4 Mon Sep 17 00:00:00 2001 From: Ms2ger Date: Wed, 25 Mar 2015 13:14:22 +0100 Subject: [PATCH 0203/1195] Replace duplicate() by try_clone() in std::net. The alleged implementation of this RFC in included `try_clone` rather than `duplicate`. This commit updates the approved RFC to match the actual implementation. --- text/0517-io-os-reform.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/text/0517-io-os-reform.md b/text/0517-io-os-reform.md index 445e5aec7a2..a7f21e2440b 100644 --- a/text/0517-io-os-reform.md +++ b/text/0517-io-os-reform.md @@ -1608,7 +1608,7 @@ impl TcpStream { fn peer_addr(&self) -> io::Result; fn local_addr(&self) -> io::Result; fn shutdown(&self, how: Shutdown) -> io::Result<()>; - fn duplicate(&self) -> io::Result; + fn try_clone(&self) -> io::Result; } impl Read for TcpStream { ... } @@ -1619,8 +1619,8 @@ impl<'a> Write for &'a TcpStream { ... } #[cfg(windows)] impl AsRawSocket for TcpStream { ... } ``` -* `clone` has been replaced with a `duplicate` function. The implementation of - `duplicate` will map to using `dup` on Unix platforms and +* `clone` has been replaced with a `try_clone` function. The implementation of + `try_clone` will map to using `dup` on Unix platforms and `WSADuplicateSocket` on Windows platforms. The `TcpStream` itself will no longer be reference counted itself under the hood. * `close_{read,write}` are both removed in favor of binding the `shutdown` @@ -1646,7 +1646,7 @@ into the `TcpListener` structure. Specifically, this will be the resulting API: impl TcpListener { fn bind(addr: &A) -> io::Result; fn local_addr(&self) -> io::Result; - fn duplicate(&self) -> io::Result; + fn try_clone(&self) -> io::Result; fn accept(&self) -> io::Result<(TcpStream, SocketAddr)>; fn incoming(&self) -> Incoming; } @@ -1663,7 +1663,7 @@ Some major changes from today's API include: * The static distinction between `TcpAcceptor` and `TcpListener` has been removed (more on this in the [socket][Sockets] section). -* The `clone` functionality has been removed in favor of `duplicate` (same +* The `clone` functionality has been removed in favor of `try_clone` (same caveats as `TcpStream`). * The `close_accept` functionality is removed entirely. This is not currently implemented via `shutdown` (not supported well across platforms) and is @@ -1690,7 +1690,7 @@ impl UdpSocket { fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, SocketAddr)>; fn send_to(&self, buf: &[u8], addr: &A) -> io::Result; fn local_addr(&self) -> io::Result; - fn duplicate(&self) -> io::Result; + fn try_clone(&self) -> io::Result; } #[cfg(unix)] impl AsRawFd for UdpSocket { ... } @@ -1705,7 +1705,7 @@ Some important points of note are: `#[unstable]` for now. * All timeout support is removed. This may come back in the form of `setsockopt` (as with TCP streams) or with a more general implementation of `select`. -* `clone` functionality has been replaced with `duplicate`. +* `clone` functionality has been replaced with `try_clone`. The `UdpSocket` type will adhere to both `Send` and `Sync`. From d33dd641e965add9a2a8fe3ac52d83dd8f5a3b87 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 25 Mar 2015 10:36:31 -0700 Subject: [PATCH 0204/1195] Note that exit() may be desired regarless of main's signature --- text/0000-process-exit.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-process-exit.md b/text/0000-process-exit.md index 1015559ec02..bdce352025e 100644 --- a/text/0000-process-exit.md +++ b/text/0000-process-exit.md @@ -79,7 +79,9 @@ this is not considered a strong enough threshold to mark the function as * The `main` function of binary programs could optionally be typed as `fn() -> i32` instead of just `fn()`. This would be a backwards-compatible change, but does somewhat add complexity. It may strike some as odd to be able to define - the `main` function with two different signatures in Rust. + the `main` function with two different signatures in Rust. Additionally, it's + likely that the `exit` functionality proposed will be desired regardless of + whether the main function can return a code or not. # Unresolved questions From b9bbb38c3ea82314d6e001d9ec28bd9e0da1ec3f Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Wed, 25 Mar 2015 15:07:24 -0400 Subject: [PATCH 0205/1195] Stdout panic Signed-off-by: Peter Atashian --- 0000-stdout-existential-crisis.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 0000-stdout-existential-crisis.md diff --git a/0000-stdout-existential-crisis.md b/0000-stdout-existential-crisis.md new file mode 100644 index 00000000000..413f679ded4 --- /dev/null +++ b/0000-stdout-existential-crisis.md @@ -0,0 +1,31 @@ +- Feature Name: stdout_existential_crisis +- Start Date: 2015-03-25 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +When calling `println!` it currently causes a panic if `stdout` does not exist. Change this to ignore this specific error and simply void the output. + +# Motivation + +On linux `stdout` almost always exists, so when people write games and turn off the terminal there is still an `stdout` that they write to. Then when getting the code to run on Windows, when the console is disabled, suddenly `stdout` doesn't exist and `println!` panicks. This behavior difference is frustrating to developers trying to move to Windows. + +There is also precedent with C and C++. On both Linux and Windows, if `stdout` is closed or doesn't exist, neither platform will error when printing to the console. + +# Detailed design + +Change the internal implementation of `println!` `print!` `panic!` and `assert!` to not `panic!` when `stdout` or `stderr` doesn't exist. When getting `stdout` or `stderr` through the `std::io` methods, those versions should continue to return an error if `stdout` or `stderr` doesn't exist. + +# Drawbacks + +Hides an error from the user which we may want to expose and may lead to people missing panicks occuring in threads. + +# Alternatives + +* Make `println!` `print!` `panic!` `assert!` return errors that the user has to handle. +* Continue with the status quo and panic if `stdout` or `stderr` doesn't exist. + +# Unresolved questions + +* Should `std::io::stdout` return `Err` or `None` when there is no `stdout` instead of unconditionally returning `Stdout`? From 88c205f2798903f3784925b999720b875809dfa2 Mon Sep 17 00:00:00 2001 From: Chris Morgan Date: Thu, 26 Mar 2015 16:54:59 +1100 Subject: [PATCH 0206/1195] Update RFCs in line with the implementations. Also some spelling and grammar and whatnot. --- text/0529-conversion-traits.md | 150 +++++++++++++++++++-------------- text/0921-entry_v3.md | 10 +-- 2 files changed, 90 insertions(+), 70 deletions(-) diff --git a/text/0529-conversion-traits.md b/text/0529-conversion-traits.md index 095772b2ff1..6e2e6b878dd 100644 --- a/text/0529-conversion-traits.md +++ b/text/0529-conversion-traits.md @@ -1,3 +1,4 @@ +- Feature Name: convert - Start Date: 2014-11-21 - RFC PR: [rust-lang/rfcs#529](https://github.com/rust-lang/rfcs/pull/529) - Rust Issue: [rust-lang/rust#23567](https://github.com/rust-lang/rust/issues/23567) @@ -37,7 +38,7 @@ For example, the introduce an `AsPath` trait to make various path operations ergonomic: ```rust -pub trait AsPath for Sized? { +pub trait AsPath { fn as_path(&self) -> &Path; } @@ -106,23 +107,19 @@ more detail below, and merits community discussion. ## Basic design The design is fairly simple, although perhaps not as simple as one -might expect: we introduce a total of *five* traits: +might expect: we introduce a total of *four* traits: ```rust -trait As for Sized? { - fn convert_as(&self) -> &T; +trait AsRef { + fn as_ref(&self) -> &T; } -trait AsMut for Sized? { - fn convert_as_mut(&mut self) -> &mut T; -} - -trait To for Sized? { - fn convert_to(&self) -> T; +trait AsMut { + fn as_mut(&mut self) -> &mut T; } trait Into { - fn convert_into(self) -> T; + fn into(self) -> T; } trait From { @@ -130,16 +127,16 @@ trait From { } ``` -The first three traits mirror our `as`/`to`/`into` conventions, but +The first three traits mirror our `as`/`into` conventions, but add a bit more structure to them: `as`-style conversions are from -references to references, `to`-style conversions are from references -to arbitrary types, and `into`-style conversions are between arbitrary -types (consuming their argument). +references to references and `into`-style conversions are between +arbitrary types (consuming their argument). -The final trait, `From`, mimics the `from` constructors. Unlike the -other traits, its method is not prefixed with `convert`. This is -because, again unlike the other traits, this trait is expected to -outright replace most custom `from` constructors. See below. +A `To` trait, following our `to` conventions and converting from +references to arbitrary types, is possible but is deferred for now. + +The final trait, `From`, mimics the `from` constructors. This trait is +expected to outright replace most custom `from` constructors. See below. **Why the reference restrictions?** @@ -185,7 +182,7 @@ lifetime linking explained above. In addition, however, it is a basic principle of Rust's libraries that conversions are distinguished by cost and consumption, and having multiple traits makes it possible to (by convention) restrict attention to e.g. "free" `as`-style conversions -by bounding only by `As`. +by bounding only by `AsRef`. Why have both `Into` and `From`? There are a few reasons: @@ -201,30 +198,16 @@ Given the above trait design, there are a few straightforward blanket `impl`s as one would expect: ```rust -// As implies To -impl<'a, Sized? T, Sized? U> To<&'a U> for &'a T where T: As { - fn convert_to(&self) -> &'a U { - self.convert_as() - } -} - -// To implies Into -impl<'a, T, U> Into for &'a T where T: To { - fn convert_into(self) -> U { - self.convert_to() - } -} - // AsMut implies Into impl<'a, T, U> Into<&'a mut U> for &'a mut T where T: AsMut { - fn convert_into(self) -> &'a mut U { - self.convert_as_mut() + fn into(self) -> &'a mut U { + self.as_mut() } } // Into implies From impl From for U where T: Into { - fn from(t: T) -> U { t.cvt_into() } + fn from(t: T) -> U { t.into() } } ``` @@ -233,28 +216,28 @@ impl From for U where T: Into { Using all of the above, here are some example `impl`s and their use: ```rust -impl As for String { - fn convert_as(&self) -> &str { +impl AsRef for String { + fn as_ref(&self) -> &str { self.as_slice() } } -impl As<[u8]> for String { - fn convert_as(&self) -> &[u8] { +impl AsRef<[u8]> for String { + fn as_ref(&self) -> &[u8] { self.as_bytes() } } impl Into> for String { - fn convert_into(self) -> Vec { + fn into(self) -> Vec { self.into_bytes() } } fn main() { let a = format!("hello"); - let b: &[u8] = a.convert_as(); - let c: &str = a.convert_as(); - let d: Vec = a.convert_into(); + let b: &[u8] = a.as_ref(); + let c: &str = a.as_ref(); + let d: Vec = a.into(); } ``` @@ -265,8 +248,8 @@ be rare, however; usually the traits are used for generic functions: impl Path { fn join_path_inner(&self, p: &Path) -> PathBuf { ... } - pub fn join_path>(&self, p: &P) -> PathBuf { - self.join_path_inner(p.convert_as()) + pub fn join_path>(&self, p: &P) -> PathBuf { + self.join_path_inner(p.as_ref()) } } ``` @@ -292,14 +275,15 @@ impl Path { ``` that would desugar into exactly the above (assuming that the `~` sigil -was restricted to `As` conversions). Such a feature is out of scope +was restricted to `AsRef` conversions). Such a feature is out of scope for this RFC, but it's a natural and highly ergonomic extension of the traits being proposed here. ## Preliminary conventions Would *all* conversion traits be replaced by the proposed ones? -Probably not, due to the combination of two factors: +Probably not, due to the combination of two factors (using the example +of `To`, despite its being deferred for now): * You still want blanket `impl`s like `ToString` for `Show`, but: * This RFC proposes that specific conversion *methods* like @@ -355,9 +339,9 @@ So a rough, preliminary convention would be the following: *All* of the conversion traits are added to the prelude. There are two reasons for doing so: -* For `As`/`To`/`Into`, the reasoning is similar to the inclusion of - `PartialEq` and friends: they are expected to appear ubiquitously as - bounds. +* For `AsRef`/`AsMut`/`Into`, the reasoning is similar to the + inclusion of `PartialEq` and friends: they are expected to appear + ubiquitously as bounds. * For `From`, bounds are somewhat less common but the use of the `from` constructor is expected to be rather widespread. @@ -381,6 +365,14 @@ There are a few drawbacks to the design as proposed: # Alternatives +The original form of this RFC used the names `As.convert_as`, +`AsMut.convert_as_mut`, `To.convert_to` and `Into.convert_into` (though +still `From.from`). After discussion `As` was changed to `AsRef`, +removing the keyword collision of a method named `as`, and the +`convert_` prefixes were removed. + +--- + The main alternative is one that attempts to provide methods that *completely replace* ad hoc conversion methods. To make this work, a form of double dispatch is used, so that the methods are added to @@ -422,47 +414,47 @@ the author to discard this alternative design. // Immutable views -trait ShiftViewFrom for Sized? { +trait ShiftViewFrom { fn shift_view_from(&T) -> &Self; } -trait ShiftView for Sized? { - fn shift_view(&self) -> &T where T: ShiftViewFrom; +trait ShiftView { + fn shift_view(&self) -> &T where T: ShiftViewFrom; } -impl ShiftView for T { - fn shift_view>(&self) -> &U { +impl ShiftView for T { + fn shift_view>(&self) -> &U { ShiftViewFrom::shift_view_from(self) } } // Mutable coercions -trait ShiftViewFromMut for Sized? { +trait ShiftViewFromMut { fn shift_view_from_mut(&mut T) -> &mut Self; } -trait ShiftViewMut for Sized? { - fn shift_view_mut(&mut self) -> &mut T where T: ShiftViewFromMut; +trait ShiftViewMut { + fn shift_view_mut(&mut self) -> &mut T where T: ShiftViewFromMut; } -impl ShiftViewMut for T { - fn shift_view_mut>(&mut self) -> &mut U { +impl ShiftViewMut for T { + fn shift_view_mut>(&mut self) -> &mut U { ShiftViewFromMut::shift_view_from_mut(self) } } // CONVERSIONS -trait ConvertFrom for Sized? { +trait ConvertFrom { fn convert_from(&T) -> Self; } -trait Convert for Sized? { +trait Convert { fn convert(&self) -> T where T: ConvertFrom; } -impl Convert for T { +impl Convert for T { fn convert(&self) -> U where U: ConvertFrom { ConvertFrom::convert_from(self) } @@ -526,3 +518,31 @@ fn main() { let b = s.shift_view::<[u8]>(); } ``` + +## Possible further work + +We could add a `To` trait. + +```rust +trait To { + fn to(&self) -> T; +} +``` + +As far as blanket `impl`s are concerned, there are a few simple ones: + +```rust +// AsRef implies To +impl<'a, T: ?Sized, U: ?Sized> To<&'a U> for &'a T where T: AsRef { + fn to(&self) -> &'a U { + self.as_ref() + } +} + +// To implies Into +impl<'a, T, U> Into for &'a T where T: To { + fn into(self) -> U { + self.to() + } +} +``` diff --git a/text/0921-entry_v3.md b/text/0921-entry_v3.md index 0773ca22919..f7cdeeef245 100644 --- a/text/0921-entry_v3.md +++ b/text/0921-entry_v3.md @@ -27,7 +27,7 @@ match map.entry(key) => { } ``` -This code is noisy, and is visibly fighting the Entry API a bit, such as having to supress +This code is noisy, and is visibly fighting the Entry API a bit, such as having to suppress the return value of insert. It requires the `Entry` enum to be imported into scope. It requires the user to learn a whole new API. It also introduces a "many ways to do it" stylistic ambiguity: @@ -53,7 +53,7 @@ map.entry(key).get().unwrap_or_else(|entry| entry.insert(vec![])).push(val); This is certainly *nicer*. No imports are needed, the Occupied case is handled, and we're closer to a "only one way". However this is still fairly tedious and arcane. `get` provides little -meaning for what is done; unwrap_or_else is long and scary-sounding; and VacantEntry litterally +meaning for what is done; `unwrap_or_else` is long and scary-sounding; and VacantEntry literally *only* supports `insert`, so having to call it seems redundant. # Detailed design @@ -63,7 +63,7 @@ Replace `Entry::get` with the following two methods: ``` /// Ensures a value is in the entry by inserting the default if empty, and returns /// a mutable reference to the value in the entry. - pub fn or_insert(self. default: V) -> &'a mut V { + pub fn or_insert(self, default: V) -> &'a mut V { match self { Occupied(entry) => entry.into_mut(), Vacant(entry) => entry.insert(default), @@ -72,7 +72,7 @@ Replace `Entry::get` with the following two methods: /// Ensures a value is in the entry by inserting the result of the default function if empty, /// and returns a mutable reference to the value in the entry. - pub fn or_insert_with V>(self. default: F) -> &'a mut V { + pub fn or_insert_with V>(self, default: F) -> &'a mut V { match self { Occupied(entry) => entry.into_mut(), Vacant(entry) => entry.insert(default()), @@ -107,7 +107,7 @@ usage audited and updated: https://github.com/rust-lang/rust/pull/22930 # Drawbacks -Replaces the composability of just mapping to a Result with more adhoc specialty methods. This +Replaces the composability of just mapping to a Result with more ad hoc specialty methods. This is hardly a drawback for the reasons stated in the RFC. Maybe someone was really leveraging the Result-ness in an exotic way, but it was likely an abuse of the API. Regardless, the `get` method is trivial to write as a consumer of the API. From 427e3e246bbdb6fe683bbb29f58030fe298ffe28 Mon Sep 17 00:00:00 2001 From: mdinger Date: Thu, 26 Mar 2015 15:32:30 -0400 Subject: [PATCH 0207/1195] Fix typos --- text/0246-const-vs-static.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0246-const-vs-static.md b/text/0246-const-vs-static.md index f7b25e3df93..89c4bdc8339 100644 --- a/text/0246-const-vs-static.md +++ b/text/0246-const-vs-static.md @@ -25,7 +25,7 @@ times. There are number of interrelated issues: program. It is even more useful if those constant values do not have a known address, because that means the compiler is free to replicate them as it wishes. Moreover, if a constant is inlined into downstream - crates, than they must be recompiled whenever that constant changes. + crates, then they must be recompiled whenever that constant changes. - *Read-only memory:* Whenever possible, we'd like to place large constants into read-only memory. But this means that the data must be truly immutable, or else a segfault will result. @@ -59,7 +59,7 @@ Some concrete problems with this design are: illegal. To resolve this, there is an alternative proposal which makes access to `static mut` be considered safe if the type of the static mut meets the `Sync` trait. -- The signifiance (no pun intended) of the `#[inline(never)]` annotation +- The significance (no pun intended) of the `#[inline(never)]` annotation is not intuitive. - There is no way to have a generic type constant. From 9f2c7f0e9fe5e4ddffc598bfa3c5f84b9bfcf074 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 26 Mar 2015 16:52:15 -0400 Subject: [PATCH 0208/1195] Do not change semantics of `as`; make wrapping methods inherent methods. --- text/0560-integer-overflow.md | 86 ++++++++++++++++++++--------------- 1 file changed, 49 insertions(+), 37 deletions(-) diff --git a/text/0560-integer-overflow.md b/text/0560-integer-overflow.md index 464d76223ca..adedf371bb8 100644 --- a/text/0560-integer-overflow.md +++ b/text/0560-integer-overflow.md @@ -126,12 +126,9 @@ follows. The intention is that the defined results are the same as the defined results today. The only change is that now a panic may result. - The operations `+`, `-`, `*`, `/`, `%` can underflow and - overflow. Shift operations (`<<`, `>>`) can shift a value of width `N` by more - than `N` bits. In these cases, the result is the same as the pre-existing, - wrapping semantics. -- When truncating, the casting operation `as` can overflow if the - truncated bits contain non-zero values. If no panic occurs, the - result of such an operation is defined to be the same as wrapping. + overflow. +- Shift operations (`<<`, `>>`) can shift a value of width `N` by more + than `N` bits. ## Enabling overflow checking @@ -146,9 +143,9 @@ The goal of this rule is to ensure that, during debugging and normal development, overflow detection is on, so that users can be alerted to potential overflow (and, in particular, for code where overflow is expected and normal, they will be immediately guided to use the -`WrappingOps` traits introduced below). However, because these checks -will be compiled out whenever an optimized build is produced, final -code wilil not pay a performance penalty. +wrapping methods introduced below). However, because these checks will +be compiled out whenever an optimized build is produced, final code +wilil not pay a performance penalty. In the future, we may add additional means to control when overflow is checked, such as scoped attributes or a global, independent @@ -168,15 +165,15 @@ coallesced into a single check. Another useful example might be that, when summing a vector, the final overflow check could be deferred until the summation is complete. -## `WrappingOps` trait for explicit wrapping arithmetic +## Methods for explicit wrapping arithmetic For those use cases where explicit wraparound on overflow is required, such as hash functions, we must provide operations with such -semantics. Accomplish this by providing the following trait and impls -in the `std::num` module. +semantics. Accomplish this by providing the following methods defined +in the inherent impls for the various integral types. ```rust -pub trait WrappingOps { +impl i32 { // and i8, i16, i64, isize, u8, u32, u64, usize fn wrapping_add(self, rhs: Self) -> Self; fn wrapping_sub(self, rhs: Self) -> Self; fn wrapping_mul(self, rhs: Self) -> Self; @@ -185,30 +182,7 @@ pub trait WrappingOps { fn wrapping_lshift(self, amount: u32) -> Self; fn wrapping_rshift(self, amount: u32) -> Self; - - fn wrapping_as_u8(self, rhs: Self) -> u8; - fn wrapping_as_u16(self, rhs: Self) -> u16; - fn wrapping_as_u32(self, rhs: Self) -> u32 - fn wrapping_as_u64(self, rhs: Self) -> u64; - fn wrapping_as_usize(self, rhs: Self) -> usize; - - fn wrapping_as_i8(self, rhs: Self) -> i8; - fn wrapping_as_i16(self, rhs: Self) -> i16; - fn wrapping_as_i32(self, rhs: Self) -> i32 - fn wrapping_as_i64(self, rhs: Self) -> i64; - fn wrapping_as_isize(self, rhs: Self) -> isize; } - -impl WrappingOps for isize -impl WrappingOps for usize -impl WrappingOps for i8 -impl WrappingOps for u8 -impl WrappingOps for i16 -impl WrappingOps for u16 -impl WrappingOps for i32 -impl WrappingOps for u32 -impl WrappingOps for i64 -impl WrappingOps for u64 ``` These are implemented to preserve the pre-existing, wrapping semantics @@ -448,6 +422,33 @@ Reasons this was not pursued: Wrong defaults. Doesn't enable distinguishing Reasons this was not pursued: My brain melted. :( +## Making `as` be checked + +The RFC originally specified that using `as` to convert between types +would cause checked semantics. However, we now use `as` as a primitive +type operator. This decision was discussed on the +[discuss message board][as]. + +The key points in favor of reverting `as` to its original semantics +were: + +1. `as` is already a fairly low-level operator that can be used (for + example) to convert between `*mut T` and `*mut U`. +2. `as` is the only way to convert types in constants, and hence it is + important that it covers all possibilities that constants might + need (eventually, [const fn][911] or other approaches may change + this, but those are not going to be stable for 1.0). +3. The [type ascription RFC][803] set the precedent that `as` is used + for "dangerous" coercions that require care. +4. Eventually, checked numeric conversions (and perhaps most or all + uses of `as`) can be ergonomically added as methods. The precise + form of this will be resolved in the future. [const fn][911] can + then allow these to be used in constant expressions. + +[as]: http://internals.rust-lang.org/t/on-casts-and-checked-overflow/1710/ +[803]: https://github.com/rust-lang/rfcs/pull/803 +[911]: https://github.com/rust-lang/rfcs/pull/911 + # Unresolved questions The C semantics of wrapping operations in some cases are undefined: @@ -480,7 +481,18 @@ overflow. [CZ22]: https://mail.mozilla.org/pipermail/rust-dev/2014-June/010483.html [JR23_2]: https://mail.mozilla.org/pipermail/rust-dev/2014-June/010527.html -## Acknowledgements and further reading +# Updates since being accepted + +Since it was accepted, the RFC has been updated as follows: + +1. The wrapping methods were moved to be inherent, since we gained the + capability for libstd to declare inherent methods on primitive + integral types. +2. `as` was changed to restore the behavior before the RFC (that is, + it truncates, as a C cast would). + + +# Acknowledgements and further reading This RFC was [initially written by Gábor Lehel][GH] and was since edited by Nicholas Matsakis into its current form. Although the text From 19590334a3263c29ec98bebe6f369010314313ea Mon Sep 17 00:00:00 2001 From: Yehuda Katz Date: Thu, 26 Mar 2015 15:07:36 -0700 Subject: [PATCH 0209/1195] Duration Reform RFC --- text/0000-duration-reform.md | 162 +++++++++++++++++++++++++++++++++++ 1 file changed, 162 insertions(+) create mode 100644 text/0000-duration-reform.md diff --git a/text/0000-duration-reform.md b/text/0000-duration-reform.md new file mode 100644 index 00000000000..90ee3e93d5c --- /dev/null +++ b/text/0000-duration-reform.md @@ -0,0 +1,162 @@ +- Feature Name: Duration Reform +- Start Date: 2015-03-24 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC suggests stabilizing a reduced-scope `Duration` type that is appropriate for interoperating with various system calls that require timeouts. It does not stabilize a large number of conversion methods in `Duration` that have subtle caveats, with the intent of revisiting those conversions more holistically in the future. + +# Motivation + +There are a number of different notions of "time", each of which has a different set of caveats, and each of which can be designed for optimal ergonomics for its domain. This proposal focuses on one particular one: an amount of time in high-precision units. + +Eventually, there are a number of concepts of time that deserve fleshed out APIs. Using the terminology from the popular Java time library [JodaTime][joda-time]: + +* `Duration`: an amount of time, described in terms of a high + precision unit. +* `Period`: an amount of time described in human terms ("5 minutes, + 27 seconds"), and which can only be resolved into a `Duration` + relative to a moment in time. +* `Instant`: a moment in time represented in terms of a `Duration` + since some epoch. + +[joda-time]: http://www.joda.org/joda-time/ + +Human complications such as leap seconds, days in a month, and leap years, and machine complications such as NTP adjustments make these concepts and their full APIs more complicated than they would at first appear. This proposal focuses on fleshing out a design for `Duration` that is sufficient for use as a timeout, leaving the other concepts of time to a future proposal. + +--- + +For the most part, the system APIs that this type is used to communicate with either use `timespec` (`u64` seconds plus `u32` nanos) or take a timeout in milliseconds (`u32` on Windows). + +> For example, [`GetQueuedCompletionStatus`][iocp-ms-example], one of +> the primary APIs in the Windows IOCP API, takes a `dwMilliseconds` +> parameter as a [`DWORD`][msdn-dword], which is a `u32`. Some Windows +> APIs use "ticks" or 100-nanosecond units. + +[iocp-ms-example]: https://msdn.microsoft.com/en-us/library/windows/desktop/aa364986%28v=vs.85%29.aspx +[msdn-dword]: https://msdn.microsoft.com/en-us/library/cc230318.aspx + +In light of that, this proposal has two primary goals: + +* to define a type that can describe portable timeouts for cross- + platform APIs +* to describe what should happen if a large `Duration` is passed into + an API that does not accept timeouts that large + +In general, this proposal considers it acceptable to reduce the granularity of timeouts (eliminating nanosecond granularity if only milliseconds are supported) and to truncate very large timeouts. + +This proposal retains the two fields in the existing `Duration`: + +* a `u64` of seconds +* a `u32` of additional nanosecond precision + +Timeout APIs defined in terms of milliseconds will truncate `Duration`s that are more than `u32::MAX` in milliseconds, and will reduce the granularity of the nanosecond field. + +> A `u32` of milliseconds supports a timeout longer than 45 days. + +Future APIs to support a broader set of [Durations][joda-duration] APIs, a [Period][joda-period] and [Instant][joda-instant] type, as well as coercions between these types, would be useful, compatible follow-ups to this RFC. + +[joda-duration]: http://www.joda.org/joda-time/key_duration.html +[joda-period]: http://www.joda.org/joda-time/key_period.html +[joda-instant]: http://www.joda.org/joda-time/key_instant.html + +# Detailed design + +A `Duration` represents a period of time represented in terms of nanosecond granularity. It has `u64` seconds and an additional `u32` nanoseconds. There is no concept of a negative `Duration`. + +> A negative `Duration` has no meaning for many APIs that may wish +> to take a `Duration`, which means that all such APIs would need +> to decide what to do when confronted with a negative `Duration`. +> As a result, this proposal focuses on the predominant use-cases for +> `Duration`, where unsigned types remove a number of caveats and +> ambiguities. + +```rust +pub struct Duration { + secs: u64, + nanos: u32 // may not be more than 1 billion +} + +impl Duration { + /// create a Duration from a number of seconds and an + /// additional nanosecond precision + pub fn new(secs: u64, nanos: u32) -> Timeout; + + /// create a Duration from a number of seconds + pub fn from_secs(secs: u64) -> Timeout; + + /// create a Duration from a number of milliseconds + pub fn from_millis(millis: u64) -> Timeout; + + /// the number of seconds represented by the Timeout + pub fn secs(self) -> u64; + + /// the number of additional nanosecond precision + pub fn nanos(self) -> u32; +} +``` + +When `Duration` is used with a system API that expects `u32` milliseconds, the nanosecond precision is dropped, and the time is truncated to `u32::MAX`. + +`Duration` implements: + +* `Add`, `Sub`, `Mul`, `Div` which follow the overflow and underflow + rules for `u64` when applied to the `secs` field. Nanoseconds + can never exceed 1 billion or be less than 0, and carry into the + `secs` field. +* `Display`, which prints a number of seconds, milliseconds and + nanoseconds (if more than 0). +* `Debug`, `Ord` (and `PartialOrd`), `Eq` (and `PartialEq`), `Copy` + and `Clone`, which are derived. + +This proposal does not, at this time, include mechanisms for instantiating a `Duration` from `weeks`, `days`, `hours` or `minutes`, because there are caveats to each of those units. In particular, the existence of leap seconds means that it is only possible to properly understand them relative to a particular starting point. + +The Joda-Time library in Java explains the problem well [in their documentation][joda-period-confusion]: + +[joda-period-confusion]: http://www.joda.org/joda-time/key_period.html + +> A duration in Joda-Time represents a duration of time measured in milliseconds. The duration is often obtained from an interval. Durations are a very simple concept, and the implementation is also simple. They have no chronology or time zone, **and consist solely of the millisecond duration.** + +> A period in Joda-Time represents a period of time defined in terms of fields, for example, 3 years 5 months 2 days and 7 hours. This differs from a duration in that it is inexact in terms of milliseconds. **A period can only be resolved to an exact number of milliseconds by specifying the instant (including chronology and time zone) it is relative to**. + +In short, this is saying that people expect "23:50:00 + 10 minutes" to equal "00:00:00", but it's impossible to know for sure whether that's true unless you know the exact starting point so you can take leap seconds into consideration. + +In order to address this confusion, Joda-Time's Duration has methods like `standardDays`/`toStandardDays` and `standardHours`/`toStandardHours`, which are meant to indicate to the user that the number of milliseconds is based on the standard number of milliseconds in an hour, rather than the colloquial notion of an "hour". + +An approach like this could work for Rust, but this RFC is intentionally limited in scope to areas without substantial tradeoffs in an attempt to allow a minimal solution to progress more quickly. + +This proposal does not include a method to get a number of milliseconds from a `Duration`, because the number of milliseconds could exceed `u64`, and we would have to decide whether to return an `Option`, panic, or wait for a standard bignum. In the interest of limiting this proposal to APIs with a straight-forward design, this proposal defers such a method. + +# Drawbacks + +The main drawback to this proposal is that it is significantly more minimal than the existing `Duration` API. However, this API is quite sufficient for timeouts, and without the caveats in the existing `Duration` API. + +# Alternatives + +We could stabilize the existing `Duration` API. However, it has a number of serious caveats: + +* The caveats described above about some of the units it supports. +* It supports converting a `Duration` into a number of microseconds or + nanoseconds. Because that cannot be done reliably, those methods + return `Option`s, and APIs that need to convert `Duration` into + nanoseconds have to re-surface the `Option` (unergonomic) or panic. +* More generally, it has a fairly large API surface area, and almost + every method has some caveat that would need to be explored in order + to stabilize it. + +--- + +We could also include a number of convenience APIs that convert from other units into `Duration`s. This proposal assumes that some of those conveniences will eventually be added. However, the design of each of those conveniences is ambiguous, so they are not included in this initial proposal. + +--- + +Finally, we could avoid any API for timeouts, and simply take milliseconds throughout the standard library. However, this has two drawbacks. + +First, it does not allow us to represent higher-precision timeouts on systems that could support them. + +Second, while this proposal does not yet include conveniences, it assumes that some conveniences should be added in the future once the design space is more fully explored. Starting with a simple type gives us space to grow into. + +# Unresolved questions + +* Should we implement all of the listed traits? Others? From fc47393ccd8abfe382c94a6b076700ba6091f05f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 26 Mar 2015 15:46:34 -0700 Subject: [PATCH 0210/1195] Clarify that exit(0) is ok --- text/0000-process-exit.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-process-exit.md b/text/0000-process-exit.md index bdce352025e..2e078f302ab 100644 --- a/text/0000-process-exit.md +++ b/text/0000-process-exit.md @@ -21,8 +21,8 @@ calls `env::set_exit_status`, then the process is not guaranteed to exit with that status (e.g. Rust was called from C). The purpose of this RFC is to provide at least one method on the path to -stabilization which will provide a method to exit a process with a nonzero exit -code. +stabilization which will provide a method to exit a process with an arbitrary +exit code. # Detailed design From 5ba01a5cc2911f3bda07ad61bec4473bdc6684c5 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 27 Mar 2015 18:01:25 -0400 Subject: [PATCH 0211/1195] Initial draft. --- text/0000-rebalancing-coherence.md | 293 +++++++++++++++++++++++++++++ 1 file changed, 293 insertions(+) create mode 100644 text/0000-rebalancing-coherence.md diff --git a/text/0000-rebalancing-coherence.md b/text/0000-rebalancing-coherence.md new file mode 100644 index 00000000000..9f6c6b09553 --- /dev/null +++ b/text/0000-rebalancing-coherence.md @@ -0,0 +1,293 @@ +- Feature Name: fundamental_attribute +- Start Date: 2015-03-27 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +## Summary + +This RFC proposes two rule changes: + +1. Modify the orphan rules so that impls of remote traits require a + local type that is either a struct/enum/trait defined in the + current crate `LT = LocalTypeConstructor<...>` or a reference to a + local type `LT = ... | < | &mut LT`. +2. Restrict negative reasoning so it too obeys the orphan rules. +3. Introduce an unstable `#[fundamental]` attribute that can be used + to extend the above rules in select cases (details below). + +## Motivation + +The current orphan rules are oriented around allowing as many remote +traits as possible. As so often happens, giving power to one party (in +this case, downstream crates) turns out to be taking power away from +another (in this case, upstream crates). The problem is that due to +coherence, the ability to define impls is a zero-sum game: every impl +that is legal to add in a child crate is also an impl that a parent +crate cannot add without fear of breaking downstream crates. A +detailed look at these problems is +[presented here](https://gist.github.com/nikomatsakis/bbe6821b9e79dd3eb477); +this RFC doesn't go over the problems in detail, but will reproduce +some of the examples found in that document. + +This RFC proposes a shift that attempts to strike a balance between +the needs of downstream and upstream crates. In particular, we wish to +preserve the ability of upstream crates to add impls to traits that +they define, while still allowing downstream creates to define the +sorts of impls they need. + +While exploring the problem, we found that in practice remote impls +almost always are tied to a local type or a reference to a local +type. For example, here are some impls from the definition of `Vec`: + +```rust +// tied to Vec +impl Send for Vec + where T: Send + +// tied to &Vec +impl<'a,T> IntoIterator for &'a Vec +``` + +On this basis, we propose that we limit remote impls to require that +they include a type either defined in the current crate or a reference +to a type defined in the current crate. This is more restrictive than +the current definition, which merely requires a local type appear +*somewhere*. So, for example, under this definition `MyType` and +`&MyType` would be considered local, but `Box`, +`Option`, and `(MyType, i32)` would not. + +Furthermore, we limit the use of *negative reasoning* to obey the +orphan rules. That is, just as a crate cannot define an impl `Type: +Trait` unless `Type` or `Trait` is local, it cannot rely that `Type: +!Trait` holds unless `Type` or `Trait` is local. + +Together, these two changes cause very little code breakage while +retaining a lot of freedom to add impls in a backwards compatible +fashion. However, they are not quite sufficient to compile all the +most popular cargo crates (though they almost succeed). Therefore, we +propose an simple, unstable attribute `#[fundamental]` (described +below) that can be used to extend the system to accommodate some +additional patterns and types. This attribute is unstable because it +is not clear whether it will prove to be adequate or need to be +generalized; this part of the design can be considered somewhat +incomplete, and we expect to finalize it based on what we observe +after the 1.0 release. + +### Practical effect + +#### Effect on parent crates + +When you first define a trait, you must also decide whether that trait +should have (a) a blanket impls for all `T` and (b) any blanket impls +over references. These blanket impls cannot be added later without a +major vesion bump, for fear of breaking downstream clients. + +Here are some examples of the kinds of blanket impls that must be added +right away: + +```rust +impl Bar for T { } +impl<'a,T:Bar> Bar for &'a T { } +``` + +#### Effect on child crates + +Under the base rules, child crates are limited to impls that use local +types or references to local types. They are also prevented from +relying on the fact that `Type: !Trait` unless either `Type` or +`Trait` is local. This turns out to be have very little impact. + +In compiling the libstd facade and librustc, exactly two impls were +found to be illegal, both of which followed the same pattern: + +```rust +struct LinkedListEntry<'a> { + data: i32, + next: Option<&'a LinkedListEntry> +} + +impl<'a> Iterator for Option<&'a LinkedListEntry> { + type Item = i32; + + fn next(&mut self) -> Option { + if let Some(ptr) = *self { + *self = Some(ptr.next); + Some(ptr.data) + } else { + None + } + } +} +``` + +The problem here is that `Option<&LinkedListEntry>` is no longer +considered a local type. A similar restriction would be that one +cannot define an impl over `Box`; but this was not +observed in practice. + +Both of these restrictions can be overcome by using a new type. For +example, the code above could be changed so that instead of writing +the impl for `Option<&LinkedListEntry>`, we define a type `LinkedList` +that wraps the option and implement on that: + +```rust +struct LinkedListEntry<'a> { + data: i32, + next: LinkedList<'a> +} + +struct LinkedList<'a> { + data: Option<&'a LinkedListEntry> +} + +impl<'a> Iterator for LinkedList<'a> { + type Item = i32; + + fn next(&mut self) -> Option { + if let Some(ptr) = self.data { + *self = Some(ptr.next); + Some(ptr.data) + } else { + None + } + } +} +``` + +#### Errors from cargo and the fundamental attribute + +We also applied our prototype to all the "Most Downloaded" cargo +crates as well as the `iron` crate. That exercise uncovered a few +patterns that the simple rules presented thus far can't handle. + +The first is that it is common to implement traits over boxed trait +objects. For example, the `error` crate defines an impl: + +- `impl FromError for Box` + +Here, `Error` is a local trait defined in `error`, but `FromError` is +the trait from `libstd`. This impl would be illegal because +`Box` is not considered local as `Box` is not local. + +The second is that it is common to use `FnMut` in blanket impls, +similar to how the `Pattern` trait in `libstd` works. The `regex` crate +in particular has the following impls: + +- `impl<'t> Replacer for &'t str` +- `impl Replacer for F where F: FnMut(&Captures) -> String` +- these are in conflict because this requires that `&str: !FnMut`, and + neither `&str` nor `FnMut` are local to `regex` + +Given that overloading over closures is likely to be a common request, +and that the `Fn` traits are well-known, core traits tied to the call +operator, it seems reasonable to say that implementing a `Fn` trait is +itself a breaking change. (This is not to suggest that there is +something *fundamental* about the `Fn` traits that distinguish them +from all other traits; just that if the goal is to have rules that +users can easily remember, saying that implememting a core operator +trait is a breaking change may be a reasonable rule, and it enables +useful patterns to boot -- patterns that are baked into the libstd +APIs.) + +To accommodate these cases (and future cases we will no doubt +encounter), this RFC proposes an unstable attribute +`#[fundamental]`. `#[fundamental]` can be applied to types and traits +with the following meaning: + +- A `#[fundamental]` type `Foo` is one where implementing a blanket + impl over `Foo` is a breaking change. As described, `&` and `&mut` are + fundamental. This attribute would be applied to `Box`, making `Box` + behave the same as `&` and `&mut` with respect to coherence. +- A `#[fundamental]` trait `Foo` is one where adding an impl of `Foo` + for an existing type is a breaking change. For now, the `Fn` traits + and `Sized` would be marked fundamental, though we may want to + extend this set to all operators or some other + more-easily-remembered set. + +The `#[fundamental]` attribute is intended to be a kind of "minimal +commitment" that still permits the most important impl patterns we see +in the wild. Because it is unstable, it can only be used within libstd +for now. We are eventually committed to finding some way to +accommodate the patterns above -- which could be as simple as +stabilizing `#[fundamental]` (or, indeed, reverting this RFC +altogether). It could also be a more general mechanism that lets users +specify more precisely what kind of impls are reserved for future +expansion and which are not. + +## Detailed Design + +### Proposed orphan rules + +Given an impl `impl Trait for T0`, either `Trait` +must be local to the current crate, or: + +1. At least one type must meet the `LT` pattern defined above. Let + `Ti` be the first such type. +2. No type parameters `P1...Pn` may appear in the type parameters that + precede `Ti` (that is, `Tj` where `j < i`). + +### Type locality and negative reasoning + +Currently the overlap check employs negative reasoning to segregate +blanket impls from other impls. For example, the following pair of +impls would be legal only if `MyType: !Copy` for all `U` (the +notation `Type: !Trait` is borrowed from [RFC 586][586]): + +```rust +impl Clone for T {..} +impl Clone for MyType {..} +``` + +[586]: https://github.com/rust-lang/rfcs/pull/586 + +This proposal places limits on negative reasoning based on the orphan +rules. Specifically, we cannot conclude that a proposition like `T0: +!Trait` holds unless `T0: Trait` meets the orphan +rules as defined in the previous section. + +In practice this means that, by default, you can only assume negative +things about traits and types defined in your current crate, since +those are under your direct control. This permits parent crates to add +any impls except for blanket impls over `T`, `&T`, or `&mut T`, as +discussed before. + +### Effect on ABI compatibility and semver + +We have not yet proposed a comprehensive semver RFC (it's +coming). However, this RFC has some effect on what that RFC would say. +As discussed above, it is a breaking change for to add a blanket impl +for a `#[fundamental]` type. It is also a breaking change to add an +impl of a `#[fundamental]` trait to an existing type. + +# Drawbacks + +The primary drawback is that downstream crates cannot write an impl +over types other than references, such as `Option`. This +can be overcome by defining wrapper structs (new types), but that can +be annoying. + +# Alternatives + +- **Status quo.** In the status quo, the balance of power is heavily + tilted towards child crates. Parent crates basically cannot add any + impl for an existing trait to an existing type without potentially + breaking child crates. + +- **Take a hard line.** We could forego the `#[fundamental]` attribute, but + it would force people to forego `Box` impls as well as the + useful closure-overloading pattern. This seems + unfortunate. Moreover, it seems likely we will encounter further + examples of "reasonable cases" that `#[fundamental]` can easily + accommodate. + +- **Specializations, negative impls, and contracts.** The gist + referenced earlier includes [a section][c] covering various + alternatives that I explored which came up short. These include + specialization, explicit negative impls, and explicit contracts + between the trait definer and the trait consumer. + +# Unresolved questions + +None. + +[c]: https://gist.github.com/nikomatsakis/bbe6821b9e79dd3eb477#file-c-md From 0f89c4a86d1b9007b50c2b659b6a062fe137d4f9 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 31 Mar 2015 15:34:52 -0700 Subject: [PATCH 0212/1195] RFC 979 is Align splitn with other languages --- README.md | 1 + ...languages.md => 0979-align-splitn-with-other-languages.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-align-splitn-with-other-languages.md => 0979-align-splitn-with-other-languages.md} (96%) diff --git a/README.md b/README.md index 95154240c62..c54ae381e90 100644 --- a/README.md +++ b/README.md @@ -44,6 +44,7 @@ the direction the language is evolving in. * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) +* [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-align-splitn-with-other-languages.md b/text/0979-align-splitn-with-other-languages.md similarity index 96% rename from text/0000-align-splitn-with-other-languages.md rename to text/0979-align-splitn-with-other-languages.md index 8c19956f61b..2397d833203 100644 --- a/text/0000-align-splitn-with-other-languages.md +++ b/text/0979-align-splitn-with-other-languages.md @@ -1,7 +1,7 @@ - Feature Name: n/a - Start Date: 2015-03-15 -- RFC PR: -- Rust Issue: +- RFC PR: https://github.com/rust-lang/rfcs/pull/979 +- Rust Issue: https://github.com/rust-lang/rust/issues/23911 # Summary From d714fe2c7d512faafadacb66c580c189edf1f74b Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 31 Mar 2015 15:40:09 -0700 Subject: [PATCH 0213/1195] RFC 1011 is process::exit --- README.md | 1 + text/{0000-process-exit.md => 1011-process.exit.md} | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) rename text/{0000-process-exit.md => 1011-process.exit.md} (98%) diff --git a/README.md b/README.md index c54ae381e90..7f46b6f3aa9 100644 --- a/README.md +++ b/README.md @@ -45,6 +45,7 @@ the direction the language is evolving in. * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) +* [1011-process.exit.md](text/1011-process.exit.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-process-exit.md b/text/1011-process.exit.md similarity index 98% rename from text/0000-process-exit.md rename to text/1011-process.exit.md index 2e078f302ab..e38e2bfdf90 100644 --- a/text/0000-process-exit.md +++ b/text/1011-process.exit.md @@ -1,6 +1,6 @@ - Feature Name: exit - Start Date: 2015-03-24 -- RFC PR: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1011 - Rust Issue: (leave this empty) # Summary From 8979e1ee8e7af877c24ef409ca71070df51213ba Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 31 Mar 2015 16:03:50 -0700 Subject: [PATCH 0214/1195] RFC 1023 is Rebalancing Coherence --- README.md | 1 + ...oherence.md => 1023-rebalancing-coherence.md} | 16 ++++++++-------- 2 files changed, 9 insertions(+), 8 deletions(-) rename text/{0000-rebalancing-coherence.md => 1023-rebalancing-coherence.md} (98%) diff --git a/README.md b/README.md index 7f46b6f3aa9..58c59ebd789 100644 --- a/README.md +++ b/README.md @@ -46,6 +46,7 @@ the direction the language is evolving in. * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) * [1011-process.exit.md](text/1011-process.exit.md) +* [1023-rebalancing-coherence.md](text/1023-rebalancing-coherence.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-rebalancing-coherence.md b/text/1023-rebalancing-coherence.md similarity index 98% rename from text/0000-rebalancing-coherence.md rename to text/1023-rebalancing-coherence.md index 9f6c6b09553..9e25997be0b 100644 --- a/text/0000-rebalancing-coherence.md +++ b/text/1023-rebalancing-coherence.md @@ -1,7 +1,7 @@ - Feature Name: fundamental_attribute - Start Date: 2015-03-27 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1023 +- Rust Issue: https://github.com/rust-lang/rust/issues/23918 ## Summary @@ -14,7 +14,7 @@ This RFC proposes two rule changes: 2. Restrict negative reasoning so it too obeys the orphan rules. 3. Introduce an unstable `#[fundamental]` attribute that can be used to extend the above rules in select cases (details below). - + ## Motivation The current orphan rules are oriented around allowing as many remote @@ -47,7 +47,7 @@ impl Send for Vec // tied to &Vec impl<'a,T> IntoIterator for &'a Vec ``` - + On this basis, we propose that we limit remote impls to require that they include a type either defined in the current crate or a reference to a type defined in the current crate. This is more restrictive than @@ -142,7 +142,7 @@ struct LinkedList<'a> { impl<'a> Iterator for LinkedList<'a> { type Item = i32; - + fn next(&mut self) -> Option { if let Some(ptr) = self.data { *self = Some(ptr.next); @@ -225,7 +225,7 @@ must be local to the current crate, or: `Ti` be the first such type. 2. No type parameters `P1...Pn` may appear in the type parameters that precede `Ti` (that is, `Tj` where `j < i`). - + ### Type locality and negative reasoning Currently the overlap check employs negative reasoning to segregate @@ -279,13 +279,13 @@ be annoying. unfortunate. Moreover, it seems likely we will encounter further examples of "reasonable cases" that `#[fundamental]` can easily accommodate. - + - **Specializations, negative impls, and contracts.** The gist referenced earlier includes [a section][c] covering various alternatives that I explored which came up short. These include specialization, explicit negative impls, and explicit contracts between the trait definer and the trait consumer. - + # Unresolved questions None. From e6715f0828b1da705675823a3d311607fe7d2ab0 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 1 Apr 2015 11:19:05 -0700 Subject: [PATCH 0215/1195] Update tracking issue for RFC 1023 --- text/1023-rebalancing-coherence.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/1023-rebalancing-coherence.md b/text/1023-rebalancing-coherence.md index 9e25997be0b..fc3ff424fe2 100644 --- a/text/1023-rebalancing-coherence.md +++ b/text/1023-rebalancing-coherence.md @@ -1,7 +1,7 @@ -- Feature Name: fundamental_attribute +- Feature Name: `fundamental_attribute` - Start Date: 2015-03-27 -- RFC PR: https://github.com/rust-lang/rfcs/pull/1023 -- Rust Issue: https://github.com/rust-lang/rust/issues/23918 +- RFC PR: [rust-lang/rfcs#1023](https://github.com/rust-lang/rfcs/pull/1023) +- Rust Issue: [rust-lang/rust#23086](https://github.com/rust-lang/rust/issues/23086) ## Summary From 8e1f46bbb43c665cbe5ee99ac133399001db970a Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 3 Apr 2015 10:55:08 -0700 Subject: [PATCH 0216/1195] RFC for pre-1.0 prelude additions --- text/0000-prelude-additions.md | 56 ++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) create mode 100644 text/0000-prelude-additions.md diff --git a/text/0000-prelude-additions.md b/text/0000-prelude-additions.md new file mode 100644 index 00000000000..33caaa9a0a8 --- /dev/null +++ b/text/0000-prelude-additions.md @@ -0,0 +1,56 @@ +- Feature Name: NA +- Start Date: 2015-04-03 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add `Default`, `IntoIterator` and `ToOwned` trait to the prelude. + +# Motivation + +Each trait has a distinct motivation: + +* For `Default`, the ergonomics have vastly improved now that you can + write `MyType::default()` (thanks to UFCS). Thanks to this + improvement, it now makes more sense to promote widespread use of + the trait. + +* For `IntoIterator`, promoting to the prelude will make it feasible + to deprecate the inherent `into_iter` methods and directly-exported + iterator types, in favor of the trait (which is currently redundant). + +* For `ToOwned`, promoting to the prelude would add a uniform, + idiomatic way to acquire an owned copy of data (including going from + `str` to `String`, for which `Clone` does not work). + +# Detailed design + +* Add `Default`, `IntoIterator` and `ToOwned` trait to the prelude. + +* Deprecate inherent `into_iter` methods. + +* Ultimately deprecate module-level `IntoIter` types (e.g. in `vec`); + this may want to wait until you can write `Vec::IntoIter` rather + than ` as IntoIterator>::IntoIter`. + +# Drawbacks + +The main downside is that prelude entries eat up some amount of +namespace (particularly, method namespace). However, these are all +important, core traits in `std`, meaning that the method names are +already quite unlikely to be used. + +Strictly speaking, a prelude addition is a breaking change, but as +above, this is highly unlikely to cause actual breakage. In any case, +it can be landed prior to 1.0. + +# Alternatives + +None. + +# Unresolved questions + +The exact timeline of deprecation for `IntoIter` types. + +Are there other traits or types that should be promoted before 1.0? From 38024b408dcee646be63b0a0cced408ba2901d2b Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 6 Apr 2015 10:24:56 -0400 Subject: [PATCH 0217/1195] Update language as requested by @brson, add a history section, and add links into README.md --- README.md | 1 + text/{0000-const-fn.md => 0911-const-fn.md} | 24 +++++++++++++++------ 2 files changed, 19 insertions(+), 6 deletions(-) rename text/{0000-const-fn.md => 0911-const-fn.md} (91%) diff --git a/README.md b/README.md index 58c59ebd789..53eedd22a4c 100644 --- a/README.md +++ b/README.md @@ -43,6 +43,7 @@ the direction the language is evolving in. * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) +* [0911-const-fn.md](text/0911-const-fn.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) * [1011-process.exit.md](text/1011-process.exit.md) diff --git a/text/0000-const-fn.md b/text/0911-const-fn.md similarity index 91% rename from text/0000-const-fn.md rename to text/0911-const-fn.md index 5414625abf8..34fa656696c 100644 --- a/text/0000-const-fn.md +++ b/text/0911-const-fn.md @@ -13,6 +13,7 @@ called in constants contexts, with constant arguments. As it is right now, `UnsafeCell` is a stabilization and safety hazard: the field it is supposed to be wrapping is public. This is only done out of the necessity to initialize static items containing atomics, mutexes, etc. - for example: + ```rust #[lang="unsafe_cell"] struct UnsafeCell { pub value: T } @@ -46,9 +47,17 @@ for such features. The design should be as simple as it can be, while keeping enough functionality to solve the issues mentioned above. -The intention is to have something usable at 1.0 without limiting what we can -in the future. Compile-time pure constants (the existing `const` items) with -added parametrization over types and values (arguments) should suffice. + +The intention of this RFC is to introduce a minimal change that +enables safe abstraction resembling the kind of code that one writes +outside of a constant. Compile-time pure constants (the existing +`const` items) with added parametrization over types and values +(arguments) should suffice. + +This RFC explicitly does not introduce a general CTFE mechanism. In +particular, conditional branching and virtual dispatch are still not +supported in constant expressions, which imposes a severe limitation +on what one can express. # Detailed design @@ -171,9 +180,6 @@ after 1.0. # Alternatives -* Not do anything for 1.0. This would result in some APIs being crippled and -serious backwards compatibility issues - `UnsafeCell`'s `value` field cannot -simply be removed later. * While not an alternative, but rather a potential extension, I want to point out there is only way I could make `const fn`s work with traits (in an untested design, that is): qualify trait implementations and bounds with `const`. @@ -214,3 +220,9 @@ algorithm that can handle *at least* tail recursion. Also, there is no way to actually write a recursive `const fn` at this moment, because no control flow primitives are implemented for constants, but that cannot be taken for granted, at least `if`/`else` should eventually work. + +# History + +- This RFC was accepted on 2015-04-06. The primary concerns raised in + the discussion concerned CTFE, and whether the `const fn` strategy + locks us into an undesirable plan there. From b9225751ebe57143fd0d628ac3585a7569f83a31 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 6 Apr 2015 10:26:11 -0400 Subject: [PATCH 0218/1195] Add links. --- text/0911-const-fn.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0911-const-fn.md b/text/0911-const-fn.md index 34fa656696c..38dc58809ee 100644 --- a/text/0911-const-fn.md +++ b/text/0911-const-fn.md @@ -1,7 +1,7 @@ - Feature Name: const_fn - Start Date: 2015-02-25 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#911](https://github.com/rust-lang/rfcs/pull/911) +- Rust Issue: [rust-lang/rust#24111](https://github.com/rust-lang/rust/issues/24111) # Summary From fe16700f941b7776236e50255e715768e8dc2b04 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Mon, 6 Apr 2015 10:43:44 -0700 Subject: [PATCH 0219/1195] Accepted ompiler fences RFC --- README.md | 1 + ...-fence-intrinsics.md => 0888-compiler-fence-intrinsics.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-compiler-fence-intrinsics.md => 0888-compiler-fence-intrinsics.md} (94%) diff --git a/README.md b/README.md index 53eedd22a4c..54a40128530 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,7 @@ the direction the language is evolving in. * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) +* [0888-compiler-fences.md](text/0888-compiler-fences.md) * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0911-const-fn.md](text/0911-const-fn.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) diff --git a/text/0000-compiler-fence-intrinsics.md b/text/0888-compiler-fence-intrinsics.md similarity index 94% rename from text/0000-compiler-fence-intrinsics.md rename to text/0888-compiler-fence-intrinsics.md index 9b5b3b5887d..9cb399c576f 100644 --- a/text/0000-compiler-fence-intrinsics.md +++ b/text/0888-compiler-fence-intrinsics.md @@ -1,7 +1,7 @@ - Feature Name: compiler_fence_intrinsics - Start Date: 2015-02-19 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#888](https://github.com/rust-lang/rfcs/pull/888) +- Rust Issue: [rust-lang/rust#24118](https://github.com/rust-lang/rust/issues/24118) # Summary From d46a957a7605238fe91455efdb68a8534b24cdb6 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 6 Apr 2015 10:55:46 -0700 Subject: [PATCH 0220/1195] Move small-base-lexing out of 0000 --- text/{0000-small-base-lexing.md => 0879-small-base-lexing.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-small-base-lexing.md => 0879-small-base-lexing.md} (94%) diff --git a/text/0000-small-base-lexing.md b/text/0879-small-base-lexing.md similarity index 94% rename from text/0000-small-base-lexing.md rename to text/0879-small-base-lexing.md index 2936b7e6b79..347047d603e 100644 --- a/text/0000-small-base-lexing.md +++ b/text/0879-small-base-lexing.md @@ -1,7 +1,7 @@ - Feature Name: stable, it only restricts the language - Start Date: 2015-02-17 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#879](https://github.com/rust-lang/rfcs/pull/879) +- Rust Issue: [rust-lang/rust#23872](https://github.com/rust-lang/rust/pull/23872) # Summary From f3153e669c807e844f82bf361fdb6dd106eb8d49 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 6 Apr 2015 10:53:52 -0700 Subject: [PATCH 0221/1195] RFC: Expand the scope of `std::fs` Expand the scope of the `std::fs` module by enhancing existing functionality, exposing lower level representations, and adding a few more new functions. --- text/0000-io-fs-2.1.md | 394 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 394 insertions(+) create mode 100644 text/0000-io-fs-2.1.md diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md new file mode 100644 index 00000000000..34e50f37031 --- /dev/null +++ b/text/0000-io-fs-2.1.md @@ -0,0 +1,394 @@ +- Feature Name: `fs2` +- Start Date: 2015-04-04 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Expand the scope of the `std::fs` module by enhancing existing functionality, +exposing lower level representations, and adding a few more new functions. + +# Motivation + +The current `std::fs` module serves many of the basic needs of interacting with +a filesystem, but it falls short of binding a good bit more of useful +functionality. For example, none of these operations are possible in stable Rust +today: + +* Inspecting a file's modification/access times +* Reading low-level information like that contained in `libc::stat` +* Inspecting the unix permission bits on a file +* Blanket setting the unix permission bits on a file +* Leveraging `DirEntry` for the extra metadata it might contain +* Testing whether two paths are equivalent (point to the same file) +* Reading the metadata of a soft link (not what it points at) +* Resolving all soft links in a path + +There is some more functionality listed in the [RFC issue][issue], but this RFC +will not attempt to solve the entirety of that issue at this time. This RFC +strives to expose APIs for much of the functionality listed above that is on the +track to becoming `#[stable]` soon. + +[issue]: https://github.com/rust-lang/rfcs/issues/939 + +## Non-goals of this RFC + +There are a few areas of the `std::fs` API surface which are **not** considered +goals for this RFC. It will be left for future RFCs to add new APIs for these +areas: + +* Enhancing `copy` to copy directories recursively or configuring how copying + happens. +* Enhancing or stabilizing `walk` and its functionality. +* Temporary files or directories + +# Detailed design + +## Lowering `Metadata` + +Currently the `Metadata` structure exposes very few pieces of information about +a file. Some of this is because the information is not available across all +platforms, but some of it is also because the standard library does not have the +appropriate abstraction to return at this time (e.g. time stamps). The raw +contents of `Metadata`, however, should be accessible no matter what. + +The following trait hierarchy and new structures will be added to the standard +library. + +```rust +mod os::windows::fs { + pub trait MetadataExt { + fn file_attributes(&self) -> u32; // `dwFileAttributes` field + fn creation_time(&self) -> u64; // `ftCreationTime` field + fn last_access_time(&self) -> u64; // `ftLastAccessTime` field + fn last_write_time(&self) -> u64; // `ftLastWriteTime` field + fn file_size(&self) -> u64; // `nFileSizeHigh`/`nFileSizeLow` fields + } + impl MetadataExt for fs::Metadata { ... } +} + +mod os::unix::fs { + pub trait MetadataExt { + fn as_raw(&self) -> &raw::Metadata; + } + impl MetadataExt for fs::Metadata { ... } + + pub struct RawMetadata(libc::stat); + impl RawMetadata { + // Accessors for fields available in `libc::stat` for *all* platforms + fn dev(&self) -> libc::dev_t; // st_dev field + fn ino(&self) -> libc::ino_t; // st_ino field + fn raw_mode(&self) -> libc::mode_t; // st_mode field + fn nlink(&self) -> libc::nlink_t; // st_nlink field + fn uid(&self) -> libc::uid_t; // st_uid field + fn gid(&self) -> libc::gid_t; // st_gid field + fn rdev(&self) -> libc::dev_t; // st_rdev field + fn size(&self) -> libc::off_t; // st_size field + fn blksize(&self) -> libc::blksize_t; // st_blksize field + fn blocks(&self) -> libc::blkcnt_t; // st_blocks field + fn atime(&self) -> (i64, i32); // st_atime field, (sec, nsec) + fn mtime(&self) -> (i64, i32); // st_mtime field, (sec, nsec) + fn ctime(&self) -> (i64, i32); // st_ctime field, (sec, nsec) + } +} + +// st_flags, st_gen, st_lspare, st_birthtim, st_qspare +mod os::{linux, macos, freebsd, ...}::fs { + pub struct stat { /* same public fields as libc::stat */ } + pub trait MetadataExt { + fn as_raw_stat(&self) -> &stat; + } + impl MetadataExt for os::unix::fs::RawMetadata { ... } + impl MetadataExt for fs::Metadata { ... } +} +``` + +The goal of this hierarchy is to expose all of the information in the OS-level +metadata in as cross-platform of a method as possible while adhering to the +design principles of the standard library. + +The interesting part about working in a "cross platform" manner here is that the +makeup of `libc::stat` on unix platforms can vary quite a bit between platforms. +For example not some platforms have a `st_birthtim` field while others do not. +To enable as much ergonomic usage as possible, the `os::unix` module will expose +the *intersection* of metadata available in `libc::stat` across all unix +platforms. The information is still exposed in a raw fashion (in terms of the +values returned), but methods are required as the raw structure is not exposed. +The unix platforms then leverage the more fine-grained modules in `std::os` +(e.g. `linux` and `macos`) to return the raw `libc::stat` structure. This will +allow full access to the information in `libc::stat` in all platforms with clear +opt-in to when you're using platform-specific information. + +One of the major goals of the `os::unix::fs` design is to enable as much +functionality as possible when programming against "unix in general" while still +allowing applications to choose to only program against macos, for example. + +### Fate of `Metadata::{accesed, modified}` + +At this time there is no suitable type in the standard library to represent the +return type of these two functions. The type would either have to be some form +of time stamp or moment in time, both of which are difficult abstractions to add +lightly. + +Consequently, both of these functions will be **deprecated** in favor of +requiring platform-specific code to access the modification/access time of +files. This information is all available via the `MetadataExt` traits listed +above. + +## Lowering and setting `Permissions` + +> **Note**: this section only describes behavior on unix. + +Currently there is no stable method of inspecting the permission bits on a file, +and it is unclear whether the current unstable methods of doing so, +`PermissionsExt::mode`, should be stabilized. The main question around this +piece of functionality is whether to provide a higher level abstractiong (e.g. +similar to the `bitflags` crate) for the permission bits on unix. + +This RFC proposes renaming `mode` and `set_mode` on `PermissionsExt` and +`OpenOptionsExt` to `raw_mode` and `set_raw_mode` in order enable an addition of +a higher-level `Mode` abstraction in the future. This is also the rationale for +naming the accessor of `st_mode` on `RawMetadata` as `raw_mode`. + +Finally, the `set_permissions` function of the `std::fs` module is also proposed +to be marked `#[stable]` soon as a method of blanket setting permissions for a +file. + +## Constructing `Permissions` + +Currently there is no method to construct an instance of `Permissions` in a +cross-platform manner. This RFC proposes adding the following APIs: + +```rust +impl Permissions { + /// Creates a new set of permissions appropriate for being placed on a file. + /// + /// On unix platforms this corresponds to the permission bits `0o666` + pub fn new() -> Permissions; + + /// Creates a new set of permissions which when applied to a file will make + /// it read-only for all users. + /// + /// On unix platforms this corresponds to the permission bits `0o444`. + pub fn new_readonly() -> Permissions; +} + +mod os::unix::fs { + pub trait PermissionsExt { + fn from_raw_mode(mode: i32) -> Self; + } + impl PermissionsExt for Permissions { ... } +} +``` + +## Creating directories with permissions + +Currently the standard library does not expose an API which allows setting the +permission bits on unix or security attributes on Windows. This RFC proposes +adding the following API to `std::fs`: + +```rust +pub struct CreateDirOptions { ... } + +impl CreateDirOptions { + /// Creates a new set of options with default mode/security settings for all + /// platforms and also non-recursive. + pub fn new() -> Self; + + /// Indicate that directories create should be created recursively, creating + /// all parent directories if they do not exist with the same security and + /// permissions settings. + pub fn recursive(&mut self, recursive: bool) -> &mut Self; + + /// Use the specified directory as a "template" for permissions and security + /// settings of the new directories to be created. + /// + /// On unix this will issue a `stat` of the specified directory and new + /// directories will be created with the same permission bits. On Windows + /// this will trigger the use of the `CreateDirectoryEx` function. + pub fn template>(&mut self, path: P) -> &mut Self; + + /// Create the specified directory with the options configured in this + /// builder. + pub fn create>(&self, path: P) -> io::Result<()>; +} + +mod os::unix::fs { + pub trait CreateDirOptionsExt { + fn raw_mode(&mut self, mode: i32) -> &mut Self; + } + impl CreateDirOptionsExt for CreateDirOptions { ... } +} + +mod os::windows::fs { + // once a `SECURITY_ATTRIBUTES` abstraction exists, this will be added + pub trait CreateDirOptionsExt { + fn security_attributes(&mut self, ...) -> &mut Self; + } + impl CreateDirOptionsExt for CreateDirOptions { ... } +} + +``` + +## Adding `fs::equivalent` + +A new function `equivalent` will be added to the `fs` module along the lines of +[C++'s equivalent function][cpp-equivalent]: + +[cpp-equivalent]: http://en.cppreference.com/w/cpp/experimental/fs/equivalent + +```rust +/// Test whether the two paths provided are equivalent references to the same +/// file or directory. +/// +/// This function will ensure that the two paths have the same status and refer +/// to the same file system entity (e.g. at the same phyical location). +pub fn equivalent, Q: AsRef>(p: P, q: Q) -> bool; +``` + +## Enhancing soft link support + +Currently the `std::fs` module provides a `soft_link` and `read_link` function, +but there is no method of doing other soft link related tasks such as: + +* Testing whether a file is a soft link +* Reading the metadata of a soft link, not what it points to + +The following APIs will be added to `std::fs`: + +```rust +/// Returns the metadata of the file pointed to by `p`, and this function, +/// unlike `metadata` will **not** follow soft links. +pub fn soft_link_metadata>(p: P) -> io::Result; + +impl Metadata { + /// Tests whether this metadata is for a soft link or not. + pub fn is_soft_link(&self) -> bool; +} +``` + +## Binding `realpath` + +There's a [long-standing issue][realpath] that the unix function `realpath` is +not bound, and this RFC proposes adding the following API to the `fs` module: + +[realpath]: https://github.com/rust-lang/rust/issues/11857 + +```rust +/// Canonicalizes the given file name to an absolute path with all `..`, `.`, +/// and soft link components resolved. +/// +/// On unix this function corresponds to the return value of the `realpath` +/// function, and on Windows this corresponds to the `GetFullPathName` function. +/// +/// Note that relative paths given to this function will use the current working +/// directory as a base, and the current working directory is not managed in a +/// thread-local fashion, so this function may need to be synchronized with +/// other calls to `env::change_dir`. +pub fn canonicalize>(p: P) -> io::Result; +``` + +## Tweaking `PathExt` + +Currently the `PathExt` trait is unstable, yet it is quite convenient! The main +motivation for its `#[unstable]` tag is that it is unclear how much +functionality should be on `PathExt` versus the `std::fs` module itself. +Currently a small subset of functionality is offered, but it is unclear what the +guiding principle for the contents of this trait are. + +This RFC proposes a few guiding principles for this trait: + +* Only read-only operations in `std::fs` will be exposed on `PathExt`. All + operations which require modifications to the filesystem will require calling + methods through `std::fs` itself. + +* Some inspection methods on `Metadata` will be exposed on `PathExt`, but only + those where it logically makes sense for `Path` to be the `self` receiver. For + example `PathExt::len` will not exist (size of the file), but + `PathExt::is_dir` will exist. + +Concretely, the `PathExt` trait will be expanded to: + +```rust +pub trait PathExt { + fn exists(&self) -> bool; + fn is_dir(&self) -> bool; + fn is_file(&self) -> bool; + fn is_soft_link(&self) -> bool; + fn metadata(&self) -> io::Result; + fn soft_link_metadata(&self) -> io::Result; + fn canonicalize(&self) -> io::Result; + fn read_link(&self) -> io::Result; + fn read_dir(&self) -> io::Result; + fn equivalent>(&self, p: P) -> bool; +} + +impl PathExt for Path { ... } +``` + +## Expanding `DirEntry` + +Currently the `DirEntry` API is quite minimalistic, exposing very few of the +underlying attributes. Platforms like Windows actually contain an entire +`Metadata` inside of a `DirEntry`, enabling much more efficient walking of +directories in some situations. + +The following APIs will be added to `DirEntry`: + +```rust +impl DirEntry { + /// This function will return the filesystem metadata for this directory + /// entry. This is equivalent to calling `fs::soft_link_metadata` on the + /// path returned. + /// + /// On Windows this function will always return `Ok` and will not issue a + /// system call, but on unix this will always issue a call to `stat` to + /// return metadata. + pub fn metadata(&self) -> io::Result; + + /// Accessors for testing what file type this `DirEntry` contains. + /// + /// On some platforms this may not require reading the metadata of the + /// underlying file from the filesystem, but on other platforms it may be + /// required to do so. + pub fn is_dir(&self) -> bool; + pub fn is_file(&self) -> bool; + pub fn is_soft_link(&self) -> bool; + // ... + + /// Returns the file name for this directory entry. + pub fn file_name(&self) -> OsString; +} +``` + +# Drawbacks + +* This is quite a bit of surface area being added to the `std::fs` API, and it + may perhaps be best to scale it back and add it in a more incremental fashion + instead of all at once. Most of it, however, is fairly straightforward, so it + seems prudent to schedule many of these features for the 1.1 release. + +* Exposing raw information such as `libc::stat` or `WIN32_FILE_ATTRIBUTE_DATA` + possibly can hamstring altering the implementation in the future. At this + point, however, it seems unlikely that the exposed pieces of information will + be changing much. + +# Alternatives + +* Instead of exposing accessor methods in `MetadataExt` on Windows, the raw + `WIN32_FILE_ATTRIBUTE_DATA` could be returned. We may change, however, to + using `BY_HANDLE_FILE_INFORMATION` one day which would make the return value + from this function more difficult to implement. + +* A `std::os::MetadataExt` trait could be added to access truly common + information such as modification/access times across all platforms. The return + value would likely be a `u64` "something" and would be clearly documented as + being a lossy abstraction and also only having a platform-specific meaning. + +* The `PathExt` trait could perhaps be implemented on `DirEntry`, but it doesn't + necessarily seem appropriate for all the methods and using inherent methods + also seems more logical. + +# Unresolved questions + +None yet. From b3d1ad6b9cb521d088fd8731d1ab6eb433e54378 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 6 Apr 2015 14:02:21 -0700 Subject: [PATCH 0222/1195] Add vision for std::os, a few other tweaks --- text/0000-io-fs-2.1.md | 101 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 91 insertions(+), 10 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index 34e50f37031..48cb6017610 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -6,14 +6,13 @@ # Summary Expand the scope of the `std::fs` module by enhancing existing functionality, -exposing lower level representations, and adding a few more new functions. +exposing lower-level representations, and adding a few new functions. # Motivation The current `std::fs` module serves many of the basic needs of interacting with -a filesystem, but it falls short of binding a good bit more of useful -functionality. For example, none of these operations are possible in stable Rust -today: +a filesystem, but is missing a lot of useful functionality. For example, none of +these operations are possible in stable Rust today: * Inspecting a file's modification/access times * Reading low-level information like that contained in `libc::stat` @@ -44,13 +43,91 @@ areas: # Detailed design -## Lowering `Metadata` +## Lowering APIs + +### The vision for the `os` module + +One of the principles of [IO reform][io-reform-vision] was to: + +> Provide hooks for integrating with low-level and/or platform-specific APIs. + +The original RFC went into some amount of detail for how this would look, in +particular by use of the `os` module. Part of the goal of this RFC is to flesh +out that vision in more detail. + +Ultimately, the organization of `os` is planned to look something like the +following: + +``` +os + unix applicable to all cfg(unix) platforms; high- and low-level APIs + io extensions to std::io + fs extensions to std::fs + net extensions to std::net + env extensions to std::env + process extensions to std::process + ... + linux applicable to linux only + io, fs, net, env, process, ... + macos ... + windows ... +``` + +APIs whose behavior is platform-specific are provided only within the `std::os` +hierarchy, making it easy to audit for usage of such APIs. Organizing the +platform modules internally in the same way as `std` makes it easy to find +relevant extensions when working with `std`. + +It is emphatically *not* the goal of the `std::os::*` modules to provide +bindings to *all* system APIs for each platform; this work is left to external +crates. The goals are rather to: + +1. Facilitate interop between abstract types like `File` that `std` provides and + the underlying system. This is done via "lowering": extension traits like + [`AsRawFd`][AsRawFd] allow you to extract low-level, platform-specific + representations out of `std` types like `File` and `TcpStream`. + +2. Provide high-level but platform-specific APIs that feel like those in the + rest of `std`. Just as with the rest of `std`, the goal here is not to + include all possible functionality, but rather the most commonly-used or + fundamental. + +Lowering makes it possible for external crates to provide APIs that work +"seamlessly" with `std` abstractions. For example, a crate for Linux might +provide an `epoll` facility that can work directly with `std::fs::File` and +`std::net::TcpStream` values, completely hiding the internal use of file +descriptors. Eventually, such a crate could even be merged into `std::os::unix`, +with minimal disruption -- there is little distinction between `std` and other +crates in this regard. + +Concretely, lowering has two ingredients: + +1. Introducing one or more "raw" types that are generally direct aliases for C + types. + +2. Providing an extension trait that makes it possible to extract a raw type + from a `std` type. In some cases, it's possible to go the other way around as + well. The conversion can be by reference or by value, where the latter is + used mainly to avoid the destructor associated with a `std` type (e.g. to + extract a file descriptor from a `File` and eliminate the `File` object, + without closing the file). + +While we do not seek to exhaustively bind types or APIs from the underlying +system, it *is* a goal to provide lowering operations for every high-level type +to a system-level data type, whenever applicable. This RFC proposes several such +lowerings that are currently missing from `std::fs`. + +[io-reform-vision]: https://github.com/rust-lang/rfcs/blob/master/text/0517-io-os-reform.md#vision-for-io +[AsRawFd]: http://static.rust-lang.org/doc/master/std/os/unix/io/trait.AsRawFd.html + +### Lowering `Metadata` (all platforms) Currently the `Metadata` structure exposes very few pieces of information about a file. Some of this is because the information is not available across all platforms, but some of it is also because the standard library does not have the appropriate abstraction to return at this time (e.g. time stamps). The raw -contents of `Metadata`, however, should be accessible no matter what. +contents of `Metadata` (a `stat` on Unix), however, should be accessible via +lowering no matter what. The following trait hierarchy and new structures will be added to the standard library. @@ -109,7 +186,7 @@ design principles of the standard library. The interesting part about working in a "cross platform" manner here is that the makeup of `libc::stat` on unix platforms can vary quite a bit between platforms. -For example not some platforms have a `st_birthtim` field while others do not. +For example some platforms have a `st_birthtim` field while others do not. To enable as much ergonomic usage as possible, the `os::unix` module will expose the *intersection* of metadata available in `libc::stat` across all unix platforms. The information is still exposed in a raw fashion (in terms of the @@ -123,7 +200,7 @@ One of the major goals of the `os::unix::fs` design is to enable as much functionality as possible when programming against "unix in general" while still allowing applications to choose to only program against macos, for example. -### Fate of `Metadata::{accesed, modified}` +#### Fate of `Metadata::{accesed, modified}` At this time there is no suitable type in the standard library to represent the return type of these two functions. The type would either have to be some form @@ -135,7 +212,10 @@ requiring platform-specific code to access the modification/access time of files. This information is all available via the `MetadataExt` traits listed above. -## Lowering and setting `Permissions` +Eventually, once a `std` type for cross-platform timestamps is available, these +methods will be re-instated as returning that type. + +### Lowering and setting `Permissions` (Unix) > **Note**: this section only describes behavior on unix. @@ -391,4 +471,5 @@ impl DirEntry { # Unresolved questions -None yet. +* What is the ultimate role of crates like `liblibc`, and how do we draw the + line between them and `std::os` definitions? From 91c0df599a817082b86edbc4628d7688a031e85c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 7 Apr 2015 16:02:40 -0700 Subject: [PATCH 0223/1195] More feedback and tweaks --- text/0000-io-fs-2.1.md | 140 ++++++++++++++++++++++++++++++----------- 1 file changed, 103 insertions(+), 37 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index 48cb6017610..49a28fc9ed0 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -43,6 +43,12 @@ areas: # Detailed design +First, a vision for how lowering APIs in general will be presented, and then a +number of specific APIs will each be proposed. Many of the proposed APIs are +independent from one another and this RFC may not be implemented all-in-one-go +but instead piecemeal over time, allowing the designs to evolve slightly in the +meantime. + ## Lowering APIs ### The vision for the `os` module @@ -103,7 +109,7 @@ crates in this regard. Concretely, lowering has two ingredients: 1. Introducing one or more "raw" types that are generally direct aliases for C - types. + types (more on this in the next section). 2. Providing an extension trait that makes it possible to extract a raw type from a `std` type. In some cases, it's possible to go the other way around as @@ -120,6 +126,35 @@ lowerings that are currently missing from `std::fs`. [io-reform-vision]: https://github.com/rust-lang/rfcs/blob/master/text/0517-io-os-reform.md#vision-for-io [AsRawFd]: http://static.rust-lang.org/doc/master/std/os/unix/io/trait.AsRawFd.html +#### `std::os::platform::raw` + +Each of the primitives in the standard library will expose the ability to be +lowered into its component abstraction, facilitating the need to define these +abstractions and organize them in the platform-specific modules. This RFC +proposes the following guidelines for doing so: + +* Each platform will have a `raw` module inside of `std::os` which houses all of + its platform specific definitions. +* Only type definitions will be contained in `raw` modules, no function + bindings, methods, or trait implementations. +* Cross-platform types (e.g. those shared on all `unix` platforms) will be + located in the respective cross-platform module. Types which only differ in + the width of an integer type are considered to be cross-platform. +* Platform-specific types will exist only in the `raw` module for that platform. + A platform-specific type may have different field names, components, or just + not exist on other platforms. + +Differences in integer widths are not considered to be enough of a platform +difference to define in each separate platform's module, meaning that it will be +possible to write code that uses `os::unix` but doesn't compile on all Unix +platforms. It is believed that most consumers of these types will continue to +store the same type (e.g. not assume it's an `i32`) throughout the application +or immediately cast it to a known type. + +To reiterate, it is not planned for each `raw` module to provide *exhaustive* +bindings to each platform. Only those abstractions which the standard library is +lowering into will be defined in each `raw` module. + ### Lowering `Metadata` (all platforms) Currently the `Metadata` structure exposes very few pieces of information about @@ -146,23 +181,23 @@ mod os::windows::fs { mod os::unix::fs { pub trait MetadataExt { - fn as_raw(&self) -> &raw::Metadata; + fn as_raw(&self) -> &Metadata; } impl MetadataExt for fs::Metadata { ... } - pub struct RawMetadata(libc::stat); - impl RawMetadata { - // Accessors for fields available in `libc::stat` for *all* platforms - fn dev(&self) -> libc::dev_t; // st_dev field - fn ino(&self) -> libc::ino_t; // st_ino field - fn raw_mode(&self) -> libc::mode_t; // st_mode field - fn nlink(&self) -> libc::nlink_t; // st_nlink field - fn uid(&self) -> libc::uid_t; // st_uid field - fn gid(&self) -> libc::gid_t; // st_gid field - fn rdev(&self) -> libc::dev_t; // st_rdev field - fn size(&self) -> libc::off_t; // st_size field - fn blksize(&self) -> libc::blksize_t; // st_blksize field - fn blocks(&self) -> libc::blkcnt_t; // st_blocks field + pub struct Metadata(raw::stat); + impl Metadata { + // Accessors for fields available in `raw::stat` for *all* platforms + fn dev(&self) -> raw::dev_t; // st_dev field + fn ino(&self) -> raw::ino_t; // st_ino field + fn mode(&self) -> raw::mode_t; // st_mode field + fn nlink(&self) -> raw::nlink_t; // st_nlink field + fn uid(&self) -> raw::uid_t; // st_uid field + fn gid(&self) -> raw::gid_t; // st_gid field + fn rdev(&self) -> raw::dev_t; // st_rdev field + fn size(&self) -> raw::off_t; // st_size field + fn blksize(&self) -> raw::blksize_t; // st_blksize field + fn blocks(&self) -> raw::blkcnt_t; // st_blocks field fn atime(&self) -> (i64, i32); // st_atime field, (sec, nsec) fn mtime(&self) -> (i64, i32); // st_mtime field, (sec, nsec) fn ctime(&self) -> (i64, i32); // st_ctime field, (sec, nsec) @@ -171,9 +206,16 @@ mod os::unix::fs { // st_flags, st_gen, st_lspare, st_birthtim, st_qspare mod os::{linux, macos, freebsd, ...}::fs { - pub struct stat { /* same public fields as libc::stat */ } + pub mod raw { + pub type dev_t = ...; + pub type ino_t = ...; + // ... + pub struct stat { + // ... same public fields as libc::stat + } + } pub trait MetadataExt { - fn as_raw_stat(&self) -> &stat; + fn as_raw_stat(&self) -> &raw::stat; } impl MetadataExt for os::unix::fs::RawMetadata { ... } impl MetadataExt for fs::Metadata { ... } @@ -225,10 +267,31 @@ and it is unclear whether the current unstable methods of doing so, piece of functionality is whether to provide a higher level abstractiong (e.g. similar to the `bitflags` crate) for the permission bits on unix. -This RFC proposes renaming `mode` and `set_mode` on `PermissionsExt` and -`OpenOptionsExt` to `raw_mode` and `set_raw_mode` in order enable an addition of -a higher-level `Mode` abstraction in the future. This is also the rationale for -naming the accessor of `st_mode` on `RawMetadata` as `raw_mode`. +This RFC proposes considering the methods for stabilization as-is and not +pursuing a higher level abstraction of the unix permission bits. To facilitate +in their inspection and manipulation, however, the following constants will be +added: + +```rust +mod os::unix::fs { + pub const USER_READ: raw::mode_t; + pub const USER_WRITE: raw::mode_t; + pub const USER_EXECUTE: raw::mode_t; + pub const USER_RWX: raw::mode_t; + pub const OTHER_READ: raw::mode_t; + pub const OTHER_WRITE: raw::mode_t; + pub const OTHER_EXECUTE: raw::mode_t; + pub const OTHER_RWX: raw::mode_t; + pub const GROUP_READ: raw::mode_t; + pub const GROUP_WRITE: raw::mode_t; + pub const GROUP_EXECUTE: raw::mode_t; + pub const GROUP_RWX: raw::mode_t; + pub const ALL_READ: raw::mode_t; + pub const ALL_WRITE: raw::mode_t; + pub const ALL_EXECUTE: raw::mode_t; + pub const ALL_RWX: raw::mode_t; +} +``` Finally, the `set_permissions` function of the `std::fs` module is also proposed to be marked `#[stable]` soon as a method of blanket setting permissions for a @@ -245,17 +308,11 @@ impl Permissions { /// /// On unix platforms this corresponds to the permission bits `0o666` pub fn new() -> Permissions; - - /// Creates a new set of permissions which when applied to a file will make - /// it read-only for all users. - /// - /// On unix platforms this corresponds to the permission bits `0o444`. - pub fn new_readonly() -> Permissions; } mod os::unix::fs { pub trait PermissionsExt { - fn from_raw_mode(mode: i32) -> Self; + fn from_mode(mode: raw::mode_t) -> Self; } impl PermissionsExt for Permissions { ... } } @@ -280,14 +337,6 @@ impl CreateDirOptions { /// permissions settings. pub fn recursive(&mut self, recursive: bool) -> &mut Self; - /// Use the specified directory as a "template" for permissions and security - /// settings of the new directories to be created. - /// - /// On unix this will issue a `stat` of the specified directory and new - /// directories will be created with the same permission bits. On Windows - /// this will trigger the use of the `CreateDirectoryEx` function. - pub fn template>(&mut self, path: P) -> &mut Self; - /// Create the specified directory with the options configured in this /// builder. pub fn create>(&self, path: P) -> io::Result<()>; @@ -295,7 +344,7 @@ impl CreateDirOptions { mod os::unix::fs { pub trait CreateDirOptionsExt { - fn raw_mode(&mut self, mode: i32) -> &mut Self; + fn mode(&mut self, mode: raw::mode_t) -> &mut Self; } impl CreateDirOptionsExt for CreateDirOptions { ... } } @@ -307,9 +356,26 @@ mod os::windows::fs { } impl CreateDirOptionsExt for CreateDirOptions { ... } } +``` + +This sort of builder is also extendable to other flavors of functions in the +future, such as [C++'s template parameter][cpp-dir-template]: +[cpp-dir-template]: http://en.cppreference.com/w/cpp/experimental/fs/create_directory + +```rust +/// Use the specified directory as a "template" for permissions and security +/// settings of the new directories to be created. +/// +/// On unix this will issue a `stat` of the specified directory and new +/// directories will be created with the same permission bits. On Windows +/// this will trigger the use of the `CreateDirectoryEx` function. +pub fn template>(&mut self, path: P) -> &mut Self; ``` +At this time, however, it it not proposed to add this method to +`CreateDirOptions`. + ## Adding `fs::equivalent` A new function `equivalent` will be added to the `fs` module along the lines of From fb6d9cb937faa908d7ee8e4c037bc6cfa58202b1 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 8 Apr 2015 13:07:12 +0200 Subject: [PATCH 0224/1195] refine detail about what "old behavior" of `as` is. --- text/0560-integer-overflow.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0560-integer-overflow.md b/text/0560-integer-overflow.md index adedf371bb8..de14896fe01 100644 --- a/text/0560-integer-overflow.md +++ b/text/0560-integer-overflow.md @@ -489,8 +489,8 @@ Since it was accepted, the RFC has been updated as follows: capability for libstd to declare inherent methods on primitive integral types. 2. `as` was changed to restore the behavior before the RFC (that is, - it truncates, as a C cast would). - + it truncates to the target bitwidth and reinterprets the highest + order bit, a.k.a. sign-bit, as necessary, as a C cast would). # Acknowledgements and further reading From e6c485492ce6bf28fea4a343d71e98b15ef67151 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Apr 2015 09:58:23 -0700 Subject: [PATCH 0225/1195] Add an accessor for d_ino on unix --- text/0000-io-fs-2.1.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index 49a28fc9ed0..e0942d2cdf5 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -505,6 +505,13 @@ impl DirEntry { /// Returns the file name for this directory entry. pub fn file_name(&self) -> OsString; } + +mod os::unix::fs { + pub trait DirEntryExt { + fn ino(&self) -> raw::ino_t; // read the d_ino field + } + impl DirEntryExt for fs::DirEntry { ... } +} ``` # Drawbacks From ebf55142d4ed36576f7079d7d4eb0dbc2e92b5d9 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Apr 2015 10:07:45 -0700 Subject: [PATCH 0226/1195] Add a FileType structure --- text/0000-io-fs-2.1.md | 53 +++++++++++++++++++++++++++++++++--------- 1 file changed, 42 insertions(+), 11 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index e0942d2cdf5..a29a7301d42 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -376,6 +376,43 @@ pub fn template>(&mut self, path: P) -> &mut Self; At this time, however, it it not proposed to add this method to `CreateDirOptions`. +## Adding `FileType` + +Currently there is no enumeration or newtype representing a list of "file types" +on the local filesystem. This is partly done because the need is not so high +right now. Some situations, however, imply that it is more efficient to learn +the file type at once instead of testing for each individual file type itself. + +For example some platforms' `DirEntry` type can know the `FileType` without an +extra syscall. If code were to test a `DirEntry` separately for whether it's a +file or a directory, it may issue more syscalls necessary than if it instead +learned the type and then tested that if it was a file or directory. + +The full set of file types, however, is not always known nor portable across +platforms, so this RFC proposes the following hierarchy: + +```rust +#[derive(Copy, Clone, PartialEq, Eq, Hash)] +pub struct FileType(..); + +impl FileType { + pub fn is_dir(&self) -> bool; + pub fn is_file(&self) -> bool; + pub fn is_soft_link(&self) -> bool; +} +``` + +Extension traits can be added in the future for testing for other more flavorful +kinds of files on various platforms (such as unix sockets on unix platforms). + +#### Dealing with `is_{file,dir}` and `file_type` methods + +Currently the `fs::Metadata` structure exposes stable `is_file` and `is_dir` +accessors. The struct will also grow a `file_type` accessor for this newtype +struct being added. It is proposed that `Metadata` will retain the +`is_{file,dir}` convenience methods, but no other "file type testers" will be +added. + ## Adding `fs::equivalent` A new function `equivalent` will be added to the `fs` module along the lines of @@ -406,11 +443,6 @@ The following APIs will be added to `std::fs`: /// Returns the metadata of the file pointed to by `p`, and this function, /// unlike `metadata` will **not** follow soft links. pub fn soft_link_metadata>(p: P) -> io::Result; - -impl Metadata { - /// Tests whether this metadata is for a soft link or not. - pub fn is_soft_link(&self) -> bool; -} ``` ## Binding `realpath` @@ -460,7 +492,6 @@ pub trait PathExt { fn exists(&self) -> bool; fn is_dir(&self) -> bool; fn is_file(&self) -> bool; - fn is_soft_link(&self) -> bool; fn metadata(&self) -> io::Result; fn soft_link_metadata(&self) -> io::Result; fn canonicalize(&self) -> io::Result; @@ -492,15 +523,12 @@ impl DirEntry { /// return metadata. pub fn metadata(&self) -> io::Result; - /// Accessors for testing what file type this `DirEntry` contains. + /// Return what file type this `DirEntry` contains. /// /// On some platforms this may not require reading the metadata of the /// underlying file from the filesystem, but on other platforms it may be /// required to do so. - pub fn is_dir(&self) -> bool; - pub fn is_file(&self) -> bool; - pub fn is_soft_link(&self) -> bool; - // ... + pub fn file_type(&self) -> FileType; /// Returns the file name for this directory entry. pub fn file_name(&self) -> OsString; @@ -542,6 +570,9 @@ mod os::unix::fs { necessarily seem appropriate for all the methods and using inherent methods also seems more logical. +* Instead of continuing to add `is_` accessors to `Metadata` and + `DirEntry`, there could + # Unresolved questions * What is the ultimate role of crates like `liblibc`, and how do we draw the From bd88df3f19cc5cb63f5ca3f89486a617b6c5461d Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 8 Apr 2015 12:45:59 -0700 Subject: [PATCH 0227/1195] RFC: Socket timeouts --- text/0000-socket-timeouts.md | 109 +++++++++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) create mode 100644 text/0000-socket-timeouts.md diff --git a/text/0000-socket-timeouts.md b/text/0000-socket-timeouts.md new file mode 100644 index 00000000000..8392670a146 --- /dev/null +++ b/text/0000-socket-timeouts.md @@ -0,0 +1,109 @@ +- Feature Name: socket_timeouts +- Start Date: 2015-04-08 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add sockopt-style timeouts to `std::net` types. + +# Motivation + +Currently, operations on various socket types in `std::net` block +indefinitely (i.e., until the connection is closed or data is +transferred). But there are many contexts in which timing out a +blocking call is important. + +The [goal of the current IO system][io-reform] is to gradually expose +cross-platform, blocking APIs for IO, especially APIs that directly +correspond to the underlying system APIs. Sockets are widely available +with nearly identical system APIs across the platforms Rust targets, +and this includes support for timeouts via [sockopts][sockopt]. + +So timeouts are well-motivated and well-suited to `std::net`. + +# Detailed design + +The proposal is to *directly expose* the timeout functionality +provided by [`setsockopt`][sockopt], in much the same way we currently +expose functionality like `set_nodelay`: + +```rust +impl TcpStream { + pub fn set_read_timeout(&self, dur: Duration) -> io::Result<()> { ... } + pub fn set_write_timeout(&self, dur: Duration) -> io::Result<()> { ... } +} + +impl UdpSocket { + pub fn set_read_timeout(&self, dur: Duration) -> io::Result<()> { ... } + pub fn set_write_timeout(&self, dur: Duration) -> io::Result<()> { ... } +} +``` + +These methods take an amount of time in the form of a `Duration`, +which is [undergoing stabilization][duration-reform]. They are +implemented via straightforward calls to `setsockopt`. + +# Drawbacks + +One potential downside to this design is that the timeouts are set +through direct mutation of the socket state, which can lead to +composition problems. For example, a socket could be passed to another +function which needs to use it with a timeout, but setting the timeout +clobbers any previous values. This lack of composability leads to +defensive programming in the form of "callee save" resets of timeouts, +for example. An alternative design is given below. + +The advantage of binding the mutating APIs directly is that we keep a +close correspondence between the `std::net` types and their underlying +system types, and a close correspondence between Rust APIs and system +APIs. It's not clear that this kind of composability is important +enough in practice to justify a departure from the traditional API. + +# Alternatives + +A different approach would be to *wrap* socket types with a "timeout +modifier", which would be responsible for setting and resetting the +timeouts: + +```rust +struct WithTimeout { + timeout: Duration, + innter: T +} + +impl WithTimeout { + /// Returns the wrapped object, resetting the timeout + pub fn into_inner(self) -> T { ... } +} + +impl TcpStream { + /// Wraps the stream with a timeout + pub fn with_timeout(self, timeout: Duration) -> WithTimeout { ... } +} + +impl Read for WithTimeout { ... } +impl Write for WithTimeout { ... } +``` + +A [previous RFC][deadlines] spelled this out in more detail. + +Unfortunately, such a "wrapping" API has problems of its own. It +creates unfortunate type incompatibilities, since you cannot store a +timeout-wrapped socket where a "normal" socket is expected. It is +difficult to be "polymorphic" over timeouts. + +Ultimately, it's not clear that the extra complexities of the type +distinction here are worth the better theoretical composability. + +# Unresolved questions + +Should we consider a preliminary version of this RFC that introduces +methods like `set_read_timeout_ms`, similar to `wait_timeout_ms` on +`Condvar`? These methods have been introduced elsewhere to provide a +stable way to use timeouts prior to `Duration` being stabilized. + +[io-reform]: https://github.com/rust-lang/rfcs/blob/master/text/0517-io-os-reform.md +[sockopt]: http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html +[duration-reform]: https://github.com/rust-lang/rfcs/pull/1040 +[deadlines]: https://github.com/rust-lang/rfcs/pull/577/ From bc8759d1bcf029caa8376bf29eddef5e69a86fed Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Apr 2015 15:53:02 -0700 Subject: [PATCH 0228/1195] Remove a half-worded alternative --- text/0000-io-fs-2.1.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index a29a7301d42..03caae26eb0 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -570,9 +570,6 @@ mod os::unix::fs { necessarily seem appropriate for all the methods and using inherent methods also seems more logical. -* Instead of continuing to add `is_` accessors to `Metadata` and - `DirEntry`, there could - # Unresolved questions * What is the ultimate role of crates like `liblibc`, and how do we draw the From fedf7fd6c05af82639a834800ef814fd78830636 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Apr 2015 15:54:25 -0700 Subject: [PATCH 0229/1195] Bind remaining unix mode constants --- text/0000-io-fs-2.1.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index 03caae26eb0..45207395298 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -290,6 +290,9 @@ mod os::unix::fs { pub const ALL_WRITE: raw::mode_t; pub const ALL_EXECUTE: raw::mode_t; pub const ALL_RWX: raw::mode_t; + pub const SETUID: raw::mode_t; + pub const SETGID: raw::mode_t; + pub const STICKY_BIT: raw::mode_t; } ``` From 11265b41675d0e2982bcc1a0e3b40df9e0208788 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Apr 2015 17:24:31 -0700 Subject: [PATCH 0230/1195] Don't add Permissions::new --- text/0000-io-fs-2.1.md | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index 45207395298..d4edfe7fc03 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -302,17 +302,10 @@ file. ## Constructing `Permissions` -Currently there is no method to construct an instance of `Permissions` in a -cross-platform manner. This RFC proposes adding the following APIs: +Currently there is no method to construct an instance of `Permissions` on any +platform. This RFC proposes adding the following APIs: ```rust -impl Permissions { - /// Creates a new set of permissions appropriate for being placed on a file. - /// - /// On unix platforms this corresponds to the permission bits `0o666` - pub fn new() -> Permissions; -} - mod os::unix::fs { pub trait PermissionsExt { fn from_mode(mode: raw::mode_t) -> Self; @@ -321,6 +314,10 @@ mod os::unix::fs { } ``` +This RFC does not propose yet adding a cross-platform way to construct a +`Permissions` structure due to the radical differences between how unix and +windows handle permissions. + ## Creating directories with permissions Currently the standard library does not expose an API which allows setting the From e8bd990b3d1fc89714c6ec5ad198748dec1bf668 Mon Sep 17 00:00:00 2001 From: Brian Campbell Date: Thu, 9 Apr 2015 02:11:58 -0400 Subject: [PATCH 0231/1195] RFC: rename soft_link to symlink --- text/0000-rename-soft-link-to-symlink.md | 97 ++++++++++++++++++++++++ 1 file changed, 97 insertions(+) create mode 100644 text/0000-rename-soft-link-to-symlink.md diff --git a/text/0000-rename-soft-link-to-symlink.md b/text/0000-rename-soft-link-to-symlink.md new file mode 100644 index 00000000000..478d95dbd77 --- /dev/null +++ b/text/0000-rename-soft-link-to-symlink.md @@ -0,0 +1,97 @@ +- Feature Name: rename_soft_link_to_symlink +- Start Date: 2015-04-09 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Rename `std::fs::soft_link` to `std::fs::symlink` and provide a deprecated +`std::fs::soft_link` alias. + +# Motivation + +At some point in the split up version of rust-lang/rfcs#517, +`std::fs::symlink` was renamed to `sym_link` and then to `soft_link`. + +The new name is somewhat surprising and can be difficult to find. After a +poll of a number of different platforms and languages, every one appears to +contain `symlink`, `symbolic_link`, or some camel case variant of those for +their equivalent API. Every piece of formal documentation found, for +both Windows and various Unix like platforms, used "symbolic link" exclusively +in prose. + +Here are the names I found for this functionality on various platforms, +libraries, and languages: + +* [POSIX/Single Unix Specification](http://pubs.opengroup.org/onlinepubs/009695399/functions/symlink.html): `symlink` +* [Windows](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365680%28v=vs.85%29.aspx): `CreateSymbolicLink` +* [Objective-C/Swift](https://developer.apple.com/library/ios/documentation/Cocoa/Reference/Foundation/Classes/NSFileManager_Class/index.html#//apple_ref/occ/instm/NSFileManager/createSymbolicLinkAtPath:withDestinationPath:error:): `createSymbolicLinkAtPath:withDestinationPath:error:` +* [Java](https://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html): `createSymbolicLink` +* [C++ (Boost/draft standard)](http://en.cppreference.com/w/cpp/experimental/fs): `create_symlink` +* [Ruby](http://ruby-doc.org/core-2.2.0/File.html): `symlink` +* [Python](https://docs.python.org/2/library/os.html#os.symlink): `symlink` +* [Perl](http://perldoc.perl.org/functions/symlink.html): `symlink` +* [PHP](https://php.net/manual/en/function.symlink.php): `symlink` +* [Delphi](http://docwiki.embarcadero.com/Libraries/XE7/en/System.SysUtils.FileCreateSymLink): `FileCreateSymLink` +* PowerShell has no official version, but several community cmdlets ([one example](http://stackoverflow.com/questions/894430/powershell-hard-and-soft-links/894651#894651), [another example](https://gallery.technet.microsoft.com/scriptcenter/New-SymLink-60d2531e)) are named `New-SymLink` + +The term "soft link", probably as a contrast with "hard link", is found +frequently in informal descriptions, but almost always in the form of a +parenthetical of an alternate phrase, such as "a symbolic link (or soft +link)". I could not find it used in any formal documentation or APIs outside +of Rust. + +The name `soft_link` was chosen to be shorter than `symbolic_link`, but +without using Unix specific jargon like `symlink`, to not give undue weight to +one platform over the other. However, based on the evidence above it doesn't +have any precedent as a formal name for the concept or API. Furthermore, +symbolic links themselves are a conept that were only relatively recently +(2007, with Windows Vista) introduced to Windows, with the [main motivator +being Unix compatibility](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365680%28v=vs.85%29.aspx): + +> Symbolic links are designed to aid in migration and application +> compatibility with UNIX operating systems. Microsoft has implemented its +> symbolic links to function just like UNIX links. + +If you do a Google search for "[windows symbolic link](https://www.google.com/search?q=windows+symbolic+link&ie=utf-8&oe=utf-8)" or "[windows soft link](https://www.google.com/search?q=windows+soft+link&ie=utf-8&oe=utf-8)", +many of the documents you find start using "symlink" after introducing the +concept, so it seems to be a fairly common abbreviation for the full name even +among Windows developers and users. + +# Detailed design + +Rename `std::fs::soft_link` to `std::fs::symlink`, and provide a deprecated +`std::fs::soft_link` wrapper for backwards compatibility. Update the +documentaiton to use "symbolic link" in prose, rather than "soft link". + +# Drawbacks + +This deprecates a stable API during the 1.0.0 beta, leaving an extra wrapper +around. + +# Alternatives + +Other choices for the name would be: + +* The status quo, `soft_link` +* The original proposal from rust-lang/rfcs#517, `sym_link` +* The full name, `symbolic_link` + +The first choice is non-obvious, for people coming from either Windows or +Unix. It is a classic compromise, that makes everyone unhappy. + +`sym_link` is slightly more consistent with the complementary `hard_link` +function, and treating "sym link" as two separate words has some precedent in +two of the Windows-targetted APIs, Delphi and some of the PowerShell cmdlets +observed. + +The full name `symbolic_link`, is a bit long and cumbersome compared to most +of the rest of the API, but is explicit and is the term used in prose to +describe the concept everywhere, so shouldn't emphasize any one platform over +the other. + +# Unresolved questions + +If we deprecate `soft_link` now, early in the beta cycle, would it be +acceptable to remove it rather than deprecate it before 1.0.0, thus avoiding a +permanently stable but deprecated API right out the gate? From 5c2aaa0a4c4ef528959ed272c8f07a418df00186 Mon Sep 17 00:00:00 2001 From: Nathaniel Theis Date: Thu, 9 Apr 2015 19:23:21 -0700 Subject: [PATCH 0232/1195] Update 0000-std-iter-once.md --- text/0000-std-iter-once.md | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/text/0000-std-iter-once.md b/text/0000-std-iter-once.md index 5ef066e1086..0c8993ed608 100644 --- a/text/0000-std-iter-once.md +++ b/text/0000-std-iter-once.md @@ -4,11 +4,11 @@ # Summary -Add a `once` function to `std::iter` to construct an iterator yielding a given value one time. +Add a `once` function to `std::iter` to construct an iterator yielding a given value one time, and an `empty` function to construct an iterator yielding no values. # Motivation -This is a common task when working with iterators. Currently, this can be done in many ways, most of which are unergonomic, do not work for all types (e.g. requiring Copy/Clone), or both. `once` is simple to implement, simple to use, and simple to understand. +This is a common task when working with iterators. Currently, this can be done in many ways, most of which are unergonomic, do not work for all types (e.g. requiring Copy/Clone), or both. `once` and `empty` are simple to implement, simple to use, and simple to understand. # Detailed design @@ -24,7 +24,19 @@ pub fn once(x: T) -> Once { } ``` -The `Once` wrapper struct exists to allow future backwards-compatible changes, and hide the implementation. +`empty` is similar: + +```rust +pub struct Empty(std::option::IntoIter); + +pub fn empty(x: T) -> Empty { + Empty( + None.into_iter() + ) +} +``` + +These wrapper structs exist to allow future backwards-compatible changes, and hide the implementation. # Drawbacks From cf25ad89704d35e3875a177add86723b3e2240c4 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 10 Apr 2015 06:07:07 -0400 Subject: [PATCH 0233/1195] RFC 693 is merged. Update links and so forth. --- README.md | 1 + text/0000-variadic-generics.md | 46 +++++++++++++++++++ ...nsic.md => 0639-discriminant-intrinsic.md} | 21 +++++++-- 3 files changed, 65 insertions(+), 3 deletions(-) create mode 100644 text/0000-variadic-generics.md rename text/{0000-discriminant-intrinsic.md => 0639-discriminant-intrinsic.md} (93%) diff --git a/README.md b/README.md index 54a40128530..a4318023caf 100644 --- a/README.md +++ b/README.md @@ -39,6 +39,7 @@ the direction the language is evolving in. * [0509-collections-reform-part-2.md](text/0509-collections-reform-part-2.md) * [0517-io-os-reform.md](text/0517-io-os-reform.md) * [0560-integer-overflow.md](text/0560-integer-overflow.md) +* [0639-discriminant-intrinsic.md](text/0639-discriminant-intrinsic.md) * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) diff --git a/text/0000-variadic-generics.md b/text/0000-variadic-generics.md new file mode 100644 index 00000000000..71cf4fc2b4d --- /dev/null +++ b/text/0000-variadic-generics.md @@ -0,0 +1,46 @@ +- Feature Name: variadic-generics +- Start Date: 2015-04-03 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + + +# Motivation + +Why are we doing this? What use cases does it support? What is the expected outcome? + +# Detailed design + +## Introduce the `Tuple` trait + +The `Tuple` trait is implemented for tuples of any arity. It is a lang +item. + +```rust +trait Tuple { } +``` + +## Expandable parameters + +In a `fn` signature, a `..` may appear before the type of any +argument. This type must implement the `Tuple` trait. The `..` +indicates that the single + +```rust +trait Fn: FnMut { + fn call(&self, args: ..A) -> Self::Output; +} +``` + +# Drawbacks + +Why should we *not* do this? + +# Alternatives + +What other designs have been considered? What is the impact of not doing this? + +# Unresolved questions + +What parts of the design are still TBD? diff --git a/text/0000-discriminant-intrinsic.md b/text/0639-discriminant-intrinsic.md similarity index 93% rename from text/0000-discriminant-intrinsic.md rename to text/0639-discriminant-intrinsic.md index 2f397b9d9f6..636410a01ec 100644 --- a/text/0000-discriminant-intrinsic.md +++ b/text/0639-discriminant-intrinsic.md @@ -1,6 +1,6 @@ - Start Date: 2015-01-21 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#639](https://github.com/rust-lang/rfcs/pull/639) +- Rust Issue: [rust-lang/rust#24263](https://github.com/rust-lang/rust/issues/24263) # Summary @@ -341,4 +341,19 @@ pub enum SqlState { IndexCorrupted, Unknown(String), } -``` \ No newline at end of file +``` + +# History + +This RFC was accepted on a provisional basis on 2015-10-04. The +intention is to implement and experiment with the proposed +intrinsic. Some concerns expressed in the RFC discussion that will +require resolution before the RFC can be fully accepted: + +- Using bounds such as `T:Reflect` to help ensure parametricity. +- Do we want to change the return type in some way? + - It may not be helpful if we expose discriminant directly in the + case of (potentially) negative discriminants. + - We might want to return something more opaque to guard against + unintended representation exposure. +- Does this intrinsic need to be unsafe? From ad18720aaf844c5da8487e8b5339ea787e95d7fe Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 10 Apr 2015 09:19:49 -0400 Subject: [PATCH 0234/1195] Remove accidental draft --- text/0000-variadic-generics.md | 46 ---------------------------------- 1 file changed, 46 deletions(-) delete mode 100644 text/0000-variadic-generics.md diff --git a/text/0000-variadic-generics.md b/text/0000-variadic-generics.md deleted file mode 100644 index 71cf4fc2b4d..00000000000 --- a/text/0000-variadic-generics.md +++ /dev/null @@ -1,46 +0,0 @@ -- Feature Name: variadic-generics -- Start Date: 2015-04-03 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary - - -# Motivation - -Why are we doing this? What use cases does it support? What is the expected outcome? - -# Detailed design - -## Introduce the `Tuple` trait - -The `Tuple` trait is implemented for tuples of any arity. It is a lang -item. - -```rust -trait Tuple { } -``` - -## Expandable parameters - -In a `fn` signature, a `..` may appear before the type of any -argument. This type must implement the `Tuple` trait. The `..` -indicates that the single - -```rust -trait Fn: FnMut { - fn call(&self, args: ..A) -> Self::Output; -} -``` - -# Drawbacks - -Why should we *not* do this? - -# Alternatives - -What other designs have been considered? What is the impact of not doing this? - -# Unresolved questions - -What parts of the design are still TBD? From 150b8670e077e0a71d1fab32c4d6b0f89e1cefc2 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 10 Apr 2015 09:30:42 -0400 Subject: [PATCH 0235/1195] Empty struct with braces --- README.md | 1 + .../0218-empty-struct-with-braces.md | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename active/0000-empty-structs-with-braces.md => text/0218-empty-struct-with-braces.md (98%) diff --git a/README.md b/README.md index a4318023caf..ea4adf729cf 100644 --- a/README.md +++ b/README.md @@ -29,6 +29,7 @@ the direction the language is evolving in. * [0141-lifetime-elision.md](text/0141-lifetime-elision.md) * [0195-associated-items.md](text/0195-associated-items.md) * [0213-defaulted-type-params.md](text/0213-defaulted-type-params.md) +* [0218-empty-struct-with-braces.md](text/0218-empty-struct-with-braces.md) * [0320-nonzeroing-dynamic-drop.md](text/0320-nonzeroing-dynamic-drop.md) * [0339-statically-sized-literals.md](text/0339-statically-sized-literals.md) * [0385-module-system-cleanup.md](text/0385-module-system-cleanup.md) diff --git a/active/0000-empty-structs-with-braces.md b/text/0218-empty-struct-with-braces.md similarity index 98% rename from active/0000-empty-structs-with-braces.md rename to text/0218-empty-struct-with-braces.md index 4e2f55d8d04..e378801ab4e 100644 --- a/active/0000-empty-structs-with-braces.md +++ b/text/0218-empty-struct-with-braces.md @@ -1,6 +1,6 @@ - Start Date: (fill me in with today's date, 2014-08-28) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#218](https://github.com/rust-lang/rfcs/pull/218/files) +- Rust Issue: [rust-lang/rust#218](https://github.com/rust-lang/rust/issues/24266) # Summary From 2973b9976c3c9b1726e5755ba999235ae5bfda1e Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Fri, 10 Apr 2015 16:50:51 +0200 Subject: [PATCH 0236/1195] =?UTF-8?q?Rename=20or=20replace=20`str::words`?= =?UTF-8?q?=20to=20side-step=20the=20ambiguity=20of=20=E2=80=9Ca=20word?= =?UTF-8?q?=E2=80=9D.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/0000-str-words.md | 67 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) create mode 100644 text/0000-str-words.md diff --git a/text/0000-str-words.md b/text/0000-str-words.md new file mode 100644 index 00000000000..f91843d1fc7 --- /dev/null +++ b/text/0000-str-words.md @@ -0,0 +1,67 @@ +- Feature Name: str-words +- Start Date: 2015-04-10 +- RFC PR: +- Rust Issue: + +# Summary + +Rename or replace `str::words` to side-step the ambiguity of “a word”. + + +# Motivation + +The [`str::words`](http://doc.rust-lang.org/std/primitive.str.html#method.words) method +is currently marked `#[unstable(reason = "the precise algorithm to use is unclear")]`. +Indeed, the concept of “a word” is not easy to define in precense of punctuation +or languages with various conventions, including not using spaces at all to separate words. + +[Issue #15628](https://github.com/rust-lang/rust/issues/15628) suggests +changing the algorithm to be based on [the *Word Boundaries* section of +*Unicode Standard Annex #29: Unicode Text Segmentation*](http://www.unicode.org/reports/tr29/#Word_Boundaries). + +While a Rust implemention of UAX#29 would be useful, it belong on crates.io more than in `std`: + +* It carries significant complexity that may be surprising from something that looks as simple + as a parameter-less “words” method in the standard library. + Users may not be aware of how subtle defining “a word” can be. +* It is not a definitive answer. The standard itself notes: + + > It is not possible to provide a uniform set of rules that resolves all issues across languages + > or that handles all ambiguous situations within a given language. + > The goal for the specification presented in this annex is to provide a workable default; + > tailored implementations can be more sophisticated. + + and gives many examples of such ambiguous situations. + +Therefore, `std` would be better off avoiding the question of defining word boundaries entirely. + + +# Detailed design + +Rename the `words` method to `split_whitespace`, and keep the current behavior unchanged. +(That is, return an iterator equivalent to `s.split(char::is_whitespace).filter(|s| !s.is_empty())`.) + +Rename the return type `std::str::Words` to `std::str::SplitWhitespace`. + +Optionally, keep a `words` wrapper method for a while, both `#[deprecated]` and `#[unstable]`, +with an error message that suggests `split_whitespace` or the chosen alternative. + + +# Drawbacks + +`split_whitespace` is very similar to the existing `str::split(&self, P)` method, +and having a separate method seems like weak API design. (But see below.) + + +# Alternatives + +* Replace `str::words` with `struct Whitespace;` with a custom `Pattern` implementation, + which can be used in `str::split`. + However this requires the `Whitespace` symbol to be imported separately. +* Remove `str::words` entirely and tell users to use + `s.split(char::is_whitespace).filter(|s| !s.is_empty())` instead. + + +# Unresolved questions + +Is there a better alternative? From 885fbdaed0c485f8df44182ceb0bbea2e07e6883 Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Fri, 10 Apr 2015 17:10:27 +0200 Subject: [PATCH 0237/1195] Spelling --- text/0000-str-words.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-str-words.md b/text/0000-str-words.md index f91843d1fc7..04bc7875220 100644 --- a/text/0000-str-words.md +++ b/text/0000-str-words.md @@ -12,14 +12,14 @@ Rename or replace `str::words` to side-step the ambiguity of “a word”. The [`str::words`](http://doc.rust-lang.org/std/primitive.str.html#method.words) method is currently marked `#[unstable(reason = "the precise algorithm to use is unclear")]`. -Indeed, the concept of “a word” is not easy to define in precense of punctuation +Indeed, the concept of “a word” is not easy to define in presence of punctuation or languages with various conventions, including not using spaces at all to separate words. [Issue #15628](https://github.com/rust-lang/rust/issues/15628) suggests changing the algorithm to be based on [the *Word Boundaries* section of *Unicode Standard Annex #29: Unicode Text Segmentation*](http://www.unicode.org/reports/tr29/#Word_Boundaries). -While a Rust implemention of UAX#29 would be useful, it belong on crates.io more than in `std`: +While a Rust implementation of UAX#29 would be useful, it belong on crates.io more than in `std`: * It carries significant complexity that may be surprising from something that looks as simple as a parameter-less “words” method in the standard library. From 4edf62886c120dd1a72ed176a1c2134c304927b2 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 10 Apr 2015 10:14:41 -0700 Subject: [PATCH 0238/1195] Remove fs::equivalent --- text/0000-io-fs-2.1.md | 20 +------------------- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index d4edfe7fc03..a3d64f69cf1 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -19,7 +19,6 @@ these operations are possible in stable Rust today: * Inspecting the unix permission bits on a file * Blanket setting the unix permission bits on a file * Leveraging `DirEntry` for the extra metadata it might contain -* Testing whether two paths are equivalent (point to the same file) * Reading the metadata of a soft link (not what it points at) * Resolving all soft links in a path @@ -187,7 +186,7 @@ mod os::unix::fs { pub struct Metadata(raw::stat); impl Metadata { - // Accessors for fields available in `raw::stat` for *all* platforms + // Accessors for fields available in `raw::stat` for *all* unix platforms fn dev(&self) -> raw::dev_t; // st_dev field fn ino(&self) -> raw::ino_t; // st_ino field fn mode(&self) -> raw::mode_t; // st_mode field @@ -413,22 +412,6 @@ struct being added. It is proposed that `Metadata` will retain the `is_{file,dir}` convenience methods, but no other "file type testers" will be added. -## Adding `fs::equivalent` - -A new function `equivalent` will be added to the `fs` module along the lines of -[C++'s equivalent function][cpp-equivalent]: - -[cpp-equivalent]: http://en.cppreference.com/w/cpp/experimental/fs/equivalent - -```rust -/// Test whether the two paths provided are equivalent references to the same -/// file or directory. -/// -/// This function will ensure that the two paths have the same status and refer -/// to the same file system entity (e.g. at the same phyical location). -pub fn equivalent, Q: AsRef>(p: P, q: Q) -> bool; -``` - ## Enhancing soft link support Currently the `std::fs` module provides a `soft_link` and `read_link` function, @@ -497,7 +480,6 @@ pub trait PathExt { fn canonicalize(&self) -> io::Result; fn read_link(&self) -> io::Result; fn read_dir(&self) -> io::Result; - fn equivalent>(&self, p: P) -> bool; } impl PathExt for Path { ... } From 216f8c85410cec4542f89fdb7f0ecc4519898ee4 Mon Sep 17 00:00:00 2001 From: Kevin Ballard Date: Sat, 11 Apr 2015 17:20:50 -0700 Subject: [PATCH 0239/1195] RFC for adding Sync to io::Error --- text/0000-io-error-sync.md | 74 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 text/0000-io-error-sync.md diff --git a/text/0000-io-error-sync.md b/text/0000-io-error-sync.md new file mode 100644 index 00000000000..bad5814e1df --- /dev/null +++ b/text/0000-io-error-sync.md @@ -0,0 +1,74 @@ +- Feature Name: `io_error_sync` +- Start Date: 2015-04-11 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add the `Sync` bound to `io::Error` by requiring that any wrapped custom errors +also conform to `Sync` in addition to `error::Error + Send`. + +# Motivation + +Adding the `Sync` bound to `io::Error` has 3 primary benefits: + +* Values that contain `io::Error`s will be able to be `Sync` +* Perhaps more importantly, `io::Error` will be able to be stored in an `Arc` +* By using the above, a cloneable wrapper can be created that shares an + `io::Error` using an `Arc` in order to simulate the old behavior of being able + to clone an `io::Error`. + +# Detailed design + +The only thing keeping `io::Error` from being `Sync` today is the wrapped custom +error type `Box`. Changing this to +`Box` and adding the `Sync` bound to `io::Error::new()` +is sufficient to make `io::Error` be `Sync`. In addition, the relevant +`convert::From` impls that convert to `Box` will be updated +to convert to `Box` instead. + +# Drawbacks + +The only downside to this change is it means any types that conform to +`error::Error` and are `Send` but not `Sync` will no longer be able to be +wrapped in an `io::Error`. It's unclear if there's any types in the standard +library that will be impacted by this. Looking through the [list of +implementors][impls] for `error::Error`, here's all of the types that may be +affected: + +* `io::IntoInnerError`: This type is only `Sync` if the underlying buffered + writer instance is `Sync`. I can't be sure, but I don't believe we have any + writers that are `Send` but not `Sync`. In addition, this type has a `From` + impl that converts it to `io::Error` even if the writer is not `Send`. +* `sync::mpsc::SendError`: This type is only `Sync` if the wrapped value `T` is + `Sync`. This is of course also true for `Send`. I'm not sure if anyone is + relying on the ability to wrap a `SendError` in an `io::Error`. +* `sync::mpsc::TrySendError`: Same situation as `SendError`. +* `sync::PoisonError`: This type is already not compatible with `io::Error` + because it wraps mutex guards (such as `sync::MutexGuard`) which are not + `Send`. +* `sync::TryLockError`: Same situation as `PoisonError`. + +So the only real question is about `sync::mpsc::SendError`. If anyone is relying +on the ability to convert that into an `io::Error` a `From` impl could be +added that returns an `io::Error` that is indistinguishable from a wrapped +`SendError`. + +[impls]: http://doc.rust-lang.org/nightly/std/error/trait.Error.html + +# Alternatives + +Don't do this. Not adding the `Sync` bound to `io::Error` means `io::Error`s +cannot be stored in an `Arc` and types that contain an `io::Error` cannot be +`Sync`. + +We should also consider whether we should go a step further and change +`io::Error` to use `Arc` instead of `Box` internally. This would let us restore +the `Clone` impl for `io::Error`. + +# Unresolved questions + +Should we add the `From` impl for `SendError`? There is no code in the rust +project that relies on `SendError` being converted to `io::Error`, and I'm +inclined to think it's unlikely for anyone to be relying on that, but I don't +know if there are any third-party crates that will be affected. From 1cbabb04c2d1fe2bf10c8b2ec9e5c4ae8d11a553 Mon Sep 17 00:00:00 2001 From: Kevin Ballard Date: Sat, 11 Apr 2015 18:07:12 -0700 Subject: [PATCH 0240/1195] RFC for replacing slice::tail()/init() with new methods --- text/0000-slice-tail-redesign.md | 101 +++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 text/0000-slice-tail-redesign.md diff --git a/text/0000-slice-tail-redesign.md b/text/0000-slice-tail-redesign.md new file mode 100644 index 00000000000..56f8454a2e6 --- /dev/null +++ b/text/0000-slice-tail-redesign.md @@ -0,0 +1,101 @@ +- Feature Name: `slice_tail_redesign` +- Start Date: 2015-04-11 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Replace `slice.tail()`, `slice.init()` with new methods `slice.shift_first()`, +`slice.shift_last()`. + +# Motivation + +The `slice.tail()` and `slice.init()` methods are relics from an older version +of the slice APIs that included a `head()` method. `slice` no longer has +`head()`, instead it has `first()` which returns an `Option`, and `last()` also +returns an `Option`. While it's generally accepted that indexing / slicing +should panic on out-of-bounds access, `tail()`/`init()` are the only +remaining methods that panic without taking an explicit index. + +A conservative change here would be to simply change `head()`/`tail()` to return +`Option`, but I believe we can do better. These operations are actually +specializations of `split_at()` and should be replaced with methods that return +`Option<(T,&[T])>`. This makes the common operation of processing the first/last +element and the remainder of the list more ergonomic, with very low impact on +code that only wants the remainder (such code only has to add `.1` to the +expression). This has an even more significant effect on code that uses the +mutable variants. + +# Detailed design + +The methods `head()`, `tail()`, `head_mut()`, and `tail_mut()` will be removed, +and new methods will be added: + +```rust +fn shift_first(&self) -> Option<(&T, &[T])>; +fn shift_last(&self) -> Option<(&T, &[T])>; +fn shift_first_mut(&mut self) -> Option<(&mut T, &mut [T])>; +fn shift_last_mut(&mut self) -> Option<(&mut T, &mut [T])>; +``` + +Existing code using `tail()` or `init()` could be translated as follows: + +* `slice.tail()` becomes `slice.shift_first().unwrap().1` or `&slice[1..]` +* `slice.init()` becomes `slice.shift_last().unwrap().1` or + `&slice[..slice.len()-1]` + +It is expected that a lot of code using `tail()` or `init()` is already either +testing `len()` explicitly or using `first()` / `last()` and could be refactored +to use `shift_first()` / `shift_last()` in a more ergonomic fashion. As an +example, the following code from typeck: + +```rust +if variant.fields.len() > 0 { + for field in variant.fields.init() { +``` + +can be rewritten as: + +```rust +if let Some((_, init_fields)) = variant.fields.shift_last() { + for field in init_fields { +``` + +And the following code from compiletest: + +```rust +let argv0 = args[0].clone(); +let args_ = args.tail(); +``` + +can be rewritten as: + +```rust +let (argv0, args_) = args.shift_first().unwrap(); +``` + +(the `clone()` ended up being unnecessary). + +# Drawbacks + +The expression `slice.shift_last().unwrap.1` is more cumbersome than +`slice.init()`. However, this is primarily due to the need for `.unwrap()` +rather than the need for `.1`, and would affect the more conservative solution +(of making the return type `Option<&[T]>`) as well. + +# Alternatives + +Only change the return type to `Option` without adding the tuple. This is the +more conservative change mentioned above. It still has the same drawback of +requiring `.unwrap()` when translating existing code. And it's unclear what the +function names should be (the current names are considered suboptimal). + +# Unresolved questions + +Is the name correct? There's precedent in this name in the form of +[`str::slice_shift_char()`][slice_shift_char]. An alternative name might be +`pop_first()`/`pop_last()`, or `shift_front()`/`shift_back()` (although the +usage of `first`/`last` was chosen to match the existing methods `first()` and +`last()`). + +[slice_shift_char]: http://doc.rust-lang.org/nightly/std/primitive.str.html#method.slice_shift_char From cea65b38c9a1d96472e9edda162c4a39d972d828 Mon Sep 17 00:00:00 2001 From: Kevin Ballard Date: Mon, 13 Apr 2015 11:17:34 -0700 Subject: [PATCH 0241/1195] Add split_first/split_last to unresolved questions Add the suggested `split_` prefix as a name option, albeit with the names `split_first()`/`split_last()` instead of the originally-suggested `split_init()`/`split_tail()`. Add a question about the return type of `shift_last()`. --- text/0000-slice-tail-redesign.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0000-slice-tail-redesign.md b/text/0000-slice-tail-redesign.md index 56f8454a2e6..86a2ce5eb0d 100644 --- a/text/0000-slice-tail-redesign.md +++ b/text/0000-slice-tail-redesign.md @@ -96,6 +96,10 @@ Is the name correct? There's precedent in this name in the form of [`str::slice_shift_char()`][slice_shift_char]. An alternative name might be `pop_first()`/`pop_last()`, or `shift_front()`/`shift_back()` (although the usage of `first`/`last` was chosen to match the existing methods `first()` and -`last()`). +`last()`). Another option is `split_first()`/`split_last()`. + +Should `shift_last()` return `Option<(&T, &[T])>` or `Option<(&[T], &T)>`? +I believe that the former is correct with this name, but the latter might be +more suitable given the name `split_last()`. [slice_shift_char]: http://doc.rust-lang.org/nightly/std/primitive.str.html#method.slice_shift_char From aa0b8ee6d8dd1275402eb7248897c2048c3f8362 Mon Sep 17 00:00:00 2001 From: Brian Campbell Date: Tue, 14 Apr 2015 00:23:34 -0400 Subject: [PATCH 0242/1195] Move symlink to OS specific modules As described in the RFC, the differences in how creating symlinks works differs sufficiently between Unix and Windows that trying to have a cross-platform API is a compability hazard. --- text/0000-rename-soft-link-to-symlink.md | 112 +++++++++++++++++++---- 1 file changed, 95 insertions(+), 17 deletions(-) diff --git a/text/0000-rename-soft-link-to-symlink.md b/text/0000-rename-soft-link-to-symlink.md index 478d95dbd77..c6932856686 100644 --- a/text/0000-rename-soft-link-to-symlink.md +++ b/text/0000-rename-soft-link-to-symlink.md @@ -5,13 +5,51 @@ # Summary -Rename `std::fs::soft_link` to `std::fs::symlink` and provide a deprecated -`std::fs::soft_link` alias. +Rename `std::fs::soft_link` into platform-specific versions: +`std::os::unix::fs::symlink`, `std::os::windows::fs::symlink_file`, and +`std::os::windows::fs::symlink_dir`. # Motivation -At some point in the split up version of rust-lang/rfcs#517, -`std::fs::symlink` was renamed to `sym_link` and then to `soft_link`. +Windows Vista introduced the ability to create symbolic links, in order to +[provide compatibility with applications ported from Unix](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365680%28v=vs.85%29.aspx): + +> Symbolic links are designed to aid in migration and application +> compatibility with UNIX operating systems. Microsoft has implemented its +> symbolic links to function just like UNIX links. + +However, symbolic links on Windows behave differently enough than symbolic +links on Unix family operating systems that you can't, in general, assume that +code that works on one will work on the other. On Unix family operating +systems, a symbolic link may refer to either a directory or a file, and which +one is determined when it is resolved to an actual file. On Windows, you must +specify at the time of creation whether a symbolic link refers to a file or +directory. + +In addition, an arbitrary process on Windows is not allowed to create a +symlink; you need to have [particular privileges][1] in order to be able to do +so; while on Unix, ordinary users can create symlinks, and any additional +security policy (such as [Grsecurity][2]) generally restricts +whether applications follow symlinks, not whether a user can create them. + +[1]: (https://technet.microsoft.com/en-us/library/cc766301%28WS.10%29.aspx) in order to be able to do +[2]: https://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity_and_PaX_Configuration_Options#Linking_restrictions + +Thus, there needs to be a way to distinguish between the two operations on +Windows, but that distinction is meaningless on Unix, and any code that deals +with symlinks on Windows will need to depend on having appropriate privilege +or have some way of obtaining appropriate privilege, which is all quite +platform specific. + +These two facts mean that it is unlikely that arbitrary code dealing with +symbolic links will be portable between Windows and Unix. Rather than trying +to support both under one API, it would be better to provide platform specific +APIs, making it much more clear upon inspection where portability issues may +arise. + +In addition, the current name `soft_link` is fairly non-standard. At some +point in the split up version of rust-lang/rfcs#517, `std::fs::symlink` was +renamed to `sym_link` and then to `soft_link`. The new name is somewhat surprising and can be difficult to find. After a poll of a number of different platforms and languages, every one appears to @@ -44,14 +82,12 @@ of Rust. The name `soft_link` was chosen to be shorter than `symbolic_link`, but without using Unix specific jargon like `symlink`, to not give undue weight to one platform over the other. However, based on the evidence above it doesn't -have any precedent as a formal name for the concept or API. Furthermore, -symbolic links themselves are a conept that were only relatively recently -(2007, with Windows Vista) introduced to Windows, with the [main motivator -being Unix compatibility](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365680%28v=vs.85%29.aspx): +have any precedent as a formal name for the concept or API. -> Symbolic links are designed to aid in migration and application -> compatibility with UNIX operating systems. Microsoft has implemented its -> symbolic links to function just like UNIX links. +Furthermore, even on Windows, the name for the [reparse point tag used][3] to +represent symbolic links is `IO_REPARSE_TAG_SYMLINK`. + +[3]: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365511%28v=vs.85%29.aspx If you do a Google search for "[windows symbolic link](https://www.google.com/search?q=windows+symbolic+link&ie=utf-8&oe=utf-8)" or "[windows soft link](https://www.google.com/search?q=windows+soft+link&ie=utf-8&oe=utf-8)", many of the documents you find start using "symlink" after introducing the @@ -60,9 +96,14 @@ among Windows developers and users. # Detailed design -Rename `std::fs::soft_link` to `std::fs::symlink`, and provide a deprecated -`std::fs::soft_link` wrapper for backwards compatibility. Update the -documentaiton to use "symbolic link" in prose, rather than "soft link". +Move `std::fs::soft_link` to `std::os::unix::fs::symlink`, and create +`std::os::windows::fs::symlink_file` and `std::os::windows::fs::symlink_dir` +that call `CreateSymbolicLink` with the appropriate arguments. + +Keep a deprecated compatibility wrapper `std::fs::soft_link` which wraps +`std::os::unix::fs::symlink` or `std::os::windows::fs::symlink_file`, +depending on the platform (as that is the current behavior of +`std::fs::softlink`, to create a file symbolic link). # Drawbacks @@ -71,7 +112,27 @@ around. # Alternatives -Other choices for the name would be: +* Have a cross platform `symlink` and `symlink_dir`, that do the same thing on + Unix but differ on Windows. This has the drawback of invisible + compatibility hazards; code that works on Unix using `symlink` may fail + silently on Windows, as creating the wrong type of symlink may succeed but + it may not be interpreted properly once a destination file of the other type + is created. +* Have a cross platform `symlink` that detects the type of the destination + on Windows. This is not always possible as it's valid to create dangling + symbolic links. +* Have `symlink`, `symlink_dir`, and `symlink_file` all cross-platform, where + the first dispatches based on the destination file type, and the latter two + panic if called with the wrong destination file type. Again, this is not + always possible as it's valid to create dangling symbolic links. +* Rather than having two separate functions on Windows, you could have a + separate parameter on Windows to specify the type of link to create; + `symlink("a", "b", FILE_SYMLINK)` vs `symlink("a", "b", DIR_SYMLINK)`. + However, having a `symlink` that had different arity on Unix and Windows + would likely be confusing, and since there are only the two possible + choices, simply having two functions seems like a much simpler solution. + +Other choices for the naming convention would be: * The status quo, `soft_link` * The original proposal from rust-lang/rfcs#517, `sym_link` @@ -83,12 +144,29 @@ Unix. It is a classic compromise, that makes everyone unhappy. `sym_link` is slightly more consistent with the complementary `hard_link` function, and treating "sym link" as two separate words has some precedent in two of the Windows-targetted APIs, Delphi and some of the PowerShell cmdlets -observed. +observed. However, I have not found any other snake case API that uses that, +and only a couple of Windows-specific APIs that use it in camel case; most +usage prefers the single word "symlink" to the two word "sym link" as the +abbreviation. The full name `symbolic_link`, is a bit long and cumbersome compared to most of the rest of the API, but is explicit and is the term used in prose to describe the concept everywhere, so shouldn't emphasize any one platform over -the other. +the other. However, unlike all other operations for creating a file or +directory (`open`, `create`, `create_dir`, etc), it is a noun, not a verb. +When used as a verb, it would be called "symbolically link", but that sounds +quite odd in the context of an API: `symbolically_link("a", "b")`. "symlink", +on the other hand, can act as either a noun or a verb. + +It would be possible to prefix any of the forms above that read as a noun with +`create_`, such as `create_symlink`, `create_sym_link`, +`create_symbolic_link`. This adds further to the verbosity, though it is +consisted with `create_dir`; you would probably need to also rename +`hard_link` to `create_hard_link` for consistency, and this seems like a lot +of churn and extra verbosity for not much benefit, as `symlink` and +`hard_link` already act as verbs on their own. If you picked this, then the +Windows versions would need to be named `create_file_symlink` and +`create_dir_symlink` (or the variations with `sym_link` or `symbolic_link`). # Unresolved questions From d509d1ad426a08fb1608ef35c9cb4195c60ae03d Mon Sep 17 00:00:00 2001 From: Brian Campbell Date: Tue, 14 Apr 2015 00:34:56 -0400 Subject: [PATCH 0243/1195] Fix link formatting --- text/0000-rename-soft-link-to-symlink.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-rename-soft-link-to-symlink.md b/text/0000-rename-soft-link-to-symlink.md index c6932856686..88ffb20eebf 100644 --- a/text/0000-rename-soft-link-to-symlink.md +++ b/text/0000-rename-soft-link-to-symlink.md @@ -32,7 +32,7 @@ so; while on Unix, ordinary users can create symlinks, and any additional security policy (such as [Grsecurity][2]) generally restricts whether applications follow symlinks, not whether a user can create them. -[1]: (https://technet.microsoft.com/en-us/library/cc766301%28WS.10%29.aspx) in order to be able to do +[1]: https://technet.microsoft.com/en-us/library/cc766301%28WS.10%29.aspx [2]: https://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity_and_PaX_Configuration_Options#Linking_restrictions Thus, there needs to be a way to distinguish between the two operations on From 9f1ab34b318f50bf2a25c6fbbcb8dea73e6c43f8 Mon Sep 17 00:00:00 2001 From: Brian Campbell Date: Tue, 14 Apr 2015 12:13:25 -0400 Subject: [PATCH 0244/1195] Fix nits Clarify in the summary that we are deprecating `soft_link` rather than removing it outright, and fix a reference to `softlink`. --- text/0000-rename-soft-link-to-symlink.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-rename-soft-link-to-symlink.md b/text/0000-rename-soft-link-to-symlink.md index 88ffb20eebf..082ccc5785c 100644 --- a/text/0000-rename-soft-link-to-symlink.md +++ b/text/0000-rename-soft-link-to-symlink.md @@ -5,7 +5,7 @@ # Summary -Rename `std::fs::soft_link` into platform-specific versions: +Deprecate `std::fs::soft_link` in favor of platform-specific versions: `std::os::unix::fs::symlink`, `std::os::windows::fs::symlink_file`, and `std::os::windows::fs::symlink_dir`. @@ -103,7 +103,7 @@ that call `CreateSymbolicLink` with the appropriate arguments. Keep a deprecated compatibility wrapper `std::fs::soft_link` which wraps `std::os::unix::fs::symlink` or `std::os::windows::fs::symlink_file`, depending on the platform (as that is the current behavior of -`std::fs::softlink`, to create a file symbolic link). +`std::fs::soft_link`, to create a file symbolic link). # Drawbacks From 6feacbb96743611eefa41a0f1f79c7474c08d8cc Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 14 Apr 2015 09:21:20 -0700 Subject: [PATCH 0245/1195] Use the term `symlink` instead --- text/0000-io-fs-2.1.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index a3d64f69cf1..d4cf3927a31 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -19,8 +19,8 @@ these operations are possible in stable Rust today: * Inspecting the unix permission bits on a file * Blanket setting the unix permission bits on a file * Leveraging `DirEntry` for the extra metadata it might contain -* Reading the metadata of a soft link (not what it points at) -* Resolving all soft links in a path +* Reading the metadata of a symlink (not what it points at) +* Resolving all symlink in a path There is some more functionality listed in the [RFC issue][issue], but this RFC will not attempt to solve the entirety of that issue at this time. This RFC @@ -397,7 +397,7 @@ pub struct FileType(..); impl FileType { pub fn is_dir(&self) -> bool; pub fn is_file(&self) -> bool; - pub fn is_soft_link(&self) -> bool; + pub fn is_symlink(&self) -> bool; } ``` @@ -412,20 +412,20 @@ struct being added. It is proposed that `Metadata` will retain the `is_{file,dir}` convenience methods, but no other "file type testers" will be added. -## Enhancing soft link support +## Enhancing symlink support Currently the `std::fs` module provides a `soft_link` and `read_link` function, -but there is no method of doing other soft link related tasks such as: +but there is no method of doing other symlink related tasks such as: -* Testing whether a file is a soft link -* Reading the metadata of a soft link, not what it points to +* Testing whether a file is a symlink +* Reading the metadata of a symlink, not what it points to The following APIs will be added to `std::fs`: ```rust /// Returns the metadata of the file pointed to by `p`, and this function, -/// unlike `metadata` will **not** follow soft links. -pub fn soft_link_metadata>(p: P) -> io::Result; +/// unlike `metadata` will **not** follow symlinks. +pub fn symlink_metadata>(p: P) -> io::Result; ``` ## Binding `realpath` @@ -437,7 +437,7 @@ not bound, and this RFC proposes adding the following API to the `fs` module: ```rust /// Canonicalizes the given file name to an absolute path with all `..`, `.`, -/// and soft link components resolved. +/// and symlink components resolved. /// /// On unix this function corresponds to the return value of the `realpath` /// function, and on Windows this corresponds to the `GetFullPathName` function. @@ -476,7 +476,7 @@ pub trait PathExt { fn is_dir(&self) -> bool; fn is_file(&self) -> bool; fn metadata(&self) -> io::Result; - fn soft_link_metadata(&self) -> io::Result; + fn symlink_metadata(&self) -> io::Result; fn canonicalize(&self) -> io::Result; fn read_link(&self) -> io::Result; fn read_dir(&self) -> io::Result; @@ -497,7 +497,7 @@ The following APIs will be added to `DirEntry`: ```rust impl DirEntry { /// This function will return the filesystem metadata for this directory - /// entry. This is equivalent to calling `fs::soft_link_metadata` on the + /// entry. This is equivalent to calling `fs::symlink_metadata` on the /// path returned. /// /// On Windows this function will always return `Ok` and will not issue a From 646d41c544a69987d09c6b23c3144cf03d7564a0 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 14 Apr 2015 09:21:44 -0700 Subject: [PATCH 0246/1195] Calling `file_type` may fail --- text/0000-io-fs-2.1.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index d4cf3927a31..7062396fae9 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -510,7 +510,7 @@ impl DirEntry { /// On some platforms this may not require reading the metadata of the /// underlying file from the filesystem, but on other platforms it may be /// required to do so. - pub fn file_type(&self) -> FileType; + pub fn file_type(&self) -> io::Result; /// Returns the file name for this directory entry. pub fn file_name(&self) -> OsString; From d8bc0fe4f99bdc11965e9b4ff5bfd9d8b34c5dda Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 14 Apr 2015 09:22:13 -0700 Subject: [PATCH 0247/1195] Rename CreateDirOptions to DirBuilder --- text/0000-io-fs-2.1.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/text/0000-io-fs-2.1.md b/text/0000-io-fs-2.1.md index 7062396fae9..920564d1d0c 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/0000-io-fs-2.1.md @@ -324,9 +324,9 @@ permission bits on unix or security attributes on Windows. This RFC proposes adding the following API to `std::fs`: ```rust -pub struct CreateDirOptions { ... } +pub struct DirBuilder { ... } -impl CreateDirOptions { +impl DirBuilder { /// Creates a new set of options with default mode/security settings for all /// platforms and also non-recursive. pub fn new() -> Self; @@ -342,18 +342,18 @@ impl CreateDirOptions { } mod os::unix::fs { - pub trait CreateDirOptionsExt { + pub trait DirBuilderExt { fn mode(&mut self, mode: raw::mode_t) -> &mut Self; } - impl CreateDirOptionsExt for CreateDirOptions { ... } + impl DirBuilderExt for DirBuilder { ... } } mod os::windows::fs { // once a `SECURITY_ATTRIBUTES` abstraction exists, this will be added - pub trait CreateDirOptionsExt { + pub trait DirBuilderExt { fn security_attributes(&mut self, ...) -> &mut Self; } - impl CreateDirOptionsExt for CreateDirOptions { ... } + impl DirBuilderExt for DirBuilder { ... } } ``` @@ -373,7 +373,7 @@ pub fn template>(&mut self, path: P) -> &mut Self; ``` At this time, however, it it not proposed to add this method to -`CreateDirOptions`. +`DirBuilder`. ## Adding `FileType` From 19c073ebc2563ee24561d0bd2cf17c34994e4bcb Mon Sep 17 00:00:00 2001 From: Yehuda Katz Date: Tue, 14 Apr 2015 20:25:07 -0700 Subject: [PATCH 0248/1195] Update 0000-duration-reform.md --- text/0000-duration-reform.md | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/text/0000-duration-reform.md b/text/0000-duration-reform.md index 90ee3e93d5c..79d7221a930 100644 --- a/text/0000-duration-reform.md +++ b/text/0000-duration-reform.md @@ -80,7 +80,8 @@ pub struct Duration { impl Duration { /// create a Duration from a number of seconds and an - /// additional nanosecond precision + /// additional nanosecond precision. If nanos is one + /// billion or greater, it carries into secs. pub fn new(secs: u64, nanos: u32) -> Timeout; /// create a Duration from a number of seconds @@ -89,7 +90,7 @@ impl Duration { /// create a Duration from a number of milliseconds pub fn from_millis(millis: u64) -> Timeout; - /// the number of seconds represented by the Timeout + /// the number of seconds represented by the Duration pub fn secs(self) -> u64; /// the number of additional nanosecond precision @@ -97,16 +98,22 @@ impl Duration { } ``` -When `Duration` is used with a system API that expects `u32` milliseconds, the nanosecond precision is dropped, and the time is truncated to `u32::MAX`. +When `Duration` is used with a system API that expects `u32` milliseconds, the `Duration`'s precision is coarsened to milliseconds, and, and the number is truncated to `u32::MAX`. + +In general, this RFC assumes that timeout APIs permit spurious updates (see, for example, [pthread_cond_timedwait][pthread_cond_timedwait], "Spurious wakeups from the pthread_cond_timedwait() or pthread_cond_wait() functions may occur"). + +[pthread_cond_timedwait]: http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_timedwait.html `Duration` implements: * `Add`, `Sub`, `Mul`, `Div` which follow the overflow and underflow - rules for `u64` when applied to the `secs` field. Nanoseconds - can never exceed 1 billion or be less than 0, and carry into the - `secs` field. + rules for `u64` when applied to the `secs` field (in particular, + `Sub` will panic if the result would be negative). Nanoseconds + must be less than 1 billion and great than or equal to 0, and carry + into the `secs` field. * `Display`, which prints a number of seconds, milliseconds and - nanoseconds (if more than 0). + nanoseconds (if more than 0). For example, a `Duration` would be + represented as `"15 seconds, 306 milliseconds, and 13 nanoseconds"` * `Debug`, `Ord` (and `PartialOrd`), `Eq` (and `PartialEq`), `Copy` and `Clone`, which are derived. From 98e41ae10aa9f9face9b6f177a8fb586d2d81732 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 14 Apr 2015 21:32:40 -0700 Subject: [PATCH 0249/1195] RFC 771 is std::iter::{once, empty} --- README.md | 1 + text/{0000-std-iter-once.md => 0771-std-iter-once.md} | 6 +++--- 2 files changed, 4 insertions(+), 3 deletions(-) rename text/{0000-std-iter-once.md => 0771-std-iter-once.md} (91%) diff --git a/README.md b/README.md index ea4adf729cf..3bbef0f4a30 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,7 @@ the direction the language is evolving in. * [0560-integer-overflow.md](text/0560-integer-overflow.md) * [0639-discriminant-intrinsic.md](text/0639-discriminant-intrinsic.md) * [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) +* [0771-std-iter-once.md](text/0771-std-iter-once.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) * [0888-compiler-fences.md](text/0888-compiler-fences.md) diff --git a/text/0000-std-iter-once.md b/text/0771-std-iter-once.md similarity index 91% rename from text/0000-std-iter-once.md rename to text/0771-std-iter-once.md index 0c8993ed608..ff205044d08 100644 --- a/text/0000-std-iter-once.md +++ b/text/0771-std-iter-once.md @@ -1,6 +1,6 @@ - Start Date: 2015-1-30 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/771 +- Rust Issue: https://github.com/rust-lang/rust/issues/24443 # Summary @@ -36,7 +36,7 @@ pub fn empty(x: T) -> Empty { } ``` -These wrapper structs exist to allow future backwards-compatible changes, and hide the implementation. +These wrapper structs exist to allow future backwards-compatible changes, and hide the implementation. # Drawbacks From 984b04cf407af37007848407239eb7580f4a8083 Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Thu, 16 Apr 2015 12:08:04 -0400 Subject: [PATCH 0250/1195] Update RFC Signed-off-by: Peter Atashian --- 0000-stdout-existential-crisis.md | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/0000-stdout-existential-crisis.md b/0000-stdout-existential-crisis.md index 413f679ded4..90bdec92c0b 100644 --- a/0000-stdout-existential-crisis.md +++ b/0000-stdout-existential-crisis.md @@ -9,23 +9,28 @@ When calling `println!` it currently causes a panic if `stdout` does not exist. # Motivation -On linux `stdout` almost always exists, so when people write games and turn off the terminal there is still an `stdout` that they write to. Then when getting the code to run on Windows, when the console is disabled, suddenly `stdout` doesn't exist and `println!` panicks. This behavior difference is frustrating to developers trying to move to Windows. +On Linux `stdout` almost always exists, so when people write games and turn off the terminal there is still an `stdout` that they write to. Then when getting the code to run on Windows, when the console is disabled, suddenly `stdout` doesn't exist and `println!` panicks. This behavior difference is frustrating to developers trying to move to Windows. -There is also precedent with C and C++. On both Linux and Windows, if `stdout` is closed or doesn't exist, neither platform will error when printing to the console. +There is also precedent with C and C++. On both Linux and Windows, if `stdout` is closed or doesn't exist, neither platform will error when attempting to print to the console. # Detailed design -Change the internal implementation of `println!` `print!` `panic!` and `assert!` to not `panic!` when `stdout` or `stderr` doesn't exist. When getting `stdout` or `stderr` through the `std::io` methods, those versions should continue to return an error if `stdout` or `stderr` doesn't exist. +When using any of the convenience macros that write to either `stdout` or `stderr`, such as `println!` `print!` `panic!` and `assert!`, change the implementation to ignore the specific error of `stdout` or `stderr` not existing. The behavior of all other errors will be unaffected. This can be implemented by redirecting `stdout` and `stderr` to `std::io::sink` if the original handles do not exist. + +Update the methods `std::io::stdin` `std::io::stdout` and `std::io::stderr` (and any raw versions of these) to return a `Result`. If their respective handles do not exist, then return `Err`. # Drawbacks -Hides an error from the user which we may want to expose and may lead to people missing panicks occuring in threads. +* Hides an error from the user which we may want to expose and may lead to people missing panicks occuring in threads. +* Some languages, such as Ruby and Python, do throw an exception when stdout is missing. # Alternatives -* Make `println!` `print!` `panic!` `assert!` return errors that the user has to handle. +* Make `println!` `print!` `panic!` `assert!` return errors that the user has to handle. This would lose a large part of the convenience of these macros. * Continue with the status quo and panic if `stdout` or `stderr` doesn't exist. +* For `std::io::stdin` `std::io::stdout` and `std::io::stderr`, make them return the equivalent of `std::io::empty` or `std::io::sink` if their respective handles don't exist. This leaves people unable to explicitly handle the case of them not existing, but has the advantage of not breaking stable signatures. +** Or they could simply error upon attempting to write to/read from the handles. # Unresolved questions -* Should `std::io::stdout` return `Err` or `None` when there is no `stdout` instead of unconditionally returning `Stdout`? +* Which is better? Breaking the signatures of those three methods in `std::io`, making them silently redirect to `empty`/`sink`, or erroring upon attempting to write to/read from the handle? From ee559f6cfff55a7e8162dca1d8051073c35e8894 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 15 Apr 2015 16:20:20 -0700 Subject: [PATCH 0251/1195] RFC: Alter mem::forget to be safe Alter the signature of the `std::mem::forget` function to remove `unsafe` Explicitly state that it is not considered unsafe behavior to not run destructors. --- text/0000-safe-mem-forget.md | 124 +++++++++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 text/0000-safe-mem-forget.md diff --git a/text/0000-safe-mem-forget.md b/text/0000-safe-mem-forget.md new file mode 100644 index 00000000000..279705d4d1f --- /dev/null +++ b/text/0000-safe-mem-forget.md @@ -0,0 +1,124 @@ +- Feature Name: N/A +- Start Date: 2015-04-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Alter the signature of the `std::mem::forget` function to remove `unsafe`. +Explicitly state that it is not considered unsafe behavior to not run +destructors. + +# Motivation + +It was [recently discovered][scoped-bug] by @arielb1 that the `thread::scoped` +API was unsound. To recap, this API previously allowed spawning a child thread +sharing the parent's stack, returning an RAII guard which `join`'d the child +thread when it fell out of scope. The join-on-drop behavior here is critical to +the safety of the API to ensure that the parent does not pop the stack frames +the child is referencing. Put another way, the safety of `thread::scoped` relied +on the fact that the `Drop` implementation for `JoinGuard` was *always* run. + +[scoped-bug]: https://github.com/rust-lang/rust/issues/24292 + +The [underlying issue][forget-bug] for this safety hole was that it is possible +to write a version of `mem::forget` without using `unsafe` code (which drops a +value without running its destructor). This is done by creating a cycle of `Rc` +pointers, leaking the actual contents. It [has been pointed out][dtor-comment] +that `Rc` is not the only vector of leaking contents today as there are +[known][dtor-bug1] [bugs][dtor-bug2] where `panic!` may fail to run +destructors. Furthermore, it has [also been pointed out][drain-bug] that not +running destructors can affect the safety of APIs like `Vec::drain_range` in +addition to `thread::scoped`. + +[forget-bug]: https://github.com/rust-lang/rust/issues/24456 +[dtor-comment]: https://github.com/rust-lang/rust/issues/24292#issuecomment-93505374 +[dtor-bug1]: https://github.com/rust-lang/rust/issues/14875 +[dtor-bug2]: https://github.com/rust-lang/rust/issues/16135 +[drain-bug]: https://github.com/rust-lang/rust/issues/24292#issuecomment-93513451 + +It has never been a guarantee of Rust that destructors for a type will run, and +this aspect was overlooked with the `thread::scoped` API which requires that its +destructor be run! Reconciling these two desires has lead to a good deal of +discussion of possible mitigation strategies for various aspects of this +problem. This strategy proposed in this RFC aims to fit uninvasively into the +standard library to avoid large overhauls or destabilizations of APIs. + +# Detailed design + +Primarily, the `unsafe` annotation on the `mem::forget` function will be +removed, allowing it to be called from safe Rust. This transition will be made +possible by stating that destructors **may not run** in all circumstances (from +both the language and library level). The standard library and the primitives it +provides will always attempt to run destructors, but will not provide a +guarantee that destructors will be run. + +It is still likely to be a footgun to call `mem::forget` as memory leaks are +almost always undesirable, but the purpose of the `unsafe` keyword in Rust is to +indicate **memory unsafety** instead of being a general deterrent for "should be +avoided" APIs. Given the premise that types must be written assuming that their +destructor may not run, it is the fault of the type in question if `mem::forget` +would trigger memory unsafety, hence allowing `mem::forget` to be a safe +function. + +Note that this modification to `mem::forget` is a breaking change due to the +signature of the function being altered, but it is expected that most code will +not break in practice and this would be an acceptable change to cherry-pick into +the 1.0 release. + +# Drawbacks + +It is clearly a very nice feature of Rust to be able to rely on the fact that a +destructor for a type is always run (e.g. the `thread::scoped` API). Admitting +that destructors may not be run can lead to difficult API decisions later on and +even accidental unsafety. This route, however, is the least invasive for the +standard library and does not require radically changing types like `Rc` or +fast-tracking bug fixes to panicking destructors. + +# Alternatives + +The main alternative this proposal is to provide the guarantee that a destructor +for a type is always run and that it is memory unsafe to not do so. This would +require a number of pieces to work together: + +* Panicking destructors not running other locals' destructors would [need to be + fixed][dtor-bug1] +* Panics in the elements of containers would [need to be fixed][dtor-bug2] to + continue running other elements' destructors. +* The `Rc` and `Arc` types would need be reevaluated somehow. One option would + be to statically prevent cycles, and another option would be to disallow types + that are unsafe to leak from being placed in `Rc` and `Arc` (more details + below). +* An audit would need to be performed to ensure that there are no other known + locations of leaks for types. There are likely more than one location than + those listed here which would need to be addressed, and it's also likely that + there would continue to be locations where destructors were not run. + +There has been quite a bit of discussion specifically on the topic of `Rc` and +`Arc` as they may be tricky cases to fix. Specifically, the compiler could +perform some form of analysis could to forbid *all* cycles or just those that +would cause memory unsafety. Unfortunately, forbidding all cycles is likely to +be too limiting for `Rc` to be useful. Forbidding only "bad" cycles, however, is +a more plausible option. + +Another alternative, as proposed by @arielb1, would be [a `Leak` marker +trait][leak] to indicate that a type is "safe to leak". Types like `Rc` would +require that their contents are `Leak`, and the `JoinGuard` type would opt-out +of it. This marker trait could work similarly to `Send` where all types are +considered leakable by default, but types could opt-out of `Leak`. This +approach, however, requires `Rc` and `Arc` to have a `Leak` bound on their type +parameter which can often leak unfortunately into many generic contexts (e.g. +trait objects). Another option would be to treak `Leak` more similarly to +`Sized` where all type parameters have a `Leak` bound by default. This change +may also cause confusion, however, by being unnecessarily restrictive (e.g. all +collections may want to take `T: ?Leak`). + +[leak]: https://github.com/rust-lang/rust/issues/24292#issuecomment-91646130 + +Overall the changes necessary for this strategy are more invasive than admitting +destructors may not run, so this alternative is not proposed in this RFC. + +# Unresolved questions + +Are there remaining APIs in the standard library which rely on destructors being +run for memory safety? From 45249d76c56474a927113d22bcd7a4babe8dffc0 Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Thu, 16 Apr 2015 12:32:28 -0400 Subject: [PATCH 0252/1195] Swap with alternative. Signed-off-by: Peter Atashian --- 0000-stdout-existential-crisis.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/0000-stdout-existential-crisis.md b/0000-stdout-existential-crisis.md index 90bdec92c0b..3dfeda5d37b 100644 --- a/0000-stdout-existential-crisis.md +++ b/0000-stdout-existential-crisis.md @@ -17,7 +17,10 @@ There is also precedent with C and C++. On both Linux and Windows, if `stdout` i When using any of the convenience macros that write to either `stdout` or `stderr`, such as `println!` `print!` `panic!` and `assert!`, change the implementation to ignore the specific error of `stdout` or `stderr` not existing. The behavior of all other errors will be unaffected. This can be implemented by redirecting `stdout` and `stderr` to `std::io::sink` if the original handles do not exist. -Update the methods `std::io::stdin` `std::io::stdout` and `std::io::stderr` (and any raw versions of these) to return a `Result`. If their respective handles do not exist, then return `Err`. +Update the methods `std::io::stdin` `std::io::stdout` and `std::io::stderr` as follows: +* If `stdout` or `stderr` does not exist, return the equivalent of `std::io::sink`. +* If `stderr` does not exist, return the equivalent of `std::io::empty`. +* For the raw versions, return a `Result`, and if the respective handle does not exist, return an `Err`. # Drawbacks @@ -28,8 +31,8 @@ Update the methods `std::io::stdin` `std::io::stdout` and `std::io::stderr` (and * Make `println!` `print!` `panic!` `assert!` return errors that the user has to handle. This would lose a large part of the convenience of these macros. * Continue with the status quo and panic if `stdout` or `stderr` doesn't exist. -* For `std::io::stdin` `std::io::stdout` and `std::io::stderr`, make them return the equivalent of `std::io::empty` or `std::io::sink` if their respective handles don't exist. This leaves people unable to explicitly handle the case of them not existing, but has the advantage of not breaking stable signatures. -** Or they could simply error upon attempting to write to/read from the handles. +* For `std::io::stdin` `std::io::stdout` and `std::io::stderr`, make them return a `Result`. This would be a breaking change to the signature, so if this is desired it should be done immediately before 1.0. +** Alternatively, make the objects returned by these methods error upon attempting to write to/read from them if their respective handle doesn't exist. # Unresolved questions From 55dfb4ea1e8ab19b4be48c890250eecf9995ecb0 Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Thu, 16 Apr 2015 12:41:36 -0400 Subject: [PATCH 0253/1195] Typo Signed-off-by: Peter Atashian --- 0000-stdout-existential-crisis.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/0000-stdout-existential-crisis.md b/0000-stdout-existential-crisis.md index 3dfeda5d37b..bcc7b648531 100644 --- a/0000-stdout-existential-crisis.md +++ b/0000-stdout-existential-crisis.md @@ -19,7 +19,7 @@ When using any of the convenience macros that write to either `stdout` or `stder Update the methods `std::io::stdin` `std::io::stdout` and `std::io::stderr` as follows: * If `stdout` or `stderr` does not exist, return the equivalent of `std::io::sink`. -* If `stderr` does not exist, return the equivalent of `std::io::empty`. +* If `stdin` does not exist, return the equivalent of `std::io::empty`. * For the raw versions, return a `Result`, and if the respective handle does not exist, return an `Err`. # Drawbacks From b4dc3b9344ec45a126467afe92aa3c32210472bb Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 4 Mar 2015 20:52:08 -0800 Subject: [PATCH 0254/1195] RFC: Scaling Rust's Governance --- text/0000-rust-governance.md | 719 +++++++++++++++++++++++++++++++++++ 1 file changed, 719 insertions(+) create mode 100644 text/0000-rust-governance.md diff --git a/text/0000-rust-governance.md b/text/0000-rust-governance.md new file mode 100644 index 00000000000..fe1e988da37 --- /dev/null +++ b/text/0000-rust-governance.md @@ -0,0 +1,719 @@ +- Feature Name: not applicable +- Start Date: 2015-02-27 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC proposes to expand, and make more explicit, Rust's governance +structure. It seeks to supplement today's core team with several +*subteams* that are more narrowly focused on specific areas of +interest. + +*Thanks to Nick Cameron, Manish Goregaokar, Yehuda Katz, Niko Matsakis and Dave + Herman for many suggestions and discussions along the way.* + +# Motivation + +Rust's governance has evolved over time, perhaps most dramatically +with the introduction of the RFC system -- which has itself been +tweaked many times. RFCs have been a major boon for improving design +quality and fostering deep, productive discussion. It's something we +all take pride in. + +That said, as Rust has matured, a few growing pains have emerged. + +We'll start with a brief review of today's governance and process, +then discuss what needs to be improved. + +## Background: today's governance structure + +Rust is governed by a +[core team](https://github.com/rust-lang/rust/wiki/Note-core-team), +which is ultimately responsible for all decision-making in the +project. Specifically, the core team: + +* Sets the overall direction and vision for the project; +* Sets the priorities and release schedule; +* Makes final decisions on RFCs. + +The core team currently has 8 members, including some people working +full-time on Rust, some volunteers, and some production users. + +Most technical decisions are decided through the +[RFC process](https://github.com/rust-lang/rfcs#what-the-process-is). +RFCs are submitted for essentially all changes to the language, +most changes to the standard library, and +[a few other topics](https://github.com/rust-lang/rfcs#when-you-need-to-follow-this-process). +RFCs are either closed immediately (if they are clearly not viable), +or else assigned a *shepherd* who is responsible for keeping the +discussion moving and ensuring all concerns are responded to. + +The final decision to accept or reject an RFC is made by the core +team. In many cases this decision follows after many rounds of +consensus-building among all stakeholders for the RFC. In the end, +though, most decisions are about weighting various tradeoffs, and the +job of the core team is to make the final decision about such +weightings in light of the overall direction of the language. + +## What needs improvement + +At a high level, we need to improve: + +* Process scalability. +* Stakeholder involvement. +* Clarity/transparency. +* Moderation processes. + +Below, each of these bullets is expanded into a more detailed analysis +of the problems. These are the problems this RFC is trying to +solve. The "Detailed Design" section then gives the actual proposal. + +### Scalability: RFC process + +In some ways, the RFC process is a victim of its own success: as the +volume and depth of RFCs has increased, it's harder for the entire +core team to stay educated and involved in every RFC. The +[shepherding process](https://github.com/rust-lang/rfcs#the-role-of-the-shepherd) +has helped make sure that RFCs don't fall through the cracks, but even +there it's been hard for the relatively small number of shepherds to +keep up (on top of the other work that they do). + +Part of the problem, of course, is due to the current push toward 1.0, +which has both increased RFC volume and takes up a great deal of +attention from the core team. But after 1.0 is released, the community +is likely to grow significantly, and feature requests will only +increase. + +Growing the core team over time has helped, but there's a practical +limit to the number of people who are jointly making decisions and +setting direction. + +A distinct problem in the other direction has also emerged recently: we've +slowly been requiring RFCs for increasingly minor changes. While it's important +that user-facing changes and commitments be vetted, the process has started to +feel heavyweight (especially for newcomers), so a recalibration may be in order. + +We need a way to scale up the RFC process that: + +* Ensures each RFC is thoroughly reviewed by several people with + interest and expertise in the area, but with different perspectives + and concerns. + +* Ensures each RFC continues moving through the pipeline at a + reasonable pace. + +* Ensures that accepted RFCs are well-aligned with the values, goals, + and direction of the project, and with other RFCs (past, present, + and future). + +* Ensures that simple, uncontentious changes can be made quickly, without undue + process burden. + +### Scalability: areas of focus + +In addition, there are increasingly areas of important work that are +only loosely connected with decisions in the core language or APIs: +tooling, documentation, infrastructure, for example. These areas all +need leadership, but it's not clear that they require the same degree +of global coordination that more "core" areas do. + +These areas are only going to increase in number and importance, so we +should remove obstacles holding them back. + +### Stakeholder involvement + +RFC shepherds are intended to reach out to "stakeholders" in an RFC, +to solicit their feedback. But that is different from the stakeholders +having a direct role in decision making. + +To the extent practical, we should include a diverse range of +perspectives in both design and decision-making, and especially +include people who are most directly affected by decisions: users. + +We have taken some steps in this direction by diversifying the core +team itself, but (1) members of the core team by definition need to +take a balanced, global view of things and (2) the core team should +not grow too large. So some other way of including more stakeholders +in decisions would be preferable. + +### Clarity and transparency + +Despite many steps toward increasing the clarity and openness of +Rust's processes, there is still room for improvement: + +* The priorities and values set by the core team are not always + clearly communicated today. This in turn can make the RFC process + seem opaque, since RFCs move along at different speeds (or are even + closed as postponed) according to these priorities. + + At a large scale, there should be more systematic communication + about high-level priorities. It should be clear whether a given RFC + topic would be considered in the near term, long term, or + never. Recent blog posts about the 1.0 release and stabilization + have made a big step in this direction. After 1.0, as part of the + regular release process, we'll want to find some regular cadence for + setting and communicating priorities. + + At a smaller scale, it is still the case that RFCs fall through the + cracks or have unclear statuses (see Scalability problems + above). Clearer, public tracking of the RFC pipeline would be a + significant improvement. + +* The decision-making process can still be opaque: it's not always + clear to an RFC author exactly when and how a decision on the RFC + will be made, and how best to work with the team for a favorable + decision. We strive to make core team meetings as *uninteresting* as + possible (that is, all interesting debate should happen in public + online communication), but there is still room for being more + explicit and public. + +### Community norms and the Code of Conduct + +Rust's design process and community norms are closely intertwined. The +RFC process is a joint exploration of design space and tradeoffs, and +requires consensus-building. The process -- and the Rust community -- +is at its best when all participants recognize that + +> ... people have differences of opinion and that every design or +> implementation choice carries a trade-off and numerous costs. There +> is seldom a right answer. + +This and other important values and norms are recorded in the +[project code of conduct (CoC)](http://www.rust-lang.org/conduct.html), +which also includes language about harassment and marginalized groups. + +Rust's community has long upheld a high standard of conduct, and has +earned a reputation for doing so. + +However, as the community grows, as people come and go, we must +continually work to maintain this standard. Usually, it suffices to +lead by example, or to gently explain the kind of mutual respect that +Rust's community practices. Sometimes, though, that's not enough, and +explicit moderation is needed. + +One problem that has emerged with the CoC is the lack of clarity about +the mechanics of moderation: + +* Who is responsible for moderation? +* What about conflicts of interest? Are decision-makers also moderators? +* How are moderation decisions reached? When are they unilateral? +* When does moderation begin, and how quickly should it occur? +* Does moderation take into account past history? +* What venues does moderation apply to? + +Answering these questions, and generally clarifying how the CoC is viewed and +enforced, is an important step toward scaling up the Rust community. + +# Detailed design + +The basic idea is to supplement the core team with several "subteams". Each +subteam is focused on a specific area, e.g., language design or libraries. Most +of the RFC review process will take place within the relevant subteam, scaling +up our ability to make decisions while involving a larger group of people in +that process. + +To ensure global coordination and a strong, coherent vision for the project as a +whole, **each subteam is led by a member of the core team**. + +## Subteams + +**The primary roles of each subteam are**: + +* Shepherding RFCs for the subteam area. As always, that means (1) ensuring that + stakeholders are aware of the RFC, (2) working to tease out various design + tradeoffs and alternatives, and (3) helping build consensus. + +* Accepting or rejecting RFCs in the subteam area. + +* Setting policy on what changes in the subteam area require RFCs, and reviewing + direct PRs for changes that do not require an RFC. + +* Delegating *reviewer rights* for the subteam area. The ability to `r+` is not + limited to team members, and in fact earning `r+` rights is a good stepping + stone toward team membership. Each team should set reviewing policy, manage + reviewing rights, and ensure that reviews take place in a timely manner. + (Thanks to Nick Cameron for this suggestion.) + +Subteams make it possible to involve a larger, more diverse group in the +decision-making process. In particular, **they should involve a mix of**: + +* Rust project leadership, in the form of at least one core team member (the + leader of the subteam). + +* Area experts: people who have a lot of interest and expertise in the subteam + area, but who may be far less engaged with other areas of the project. + +* Stakeholders: people who are strongly affected by decisions in the + subteam area, but who may not be experts in the design or + implementation of that area. *It is crucial that some people heavily + using Rust for applications/libraries have a seat at the table, to + make sure we are actually addressing real-world needs.* + +Members should have demonstrated a good sense for design and dealing with +tradeoffs, an ability to work within a framework of consensus, and of course +sufficient knowledge about or experience with the subteam area. Leaders should +in addition have demonstrated exceptional communication, design, and people +skills. They must be able to work with a diverse group of people and help lead +it toward consensus and execution. + +Each subteam is led by a member of the core team. **The leader is responsible for**: + +* Setting up the subteam: + + * Deciding on the initial membership of the subteam (in consultation with + the core team). + + * Working with subteam members to determine and publish subteam policies and + mechanics. + +* Communicating core team vision downward to the subteam. + +* Alerting the core team to subteam RFCs that need global, cross-cutting + attention, and to RFCs that have entered the "final comment period" (see below). + +* Ensuring that RFCs and PRs are progressing at a reasonable rate, re-assigning + shepherds/reviewers as needed. + +* Making final decisions in cases of contentious RFCs that are unable to reach + consensus otherwise (should be rare). + +The way that subteams communicate internally and externally is left to each +subteam to decide, but: + +* Technical discussion should take place as much as possible on public forums, + ideally on RFC/PR threads and tagged discuss posts. + +* Each subteam will have a dedicated discuss forum tag. + +* Subteams should have some kind of regular meeting or other way of making + decisions. + +* Subteams should regularly publish the status of RFCs, PRs, and other news + related to their area. + +## Core team + +**The core team serves as leadership for the Rust project as a whole**. In + particular, it: + +* **Sets the overall direction and vision for the project.** That means setting + the core values that are used when making decisions about technical + tradeoffs. It means steering the project toward specific use cases where Rust + can have a major impact. It means leading the discussion, and writing RFCs + for, *major* initiatives in the project. + +* **Sets the priorities and release schedule.** Design bandwidth is limited, and + it's dangerous to try to grow the language too quickly; the core team makes + some difficult decisions about which areas to prioritize for new design, based + on the core values and target use cases. + +* **Focuses on broad, cross-cutting concerns.** The core team is specifically + designed to take a *global* view of the project, to make sure the pieces are + fitting together in a coherent way. + +* **Spins up or shuts down subteams.** Over time, we may want to expand the set + of subteams, and it may make sense to have temporary "strike teams" that focus + on a particular, limited task. + +* **Decides whether/when to ungate a feature.** While the subteams make + decisions on RFCs, the core team is responsible for pulling the trigger that + moves a feature from nightly to stable. This provides an extra check that + features have adequately addressed cross-cutting concerns, that the + implementation quality is high enough, and that language/library commitments + are reasonable. + +The core team should include both the subteam leaders, and, over time, a diverse +set of other stakeholders that are both actively involved in the Rust community, +and can speak to the needs of major Rust constituencies, to ensure that the +project is addressing real-world needs. + +## Decision-making + +### Consensus + +Rust has long used a form of [consensus decision-making][consensus]. In a +nutshell the premise is that a successful outcome is not where one side of a +debate has "won", but rather where concerns from *all* sides have been addressed +in some way. **This emphatically does not entail design by committee, nor +compromised design**. Rather, it's a recognition that + +> ... every design or implementation choice carries a trade-off and numerous +> costs. There is seldom a right answer. + +Breakthrough designs sometimes end up changing the playing field by eliminating +tradeoffs altogether, but more often difficult decisions have to be made. **The +key is to have a clear vision and set of values and priorities**, which is the +core team's responsibility to set and communicate, and the subteam's +responsibility to act upon. + +Whenever possible, we seek to reach consensus through discussion and design +revision. Concretely, the steps are: + +* Initial RFC proposed, with initial analysis of tradeoffs. +* Comments reveal additional drawbacks, problems, or tradeoffs. +* RFC revised to address comments, often by improving the design. +* Repeat above until "major objections" are fully addressed, or it's clear that + there is a fundamental choice to be made. + +Consensus is reached when most people are left with only "minor" objections, +i.e., while they might choose the tradeoffs slightly differently they do not +feel a strong need to *actively block* the RFC from progressing. + +One important question is: consensus among which people, exactly? Of course, the +broader the consensus, the better. But at the very least, **consensus within the +members of the subteam should be the norm for most decisions.** If the core team +has done its job of communicating the values and priorities, it should be +possible to fit the debate about the RFC into that framework and reach a fairly +clear outcome. + +[consensus]: http://en.wikipedia.org/wiki/Consensus_decision-making + +### Lack of consensus + +In some cases, though, consensus cannot be reached. These cases tend to split +into two very different camps: + +* "Trivial" reasons, e.g., there is not widespread agreement about naming, but + there is consensus about the substance. + +* "Deep" reasons, e.g., the design fundamentally improves one set of concerns at + the expense of another, and people on both sides feel strongly about it. + +In either case, an alternative form of decision-making is needed. + +* For the "trivial" case, usually either the RFC shepherd or subteam leader will + make an executive decision. + +* For the "deep" case, the subteam leader is empowered to make a final decision, + but should consult with the rest of the core team before doing so. + +### How and when RFC decisions are made, and the "final comment period" + +Each RFC has a shepherd drawn from the relevant subteam. The shepherd is +responsible for driving the consensus process -- working with both the RFC +author and the broader community to dig out problems, alternatives, and improved +design, always working to reach broader consensus. + +At some point, the RFC comments will reach a kind of "steady state", where no +new tradeoffs are being discovered, and either objections have been addressed, +or it's clear that the design has fundamental downsides that need to be weighed. + +At that point, the shepherd will announce that the RFC is in a "final comment +period" (which lasts for one week). This is a kind of "last call" for strong +objections to the RFC. **The announcement of the final comment period for an RFC +should be very visible**; it should be included in the subteam's periodic +communications. + +> Note that the final comment period is in part intended to help keep RFCs +> moving. Historically, RFCs sometimes stall out at a point where discussion has +> died down but a decision isn't needed urgently. In this proposed model, the +> RFC author could ask the shepherd to move to the final comment period (and +> hence toward a decision). + +After the final comment period, the subteam can make a decision on the RFC. The +role of the subteam at that point is *not* to reveal any new technical issues or +arguments; if these come up during discussion, they should be added as comments +to the RFC, and it should undergo another final comment period. + +Instead, the subteam decision is based on **weighing the already-revealed +tradeoffs against the project's priorities and values** (which the core team is +responsible for setting, globally). In the end, these decisions are about how to +weight tradeoffs. The decision should be communicated in these terms, pointing +out the tradeoffs that were raised and explaining how they were weighted, and +**never introducing new arguments**. + +## Keeping things lightweight + +In addition to the "final comment period" proposed above, this RFC proposes some +further adjustments to the RFC process to keep it lightweight. + +A key observation is that, thanks to the stability system and nightly/stable +distinction, **it's easy to experiment with features without commitment**. + +### Clarifying what needs an RFC + +Over time, we've been drifting toward requiring an RFC for essentially any +user-facing change, which sometimes means that very minor changes get stuck +awaiting an RFC decision. While subteams + final comment period should help keep +the pipeline flowing a bit better, it would also be good to allow "minor" +changes to go through without an RFC, provided there is sufficient review in +some other way. (And in the end, the core team ungates features, which ensures +at least a final review.) + +This RFC does not attempt to answer the question "What needs an RFC", because +that question will vary for each subteam. However, this RFC stipulates that each +subteam should set an explicit policy about: + +1. What requires an RFC for the subteam's area, and +2. What the non-RFC review process is. + +These guidelines should try to keep the process lightweight for minor changes. + +### Clarifying the "finality" of RFCs + +While RFCs are very important, they do not represent the final state of a +design. Often new issues or improvements arise during implementation, or after +gaining some experience with a feature. **The nightly/stable distinction exists +in part to allow for such design iteration.** + +Thus RFCs do not need to be "perfect" before acceptance. If consensus is reached +on major points, the minor details can be left to implementation and revision. + +Later, if an implementation differs from the RFC in *substantial* ways, the +subteam should be alerted, and may ask for an explicit amendment RFC. Otherwise, +the changes should just be explained in the commit/PR. + +## The teams + +With all of that out of the way, what subteams should we start with? This RFC +proposes the following initial set: + +* Language design +* Libraries +* Compiler +* Tooling and infrastructure +* Moderation + +In the long run, we will likely also want teams for documentation and for +community events, but these can be spun up once there is a more clear need (and +available resources). + +### Language design team + +Focuses on the *design* of language-level features; not all team members need to +have extensive implementation experience. + +Some example RFCs that fall into this area: + +* [Associated types and multidispatch](https://github.com/rust-lang/rfcs/pull/195) +* [DST coercions](https://github.com/rust-lang/rfcs/pull/982) +* [Trait-based exception handling](https://github.com/rust-lang/rfcs/pull/243) +* [Rebalancing coherence](https://github.com/rust-lang/rfcs/pull/1023) +* [Integer overflow](https://github.com/rust-lang/rfcs/pull/560) (this has high + overlap with the library subteam) +* [Sound generic drop](https://github.com/rust-lang/rfcs/pull/769) + +### Library team + +Oversees both `std` and, ultimately, other crates in the `rust-lang` github +organization. The focus up to this point has been the standard library, but we +will want "official" libraries that aren't quite `std` territory but are still +vital for Rust. (The precise plan here, as well as the long-term plan for `std`, +is one of the first important areas of debate for the subteam.) Also includes +API conventions. + +Some example RFCs that fall into this area: + +* [Collections reform](https://github.com/rust-lang/rfcs/pull/235) +* [IO reform](https://github.com/rust-lang/rfcs/pull/517/) +* [Debug improvements](https://github.com/rust-lang/rfcs/pull/640) +* [Simplifying std::hash](https://github.com/rust-lang/rfcs/pull/823) +* [Conventions for ownership variants](https://github.com/rust-lang/rfcs/pull/199) + +### Compiler team + +Focuses on compiler internals, including implementation of language +features. This broad category includes work in codegen, factoring of compiler +data structures, type inference, borrowck, and so on. + +There is a more limited set of example RFCs for this subteam, in part because we +haven't generally required RFCs for this kind of internals work, but here are two: + +* [Non-zeroing dynamic drops](https://github.com/rust-lang/rfcs/pull/320) (this + has high overlap with language design) +* [Incremental compilation](https://github.com/rust-lang/rfcs/pull/594) + +### Tooling and infrastructure team + +Even more broad is the "tooling" subteam, which at inception is planned to +encompass every "official" (rust-lang managed) non-`rustc` tool: + +* rustdoc +* rustfmt +* Cargo +* crates.io +* CI infrastructure +* Debugging tools +* Profiling tools +* Editor/IDE integration +* Refactoring tools + +It's not presently clear exactly what tools will end up under this umbrella, nor +which should be prioritized. + +### Moderation team + +Finally, the moderation team is responsible for dealing with CoC violations. + +One key difference from the other subteams is that the moderation team does not +have a leader. Its members are chosen directly by the core team, and should be +community members who have demonstrated the highest standard of discourse and +maturity. To limit conflicts of interest, **the moderation subteam should not +include any core team members**. However, the subteam is free to consult with +the core team as it deems appropriate. + +The moderation team will have a public email address that can be used to raise +complaints about CoC violations (forwards to all active moderators). + +#### Initial plan for moderation + +What follows is an initial proposal for the mechanics of moderation. The +moderation subteam may choose to revise this proposal by drafting an RFC, which +will be approved by the core team. + +Moderation begins whenever a moderator becomes aware of a CoC problem, either +through a complaint or by observing it directly. In general, the enforcement +steps are as follows: + +> **These steps are adapted from text written by Manish Goregaokar, who helped +articulate them from experience as a Stack Exchange moderator.** + +* Except for extreme cases (see below), try first to address the problem with a + light public comment on thread, aimed to de-escalate the situation. These + comments should strive for as much empathy as possible. Moderators should + emphasize that dissenting opinions are valued, and strive to ensure that the + technical points are heard even as they work to cool things down. + + When a discussion has just gotten a bit heated, the comment can just be a + reminder to be respectful and that there is rarely a clear "right" answer. In + cases that are more clearly over the line into personal attacks, it can + directly call out a problematic comment. + +* If the problem persists on thread, or if a particular person repeatedly comes + close to or steps over the line of a CoC violation, moderators then email the + offender privately. The message should include relevant portions of the CoC + together with the offending comments. Again, the goal is to de-escalate, and + the email should be written in a dispassionate and empathetic way. However, + the message should also make clear that continued violations may result in a + ban. + +* If problems still persist, the moderators can ban the offender. Banning should + occur for progressively longer periods, for example starting at 1 day, then 1 + week, then permanent. The moderation subteam will determine the precise + guidelines here. + +In general, moderators can and should unilaterally take the first step, but +steps beyond that (particularly banning) should be done via consensus with the +other moderators. Permanent bans require core team approval. + +Some situations call for more immediate, drastic measures: deeply inappropriate +comments, harassment, or comments that make people feel unsafe. (See the +[code of conduct](http://www.rust-lang.org/conduct.html) for some more details +about this kind of comment). In these cases, an individual moderator is free to +take immediate, unilateral steps including redacting or removing comments, or +instituting a short-term ban until the subteam can convene to deal with the +situation. + +The moderation team is responsible for interpreting the CoC. Drastic measures +like bans should only be used in cases of clear, repeated violations. + +Moderators themselves are held to a very high standard of behavior, and should +strive for professional and impersonal interactions when dealing with a CoC +violation. They should always push to *de-escalate*. And they should recuse +themselves from moderation in threads where they are actively participating in +the technical debate or otherwise have a conflict of interest. Moderators who +fail to keep up this standard, or who abuse the moderation process, may be +removed by the core team. + +Subteam, and especially core team members are *also* held to a high standard of +behavior. Part of the reason to separate the moderation subteam is to ensure +that CoC violations by Rust's leadership be addressed through the same +independent body of moderators. + +Moderation covers all rust-lang venues, which currently include github +repos, IRC channels (#rust, #rust-internals, #rustc, #rust-libs), and +the two discourse forums. (The subreddit already has its own +moderation structure, and isn't directly associated with the rust-lang +organization.) + +# Drawbacks + +One possibility is that decentralized decisions may lead to a lack of coherence +in the overall design of Rust. However, the existence of the core team -- and +the fact that subteam leaders will thus remain in close communication on +cross-cutting concerns in particular -- serves to greatly mitigate that risk. + +As with any change to governance, there is risk that this RFC would harm +processes that are working well. In particular, bringing on a large number of +new people into official decision-making roles carries a risk of culture clash +or problems with consensus-building. + +By setting up this change as a relatively slow build-out from the current core +team, some of this risk is mitigated: it's not a radical restructuring, but +rather a refinement of the current process. In particular, today core team +members routinely seek input directly from other community members who would be +likely subteam members; in some ways, this RFC just makes that process more +official. + +For the moderation subteam, there is a significant shift toward strong +enforcement of the CoC, and with that a risk of *over*-application: the goal is +to make discourse safe and productive, not to introduce fear of violating the +CoC. The moderation guidelines, careful selection of moderators, and ability to +withdraw moderators mitigate this risk. + +# Alternatives + +There are numerous other forms of open-source governance out there, far more +than we can list or detail here. And in any case, this RFC is intended as an +expansion of Rust's existing governance to address a few scaling problems, +rather than a complete rethink. + +[Mozilla's module system][module], was a partial inspiration for this RFC. The +proposal here can be seen as an evolution of the module system where the subteam +leaders (module owners) are integrated into an explicit core team, providing for +tighter intercommunication and a more unified sense of vision and purpose. +Alternatively, the proposal is an evolution of the current core team structure +to include subteams. + +One seemingly minor, but actually important aspect is *naming*: + +* The name "subteam" (from [jQuery][jq]) felt like a better fit than "module" both +to avoid confusion (having two different kinds of modules associated with +Mozilla seems problematic) and because it emphasizes the more unified nature of +this setup. + +* The term "leader" was chosen to reflect that there is a vision for each subteam +(as part of the larger vision for Rust), which the leader is responsible for +moving the subteam toward. Notably, this is how "module owner" is actually +defined in Mozilla's module system: + + > A "module owner" is the person to whom leadership of a module's work has been + > delegated. + +* The term "team member" is just following standard parlance. It could be +replaced by something like "peer" (following the module system tradition), or +some other term that is less bland than "member". Ideally, the term would +highlight the significant stature of team membership: being part of the +decision-making group for a substantial area of the Rust project. + +[module]: https://wiki.mozilla.org/Modules +[jq]: https://jquery.org/team/ +[mom]: https://wiki.mozilla.org/Modules/Activities#Module_Ownership_System + +# Unresolved questions + +## Subteams + +This RFC purposefully leaves several subteam-level questions open: + +* What is the exact venue and cadence for subteam decision-making? +* Do subteams have dedicated IRC channels or other forums? (This RFC stipulates + only dedicated discourse tags.) +* How large is each subteam? +* What are the policies for when RFCs are required, or when PRs may be reviewed + directly? + +These questions are left to be address by subteams after their formation, in +part because good answers will likely require some iterations to discover. + +## Broader questions + +There are many other questions that this RFC doesn't seek to address, and this +is largely intentional. For one, it avoids trying to set out too much structure +in advance, making it easier to iterate on the mechanics of subteams. In +addition, there is a danger of *too much* policy and process, especially given +that this RFC is aimed to improve the scalability of decision-making. It should +be clear that this RFC is not the last word on governance, and over time we will +probably want to grow more explicit policies in other areas -- but a +lightweight, iterative approach seems the best way to get there. From cb57978eb6c3c13cdb5fd69267ab7af7072bae2c Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 17 Apr 2015 14:03:45 -0700 Subject: [PATCH 0255/1195] Update links, clarify a few questions --- text/0000-rust-governance.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/text/0000-rust-governance.md b/text/0000-rust-governance.md index fe1e988da37..0757ffeacf6 100644 --- a/text/0000-rust-governance.md +++ b/text/0000-rust-governance.md @@ -29,7 +29,7 @@ then discuss what needs to be improved. ## Background: today's governance structure Rust is governed by a -[core team](https://github.com/rust-lang/rust/wiki/Note-core-team), +[core team](https://github.com/rust-lang/rust-wiki-backup/blob/master/Note-core-team.md), which is ultimately responsible for all decision-making in the project. Specifically, the core team: @@ -262,10 +262,11 @@ Each subteam is led by a member of the core team. **The leader is responsible fo * Setting up the subteam: * Deciding on the initial membership of the subteam (in consultation with - the core team). + the core team). Once the subteam is up and running. * Working with subteam members to determine and publish subteam policies and - mechanics. + mechanics, including the way that subteam members join or leave the team + (which should be based on subteam consensus). * Communicating core team vision downward to the subteam. @@ -284,13 +285,15 @@ subteam to decide, but: * Technical discussion should take place as much as possible on public forums, ideally on RFC/PR threads and tagged discuss posts. -* Each subteam will have a dedicated discuss forum tag. +* Each subteam will have a dedicated + [discuss forum](http://internals.rust-lang.org/) tag. * Subteams should have some kind of regular meeting or other way of making decisions. * Subteams should regularly publish the status of RFCs, PRs, and other news - related to their area. + related to their area. Ideally, this would be done in part via a dashboard + like [the Homu queue](http://buildbot.rust-lang.org/homu/queue/rust) ## Core team From c3f9fda5f1c2f284760c7bd150801eeeb2373708 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 17 Apr 2015 14:12:34 -0700 Subject: [PATCH 0256/1195] Clarify that subteams seek out input from other stakeholders, and elaborate on how decisions are communicated. --- text/0000-rust-governance.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/text/0000-rust-governance.md b/text/0000-rust-governance.md index 0757ffeacf6..9c15d540148 100644 --- a/text/0000-rust-governance.md +++ b/text/0000-rust-governance.md @@ -288,8 +288,14 @@ subteam to decide, but: * Each subteam will have a dedicated [discuss forum](http://internals.rust-lang.org/) tag. +* Subteams should actively seek out discussion and input from stakeholders who + are not members of the team. + * Subteams should have some kind of regular meeting or other way of making - decisions. + decisions. The content of this meeting should be summarized with the rationale + for each decision -- and, as explained below, decisions should generally be + about weighting a set of already-known tradeoffs, not discussing or + discovering new rationale. * Subteams should regularly publish the status of RFCs, PRs, and other news related to their area. Ideally, this would be done in part via a dashboard From d3ba6ab4902b51e5d30421edda85fecef5d5abc4 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 17 Apr 2015 14:16:13 -0700 Subject: [PATCH 0257/1195] RFC 1030 is a few prelude additions --- text/{0000-prelude-additions.md => 1030-prelude-additions.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-prelude-additions.md => 1030-prelude-additions.md} (91%) diff --git a/text/0000-prelude-additions.md b/text/1030-prelude-additions.md similarity index 91% rename from text/0000-prelude-additions.md rename to text/1030-prelude-additions.md index 33caaa9a0a8..dc05c6933c5 100644 --- a/text/0000-prelude-additions.md +++ b/text/1030-prelude-additions.md @@ -1,7 +1,7 @@ - Feature Name: NA - Start Date: 2015-04-03 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1030](https://github.com/rust-lang/rfcs/pull/1030) +- Rust Issue: [rust-lang/rust#24538](https://github.com/rust-lang/rust/issues/24538) # Summary From 537849ff6a88cb59e14014a3aa7cf2b35fc6273e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 17 Apr 2015 16:43:54 -0700 Subject: [PATCH 0258/1195] RFC 1028 is renaming fs::soft_link --- ...nk-to-symlink.md => 1048-rename-soft-link-to-symlink.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-rename-soft-link-to-symlink.md => 1048-rename-soft-link-to-symlink.md} (97%) diff --git a/text/0000-rename-soft-link-to-symlink.md b/text/1048-rename-soft-link-to-symlink.md similarity index 97% rename from text/0000-rename-soft-link-to-symlink.md rename to text/1048-rename-soft-link-to-symlink.md index 082ccc5785c..a05d54c4c00 100644 --- a/text/0000-rename-soft-link-to-symlink.md +++ b/text/1048-rename-soft-link-to-symlink.md @@ -1,7 +1,7 @@ -- Feature Name: rename_soft_link_to_symlink +- Feature Name: `rename_soft_link_to_symlink` - Start Date: 2015-04-09 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1048](https://github.com/rust-lang/rfcs/pull/1048) +- Rust Issue: [rust-lang/rust#24222](https://github.com/rust-lang/rust/pull/24222) # Summary From 251de1766757ce512662b0951878d31c55963182 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 17 Apr 2015 16:50:02 -0700 Subject: [PATCH 0259/1195] RFC 1054 is renaming str::words --- text/{0000-str-words.md => 1054-str-words.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-str-words.md => 1054-str-words.md} (94%) diff --git a/text/0000-str-words.md b/text/1054-str-words.md similarity index 94% rename from text/0000-str-words.md rename to text/1054-str-words.md index 04bc7875220..abfc3efee8d 100644 --- a/text/0000-str-words.md +++ b/text/1054-str-words.md @@ -1,7 +1,7 @@ - Feature Name: str-words - Start Date: 2015-04-10 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1054](https://github.com/rust-lang/rfcs/pull/1054) +- Rust Issue: [rust-lang/rust#24543](https://github.com/rust-lang/rust/issues/24543) # Summary From b75b8c781ba592c13b3785c251947acc9ed5eac0 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 17 Apr 2015 16:53:26 -0700 Subject: [PATCH 0260/1195] RFC 1057 is adding Sync to io::Error --- text/{0000-io-error-sync.md => 1057-io-error-sync.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-io-error-sync.md => 1057-io-error-sync.md} (95%) diff --git a/text/0000-io-error-sync.md b/text/1057-io-error-sync.md similarity index 95% rename from text/0000-io-error-sync.md rename to text/1057-io-error-sync.md index bad5814e1df..8e173b5c029 100644 --- a/text/0000-io-error-sync.md +++ b/text/1057-io-error-sync.md @@ -1,7 +1,7 @@ - Feature Name: `io_error_sync` - Start Date: 2015-04-11 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1057](https://github.com/rust-lang/rfcs/pull/1057) +- Rust Issue: [rust-lang/rust#24133](https://github.com/rust-lang/rust/pull/24133) # Summary From bbceaa3fed9d1d4df7650c8d9f1e35104386ad8d Mon Sep 17 00:00:00 2001 From: Ariel Ben-Yehuda Date: Mon, 20 Apr 2015 22:41:43 +0300 Subject: [PATCH 0261/1195] Clarify cast rules, especially regarding fat pointers. --- text/0401-coercions.md | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/text/0401-coercions.md b/text/0401-coercions.md index 297f36d7809..677e8cfc8e3 100755 --- a/text/0401-coercions.md +++ b/text/0401-coercions.md @@ -319,20 +319,24 @@ descriptions are equivalent. Casting is indicated by the `as` keyword. A cast `e as U` is valid if one of the following holds: -* `e` has type `T` and `T` coerces to `U`; - -* `e` has type `*T` and `U` is `*U_0` (i.e., between any raw pointers); - -* `e` has type `*T` and `U` is `uint` , or vice versa; - -* `e` has type `T` and `T` and `U` are any numeric types; - -* `e` is a C-like enum and `U` is any integer type, `bool`; - -* `e` has type `T` and `T == u8` and `U == char`; - -* `e` has type `T` and `T == &[V, ..n]` or `T == &V` and `U == *const V`, and - similarly for the mutable variants to either `*const V` or `*mut V`. +* `e` has type `T` and `T` coerces to `U`; *coercion-cast* +* `e` has type `*T`, `U` is `*U_0`, and either `U_0: Sized` or + unsize_kind(`T`) = unsize_kind(`U_0`); *ptr-ptr-cast* +* `e` has type `*T` and `U` is a numeric type, while `T: Sized`; *ptr-addr-cast* +* `e` has type `usize` and `U` is `*U_0`, while `U_0: Sized`; *addr-ptr-cast* +* `e` has type `T` and `T` and `U` are any numeric types; *numeric-cast* +* `e` is a C-like enum and `U` is an integer type or `bool`; *enum-cast* +* `e` has type `bool` and `U` is an integer; *bool-cast* +* `e` has type `u8` and `U` is `char`; *u8-char-cast* +* `e` has type `&.[T; n]` and `U` is `*T`, and `e` is a mutable + reference if `U` is. *array-ptr-cast* +* `e` is a function pointer type and `U` has type `*T`, + while `T: Sized`; *fptr-ptr-cast* +* `e` is a function pointer type and `U` is an integer; *fptr-addr-cast* + +where `&.T` and `*T` are references of either mutability, +and where unsize_kind(`T`) is the kind of the unsize info +in `T` - a vtable or a length (or `()` if `T: Sized`). Casting is not transitive, that is, even if `e as U1 as U2` is a valid expression, `e as U2` is not necessarily so (in fact it will only be valid if From 6feddeaef12285939634b46bdc6b83614b8b907f Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 24 Apr 2015 16:20:56 -0700 Subject: [PATCH 0262/1195] Update with getters, clarified socket options, and alternative for Option --- text/0000-socket-timeouts.md | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/text/0000-socket-timeouts.md b/text/0000-socket-timeouts.md index 8392670a146..991e420711b 100644 --- a/text/0000-socket-timeouts.md +++ b/text/0000-socket-timeouts.md @@ -31,18 +31,27 @@ expose functionality like `set_nodelay`: ```rust impl TcpStream { pub fn set_read_timeout(&self, dur: Duration) -> io::Result<()> { ... } + pub fn read_timeout(&self) -> Duration; + pub fn set_write_timeout(&self, dur: Duration) -> io::Result<()> { ... } + pub fn write_timeout(&self) -> Duration; } impl UdpSocket { pub fn set_read_timeout(&self, dur: Duration) -> io::Result<()> { ... } + pub fn read_timeout(&self) -> Duration; + pub fn set_write_timeout(&self, dur: Duration) -> io::Result<()> { ... } + pub fn write_timeout(&self) -> Duration; } ``` -These methods take an amount of time in the form of a `Duration`, +The setter methods take an amount of time in the form of a `Duration`, which is [undergoing stabilization][duration-reform]. They are -implemented via straightforward calls to `setsockopt`. +implemented via straightforward calls to `setsockopt`. A `Duration` of +zero represents *no timeout*. + +The corresponding socket options are `SO_RCVTIMEO` and `SO_SNDTIMEO`. # Drawbacks @@ -62,6 +71,22 @@ enough in practice to justify a departure from the traditional API. # Alternatives +## `Option` + +It's a bit unfortunate -- and rather un-Rustic -- to special case a +zero duration as "no timeout". + +An alternative would be to use `Option`, and, on +`Some(Duration::zero())` yield an invalid input error. That would +force more clarity about intent and help guard against accidental +disabling of timeouts for arithmetic reasons. + +On the other hand, it may have the risk of *introducing* bugs for +those who expect a different semantics -- though the use of option +types will likely serve as a sufficient heads-up. + +## Wrapping for compositionality + A different approach would be to *wrap* socket types with a "timeout modifier", which would be responsible for setting and resetting the timeouts: @@ -69,7 +94,7 @@ timeouts: ```rust struct WithTimeout { timeout: Duration, - innter: T + inner: T } impl WithTimeout { From 306d9399bbeb5bc71eceb3b29921b68862ed2975 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 24 Apr 2015 16:42:17 -0700 Subject: [PATCH 0263/1195] RFC 1044 is Expand the scope of std::fs --- README.md | 1 + text/{0000-io-fs-2.1.md => 1044-io-fs-2.1.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-io-fs-2.1.md => 1044-io-fs-2.1.md} (99%) diff --git a/README.md b/README.md index 3bbef0f4a30..6d9f4d84a1a 100644 --- a/README.md +++ b/README.md @@ -52,6 +52,7 @@ the direction the language is evolving in. * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) * [1011-process.exit.md](text/1011-process.exit.md) * [1023-rebalancing-coherence.md](text/1023-rebalancing-coherence.md) +* [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-io-fs-2.1.md b/text/1044-io-fs-2.1.md similarity index 99% rename from text/0000-io-fs-2.1.md rename to text/1044-io-fs-2.1.md index 920564d1d0c..d7c49ea226c 100644 --- a/text/0000-io-fs-2.1.md +++ b/text/1044-io-fs-2.1.md @@ -1,7 +1,7 @@ - Feature Name: `fs2` - Start Date: 2015-04-04 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1044 +- Rust Issue: https://github.com/rust-lang/rust/issues/24796 # Summary From 5c10787dc1690e292c7fb8eaaba32055a5f360b3 Mon Sep 17 00:00:00 2001 From: Andrew Chin Date: Sat, 25 Apr 2015 18:20:26 -0400 Subject: [PATCH 0264/1195] Fixed broken link in README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6d9f4d84a1a..d441dcf43d4 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,7 @@ the direction the language is evolving in. * [0771-std-iter-once.md](text/0771-std-iter-once.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) -* [0888-compiler-fences.md](text/0888-compiler-fences.md) +* [0888-compiler-fence-intrinsics.md](text/0888-compiler-fence-intrinsics.md) * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0911-const-fn.md](text/0911-const-fn.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) From 47d38bc668d9eb57739ff2aaa85a6b776e77be88 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 27 Apr 2015 15:33:34 -0700 Subject: [PATCH 0265/1195] RFC 1040 is: Duration Reform --- README.md | 1 + ...0000-duration-reform.md => 1040-duration-reform.md} | 10 +++++----- 2 files changed, 6 insertions(+), 5 deletions(-) rename text/{0000-duration-reform.md => 1040-duration-reform.md} (98%) diff --git a/README.md b/README.md index d441dcf43d4..eb65c3da4a5 100644 --- a/README.md +++ b/README.md @@ -52,6 +52,7 @@ the direction the language is evolving in. * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) * [1011-process.exit.md](text/1011-process.exit.md) * [1023-rebalancing-coherence.md](text/1023-rebalancing-coherence.md) +* [1040-duration-reform.md](text/1040/duration-reform.md) * [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) ## Table of Contents diff --git a/text/0000-duration-reform.md b/text/1040-duration-reform.md similarity index 98% rename from text/0000-duration-reform.md rename to text/1040-duration-reform.md index 79d7221a930..a8fe0bbe7c7 100644 --- a/text/0000-duration-reform.md +++ b/text/1040-duration-reform.md @@ -1,7 +1,7 @@ -- Feature Name: Duration Reform +- Feature Name: duration - Start Date: 2015-03-24 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1040 +- Rust Issue: https://github.com/rust-lang/rust/issues/24874 # Summary @@ -109,10 +109,10 @@ In general, this RFC assumes that timeout APIs permit spurious updates (see, for * `Add`, `Sub`, `Mul`, `Div` which follow the overflow and underflow rules for `u64` when applied to the `secs` field (in particular, `Sub` will panic if the result would be negative). Nanoseconds - must be less than 1 billion and great than or equal to 0, and carry + must be less than 1 billion and great than or equal to 0, and carry into the `secs` field. * `Display`, which prints a number of seconds, milliseconds and - nanoseconds (if more than 0). For example, a `Duration` would be + nanoseconds (if more than 0). For example, a `Duration` would be represented as `"15 seconds, 306 milliseconds, and 13 nanoseconds"` * `Debug`, `Ord` (and `PartialOrd`), `Eq` (and `PartialEq`), `Copy` and `Clone`, which are derived. From c385c9b75f2abc085044fde577a0b17f8a53b7a7 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Tue, 28 Apr 2015 18:09:09 -0400 Subject: [PATCH 0266/1195] remove static_assert --- 0000-remove-static-assert.md | 73 ++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 0000-remove-static-assert.md diff --git a/0000-remove-static-assert.md b/0000-remove-static-assert.md new file mode 100644 index 00000000000..56778da0e6d --- /dev/null +++ b/0000-remove-static-assert.md @@ -0,0 +1,73 @@ +- Feature Name: remove-static-assert +- Start Date: 2015-04-28 +- RFC PR: +- Rust Issue: https://github.com/rust-lang/rust/pull/24910 + +# Summary + +Remove the `static_assert` feature. + +# Motivation + +To recap, `static_assert` looks like this: + +```rust +#![feature(static_assert)] +#[static_assert] +static asssertion: bool = true; +``` + +If `assertion` is `false` instead, this fails to compile: + +```text +error: static assertion failed +static asssertion: bool = false; + ^~~~~ +``` + +If you don’t have the `feature` flag, you get another interesting error: + +```text +error: `#[static_assert]` is an experimental feature, and has a poor API +``` + +Throughout its life, `static_assert` has been... weird. Graydon suggested it +[in May of 2013][suggest], and it was +[implemented][https://github.com/rust-lang/rust/pull/6670] shortly after. +[Another issue][issue] was created to give it a ‘better interface’. Here’s why: + +> The biggest problem with it is you need a static variable with a name, that +> goes through trans and ends up in the object file. + +In other words, `assertion` above ends up as a symbol in the final output. Not +something you’d usually expect from some kind of static assertion. + +[suggest]: https://github.com/rust-lang/rust/issues/6568 +[issue]: https://github.com/rust-lang/rust/issues/6676 + +So why not improve `static_assert`? With compile time function evaluation, the +idea of a ‘static assertion’ doesn’t need to have language semantics. Either +`const` functions or full-blown CTFE is a useful feature in its own right that +we’ve said we want in Rust. In light of it being eventually added, +`static_assert` doesn’t make sense any more. + +`static_assert` isn’t used by the compiler at all. + +# Detailed design + +Remove `static_assert`. [Implementation submitted here][here]. + +[here]: https://github.com/rust-lang/rust/pull/24910 + +# Drawbacks + +Why should we *not* do this? + +# Alternatives + +This feature is pretty binary: we either remove it, or we don’t. We could keep the feature, +but build out some sort of alternate version that’s not as weird. + +# Unresolved questions + +None with the design, only “should we do this?” From cdc00f968e467d6579fc25f071841c3407c88430 Mon Sep 17 00:00:00 2001 From: Abhishek Chanda Date: Tue, 28 Apr 2015 16:36:18 -0700 Subject: [PATCH 0267/1195] Fix a typo in URL --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index eb65c3da4a5..b101b8151d3 100644 --- a/README.md +++ b/README.md @@ -52,7 +52,7 @@ the direction the language is evolving in. * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) * [1011-process.exit.md](text/1011-process.exit.md) * [1023-rebalancing-coherence.md](text/1023-rebalancing-coherence.md) -* [1040-duration-reform.md](text/1040/duration-reform.md) +* [1040-duration-reform.md](text/1040-duration-reform.md) * [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) ## Table of Contents From c7f633b78159b9c9140e2831fb83ea6fe8e2ad7d Mon Sep 17 00:00:00 2001 From: David Turner Date: Thu, 19 Mar 2015 23:25:35 -0400 Subject: [PATCH 0268/1195] fix wrong statement --- text/0000-read-all.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index 28259dfb679..2aebad18ca9 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -46,7 +46,7 @@ One alternative design would return some new kind of Result which could report the number of bytes sucessfully read before an error. This would be inconsistent with write_all, but arguably more correct. -If we wanted io::Error to be a smaller type, ErrorKind::ShortRead +If we wanted io::ErrorKind to be a smaller type, ErrorKind::ShortRead could be unparameterized. But this would reduce the information available to calleres. From e44bbba91e774e3ab63092e55574e3e9e0099db2 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 4 May 2015 10:28:19 -0700 Subject: [PATCH 0269/1195] Switch Option to main proposal --- text/0000-socket-timeouts.md | 48 +++++++++++++++++++++--------------- 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/text/0000-socket-timeouts.md b/text/0000-socket-timeouts.md index 991e420711b..1ca1580d7b1 100644 --- a/text/0000-socket-timeouts.md +++ b/text/0000-socket-timeouts.md @@ -30,26 +30,29 @@ expose functionality like `set_nodelay`: ```rust impl TcpStream { - pub fn set_read_timeout(&self, dur: Duration) -> io::Result<()> { ... } - pub fn read_timeout(&self) -> Duration; + pub fn set_read_timeout(&self, dur: Option) -> io::Result<()> { ... } + pub fn read_timeout(&self) -> Option; - pub fn set_write_timeout(&self, dur: Duration) -> io::Result<()> { ... } - pub fn write_timeout(&self) -> Duration; + pub fn set_write_timeout(&self, dur: Option) -> io::Result<()> { ... } + pub fn write_timeout(&self) -> Option; } impl UdpSocket { - pub fn set_read_timeout(&self, dur: Duration) -> io::Result<()> { ... } - pub fn read_timeout(&self) -> Duration; + pub fn set_read_timeout(&self, dur: Option) -> io::Result<()> { ... } + pub fn read_timeout(&self) -> Option; - pub fn set_write_timeout(&self, dur: Duration) -> io::Result<()> { ... } - pub fn write_timeout(&self) -> Duration; + pub fn set_write_timeout(&self, dur: Option) -> io::Result<()> { ... } + pub fn write_timeout(&self) -> Option; } ``` The setter methods take an amount of time in the form of a `Duration`, which is [undergoing stabilization][duration-reform]. They are -implemented via straightforward calls to `setsockopt`. A `Duration` of -zero represents *no timeout*. +implemented via straightforward calls to `setsockopt`. The `Option` is +used to signify no timeout (for both setting and +getting). Consequently, `Some(Duration::new(0, 0))` is a possible +argument; the setter methods will return an IO error of kind +`InvalidInput` in this case. (See Alternatives for other approaches.) The corresponding socket options are `SO_RCVTIMEO` and `SO_SNDTIMEO`. @@ -71,19 +74,24 @@ enough in practice to justify a departure from the traditional API. # Alternatives -## `Option` +## Taking `Duration` directly -It's a bit unfortunate -- and rather un-Rustic -- to special case a -zero duration as "no timeout". +Using an `Option` introduces a certain amount of complexity +-- it raises the issue of `Some(Duration::new(0, 0))`, and it's +slightly more verbose to set a timeout. -An alternative would be to use `Option`, and, on -`Some(Duration::zero())` yield an invalid input error. That would -force more clarity about intent and help guard against accidental -disabling of timeouts for arithmetic reasons. +An alternative would be to take a `Duration` directly, and interpret a +zero length duration as "no timeout" (which is somewhat traditional in +C APIs). That would make the API somewhat more familiar, but less +Rustic, and it becomes somewhat easier to pass in a zero value by +accident (without thinking about this possibility). -On the other hand, it may have the risk of *introducing* bugs for -those who expect a different semantics -- though the use of option -types will likely serve as a sufficient heads-up. +Note that both styles of API require code that does arithmetic on +durations to check for zero in advance. + +Aside from fitting Rust idioms better, the main proposal also gives a +somewhat stronger indication of a bug when things go wrong (rather +than simply failing to time out, for example). ## Wrapping for compositionality From e449a9c35c2aa647c2610b1da7633b6c92089632 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 4 May 2015 16:20:31 -0700 Subject: [PATCH 0270/1195] RFC: Policy on semver and API evolution --- text/0000-api-evolution.md | 748 +++++++++++++++++++++++++++++++++++++ 1 file changed, 748 insertions(+) create mode 100644 text/0000-api-evolution.md diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md new file mode 100644 index 00000000000..accf4336923 --- /dev/null +++ b/text/0000-api-evolution.md @@ -0,0 +1,748 @@ +- Feature Name: not applicable +- Start Date: 2015-05-04 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC proposes a comprehensive set of guidelines for which changes to +*stable* APIs are considered breaking from a semver perspective, and which are +not. These guidelines are intended for both the standard library and for the +crates.io ecosystem. + +This does *not* mean that the standard library should be completely free to make +non-semver-breaking changes; there are sometimes still risks of ecosystem pain +that need to be taken into account. Rather, this RFC makes explicit an initial +set of changes that absolutely *cannot* be made without a semver bump. + +Along the way, it also discusses some interactions with potential language +features that can help mitigate pain for non-breaking changes. + +The RFC covers only API issues; other issues related to language features, +lints, type inference, command line arguments, Cargo, and so on are considered +out of scope. + +# Motivation + +Both Rust and its library ecosystem have adopted [semver](http://semver.org/), a +technique for versioning platforms/libraries partly in terms of the effect on +the code that uses them. In a nutshell, the versioning scheme has three components:: + +1. **Major**: must be incremented for changes that break client code. +2. **Minor**: incremented for backwards-compatible feature additions. +3. **Patch**: incremented for backwards-compatible bug fixes. + +[Rust 1.0.0](http://blog.rust-lang.org/2015/02/13/Final-1.0-timeline.html) will +mark the beginning of our +[commitment to stability](http://blog.rust-lang.org/2014/10/30/Stability.html), +and from that point onward it will be important to be clear about what +constitutes a breaking change, in order for semver to play a meaningful role. As +we will see, this question is more subtle than one might think at first -- and +the simplest approach would make it effectively impossible to grow the standard +library. + +The goal of this RFC is to lay out a comprehensive policy for what *must* be +considered a breaking API change from the perspective of semver, along with some +guidance about non-semver-breaking changes. + +# Detailed design + +For clarity, in the rest of the RFC, we will use the following terms: + +* **Major change**: a change that requires a major semver bump. +* **Minor change**: a change that requires only a minor semver bump. +* **Breaking change**: a change that, *strictly speaking*, can cause downstream + code to fail to compile. + +What we will see is that in Rust today, almost any change is technically a +breaking change. For example, given the way that globs currently work, *adding +any public item* to a library can break its clients (more on that later). But +not all breaking changes are equal. + +So, this RFC proposes that **all major changes are breaking, but not all breaking +changes are major.** + +## Overview + +### Principles of the policy + +The basic design of the policy is that **minor changes should require at most a +few local *annotations* to the code you are developing, and in principle no +changes to your dependencies.** + +In more detail: + +* Minor changes should require at most minor amounts of work upon upgrade. For + example, changes that may require occasional type annotations or use of UFCS + to disambiguate are not automatically "major" changes. (But in such cases, one + must evaluate how widespread these "minor" changes are). + +* In principle, it should be possible to produce a version of the code for any + dependencies that *will not break* when upgrading to a new minor + revision. This goes hand-in-hand with the above bullet; as we will see, it's + possible to save a fully "elaborated" version of upstream code that does not + require any disambiguation. The "in principle" refers to the fact that getting + there may require some additional tooling or language support, which this RFC + outlines. + +That means that any breakage in a minor release must be very "shallow": it must +always be possible to locally fix the problem through some kind of +disambiguation *that could have been done in advance* (by using more explicit +forms) or other annotation (like disabling a lint). It means that minor changes +can never leave you in a state that requires breaking changes to your own code. + +**Although this general policy allows some (very limited) breakage in minor +releases, it is not a license to make these changes blindly**. The breakage that +this RFC permits, aside from being very simple to fix, is also unlikely to occur +often in practice. The RFC will discuss measures that should be employed in the +standard library to ensure that even these minor forms of breakage do not cause +widespread pain in the ecosystem. + +### Scope of the policy + +The policy laid out by this RFC applies to *stable*, *public* APIs in the +standard library. Eventually, stability attributes will be usable in external +libraries as well (this will require some design work), but for now public APIs +in external crates should be understood as de facto stable after the library +reaches 1.0.0 (per semver). + +## Policy by language feature + +Most of the policy is simplest to lay out with reference to specific language +features and the way that APIs using them can, and cannot, evolve in a minor +release. + +**Breaking changes are assumed to be major changes unless otherwise stated**. +The RFC covers many, but not all breaking changes that are major; it covers +*all* breaking changes that are considered minor. + +### Crates + +#### Major change: introducing `#[feature]` for the first time. + +Changing a crate from working on stable Rust to *requiring* a nightly is +considered a breaking change. Crate authors should consider using Cargo +"features" for their crate to make such use opt-in. + +#### Minor change: adding/removing crate dependencies. + +The author is not aware of any possible breakage in altering dependencies (which +is essentially private anyway). + +### Modules + +#### Major change: renaming/moving/removing any public items. + +Although renaming an item might seem like a minor change, according to the +general policy design this is not a permitted form of breakage: it's not +possible to annotate code in advance to avoid the breakage, nor is it possible +to prevent the breakage from affecting dependencies. + +Of course, much of the effect of renaming/moving/removing can be achieved by +instead using deprecation and `pub use`, and the standard library should not be +afraid to do so!In the long run, we should consider hiding at least some old +deprecated items from the docs, and could even consider putting out a major +version solely as a kind of "garbage collection" for long-deprecated APIs. + +#### Minor change: adding new public items. + +Note that adding new public items is currently a breaking change, due to glob +imports. For example, the following snippet of code will break if the `foo` +module introduces a public item called `bar`: + +```rust +use foo::*; +fn bar() { ... } +``` + +The problem here is that glob imports currently do not allow any of their +imports to be shadowed by an explicitly-define item. + +There are two reasons this is considered a minor change by this RFC: + +1. The RFC also suggests permitting shadowing of a glob import by any explicit + item. This has been the intended semantics of globs, but has not been + implemented. The details are left to a future RFC, however. + +2. Even if that change were made, though, there is still the case where two glob + imports conflict without any explicit definition "covering" them. This is + permitted to produce an error under the principles of this RFC because the + glob imports could have been written as more explicit (expanded) `use` + statements. It is also plausible to do this expansion automatically for a + crate's dependencies, to prevent breakage in the first place. + +### Structs + +See "[Signatures in type definitions](#signatures-in-type-definitions)" for some +general remarks about changes to the actual types in a `struct` definition. + +#### Major change: adding a private field when all current fields are public. + +This change has the effect of making external struct literals impossible to +write, which can break code irreparably. + +#### Major change: adding a public field when no private field exists. + +This change retains the ability to use struct literals, but it breaks existing +uses of such literals; it likewise breaks exhaustive matches against the struct. + +#### Minor change: adding or removing private fields when at least one already exists. + +No existing code could be relying on struct literals for the struct, nor on +exhaustively matching its contents, and client code will likewise be oblivious +to the addition of further private fields. + +For tuple structs, this is only a minor change if furthermore *all* fields are +currently private. (Tuple structs with mixtures of public and private fields are +bad practice in any case.) + +#### Minor change: going from a tuple struct with all private fields (with at least one field) to a normal struct, or vice versa. + +This is technically a breaking change: + +```rust +// in some other module: +pub struct Foo(SomeType); + +// in downstream code +let Foo(_) = foo; +``` + +Changing `Foo` to a normal struct can break code that matches on it -- but there +is never any real reason to match on it in that circumstance, since you cannot +extract any fields or learn anything of interest about the struct. + +### Enums + +See "[Signatures in type definitions](#signatures-in-type-definitions)" for some +general remarks about changes to the actual types in an `enum` definition. + +#### Major change: adding new variants. + +Exhaustiveness checking means that a `match` that explicitly checks all the +variants for an `enum` will break if a new variant is added. It is not currently +possible to defend against this breakage in advance. + +A [postponed RFC](https://github.com/rust-lang/rfcs/pull/757) discusses a +language feature that allows an enum to be marked as "extensible", which +modifies the way that exhaustiveness checking is done and would make it possible +to extend the enum without breakage. + +#### Major change: adding new fields to a variant. + +If the enum is public, so is the full contents of all of its variants. As per +the rules for structs, this means it is not allowed to add any new fields (which +will automatically be public). + +If you wish to allow for this kind of extensibility, consider introducing a new, +explicit struct for the variant up front. + +### Traits + +#### Major change: adding a non-defaulted item. + +Adding any item without a default will immediately break all trait implementations. + +It's possible that in the future we will allow some kind of +"[sealing](#sealed-traits)" to say that a trait can only be used as a bound, not +to provide new implementations; such a trait *would* allow arbitrary items to be +added. + +#### Major change: any non-trivial change to item signatures. + +Because traits have both implementors and consumers, any change to the signature +of e.g. a method will affect at least one of the two parties. So, for example, +abstracting a concrete method to use generics instead might work fine for +clients of the trait, but would break existing implementors. (Note, as above, +the potential for "sealed" traits to alter this dynamic.) + +#### Minor change: adding a defaulted item. + +Adding a defaulted item is technically a breaking change: + +```rust +trait Trait1 {} +trait Trait2 { + fn foo(&self); +} + +fn use_both(t: &T) { + t.foo() +} +``` + +If a `foo` method is added to `Trait1`, even with a default, it would cause a +dispatch ambiguity in `use_both`, since the call to `foo` could be referring to +either trait. + +(Note, however, that existing *implementations* of the trait are fine.) + +According to the basic principles of this RFC, such a change is minor: it is +always possible to annotate the call `t.foo()` to be more explicit *in advance* +using UFCS: `::foo(t)`. This kind of annotation could be done +automatically for code in dependencies (see +[Elaborated source](#elaborated-source)). And it would also be possible to +mitigate this problem by allowing +[method renaming on trait import](#trait-item-renaming). + +While the scenario of adding a defaulted method to a trait may seem somewhat +obscure, the exact same hazards arise with *implementing existing traits* (see +below), which is clearly vital to allow; we apply a similar policy to both. + +All that said, it is incumbent on library authors to ensure that such "minor" +changes are in fact minor in practice: if a conflict like `t.foo()` is likely to +arise at all often in downstream code, it would be advisable to explore a +different choice of names. More guidelines for the standard library are given +later on. + +#### Minor change: adding a defaulted type parameter. + +As with "[Signatures in type definitions](#signatures-in-type-definitions)", +traits are permitted to add new type parameters as long as defaults are provided +(which is backwards compatible). + +### Trait implementations + +#### Major change: implementing any "fundamental" trait. + +A [recent RFC](https://github.com/rust-lang/rfcs/pull/1023) introduced the idea +of "fundamental" traits which are so basic that *not* implementing such a trait +right off the bat is considered a promise that you will *never* implement the +trait. The `Sized` and `Fn` traits are examples. + +The coherence rules take advantage of fundamental traits in such a way that +*adding a new implementation of a fundamental trait to an existing type can +cause downstream breakage*. Thus, such impls are considered major changes. + +#### Minor change: implementing any non-fundamental trait. + +Unfortunately, implementing any existing trait can cause breakage: + +```rust +// Crate A + pub trait Trait1 { + fn foo(&self); + } + + pub struct Foo; // does not implement Trait1 + +// Crate B + use crateA::Trait1; + + trait Trait2 { + fn foo(&self); + } + + impl Trait2 for crateA::Foo { .. } + + fn use_foo(f: &crateA::Foo) { + f.foo() + } +``` + +If crate A adds an implementation of `Trait1` for `Foo`, the call to `f.foo()` +in crate B will yield a dispatch ambiguity (much like the one we saw for +defaulted items). Thus *technically implementing any existing trait is a +breaking change!* Completely prohibiting such a change is clearly a non-starter. + +However, as before, this kind of breakage is considered "minor" by the +principles of this RFC (see "Adding a defaulted item" above). + +### Inherent implementations + +#### Minor change: adding any inherent items. + +Adding an inherent item cannot lead to dispatch ambiguity, because inherent +items trump any trait items with the same name. + +However, introducing an inherent item *can* lead to breakage if the signature of +the item does not match that of an in scope, implemented trait: + +```rust +// Crate A + pub struct Foo; + +// Crate B + trait Trait { + fn foo(&self); + } + + impl Trait for crateA::Foo { .. } + + fn use_foo(f: &crateA::Foo) { + f.foo() + } +``` + +If crate A adds a method: + +```rust +impl Foo { + fn foo(&self, x: u8) { ... } +} +``` + +then crate B would no longer compile, since dispatch would prefer the inherent +impl, which has the wrong type. + +Once more, this is considered a minor change, since UFCS can disambiguate (see +"Adding a defaulted item" above). + +It's worth noting, however, that if the signatures *did* happen to match then +the change would no longer cause a compilation error, but might silently change +runtime behavior. The case where the same method for the same type has +meaningfully different behavior is considered unlikely enough that the RFC is +willing to permit it to be labeled as a minor change -- and otherwise, inherent +methods could never be added after the fact. + +### Other items + +Most remaining items do not have any particularly unique items: + +* For type aliases, see "[Signatures in type definitions](#signatures-in-type-definitions)". +* For free functions, see "[Signatures in functions](#signatures-in-functions)". + +## Cross-cutting concerns + +### Behavioral changes + +This RFC is largely focused on API changes which may, in particular, cause +downstream code to stop compiling. But in some sense it is even more pernicious +to make a change that allows downstream code to continue compiling, but causes +its runtime behavior to break. + +This RFC does not attempt to provide a comprehensive policy on behavioral +changes, which would be extremely difficult. In general, APIs are expected to +provide explicit contracts for their behavior via documentation, and behavior +that is not part of this contract is permitted to change in minor +revisions. (Remember: this RFC is about setting a *minimum* bar for when major +version bumps are required.) + +This policy will likely require some revision over time, to become more explicit +and perhaps lay out some best practices. + +### Signatures in type definitions + +#### Major change: tightening bounds. + +Adding new constraints on existing type parameters is a breaking change, since +existing uses of the type definition can break. So the following is a major +change: + +```rust +// MAJOR CHANGE + +// Before +struct Foo { .. } + +// After +struct Foo { .. } +``` + +#### Minor change: loosening bounds. + +Loosening bounds, on the other hand, cannot break code because when you +reference `Foo`, you *do not learn anything about the bounds on `A`*. (This +is why you have to repeat any relevant bounds in `impl` blocks for `Foo`, for +example.) So the following is a minor change: + +```rust +// MINOR CHANGE + +// Before +struct Foo { .. } + +// After +struct Foo { .. } +``` + +#### Minor change: adding defaulted type parameters. + +All existing references to a type/trait definition continue to compile and work +correctly after a new defaulted type parameter is added. So the following is +a minor change: + +```rust +// MINOR CHANGE + +// Before +struct Foo { .. } + +// After +struct Foo { .. } +``` + +#### Minor change: generalizing to generics. + +A struct or enum field can change from a concrete type to a generic type +parameter, provided that the change results in an identical type for all +existing use cases. For example, the following change is permitted: + +```rust +// MINOR CHANGE + +// Before +struct Foo(pub u8); + +// After +struct Foo(pub T); +``` + +because existing uses of `Foo` are shorthand for `Foo` which yields the +identical field type. + +On the other hand, the following is not permitted: + +```rust +// MAJOR CHANGE + +// Before +struct Foo(pub T, pub u8); + +// After +struct Foo(pub T, pub T); +``` + +since there may be existing uses of `Foo` with a non-default type parameter +which would break as a result of the change. + +It's also permitted to change from a generic type to a more-generic one in a +minor revision: + +```rust +// MINOR CHANGE + +// Before +struct Foo(pub T, pub T); + +// After +struct Foo(pub T, pub U); +``` + +since, again, all existing uses of the type `Foo` will yield the same field +types as before. + +### Signatures in functions + +All of the changes mentioned below are considered major changes in the context +of trait methods, since they can break implementors. + +#### Major change: adding new arguments. + +At the moment, Rust does not provide defaulted arguments, so any change in arity +is a breaking change. + +#### Minor change: introducing a new type parameter. + +Technically, adding a (non-defaulted) type parameter can break code: + +```rust +// MINOR CHANGE (but causes breakage) + +// Before +fn foo(...) { ... } + +// After +fn foo(...) { ... } +``` + +will break any calls like `foo::`. However, such explicit calls are rare +enough (and can usually be written in other ways) that this breakage is +considered minor. (However, one should take into account how likely it is that +the function in question is being called with explicit type arguments). + +Such changes are an important ingredient of abstracting to use generics, as described next. + +#### Minor change: generalizing to generics. + +The type of an argument to a function, or its return value, can be *generalized* +to use generics, including by introducing a new type parameter (as long as it +can be instantiated to the original type). For example, the following change is +allowed: + +```rust +// MINOR CHANGE + +// Before +fn foo(x: u8) -> u8; +fn bar>(t: T); + +// After +fn foo(x: T) -> T; +fn bar>(t: T); +``` + +because all existing uses are instantiations of the new signature. On the other +hand, the following isn't allowed in a minor revision: + +```rust +// MAJOR CHANGE + +// Before +fn foo(x: Vec); + +// After +fn foo>(x: T); +``` + +because the generics include a constraint not satisfied by the original type. + +Introducing generics in this way can potentially create type inference failures, +but these are considered acceptable per the principles of the RFC: they only +require local annotations that could have been inserted in advance. + +Perhaps somewhat surprisingly, generalization applies to trait objects as well, +given that every trait implements itself: + +```rust +// MINOR CHANGE + +// Before +fn foo(t: &Trait); + +// After +fn foo(t: &T); +``` + +(The use of `?Sized` is essential; otherwise you couldn't recover the original +signature). + +### Lints + +#### Minor change: introducing new lint warnings/errors + +Lints are considered advisory, and changes that cause downstream code to receive +additional lint warnings/errors are still considered "minor" changes. + +## Mitigation for minor changes + +### The Crater tool + +@brson has been hard at work on a tool called "Crater" which can be used to +exercise changes on the entire crates.io ecosystem, looking for +regressions. This tool will be indispensable when weighing the costs of a minor +change that might cause some breakage -- we can actually gauge what the breakage +would look like in practice. + +While this would, of course, miss code not available publicly, the hope is that +code on crates.io is a broadly representative sample, good enough to turn up +problems. + +Any breaking, but minor change to the standard library must be evaluated through +Crater before being committed. + +### Nightlies + +One line of defense against a "minor" change causing significant breakage is the +nightly release channel: we can get feedback about breakage long before it makes +even into a beta release. + +### Elaborated source + +When compiling upstream dependencies, it is possible to generate an "elaborated" +version of the source code where all dispatch is resolved to explicit UFCS form, +all types are annotated, and all glob imports are replaced by explicit imports. + +This fully-elaborated form is almost entirely immune to breakage due to any of +the "minor changes" listed above. + +You could imagine Cargo storing this elaborated form for dependencies upon +compilation. That would in turn make it easy to update Rust, or some subset of +dependencies, without breaking any upstream code (even in minor ways). You would +be left only with very small, local changes to make to the code you own. + +While this RFC does not propose any such tooling change right now, the point is +mainly that there are a lot of options if minor changes turn out to cause +breakage more often than anticipated. + +### Trait item renaming + +One very useful mechanism would be the ability to import a trait while renaming +some of its items, e.g. `use some_mod::SomeTrait with {foo_method as bar}`. In +particular, when methods happen to conflict across traits defined in separate +crates, a user of the two traits could rename one of the methods out of the way. + +## Thoughts on possible language changes (unofficial) + +The following is just a quick sketch of some focused language changes that would +help our API evolution story. + +**Glob semantics** + +As already mentioned, the fact that glob imports currently allow *no* shadowing +is deeply problematic: in a technical sense, it means that the addition of *any* +public item can break downstream code arbitrarily. + +It would be much better for API evolution (and for ergonomics and intuition) if +explicitly-defined items trump glob imports. But this is left to a future RFC. + +**Globs with fine-grained control** + +Another useful tool for working with globs would be the ability to *exclude* +certain items from a glob import, e.g. something like: + +```rust +use some_module::{* without Foo}; +``` + +This is especially useful for the case where multiple modules being glob +imported happen to export items with the same name. + +**"Extensible" enums** + +There is already [an RFC](https://github.com/rust-lang/rfcs/pull/757) for an +`enum` annotation that would make it possible to add variants without ever +breaking downstream code. + +**Sealed traits** + +The ability to annotate a trait with some "sealed" marker, saying that no +external implementations are allowed, would be useful in certain cases where a +crate wishes to define a closed set of types that implements a particular +interface. Such an attribute would make it possible to evolve the interface +without a major version bump (since no downstream implementors can exist). + +**Defaulted parameters** + +Also known as "optional arguments" -- an +[oft-requested](https://github.com/rust-lang/rfcs/issues/323) feature. Allowing +arguments to a function to be optional makes it possible to add new arguments +after the fact without a major version bump. + +# Drawbacks and Alternatives + +The main drawback to the approach laid out here is that it makes the stability +and semver guarantees a bit fuzzier: the promise is not that code will never +break, full stop, but rather that minor release breakage is of an extremely +limited form, for which there are a variety of mitigation strategies. This +approach tries to strike a middle ground between a very hard line for stability +(which, for Rust, would rule out many forms of extension) and willy-nilly +breakage: it's an explicit, but pragmatic policy. + +An alternative would be to take a harder line and find some other way to allow +API evolution. Supposing that we resolved the issues around glob imports, the +main problems with breakage have to do with adding new inherent methods or trait +implementations -- both of which are vital forms of evolution. It might be +possible, in the standard library case, to provide some kind of version-based +opt in to this evolution: a crate could opt in to breaking changes for a +particular version of Rust, which might in turn be provided only through some +`cfg`-like mechanism. + +Note that these strategies are not mutually exclusive. Rust's development +processes involved a very steady, strong stream of breakage, and while we need +to be very serious about stabilization, it is possible to take an iterative +approach. The changes considered "major" by this RFC already move the bar *very +significantly* from what was permitted pre-1.0. It may turn out that even the +minor forms of breakage permitted here are, in the long run, too much to +tolerate; at that point we could revise the policies here and explore some +opt-in scheme, for example. + +# Unresolved questions + +## Behavioral issues + +- Is it permitted to change a contract from "abort" to "panic"? What about from + "panic" to "return an `Err`"? + +- Should we try to lay out more specific guidance for behavioral changes at this + point? From e233f8d1cb18bad9c1ddf25a3059818bf2d65cc1 Mon Sep 17 00:00:00 2001 From: arielb1 Date: Tue, 5 May 2015 19:26:28 +0300 Subject: [PATCH 0271/1195] should-be-final version A mix of the original RFC and facts on the ground. --- text/0401-coercions.md | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/text/0401-coercions.md b/text/0401-coercions.md index 677e8cfc8e3..0bbb1e2a9cc 100755 --- a/text/0401-coercions.md +++ b/text/0401-coercions.md @@ -319,20 +319,19 @@ descriptions are equivalent. Casting is indicated by the `as` keyword. A cast `e as U` is valid if one of the following holds: -* `e` has type `T` and `T` coerces to `U`; *coercion-cast* -* `e` has type `*T`, `U` is `*U_0`, and either `U_0: Sized` or - unsize_kind(`T`) = unsize_kind(`U_0`); *ptr-ptr-cast* -* `e` has type `*T` and `U` is a numeric type, while `T: Sized`; *ptr-addr-cast* -* `e` has type `usize` and `U` is `*U_0`, while `U_0: Sized`; *addr-ptr-cast* -* `e` has type `T` and `T` and `U` are any numeric types; *numeric-cast* -* `e` is a C-like enum and `U` is an integer type or `bool`; *enum-cast* -* `e` has type `bool` and `U` is an integer; *bool-cast* -* `e` has type `u8` and `U` is `char`; *u8-char-cast* -* `e` has type `&.[T; n]` and `U` is `*T`, and `e` is a mutable - reference if `U` is. *array-ptr-cast* -* `e` is a function pointer type and `U` has type `*T`, - while `T: Sized`; *fptr-ptr-cast* -* `e` is a function pointer type and `U` is an integer; *fptr-addr-cast* + * `e` has type `T` and `T` coerces to `U`; *coercion-cast* + * `e` has type `*T`, `U` is `*U_0`, and either `U_0: Sized` or + unsize_kind(`T`) = unsize_kind(`U_0`); *ptr-ptr-cast* + * `e` has type `*T` and `U` is a numeric type, while `T: Sized`; *ptr-addr-cast* + * `e` is an integer and `U` is `*U_0`, while `U_0: Sized`; *addr-ptr-cast* + * `e` has type `T` and `T` and `U` are any numeric types; *numeric-cast* + * `e` is a C-like enum and `U` is an integer type; *enum-cast* + * `e` has type `bool` or `char` and `U` is an integer; *prim-int-cast* + * `e` has type `u8` and `U` is `char`; *u8-char-cast* + * `e` has type `&[T; n]` and `U` is `*const T`; *array-ptr-cast* + * `e` is a function pointer type and `U` has type `*T`, + while `T: Sized`; *fptr-ptr-cast* + * `e` is a function pointer type and `U` is an integer; *fptr-addr-cast* where `&.T` and `*T` are references of either mutability, and where unsize_kind(`T`) is the kind of the unsize info From 568ed6a75653b75dcd4f36a102afa59cd73f018f Mon Sep 17 00:00:00 2001 From: Barosl Lee Date: Sat, 2 May 2015 19:16:18 +0900 Subject: [PATCH 0272/1195] Add a draft for rename_connect_to_join --- text/0000-rename-connect-to-join.md | 73 +++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 text/0000-rename-connect-to-join.md diff --git a/text/0000-rename-connect-to-join.md b/text/0000-rename-connect-to-join.md new file mode 100644 index 00000000000..68e3782883a --- /dev/null +++ b/text/0000-rename-connect-to-join.md @@ -0,0 +1,73 @@ +- Feature Name: `rename_connect_to_join` +- Start Date: 2015-05-02 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Rename `.connect()` to `.join()` in `SliceConcatExt`. + +# Motivation + +Rust has a string concatenation method named `.connect()` in `SliceConcatExt`. +However, this does not align with the precedents in other languages. Most +languages use `.join()` for that purpose, as seen later. + +This is probably because, in the ancient Rust, `join` was a keyword to join a +task. However, `join` retired as a keyword in 2011 with the commit +rust-lang/rust@d1857d3. While `.connect()` is technically correct, the name may +not be directly inferred by the users of the mainstream languages. There was [a +question] about this on reddit. + +[a question]: http://www.reddit.com/r/rust/comments/336rj3/whats_the_best_way_to_join_strings_with_a_space/ + +The languages that use the name of `join` are: + +- Python: [str.join](https://docs.python.org/3/library/stdtypes.html#str.join) +- Ruby: [Array.join](http://ruby-doc.org/core-2.2.0/Array.html#method-i-join) +- JavaScript: [Array.prototype.join](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/join) +- Go: [strings.Join](https://golang.org/pkg/strings/#Join) +- C#: [String.Join](https://msdn.microsoft.com/en-us/library/dd783876%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396) +- Java: [String.join](http://docs.oracle.com/javase/8/docs/api/java/lang/String.html#join-java.lang.CharSequence-java.lang.Iterable-) +- Perl: [join](http://perldoc.perl.org/functions/join.html) + +The languages not using `join` are as follows. Interestingly, they are +all functional-ish languages. + +- Haskell: [intercalate](http://hackage.haskell.org/package/text-1.2.0.4/docs/Data-Text.html#v:intercalate) +- OCaml: [String.concat](http://caml.inria.fr/pub/docs/manual-ocaml/libref/String.html#VALconcat) +- F#: [String.concat](https://msdn.microsoft.com/en-us/library/ee353761.aspx) + +Note that Rust also has `.concat()` in `SliceConcatExt`, which is a specialized +version of `.connect()` that uses an empty string as a separator. + +# Detailed design + +While the `SliceConcatExt` trait is unstable, the `.connect()` method itself is +marked as stable. So we need to: + +1. Deprecate the `.connect()` method. +2. Add the `.join()` method. + +Or, if we are to achieve the [instability guarantee], we may remove the old +method entirely, as it's still pre-1.0. However, the author considers that this +may require even more consensus. + +[instability guarantee]: https://github.com/rust-lang/rust/issues/24928 + +# Drawbacks + +Having a deprecated method in a newborn language is not pretty. + +If we do remove the `.connect()` method, the language becomes pretty again, but +it breaks the stability guarantee at the same time. + +# Alternatives + +Keep the status quo. Improving searchability in the docs will help newcomers +find the appropriate method. + +# Unresolved questions + +Are there even more clever names for the method? How about `.homura()`, or +`.madoka()`? From 6e4029d0769fe3aef84b008b6bffbb32959247b2 Mon Sep 17 00:00:00 2001 From: mdinger Date: Wed, 6 May 2015 15:33:16 -0400 Subject: [PATCH 0273/1195] typos --- text/0135-where.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0135-where.md b/text/0135-where.md index 55ad1affa4f..24e24529874 100644 --- a/text/0135-where.md +++ b/text/0135-where.md @@ -49,7 +49,7 @@ overcome with the `where` syntax: - **It does not work well with associated types.** This is because there is no space to specify the value of an associated type. Other - languages use `where` clauses (or something analagous) for this + languages use `where` clauses (or something analogous) for this purpose. - **It's just plain hard to read.** Experience has shown that as the @@ -251,7 +251,7 @@ It is unclear exactly what form associated types will have in Rust, but it is [well documented][comparison] that our current design, in which type parameters decorate traits, does not scale particularly well. (For curious readers, there are [several][part1] [blog][part2] -[posts][pnkfelix] exporing the design space of associated types with +[posts][pnkfelix] exploring the design space of associated types with respect to Rust in particular.) The high-level summary of associated types is that we can replace From 390b78cffd126f7b9ea213088c113fc677ce8304 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 7 May 2015 08:46:50 -0700 Subject: [PATCH 0274/1195] Added Result for reading timeout; added enum alternative --- text/0000-socket-timeouts.md | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/text/0000-socket-timeouts.md b/text/0000-socket-timeouts.md index 1ca1580d7b1..64b9c7ccf9e 100644 --- a/text/0000-socket-timeouts.md +++ b/text/0000-socket-timeouts.md @@ -31,18 +31,18 @@ expose functionality like `set_nodelay`: ```rust impl TcpStream { pub fn set_read_timeout(&self, dur: Option) -> io::Result<()> { ... } - pub fn read_timeout(&self) -> Option; + pub fn read_timeout(&self) -> io::Result>; pub fn set_write_timeout(&self, dur: Option) -> io::Result<()> { ... } - pub fn write_timeout(&self) -> Option; + pub fn write_timeout(&self) -> io::Result>; } impl UdpSocket { pub fn set_read_timeout(&self, dur: Option) -> io::Result<()> { ... } - pub fn read_timeout(&self) -> Option; + pub fn read_timeout(&self) -> io::Result>; pub fn set_write_timeout(&self, dur: Option) -> io::Result<()> { ... } - pub fn write_timeout(&self) -> Option; + pub fn write_timeout(&self) -> io::Result>; } ``` @@ -93,6 +93,25 @@ Aside from fitting Rust idioms better, the main proposal also gives a somewhat stronger indication of a bug when things go wrong (rather than simply failing to time out, for example). +## Combining with nonblocking support + +Another possibility would be to provide a single method that can +choose between blocking indefinitely, blocking with a timeout, and +nonblocking mode: + +```rust +enum BlockingMode { + Nonblocking, + Blocking, + Timeout(Duration) +} +``` + +This `enum` makes clear that it doesn't make sense to have both a +timeout and put the socket in nonblocking mode. On the other hand, it +would relinquish the one-to-one correspondence between Rust +configuration APIs and underlying socket options. + ## Wrapping for compositionality A different approach would be to *wrap* socket types with a "timeout From 5e8ff5580c53906475eaf114ad2eaf6cc756276a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 7 May 2015 10:08:19 -0700 Subject: [PATCH 0275/1195] RFC 1068 is Rust Governance --- text/{0000-rust-governance.md => 1068-rust-governance.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-rust-governance.md => 1068-rust-governance.md} (99%) diff --git a/text/0000-rust-governance.md b/text/1068-rust-governance.md similarity index 99% rename from text/0000-rust-governance.md rename to text/1068-rust-governance.md index 9c15d540148..16237eb791e 100644 --- a/text/0000-rust-governance.md +++ b/text/1068-rust-governance.md @@ -1,7 +1,7 @@ - Feature Name: not applicable - Start Date: 2015-02-27 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1068](https://github.com/rust-lang/rfcs/pull/1068) +- Rust Issue: N/A # Summary From 68d4f12468f1652cb7a84896c286b3c3daf78d69 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 7 May 2015 10:24:49 -0700 Subject: [PATCH 0276/1195] RFC 1066 is: alter mem::forget to be safe --- README.md | 1 + text/{0000-safe-mem-forget.md => 1066-safe-mem-forget.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-safe-mem-forget.md => 1066-safe-mem-forget.md} (98%) diff --git a/README.md b/README.md index b101b8151d3..62e427342b8 100644 --- a/README.md +++ b/README.md @@ -54,6 +54,7 @@ the direction the language is evolving in. * [1023-rebalancing-coherence.md](text/1023-rebalancing-coherence.md) * [1040-duration-reform.md](text/1040-duration-reform.md) * [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) +* [1066-safe-mem-forget.md](text/1066-safe-mem-forget.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-safe-mem-forget.md b/text/1066-safe-mem-forget.md similarity index 98% rename from text/0000-safe-mem-forget.md rename to text/1066-safe-mem-forget.md index 279705d4d1f..aefcbb20197 100644 --- a/text/0000-safe-mem-forget.md +++ b/text/1066-safe-mem-forget.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-04-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1066 +- Rust Issue: https://github.com/rust-lang/rust/issues/25186 # Summary From d00d7f75ae94b78c9271a29c21a8193124653227 Mon Sep 17 00:00:00 2001 From: mdinger Date: Tue, 12 May 2015 01:09:53 -0400 Subject: [PATCH 0277/1195] typos --- text/0195-associated-items.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0195-associated-items.md b/text/0195-associated-items.md index 45f97efe0fb..b195030826d 100644 --- a/text/0195-associated-items.md +++ b/text/0195-associated-items.md @@ -733,7 +733,7 @@ know the concrete type of the value returned from the function -- here, `MyNode` ## Scoping of `trait` and `impl` items Associated types are frequently referred to in the signatures of a trait's -methods and associated functions, and it is natural and convneient to refer to +methods and associated functions, and it is natural and convenient to refer to them directly. In other words, writing this: @@ -1076,7 +1076,7 @@ coherence property above, there can be at most one. On the other hand, even if there is only one applicable `impl`, type inference is *not* allowed to infer the input type parameters from it. This restriction -makes it possible to ensure *crate concatentation*: adding another crate may add +makes it possible to ensure *crate concatenation*: adding another crate may add `impl`s for a given trait, and if type inference depended on the absence of such `impl`s, importing a crate could break existing code. From 919ec7b896d5140db4a5bbe413530515b49b9f82 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:09:32 -0700 Subject: [PATCH 0278/1195] Clarify major policy, cargo features, nightly, and the need for default type param RFC --- text/0000-api-evolution.md | 44 +++++++++++++++++++++++++++----------- 1 file changed, 32 insertions(+), 12 deletions(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index accf4336923..6c7bef06085 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -66,9 +66,10 @@ changes are major.** ### Principles of the policy -The basic design of the policy is that **minor changes should require at most a -few local *annotations* to the code you are developing, and in principle no -changes to your dependencies.** +The basic design of the policy is that **the same code should be able to run +against different minor revisions**. Furthermore, minor changes should require +at most a few local *annotations* to the code you are developing, and in +principle no changes to your dependencies. In more detail: @@ -118,16 +119,26 @@ The RFC covers many, but not all breaking changes that are major; it covers ### Crates -#### Major change: introducing `#[feature]` for the first time. +#### Major change: going from stable to nightly -Changing a crate from working on stable Rust to *requiring* a nightly is -considered a breaking change. Crate authors should consider using Cargo -"features" for their crate to make such use opt-in. +Changing a crate from working on stable Rust to *requiring* a nightly +is considered a breaking change. That includes using `#[feature]` +directly, or using a dependency that does so. Crate authors should +consider using Cargo "features" for their crate to make such use +opt-in. -#### Minor change: adding/removing crate dependencies. +#### Minor change: altering the use of Cargo features -The author is not aware of any possible breakage in altering dependencies (which -is essentially private anyway). +Cargo packages can provide +[opt-in features](http://doc.crates.io/manifest.html#the-[features]-section), +which enable `#[cfg]` options. When a common dependency is compiled, it is done +so with the *union* of all features opted into by any packages using the +dependency. That means that adding or removing a feature could technically break +other, unrelated code. + +However, such breakage always represents a bug: packages are supposed to support +any combination of features, and if another client of the package depends on a +given feature, that client should specify the opt-in themselves. ### Modules @@ -140,7 +151,7 @@ to prevent the breakage from affecting dependencies. Of course, much of the effect of renaming/moving/removing can be achieved by instead using deprecation and `pub use`, and the standard library should not be -afraid to do so!In the long run, we should consider hiding at least some old +afraid to do so! In the long run, we should consider hiding at least some old deprecated items from the docs, and could even consider putting out a major version solely as a kind of "garbage collection" for long-deprecated APIs. @@ -489,7 +500,9 @@ struct Foo(pub T); ``` because existing uses of `Foo` are shorthand for `Foo` which yields the -identical field type. +identical field type. (Note: this is not actually true today, since +[default type parameters](https://github.com/rust-lang/rfcs/pull/213) are not +fully implemented. But this is the intended semantics.) On the other hand, the following is not permitted: @@ -688,6 +701,13 @@ use some_module::{* without Foo}; This is especially useful for the case where multiple modules being glob imported happen to export items with the same name. +**Default type parameters** + +Some of the minor changes for moving to more generic code depends on an +interplay between defaulted type paramters and type inference, which has been +[accepted as an RFC](https://github.com/rust-lang/rfcs/pull/213) but not yet +implemented. + **"Extensible" enums** There is already [an RFC](https://github.com/rust-lang/rfcs/pull/757) for an From 796abd67018c32bbed62180434df04a5f4f872e2 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:12:20 -0700 Subject: [PATCH 0279/1195] Fix UFCS usage --- text/0000-api-evolution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 6c7bef06085..11e15152fc6 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -290,7 +290,7 @@ either trait. According to the basic principles of this RFC, such a change is minor: it is always possible to annotate the call `t.foo()` to be more explicit *in advance* -using UFCS: `::foo(t)`. This kind of annotation could be done +using UFCS: `Trait2::foo(t)`. This kind of annotation could be done automatically for code in dependencies (see [Elaborated source](#elaborated-source)). And it would also be possible to mitigate this problem by allowing From c56c2d1fa33ced4c45dafbb6458a21f5c347412d Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:12:55 -0700 Subject: [PATCH 0280/1195] Clarify major signature changes for adding/removing arguments --- text/0000-api-evolution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 11e15152fc6..1d4ae747593 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -540,7 +540,7 @@ types as before. All of the changes mentioned below are considered major changes in the context of trait methods, since they can break implementors. -#### Major change: adding new arguments. +#### Major change: adding/removing arguments. At the moment, Rust does not provide defaulted arguments, so any change in arity is a breaking change. From 27b5bfc155c90ab0c9e6d07b740cde1372c229eb Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:14:15 -0700 Subject: [PATCH 0281/1195] Clarify lints --- text/0000-api-evolution.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 1d4ae747593..1f941c377cb 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -627,6 +627,10 @@ signature). Lints are considered advisory, and changes that cause downstream code to receive additional lint warnings/errors are still considered "minor" changes. +Making this work well in practice will likely require some infrastructure work +along the lines of +[this RFC issue](https://github.com/rust-lang/rfcs/issues/1029) + ## Mitigation for minor changes ### The Crater tool From c32dde42aacd1aa5f847dc38d15e45d065fac565 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:15:03 -0700 Subject: [PATCH 0282/1195] Clarify role of beta --- text/0000-api-evolution.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 1f941c377cb..08476cffb6c 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -652,7 +652,8 @@ Crater before being committed. One line of defense against a "minor" change causing significant breakage is the nightly release channel: we can get feedback about breakage long before it makes -even into a beta release. +even into a beta release. And of course the beta cycle itself provides another +line of defense. ### Elaborated source From 48e3eb0adb48a7993cd5056b908b1fde35ca0ecd Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:20:44 -0700 Subject: [PATCH 0283/1195] Clarify adding type parameters to functions --- text/0000-api-evolution.md | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 08476cffb6c..13822f4fc07 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -562,9 +562,12 @@ fn foo(...) { ... } will break any calls like `foo::`. However, such explicit calls are rare enough (and can usually be written in other ways) that this breakage is considered minor. (However, one should take into account how likely it is that -the function in question is being called with explicit type arguments). +the function in question is being called with explicit type arguments). This +RFC also suggests adding a `...` notation to explicit parameter lists to keep +them open-ended (see suggested language changes). -Such changes are an important ingredient of abstracting to use generics, as described next. +Such changes are an important ingredient of abstracting to use generics, as +described next. #### Minor change: generalizing to generics. @@ -706,6 +709,11 @@ use some_module::{* without Foo}; This is especially useful for the case where multiple modules being glob imported happen to export items with the same name. +Another possibility would be to not make it an error for two glob imports to +bring the same name into scope, but to generate the error only at the point that +the imported name was actually *used*. Then collisions could be resolved simply +by adding a single explicit, shadowing import. + **Default type parameters** Some of the minor changes for moving to more generic code depends on an @@ -734,6 +742,15 @@ Also known as "optional arguments" -- an arguments to a function to be optional makes it possible to add new arguments after the fact without a major version bump. +**Open-ended explicit type paramters** + +One hazard is that with today's explicit type parameter syntax, you must always +specify *all* type parameters: `foo::(x, y)`. That means that adding a new +type parameter to `foo` can break code, even if a default is provided. + +This could be easily addressed by adding a notation like `...` to leave +additional parameters unspecified: `foo::(x, y)`. + # Drawbacks and Alternatives The main drawback to the approach laid out here is that it makes the stability From 72b55905bc17b4d19f58c61bb1474482b3d40c1c Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:23:38 -0700 Subject: [PATCH 0284/1195] Fix link --- text/0000-api-evolution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 13822f4fc07..afeff503813 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -255,7 +255,7 @@ explicit struct for the variant up front. Adding any item without a default will immediately break all trait implementations. It's possible that in the future we will allow some kind of -"[sealing](#sealed-traits)" to say that a trait can only be used as a bound, not +"[sealing](#thoughts-on-possible-language-changes-unofficial)" to say that a trait can only be used as a bound, not to provide new implementations; such a trait *would* allow arbitrary items to be added. From d89fd02685693436772f4894abd5dd2c951baaf2 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:26:14 -0700 Subject: [PATCH 0285/1195] Mention object safety --- text/0000-api-evolution.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index afeff503813..edae6bd3839 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -306,6 +306,9 @@ arise at all often in downstream code, it would be advisable to explore a different choice of names. More guidelines for the standard library are given later on. +Finally, if the new item would change the trait from object safe to non-object +safe, it is considered a major change. + #### Minor change: adding a defaulted type parameter. As with "[Signatures in type definitions](#signatures-in-type-definitions)", From 8bfe1d8dbf531c62855a4569f156eafd18ed2056 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:30:03 -0700 Subject: [PATCH 0286/1195] Clarify wording about dependencies --- text/0000-api-evolution.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index edae6bd3839..1e51aa52e3f 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -78,13 +78,13 @@ In more detail: to disambiguate are not automatically "major" changes. (But in such cases, one must evaluate how widespread these "minor" changes are). -* In principle, it should be possible to produce a version of the code for any - dependencies that *will not break* when upgrading to a new minor - revision. This goes hand-in-hand with the above bullet; as we will see, it's - possible to save a fully "elaborated" version of upstream code that does not - require any disambiguation. The "in principle" refers to the fact that getting - there may require some additional tooling or language support, which this RFC - outlines. +* In principle, it should be possible to produce a version of dependency code + that *will not break* when upgrading other dependencies, or Rust itself, to a + new minor revision. This goes hand-in-hand with the above bullet; as we will + see, it's possible to save a fully "elaborated" version of upstream code that + does not require any disambiguation. The "in principle" refers to the fact + that getting there may require some additional tooling or language support, + which this RFC outlines. That means that any breakage in a minor release must be very "shallow": it must always be possible to locally fix the problem through some kind of From 5525db79fe706a63f0b40f6d6fdd1c7f0d66d27f Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:30:24 -0700 Subject: [PATCH 0287/1195] Add link to Cargo features --- text/0000-api-evolution.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 1e51aa52e3f..3dffe2ac738 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -121,11 +121,11 @@ The RFC covers many, but not all breaking changes that are major; it covers #### Major change: going from stable to nightly -Changing a crate from working on stable Rust to *requiring* a nightly -is considered a breaking change. That includes using `#[feature]` -directly, or using a dependency that does so. Crate authors should -consider using Cargo "features" for their crate to make such use -opt-in. +Changing a crate from working on stable Rust to *requiring* a nightly is +considered a breaking change. That includes using `#[feature]` directly, or +using a dependency that does so. Crate authors should consider using Cargo +["features"](http://doc.crates.io/manifest.html#the-[features]-section) for +their crate to make such use opt-in. #### Minor change: altering the use of Cargo features From 9f87a412067f72dd96bc1ed435de02a27c2255e7 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:30:37 -0700 Subject: [PATCH 0288/1195] Fix typo --- text/0000-api-evolution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 3dffe2ac738..3b12fe3800d 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -167,7 +167,7 @@ fn bar() { ... } ``` The problem here is that glob imports currently do not allow any of their -imports to be shadowed by an explicitly-define item. +imports to be shadowed by an explicitly-defined item. There are two reasons this is considered a minor change by this RFC: From 9532a9acc977b998a99b2f83a3b4176194558d56 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:31:08 -0700 Subject: [PATCH 0289/1195] Clarify glob overlap text --- text/0000-api-evolution.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index 3b12fe3800d..ddef7085d56 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -176,11 +176,11 @@ There are two reasons this is considered a minor change by this RFC: implemented. The details are left to a future RFC, however. 2. Even if that change were made, though, there is still the case where two glob - imports conflict without any explicit definition "covering" them. This is - permitted to produce an error under the principles of this RFC because the - glob imports could have been written as more explicit (expanded) `use` - statements. It is also plausible to do this expansion automatically for a - crate's dependencies, to prevent breakage in the first place. + imports conflict with each other, without any explicit definition "covering" + them. This is permitted to produce an error under the principles of this RFC + because the glob imports could have been written as more explicit (expanded) + `use` statements. It is also plausible to do this expansion automatically for + a crate's dependencies, to prevent breakage in the first place. ### Structs From dbd5c7513015fd9854931eeba6145881e3194f76 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:34:47 -0700 Subject: [PATCH 0290/1195] Minor clarification for private fields --- text/0000-api-evolution.md | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index ddef7085d56..e94a1e86f11 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -169,18 +169,14 @@ fn bar() { ... } The problem here is that glob imports currently do not allow any of their imports to be shadowed by an explicitly-defined item. -There are two reasons this is considered a minor change by this RFC: +This is considered a minor change because under the principles of this RFC: the +glob imports could have been written as more explicit (expanded) `use` +statements. It is also plausible to do this expansion automatically for a +crate's dependencies, to prevent breakage in the first place. -1. The RFC also suggests permitting shadowing of a glob import by any explicit - item. This has been the intended semantics of globs, but has not been - implemented. The details are left to a future RFC, however. - -2. Even if that change were made, though, there is still the case where two glob - imports conflict with each other, without any explicit definition "covering" - them. This is permitted to produce an error under the principles of this RFC - because the glob imports could have been written as more explicit (expanded) - `use` statements. It is also plausible to do this expansion automatically for - a crate's dependencies, to prevent breakage in the first place. +(This RFC also suggests permitting shadowing of a glob import by any explicit +item. This has been the intended semantics of globs, but has not been +implemented. The details are left to a future RFC, however.) ### Structs @@ -197,7 +193,7 @@ write, which can break code irreparably. This change retains the ability to use struct literals, but it breaks existing uses of such literals; it likewise breaks exhaustive matches against the struct. -#### Minor change: adding or removing private fields when at least one already exists. +#### Minor change: adding or removing private fields when at least one already exists (before and after the change). No existing code could be relying on struct literals for the struct, nor on exhaustively matching its contents, and client code will likewise be oblivious From 384e749f5d3bd64c0e9789de1e8fb6a2d82dbbf3 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 12 May 2015 10:40:44 -0700 Subject: [PATCH 0291/1195] Talk about default associated types --- text/0000-api-evolution.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/text/0000-api-evolution.md b/text/0000-api-evolution.md index e94a1e86f11..54c465f3acb 100644 --- a/text/0000-api-evolution.md +++ b/text/0000-api-evolution.md @@ -302,8 +302,17 @@ arise at all often in downstream code, it would be advisable to explore a different choice of names. More guidelines for the standard library are given later on. -Finally, if the new item would change the trait from object safe to non-object -safe, it is considered a major change. +There are two circumstances when adding a defaulted item is still a major change: + +* The new item would change the trait from object safe to non-object safe. +* The trait has a defaulted associated type and the item being added is a + defaulted function/method. In this case, existing impls that override the + associated type will break, since the function/method default will not + apply. (See + [the associated item RFC](https://github.com/rust-lang/rfcs/blob/master/text/0195-associated-items.md#defaults)). +* Adding a default to an existing associated type is likewise a major change if + the trait has defaulted methods, since it will invalidate use of those + defaults for the methods in existing trait impls. #### Minor change: adding a defaulted type parameter. From 77c2975aabe25eacd1d78cfa2fe2d1754a8fb0f8 Mon Sep 17 00:00:00 2001 From: John Hodge Date: Wed, 13 May 2015 21:10:07 +0800 Subject: [PATCH 0292/1195] Quick draft "Result::expect" rfc --- text/0000-result-expect.md | 40 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 text/0000-result-expect.md diff --git a/text/0000-result-expect.md b/text/0000-result-expect.md new file mode 100644 index 00000000000..8546b6eb354 --- /dev/null +++ b/text/0000-result-expect.md @@ -0,0 +1,40 @@ +- Feature Name: `result_expect` +- Start Date: 2015-05-13 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add an `expect` method to the Result type, bounded to `E: Debug` + +# Motivation + +While `Result::unwrap` exists, it does not allow annotating the panic message with the operation +attempted (e.g. what file was being opened). This is at odds to 'Option' which includes both +`unwrap` and `expect` (with the latter taking an arbitrary failure message). + +# Detailed design + +Add a new method to the same `impl` block as `Result::unwrap` that takes a `&str` message and +returns `T` if the `Result` was `Ok`. If the `Result` was `Err`, it panics with both the provided +message and the error value. + +The format of the error message is left undefined in the documentation, but will most likely be +the following +``` +panic!("{}: {:?}", msg, e) +``` + +# Drawbacks + +- It involves adding a new method to a core rust type. +- The panic message format is less obvious than it is with `Option::expect` (where the panic message is the message passed) + +# Alternatives + +- We are perfectly free to not do this. +- A macro could be introduced to fill the same role (which would allow arbitrary formatting of the panic message). + +# Unresolved questions + +Are there any issues with the proposed format of the panic string? From a9ae74e3e1ecb80cc029d5c2f7cb965418eb18e4 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 7 May 2015 11:57:30 -0400 Subject: [PATCH 0293/1195] First version. --- text/0000-contingency-plan.md | 425 ++++++++++++++++++++++++++++++++++ text/0000-language-semver.md | 425 ++++++++++++++++++++++++++++++++++ 2 files changed, 850 insertions(+) create mode 100644 text/0000-contingency-plan.md create mode 100644 text/0000-language-semver.md diff --git a/text/0000-contingency-plan.md b/text/0000-contingency-plan.md new file mode 100644 index 00000000000..fb90724c44a --- /dev/null +++ b/text/0000-contingency-plan.md @@ -0,0 +1,425 @@ +- Feature Name: N/A +- Start Date: 2015-05-07 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC has two main goals: + +- define what precisely constitutes a breaking change for the Rust language itself; +- define a language versioning mechanism that extends the sorts of + changes we can make without causing compilation failures (for + example, adding new keywords). + +# Motivation + +With the release of 1.0, we need to establish clear policy on what +precisely constitutes a "minor" vs "major" change to the Rust language +itself (as opposed to libraries, which are covered by [RFC 1105]). +**This RFC proposes limiting breaking changes to changes with +soundness implications**: this includes both bug fixes in the compiler +itself, as well as changes to the type system or RFCs that are +necessary to close flaws uncovered later. + +However, simply landing all breaking changes immediately could be very +disruptive to the ecosystem. Therefore, **the RFC also proposes +specific measures to mitigate the impact of breaking changes**, and +some criteria when those measures might be appropriate. + +Furthermore, there are other kinds of changes that we may want to make +which feel like they *ought* to be possible, but which are in fact +breaking changes. The simplest example is adding a new keyword to the +language -- despite being a purely additive change, a new keyword can +of course conflict with existing identifiers. Therefore, **the RFC +proposes a simple annotation that allows crates to designate the +version of the language they were written for**. This effectively +permits some amount of breaking changes by making them "opt-in" +through the version attribute. + +However, even though the version attribute can be used to make +breaking changes "opt-in" (and hence not really breaking), this is +still a tool to be used with great caution. Therefore, **the RFC also +proposes guidelines on when it is appropriate to include an "opt-in" +breaking change and when it is not**. + +This RFC is focused specifically on the question of what kinds of +changes we can make in a minor release (as well as some limited +mechanisms that lay the groundwork for certain kinds of anticipated +changes). It intentionally does not address the question of a release +schedule for Rust 2.0, nor does it propose any new features +itself. These topics are complex enough to be worth considering in +separate RFCs. + +# Detailed design + +The detailed design is broken into two major section: how to address +soundness changes, and how to address other, opt-in style changes. We +do not discuss non-breaking changes here, since obviously those are +safe. + +### Soundness changes + +When compiler bugs or soundness problems are encountered in the +language itself (as opposed to in a library), clearly they ought to be +fixed. However, it is important to fix them in such a way as to +minimize the impact on the ecosystem. + +The first step then is to evaluate the impact of the fix on the crates +found in the `crates.io` website (using e.g. the crater tool). If +impact is found to be "small" (which this RFC does not attempt to +precisely define), then the fix can simply be landed. As today, the +commit message of any breaking change should include the term +`[breaking-change]` along with a description of how to resolve the +problem, which helps those people who are affected to migrate their +code. A description of the problem should also appear in the relevant +subteam report. + +In cases where the impact seems larger, the following steps can be +taken to ease the transition: + +1. Identify important crates (such as those with many dependencies) + and work with the crate author to correct the code as quickly as + possible, ideally before the fix even lands. +2. Work hard to ensure that the error message identifies the problem + clearly and suggests the appropriate solution. +3. Provide an annotation that allows for a scoped "opt out" of the + newer rules, as described below. While the change is still + breaking, this at least makes it easy for crates to update and get + back to compiling status quickly. +4. Begin with a deprecation or other warning before issuing a hard + error. In extreme cases, it might be nice to begin by issuing a + deprecation warning for the unsound behavior, and only make the + behavior a hard error after the deprecation has had time to + circulate. This gives people more time to update their crates. + However, this option may frequently not be available, because the + source of a compilation error is often hard to pin down with + precision. + +Some of the factors that should be taken into consideration when +deciding whether and how to minimize the impact of a fix: + +- How many crates on `crates.io` are affected? + - This is a general proxy for the overall impact (since of course + there will always be private crates that are not part of + crates.io). +- Were particularly vital or widely used crates affected? + - This could indicate that the impact will be wider than the raw + number would suggest. +- Does the change silently change the result of running the program, + or simply cause additional compilation failures? + - The latter, while frustrating, are easier to diagnose. +- What changes are needed to get code compiling again? Are those + changes obvious from the error message? + - The more cryptic the error, the more frustrating it is when + compilation fails. + +#### Opting out + +In some cases, it may be useful to permit users to opt out of new type +rules. The intention is that this "opt out" is used as a temporary +crutch to make it easy to get the code up and running. Depending on +the severity of the soundness fix, the "opt out" may be permanently +available, or it could be removed in a later release. In either case, +use of the "opt out" API would trigger the deprecation lint. + +#### Changes that alter dynamic semantics versus typing rules + +In some cases, fixing a bug may not cause crates to stop compiling, +but rather will cause them to silently start doing something different +than they were doing before. In cases like these, the same principle +of using mitigation measures to lessen the impact (and ease the +transition) applies, but the precise strategy to be used will have to +be worked out on a more case-by-case basis. This is particularly +relevant to the underspecified areas of the language described in the +next section. + +Our approach to handling [dynamic drop][RFC 320] is a good +example. Because we expect that moving to the complete non-zeroing +dynamic drop semantics will break code, we've made an intermediate +change that +[altered the compiler to fill with use a non-zero value](https://github.com/rust-lang/rust/pull/23535), +which helps to expose code that was implicitly relying on the current +behavior (much of which has since been restructured in a more +future-proof way). + +#### Underspecified language semantics + +There are a number of areas where the precise language semantics are +currently somewhat underspecified. Over time, we expect to be fully +defining the semantics of all of these areas. This may cause some +existing code -- and in particular existing unsafe code -- to break or +become invalid. Changes of this nature should be treated as soundness +changes, meaning that we should attempt to mitigate the impact and +ease the transition wherever possible. + +Known areas where change is expected include the following: + +- Destructors semantics: + - We plan to stop zeroing data and instead use marker flags on the stack, + as specified in [RFC 320]. This may affect destructors that rely on ovewriting + memory or using the `unsafe_no_drop_flag` attribute. + - Currently, panicing in a destructor can cause unintentional memory + leaks and other poor behavior (see [#14875], [#16135]). We are + likely to make panic in a destructor simply abort, but the precise + mechanism is not yet decided. + - Order of dtor execution within a data structure is somewhat + inconsistent (see [#744]). +- The legal aliasing rules between unsafe pointers is not fully settled (see [#19733]). +- The interplay of assoc types and lifetimes is not fully settled and can lead + to unsoundness in some cases (see [#23442]). +- The trait selection algorithm is expected to be improved and made more complete over time. + It is possible that this will affect existing code. +- [Overflow semantics][RFC 560]: in particular, we may have missed some cases. +- Memory allocation in unsafe code is currently unstable. We expect to + be defining safe interfaces as part of the work on supporting + tracing garbage collectors (see [#415]). +- The treatment of hygiene in macros is uneven (see [#22462], [#24278]). In some cases, + changes here may be backwards compatible, or may be more appropriate only with explicit opt-in + (or perhaps an alternate macro system altogether). +- The layout of data structures is expected to change over time unless they are annotated + with a `#[repr(C)]` attribute. +- Lints will evolve over time (both the lints that are enabled and the + precise cases that lints catch). We expect to introduce a + [means to limit the effect of these changes on dependencies][#1029]. +- Stack overflow is currently detected via a segmented stack check + prologue and results in an abort. We expect to experiment with a + system based on guard pages in the future. +- We currently abort the process on OOM conditions (exceeding the heap space, overflowing + the stack). We may attempt to panic in such cases instead if possible. +- Some details of type inference may change. For example, we expect to + implement the fallback mechanism described in [RFC 213], and we may + wish to make minor changes to accommodate overloaded integer + literals. In some cases, type inferences changes may be better + handled via explicit opt-in. + +(Although it is not directly covered by this RFC, it's worth noting in +passing that some of the CLI flags to the compiler may change in the +future as well. The `-Z` flags are of course explicitly unstable, but +some of the `-C`, rustdoc, and linker-specific flags are expected to +evolve over time.) + +### Opt-in changes + +For breaking changes that are not related to soundness or language +semantics, but are still deemed desirable, an opt-in strategy can be +used instead. This section describes an attribute for opting in to +newer language updates, and gives guidelines on what kinds of changes +should or should not be introduced in this fashion. + +We use the term *"opt-in changes"* to refer to changes that would be +breaking changes, but are not because of the opt-in mechanism. + +#### Rust version attribute + +The specific proposal is an attribute `#![rust_version="X.Y"]` that +can be attached to the crate; the version `X.Y` in this attribute is +called the crate's "declared version". Every build of the Rust +compiler will also have a version number built into it reflecting the +current release. + +When a `#[rust_version="X.Y"]` attribute is encountered, the compiler +will endeavor to produce the semantics of Rust "as it was" during +version `X.Y`. RFCs that propose opt-in changes should discuss how the +older behavior can be supported in the compiler, but this is expected +to be straightforward: if supporting older behavior is hard to do, it +may indicate that the opt-in change is too complex and should not be +accepted. + +If the crate declares a version `X.Y` that is *newer* than the +compiler itself, the compiler should simply issue a warning and +proceed as if the crate had declared the compiler's version (i.e., the +newer version the compiler knows about). + +Note that if the changes introducing by the Rust version `X.Y` affect +parsing, implementing these semantics may require some limited amount +of feedback between the parser and the tokenizer, or else a limited +"pre-parse" to scan the set of crate attributes and extract the +version. For example, if version `X.Y` adds new keywords, the +tokenizer will likely need to be configured appropriately with the +proper set of keywords. For this reason, it may make sense to require +that the `#![rust_version]` attribute appear *first* on the crate. + +#### When opt-in changes are appropriate + +Opt-in changes allow us to greatly expand the scope of the kinds of +additions we can make without breaking existing code, but they are not +applicable in all situations. A good rule of thumb is that an opt-in +change is only appropriate if the exact effect of the older code can +be easily recreated in the newer system with only surface changes to +the syntax. + +Another view is that opt-in changes are appropriate if those changes +do not affect the "abstract AST" of your Rust program. In other words, +existing Rust syntax is just a serialization of a more idealized view +of the syntax, in which there are no conflicts between keywords and +identifiers, syntactic sugar is expanded, and so forth. Opt-in changes +might affect the translation into this abstract AST, but should not +affect the semantics of the AST itself at a deeper level. This concept +of an idealized AST is analagous to the "elaborated syntax" described +in [RFC 1105], except that it is at a conceptual level. + +So, for example, the conflict between new keywords and existing +identifiers can (generally) be trivially worked around by renaming +identifiers, though the question of public identifiers is an +interesting one (contextual keywords may suffice, or else perhaps some +kind of escaping syntax -- we defer this question here for a later +RFC). + +In the previous section on breaking changes, we identified various +criteria that can be used to decide how to approach a breaking change +(i.e., how far to go in attempting to mitigate the fallout). For the +most part, those same criteria also apply when deciding whether to +accept an "opt-in" change: + +- How many crates on `crates.io` would break if they "opted-in" to the + change, and would opting in require extensive changes? +- Does the change silently change the result of running the program, + or simply cause additional compilation failures? + - Opt-in changes that silently change the result of running the + program are particularly unlikely to be accepted. +- What changes are needed to get code compiling again? Are those + changes obvious from the error message? + +# Drawbacks + +**Allowing unsafe code to continue compiling -- even with warnings -- +raises the probability that people experiences crashes and other +undesirable effects while using Rust.** However, in practice, most +unsafety hazards are more theoretical than practical: consider the +problem with the `thread::scoped` API. To actually create a data-race, +one had to place the guard into an `Rc` cycle, which would be quite +unusual. Therefore, a compromise path that warns about bad content but +provides an option for gradual migration seems preferable. + +**Deprecation implies that a maintenance burden.** For library APIs, +this is relatively simple, but for type-system changes it can be quite +onerous. We may want to consider a policy for dropping older, +deprecated type-system rules after some time, as discussed in the +section on *unresolved questions*. + +## Notes on phasing + +# Alternatives + +**Rather than supporting opt-in changes, one might consider simply +issuing a new major release for every such change.** Put simply, +though, issuing a new major release just because we want to have a new +keyword feels like overkill. This seems like to have two potential +negative effects. It may simply cause us to not make some of the +changes we would make otherwise, or work harder to fit them within the +existing syntactic constraints. It may also serve to dilute the +meaning of issuing a new major version, since even additive changes +that do not affect existing code in any meaningful way would result in +a major release. One would then be tempted to have some *additional* +numbering scheme, PR blitz, or other means to notify people when a new +major version is coming that indicates deeper changes. + +**Rather than simply fixing soundness bugs, we could use the opt-in +mechanism to fix them conditionally.** This was initially considered +as an option, but eventually rejected for the following reasons: + +- This would effectively cause a deeper split between minor versions; + currently, opt-in is limited to "surface changes" only, but allowing + opt-in to affect the type system feels like it would be creating two + distinct languages. +- It seems likely that all users of Rust will want to know that their code + is sound and would not want to be working with unsafe constructs or bugs. +- Users may choose not to opt-in to newer versions because they do not + need the new features introduced there or because they wish to + preserve compatibility with older compilers. It would be sad for + them to lose the benefits of bug fixes as well. +- We already have several mitigation measures, such as opt-out or + temporary deprecation, that can be used to ease the transition + around a soundness fix. Moreover, separating out new type rules so + that they can be "opted into" can be very difficult and would + complicate the compiler internally; it would also make it harder to + reason about the type system as a whole. + +**Rather than using a version number to opt-in to minor changes, one +might consider using the existing feature mechanism.** For example, +one could write `#![feature(foo)]` to opt in to the feature "foo" and +its associated keywords and type rules, rather than +`#![rust_version="1.2.3"]`. While using minimum version numbers is +more opaque than named features, they do offer several advantages: + +1. Using a version number alone makes it easy to think about what + version of Rust you are using as a conceptual unit, rather than + choosing features "a la carte". +2. Using named features, the list of features that must be attached to + Rust code will grow indefinitely, presuming your crate wants to + stay up to date. +3. Using a version attribute preserves a mental separation between + "experimental work" (feature gates) and stable, new features. +4. Named features present a combinatoric testing problem, where we + should (in principle) test for all possible combinations of + features. + +# Unresolved questions + +**Can (and should) we give a more precise definition for compiler bugs +and soundness problems?** The current text is vague on what precisely +constitutes a compiler bug and soundness change. It may be worth +defining more precisely, though likely this would be best done as part +of writing up a more thorough (and authoritative) Rust reference +manual. + +**Should we add a mechanism for "escaping" keywords?"** We may need a +mechanism for escaping keywords in the future. Imagine you have a +public function named `foo`, and we add a keyword `foo`. Now, if you +opt in to the newer version of Rust, your function declaration is +illegal: but if you rename the function `foo`, you are making a +breaking change for your clients, which you may not wish to do. If we +had an escaping mechanism, you would probably still want to deprecate +`foo` in favor of a new function `bar` (since typing `foo` would be +awkward), but it could still exist. + +**Should we add a mechanism for skipping over new syntax?** The +current `#[cfg]` mechanism is applied *after* parsing. This implies +that if we add new syntax, crates which employ that new syntax will +not be parsable by older compilers, even if the modules that depend on +that new syntax are disabled via `#[cfg]` directives. It may be useful +to add some mechanism for informing the parser that it should skip +over sections of the input (presumably based on token trees). One +approach to this might just be modifying the existing `#[cfg]` +directives so that they are applied during parsing rather than as a +post-pass. + +**What precisely constitutes "small" impact?** This RFC does not +attempt to define when the impact of a patch is "small" or "not +small". We will have to develop guidelines over time based on +precedent. One of the big unknowns is how indicative the breakage we +observe on `crates.io` will be of the total breakage that will occur: +it is certainly possible that all crates on `crates.io` work fine, but +the change still breaks a large body of code we do not have access to. + +**Should deprecation due to unsoundness have a special lint?** We may +not want to use the same deprecation lint for unsoundness that we use +for everything else. + +**What attribute should we use to "opt out" of soundness changes?** +The section on breaking changes indicated that it may sometimes be +appropriate to includ an "opt out" that people can use to temporarily +revert to older, unsound type rules, but did not specify precisely +what that opt-out should look like. Ideally, we would identify a +specific attribute in advance that will be used for such purposes. In +the past, we have simply created ad-hoc attributes (e.g., +`#[old_orphan_check]`), but because custom attributes are forbidden by +stable Rust, this has the unfortunate side-effect of meaning that code +which opts out of the newer rules cannot be compiled on older +compilers (even though it's using the older type system rules). If we +introduce an attribute in advance we will not have this problem. + +[RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 +[RFC 320]: https://github.com/rust-lang/rfcs/pull/320 +[#774]: https://github.com/rust-lang/rfcs/issues/744 +[#14875]: https://github.com/rust-lang/rust/issues/14875 +[#16135]: https://github.com/rust-lang/rust/issues/16135 +[#19733]: https://github.com/rust-lang/rust/issues/19733 +[#23442]: https://github.com/rust-lang/rust/issues/23442 +[RFC 213]: https://github.com/rust-lang/rfcs/pull/213 +[#415]: https://github.com/rust-lang/rfcs/issues/415 +[#22462]: https://github.com/rust-lang/rust/issues/22462#issuecomment-81756673 +[#24278]: https://github.com/rust-lang/rust/issues/24278 +[#1029]: https://github.com/rust-lang/rfcs/issues/1029 +[RFC 560]: https://github.com/rust-lang/rfcs/pull/560 diff --git a/text/0000-language-semver.md b/text/0000-language-semver.md new file mode 100644 index 00000000000..ef42b11377a --- /dev/null +++ b/text/0000-language-semver.md @@ -0,0 +1,425 @@ +- Feature Name: N/A +- Start Date: 2015-05-07 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC has two main goals: + +- define what precisely constitutes a breaking change for the Rust language itself; +- define a language versioning mechanism that extends the sorts of + changes we can make without causing compilation failures (for + example, adding new keywords). + +# Motivation + +With the release of 1.0, we need to establish clear policy on what +precisely constitutes a "minor" vs "major" change to the Rust language +itself (as opposed to libraries, which are covered by [RFC 1105]). +**This RFC proposes limiting breaking changes to changes with +soundness implications**: this includes both bug fixes in the compiler +itself, as well as changes to the type system or RFCs that are +necessary to close flaws uncovered later. + +However, simply landing all breaking changes immediately could be very +disruptive to the ecosystem. Therefore, **the RFC also proposes +specific measures to mitigate the impact of breaking changes**, and +some criteria when those measures might be appropriate. + +Furthermore, there are other kinds of changes that we may want to make +which feel like they *ought* to be possible, but which are in fact +breaking changes. The simplest example is adding a new keyword to the +language -- despite being a purely additive change, a new keyword can +of course conflict with existing identifiers. Therefore, **the RFC +proposes a simple annotation that allows crates to designate the +version of the language they were written for**. This effectively +permits some amount of breaking changes by making them "opt-in" +through the version attribute. + +However, even though the version attribute can be used to make +breaking changes "opt-in" (and hence not really breaking), this is +still a tool to be used with great caution. Therefore, **the RFC also +proposes guidelines on when it is appropriate to include an "opt-in" +breaking change and when it is not**. + +This RFC is focused specifically on the question of what kinds of +changes we can make within a single major version (as well as some +limited mechanisms that lay the groundwork for certain kinds of +anticipated changes). It intentionally does not address the question +of a release schedule for Rust 2.0, nor does it propose any new +features itself. These topics are complex enough to be worth +considering in separate RFCs. + +# Detailed design + +The detailed design is broken into two major section: how to address +soundness changes, and how to address other, opt-in style changes. We +do not discuss non-breaking changes here, since obviously those are +safe. + +### Soundness changes + +When compiler bugs or soundness problems are encountered in the +language itself (as opposed to in a library), clearly they ought to be +fixed. However, it is important to fix them in such a way as to +minimize the impact on the ecosystem. + +The first step then is to evaluate the impact of the fix on the crates +found in the `crates.io` website (using e.g. the crater tool). If +impact is found to be "small" (which this RFC does not attempt to +precisely define), then the fix can simply be landed. As today, the +commit message of any breaking change should include the term +`[breaking-change]` along with a description of how to resolve the +problem, which helps those people who are affected to migrate their +code. A description of the problem should also appear in the relevant +subteam report. + +In cases where the impact seems larger, the following steps can be +taken to ease the transition: + +1. Identify important crates (such as those with many dependencies) + and work with the crate author to correct the code as quickly as + possible, ideally before the fix even lands. +2. Work hard to ensure that the error message identifies the problem + clearly and suggests the appropriate solution. +3. Provide an annotation that allows for a scoped "opt out" of the + newer rules, as described below. While the change is still + breaking, this at least makes it easy for crates to update and get + back to compiling status quickly. +4. Begin with a deprecation or other warning before issuing a hard + error. In extreme cases, it might be nice to begin by issuing a + deprecation warning for the unsound behavior, and only make the + behavior a hard error after the deprecation has had time to + circulate. This gives people more time to update their crates. + However, this option may frequently not be available, because the + source of a compilation error is often hard to pin down with + precision. + +Some of the factors that should be taken into consideration when +deciding whether and how to minimize the impact of a fix: + +- How many crates on `crates.io` are affected? + - This is a general proxy for the overall impact (since of course + there will always be private crates that are not part of + crates.io). +- Were particularly vital or widely used crates affected? + - This could indicate that the impact will be wider than the raw + number would suggest. +- Does the change silently change the result of running the program, + or simply cause additional compilation failures? + - The latter, while frustrating, are easier to diagnose. +- What changes are needed to get code compiling again? Are those + changes obvious from the error message? + - The more cryptic the error, the more frustrating it is when + compilation fails. + +#### Opting out + +In some cases, it may be useful to permit users to opt out of new type +rules. The intention is that this "opt out" is used as a temporary +crutch to make it easy to get the code up and running. Depending on +the severity of the soundness fix, the "opt out" may be permanently +available, or it could be removed in a later release. In either case, +use of the "opt out" API would trigger the deprecation lint. + +#### Changes that alter dynamic semantics versus typing rules + +In some cases, fixing a bug may not cause crates to stop compiling, +but rather will cause them to silently start doing something different +than they were doing before. In cases like these, the same principle +of using mitigation measures to lessen the impact (and ease the +transition) applies, but the precise strategy to be used will have to +be worked out on a more case-by-case basis. This is particularly +relevant to the underspecified areas of the language described in the +next section. + +Our approach to handling [dynamic drop][RFC 320] is a good +example. Because we expect that moving to the complete non-zeroing +dynamic drop semantics will break code, we've made an intermediate +change that +[altered the compiler to fill with use a non-zero value](https://github.com/rust-lang/rust/pull/23535), +which helps to expose code that was implicitly relying on the current +behavior (much of which has since been restructured in a more +future-proof way). + +#### Underspecified language semantics + +There are a number of areas where the precise language semantics are +currently somewhat underspecified. Over time, we expect to be fully +defining the semantics of all of these areas. This may cause some +existing code -- and in particular existing unsafe code -- to break or +become invalid. Changes of this nature should be treated as soundness +changes, meaning that we should attempt to mitigate the impact and +ease the transition wherever possible. + +Known areas where change is expected include the following: + +- Destructors semantics: + - We plan to stop zeroing data and instead use marker flags on the stack, + as specified in [RFC 320]. This may affect destructors that rely on ovewriting + memory or using the `unsafe_no_drop_flag` attribute. + - Currently, panicing in a destructor can cause unintentional memory + leaks and other poor behavior (see [#14875], [#16135]). We are + likely to make panic in a destructor simply abort, but the precise + mechanism is not yet decided. + - Order of dtor execution within a data structure is somewhat + inconsistent (see [#744]). +- The legal aliasing rules between unsafe pointers is not fully settled (see [#19733]). +- The interplay of assoc types and lifetimes is not fully settled and can lead + to unsoundness in some cases (see [#23442]). +- The trait selection algorithm is expected to be improved and made more complete over time. + It is possible that this will affect existing code. +- [Overflow semantics][RFC 560]: in particular, we may have missed some cases. +- Memory allocation in unsafe code is currently unstable. We expect to + be defining safe interfaces as part of the work on supporting + tracing garbage collectors (see [#415]). +- The treatment of hygiene in macros is uneven (see [#22462], [#24278]). In some cases, + changes here may be backwards compatible, or may be more appropriate only with explicit opt-in + (or perhaps an alternate macro system altogether). +- The layout of data structures is expected to change over time unless they are annotated + with a `#[repr(C)]` attribute. +- Lints will evolve over time (both the lints that are enabled and the + precise cases that lints catch). We expect to introduce a + [means to limit the effect of these changes on dependencies][#1029]. +- Stack overflow is currently detected via a segmented stack check + prologue and results in an abort. We expect to experiment with a + system based on guard pages in the future. +- We currently abort the process on OOM conditions (exceeding the heap space, overflowing + the stack). We may attempt to panic in such cases instead if possible. +- Some details of type inference may change. For example, we expect to + implement the fallback mechanism described in [RFC 213], and we may + wish to make minor changes to accommodate overloaded integer + literals. In some cases, type inferences changes may be better + handled via explicit opt-in. + +(Although it is not directly covered by this RFC, it's worth noting in +passing that some of the CLI flags to the compiler may change in the +future as well. The `-Z` flags are of course explicitly unstable, but +some of the `-C`, rustdoc, and linker-specific flags are expected to +evolve over time.) + +### Opt-in changes + +For breaking changes that are not related to soundness or language +semantics, but are still deemed desirable, an opt-in strategy can be +used instead. This section describes an attribute for opting in to +newer language updates, and gives guidelines on what kinds of changes +should or should not be introduced in this fashion. + +We use the term *"opt-in changes"* to refer to changes that would be +breaking changes, but are not because of the opt-in mechanism. + +#### Rust version attribute + +The specific proposal is an attribute `#![rust_version="X.Y"]` that +can be attached to the crate; the version `X.Y` in this attribute is +called the crate's "declared version". Every build of the Rust +compiler will also have a version number built into it reflecting the +current release. + +When a `#[rust_version="X.Y"]` attribute is encountered, the compiler +will endeavor to produce the semantics of Rust "as it was" during +version `X.Y`. RFCs that propose opt-in changes should discuss how the +older behavior can be supported in the compiler, but this is expected +to be straightforward: if supporting older behavior is hard to do, it +may indicate that the opt-in change is too complex and should not be +accepted. + +If the crate declares a version `X.Y` that is *newer* than the +compiler itself, the compiler should simply issue a warning and +proceed as if the crate had declared the compiler's version (i.e., the +newer version the compiler knows about). + +Note that if the changes introducing by the Rust version `X.Y` affect +parsing, implementing these semantics may require some limited amount +of feedback between the parser and the tokenizer, or else a limited +"pre-parse" to scan the set of crate attributes and extract the +version. For example, if version `X.Y` adds new keywords, the +tokenizer will likely need to be configured appropriately with the +proper set of keywords. For this reason, it may make sense to require +that the `#![rust_version]` attribute appear *first* on the crate. + +#### When opt-in changes are appropriate + +Opt-in changes allow us to greatly expand the scope of the kinds of +additions we can make without breaking existing code, but they are not +applicable in all situations. A good rule of thumb is that an opt-in +change is only appropriate if the exact effect of the older code can +be easily recreated in the newer system with only surface changes to +the syntax. + +Another view is that opt-in changes are appropriate if those changes +do not affect the "abstract AST" of your Rust program. In other words, +existing Rust syntax is just a serialization of a more idealized view +of the syntax, in which there are no conflicts between keywords and +identifiers, syntactic sugar is expanded, and so forth. Opt-in changes +might affect the translation into this abstract AST, but should not +affect the semantics of the AST itself at a deeper level. This concept +of an idealized AST is analagous to the "elaborated syntax" described +in [RFC 1105], except that it is at a conceptual level. + +So, for example, the conflict between new keywords and existing +identifiers can (generally) be trivially worked around by renaming +identifiers, though the question of public identifiers is an +interesting one (contextual keywords may suffice, or else perhaps some +kind of escaping syntax -- we defer this question here for a later +RFC). + +In the previous section on breaking changes, we identified various +criteria that can be used to decide how to approach a breaking change +(i.e., how far to go in attempting to mitigate the fallout). For the +most part, those same criteria also apply when deciding whether to +accept an "opt-in" change: + +- How many crates on `crates.io` would break if they "opted-in" to the + change, and would opting in require extensive changes? +- Does the change silently change the result of running the program, + or simply cause additional compilation failures? + - Opt-in changes that silently change the result of running the + program are particularly unlikely to be accepted. +- What changes are needed to get code compiling again? Are those + changes obvious from the error message? + +# Drawbacks + +**Allowing unsafe code to continue compiling -- even with warnings -- +raises the probability that people experiences crashes and other +undesirable effects while using Rust.** However, in practice, most +unsafety hazards are more theoretical than practical: consider the +problem with the `thread::scoped` API. To actually create a data-race, +one had to place the guard into an `Rc` cycle, which would be quite +unusual. Therefore, a compromise path that warns about bad content but +provides an option for gradual migration seems preferable. + +**Deprecation implies that a maintenance burden.** For library APIs, +this is relatively simple, but for type-system changes it can be quite +onerous. We may want to consider a policy for dropping older, +deprecated type-system rules after some time, as discussed in the +section on *unresolved questions*. + +## Notes on phasing + +# Alternatives + +**Rather than supporting opt-in changes, one might consider simply +issuing a new major release for every such change.** Put simply, +though, issuing a new major release just because we want to have a new +keyword feels like overkill. This seems like to have two potential +negative effects. It may simply cause us to not make some of the +changes we would make otherwise, or work harder to fit them within the +existing syntactic constraints. It may also serve to dilute the +meaning of issuing a new major version, since even additive changes +that do not affect existing code in any meaningful way would result in +a major release. One would then be tempted to have some *additional* +numbering scheme, PR blitz, or other means to notify people when a new +major version is coming that indicates deeper changes. + +**Rather than simply fixing soundness bugs, we could use the opt-in +mechanism to fix them conditionally.** This was initially considered +as an option, but eventually rejected for the following reasons: + +- This would effectively cause a deeper split between minor versions; + currently, opt-in is limited to "surface changes" only, but allowing + opt-in to affect the type system feels like it would be creating two + distinct languages. +- It seems likely that all users of Rust will want to know that their code + is sound and would not want to be working with unsafe constructs or bugs. +- Users may choose not to opt-in to newer versions because they do not + need the new features introduced there or because they wish to + preserve compatibility with older compilers. It would be sad for + them to lose the benefits of bug fixes as well. +- We already have several mitigation measures, such as opt-out or + temporary deprecation, that can be used to ease the transition + around a soundness fix. Moreover, separating out new type rules so + that they can be "opted into" can be very difficult and would + complicate the compiler internally; it would also make it harder to + reason about the type system as a whole. + +**Rather than using a version number to opt-in to minor changes, one +might consider using the existing feature mechanism.** For example, +one could write `#![feature(foo)]` to opt in to the feature "foo" and +its associated keywords and type rules, rather than +`#![rust_version="1.2.3"]`. While using minimum version numbers is +more opaque than named features, they do offer several advantages: + +1. Using a version number alone makes it easy to think about what + version of Rust you are using as a conceptual unit, rather than + choosing features "a la carte". +2. Using named features, the list of features that must be attached to + Rust code will grow indefinitely, presuming your crate wants to + stay up to date. +3. Using a version attribute preserves a mental separation between + "experimental work" (feature gates) and stable, new features. +4. Named features present a combinatoric testing problem, where we + should (in principle) test for all possible combinations of + features. + +# Unresolved questions + +**Can (and should) we give a more precise definition for compiler bugs +and soundness problems?** The current text is vague on what precisely +constitutes a compiler bug and soundness change. It may be worth +defining more precisely, though likely this would be best done as part +of writing up a more thorough (and authoritative) Rust reference +manual. + +**Should we add a mechanism for "escaping" keywords?"** We may need a +mechanism for escaping keywords in the future. Imagine you have a +public function named `foo`, and we add a keyword `foo`. Now, if you +opt in to the newer version of Rust, your function declaration is +illegal: but if you rename the function `foo`, you are making a +breaking change for your clients, which you may not wish to do. If we +had an escaping mechanism, you would probably still want to deprecate +`foo` in favor of a new function `bar` (since typing `foo` would be +awkward), but it could still exist. + +**Should we add a mechanism for skipping over new syntax?** The +current `#[cfg]` mechanism is applied *after* parsing. This implies +that if we add new syntax, crates which employ that new syntax will +not be parsable by older compilers, even if the modules that depend on +that new syntax are disabled via `#[cfg]` directives. It may be useful +to add some mechanism for informing the parser that it should skip +over sections of the input (presumably based on token trees). One +approach to this might just be modifying the existing `#[cfg]` +directives so that they are applied during parsing rather than as a +post-pass. + +**What precisely constitutes "small" impact?** This RFC does not +attempt to define when the impact of a patch is "small" or "not +small". We will have to develop guidelines over time based on +precedent. One of the big unknowns is how indicative the breakage we +observe on `crates.io` will be of the total breakage that will occur: +it is certainly possible that all crates on `crates.io` work fine, but +the change still breaks a large body of code we do not have access to. + +**Should deprecation due to unsoundness have a special lint?** We may +not want to use the same deprecation lint for unsoundness that we use +for everything else. + +**What attribute should we use to "opt out" of soundness changes?** +The section on breaking changes indicated that it may sometimes be +appropriate to includ an "opt out" that people can use to temporarily +revert to older, unsound type rules, but did not specify precisely +what that opt-out should look like. Ideally, we would identify a +specific attribute in advance that will be used for such purposes. In +the past, we have simply created ad-hoc attributes (e.g., +`#[old_orphan_check]`), but because custom attributes are forbidden by +stable Rust, this has the unfortunate side-effect of meaning that code +which opts out of the newer rules cannot be compiled on older +compilers (even though it's using the older type system rules). If we +introduce an attribute in advance we will not have this problem. + +[RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 +[RFC 320]: https://github.com/rust-lang/rfcs/pull/320 +[#774]: https://github.com/rust-lang/rfcs/issues/744 +[#14875]: https://github.com/rust-lang/rust/issues/14875 +[#16135]: https://github.com/rust-lang/rust/issues/16135 +[#19733]: https://github.com/rust-lang/rust/issues/19733 +[#23442]: https://github.com/rust-lang/rust/issues/23442 +[RFC 213]: https://github.com/rust-lang/rfcs/pull/213 +[#415]: https://github.com/rust-lang/rfcs/issues/415 +[#22462]: https://github.com/rust-lang/rust/issues/22462#issuecomment-81756673 +[#24278]: https://github.com/rust-lang/rust/issues/24278 +[#1029]: https://github.com/rust-lang/rfcs/issues/1029 +[RFC 560]: https://github.com/rust-lang/rfcs/pull/560 From 622e2524c3f29c0ac336501591bb1f92dc44d545 Mon Sep 17 00:00:00 2001 From: Ulrik Sverdrup Date: Sun, 17 May 2015 16:55:33 +0200 Subject: [PATCH 0294/1195] [str-split-at] RFC: introduce `split_at(mid: usize)` on `str` --- text/0000-str-split-at.md | 100 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 text/0000-str-split-at.md diff --git a/text/0000-str-split-at.md b/text/0000-str-split-at.md new file mode 100644 index 00000000000..c47fd9fc748 --- /dev/null +++ b/text/0000-str-split-at.md @@ -0,0 +1,100 @@ +- Feature Name: str-split-at +- Start Date: 2015-05-17 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Introduce the method `split_at(&self, mid: usize) -> (&str, &str)` on `str`, +to divide a slice into two, just like we can with `[T]`. + +# Motivation + +Adding `split_at` is a measure to provide a method from `[T]` in a version that +makes sense for `str`. + +Once used to `[T]`, users might even expect that `split_at` is present on str. + +It is a simple method with an obvious implementation, but it provides +convenience while working with string segmentation manually, which we already +have ample tools for (for example the method `find` that returns the first +matching byte offset). + +Using `split_at` can lead to less repeated bounds checks, since it is easy to +use cumulatively, splitting off a piece at a time. + +This feature is requested in [rust-lang/rust#18063][freq] + +[freq]: https://github.com/rust-lang/rust/issues/18063 + +# Detailed design + +Introduce the method `split_at(&self, mid: usize) -> (&str, &str)` on `str`, to +divide a slice into two. + +`mid` will be a byte offset from the start of the string, and it must be on +a character boundary. Both `0` and `self.len()` are valid splitting points. + +`split_at` will be an inherent method on `str` where possible, and will be +available from libcore and the layers above it. + +The following is a working implementation, implemented as a trait just for +illustration and to be testable as a custom extension: + +```rust +trait SplitAt { + fn split_at(&self, mid: usize) -> (&Self, &Self); +} + +impl SplitAt for str { + /// Divide one string slice into two at an index. + /// + /// The index `mid` is a byte offset from the start of the string + /// that must be on a character boundary. + /// + /// Return slices `&self[..mid]` and `&self[mid..]`. + /// + /// # Panics + /// + /// Panics if `mid` is beyond the last character of the string, + /// or if it is not on a character boundary. + /// + /// # Examples + /// ``` + /// let s = "Löwe 老虎 Léopard"; + /// let first_space = s.find(' ').unwrap_or(s.len()); + /// let (a, b) = s.split_at(first_space); + /// + /// assert_eq!(a, "Löwe"); + /// assert_eq!(b, " 老虎 Léopard"); + /// ``` + fn split_at(&self, mid: usize) -> (&str, &str) { + (&self[..mid], &self[mid..]) + } +} +``` + +`split_at` will use a byte offset (a.k.a byte index) to be consistent with +slicing and the offset used by interrogator methods such as `find` or iterators +such as `char_indices`. Byte offsets are our standard lightweight position +indicators that we use to support efficient operations on string slices. + +Implementing `split_at_mut` is not relevant for `str` at this time. + +# Drawbacks + +* Possible name confusion with other `str` methods like `.split()` +* According to our developing API evolution and semver guidelines this is a + breaking change but a (very) minor change. Adding methods is something we + expect to be able to. (See [RFC PR #1105][pr1105]). + +[pr1105]: https://github.com/rust-lang/rfcs/pull/1105 + +# Alternatives + +* Recommend other splitting methods, like the split iterators. +* Stick to writing `(&foo[..mid], &foo[mid..])` + +# Unresolved questions + +* *None* From 2e727ea4e22ccde9c91b2b84e1c6b071b571cf8c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 19 May 2015 11:31:52 -0700 Subject: [PATCH 0295/1195] RFC 1047 is socket timeouts --- text/{0000-socket-timeouts.md => 1047-socket-timeouts.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-socket-timeouts.md => 1047-socket-timeouts.md} (96%) diff --git a/text/0000-socket-timeouts.md b/text/1047-socket-timeouts.md similarity index 96% rename from text/0000-socket-timeouts.md rename to text/1047-socket-timeouts.md index 64b9c7ccf9e..81d72964462 100644 --- a/text/0000-socket-timeouts.md +++ b/text/1047-socket-timeouts.md @@ -1,7 +1,7 @@ -- Feature Name: socket_timeouts +- Feature Name: `socket_timeouts` - Start Date: 2015-04-08 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1047](https://github.com/rust-lang/rfcs/pull/1047) +- Rust Issue: [rust-lang/rust#25619](https://github.com/rust-lang/rust/issues/25619) # Summary From 8f8f7ea7ebc3890e1d8541f34f59ec15177311a8 Mon Sep 17 00:00:00 2001 From: Ulrik Sverdrup Date: Wed, 20 May 2015 03:09:46 +0200 Subject: [PATCH 0296/1195] [str-split-at] Update drawbacks --- text/0000-str-split-at.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-str-split-at.md b/text/0000-str-split-at.md index c47fd9fc748..66f849096d0 100644 --- a/text/0000-str-split-at.md +++ b/text/0000-str-split-at.md @@ -83,6 +83,8 @@ Implementing `split_at_mut` is not relevant for `str` at this time. # Drawbacks +* `split_at` panics on 1) index out of bounds 2) index not on character + boundary. * Possible name confusion with other `str` methods like `.split()` * According to our developing API evolution and semver guidelines this is a breaking change but a (very) minor change. Adding methods is something we From e6da4014e737da35d4b013fcc84b356de1607789 Mon Sep 17 00:00:00 2001 From: James Miller Date: Wed, 20 May 2015 13:40:04 +1200 Subject: [PATCH 0297/1195] Write RFC for adding an expect intrinsic --- text/0000-expect-intrinsic.md | 43 +++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 text/0000-expect-intrinsic.md diff --git a/text/0000-expect-intrinsic.md b/text/0000-expect-intrinsic.md new file mode 100644 index 00000000000..078aba7ffa8 --- /dev/null +++ b/text/0000-expect-intrinsic.md @@ -0,0 +1,43 @@ +- Feature Name: expect_intrinsic +- Start Date: 2015-05-20 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Provide an intrinsic function for hinting the likelyhood of branches being taken. + +# Motivation + +Branch prediction can have significant effects on the running time of some code. Especially tight +inner loops which may be run millions of times. While in general programmers aren't able to +effectively provide hints to the compiler, there are cases where the likelyhood of some branch +being taken can be known. + +For example: in arbitrary-precision arithmetic, operations are often performed in a base that is +equal to `2^word_size`. The most basic division algorithm, "Schoolbook Division", has a step that +will be taken in `2/B` cases (where `B` is the base the numbers are in), given random input. On a +32-bit processor that is approximately one in two billion cases, for 64-bit it's one in 18 +quintillion cases. + +# Detailed design + +Implement an `expect` intrinsic with the signature: `fn(bool, bool) -> bool`. The first argument is +the condition being tested, the second argument is the expected result. The return value is the +same as the first argument, meaning that `if foo == bar { .. }` can be simply replaced with +`if expect(foo == bar, false) { .. }`. + +The expected value is required to be a constant value. + +# Drawbacks + +The second argument is required to be a constant value, which can't be easily expressed. + +# Alternatives + +Provide a pair of intrinsics `likely` and `unlikely`, these are the same as `expect` just with +`true` and `false` substituted in for the expected value, respectively. + +# Unresolved questions + +None. \ No newline at end of file From 5d5923a4c2b8efc8578b53f56a3e43b7fd7a2390 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 21 May 2015 15:02:10 -0400 Subject: [PATCH 0298/1195] Address suggestions, remove duplicate file. --- text/0000-contingency-plan.md | 425 ---------------------------------- text/0000-language-semver.md | 81 +++++-- 2 files changed, 63 insertions(+), 443 deletions(-) delete mode 100644 text/0000-contingency-plan.md diff --git a/text/0000-contingency-plan.md b/text/0000-contingency-plan.md deleted file mode 100644 index fb90724c44a..00000000000 --- a/text/0000-contingency-plan.md +++ /dev/null @@ -1,425 +0,0 @@ -- Feature Name: N/A -- Start Date: 2015-05-07 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary - -This RFC has two main goals: - -- define what precisely constitutes a breaking change for the Rust language itself; -- define a language versioning mechanism that extends the sorts of - changes we can make without causing compilation failures (for - example, adding new keywords). - -# Motivation - -With the release of 1.0, we need to establish clear policy on what -precisely constitutes a "minor" vs "major" change to the Rust language -itself (as opposed to libraries, which are covered by [RFC 1105]). -**This RFC proposes limiting breaking changes to changes with -soundness implications**: this includes both bug fixes in the compiler -itself, as well as changes to the type system or RFCs that are -necessary to close flaws uncovered later. - -However, simply landing all breaking changes immediately could be very -disruptive to the ecosystem. Therefore, **the RFC also proposes -specific measures to mitigate the impact of breaking changes**, and -some criteria when those measures might be appropriate. - -Furthermore, there are other kinds of changes that we may want to make -which feel like they *ought* to be possible, but which are in fact -breaking changes. The simplest example is adding a new keyword to the -language -- despite being a purely additive change, a new keyword can -of course conflict with existing identifiers. Therefore, **the RFC -proposes a simple annotation that allows crates to designate the -version of the language they were written for**. This effectively -permits some amount of breaking changes by making them "opt-in" -through the version attribute. - -However, even though the version attribute can be used to make -breaking changes "opt-in" (and hence not really breaking), this is -still a tool to be used with great caution. Therefore, **the RFC also -proposes guidelines on when it is appropriate to include an "opt-in" -breaking change and when it is not**. - -This RFC is focused specifically on the question of what kinds of -changes we can make in a minor release (as well as some limited -mechanisms that lay the groundwork for certain kinds of anticipated -changes). It intentionally does not address the question of a release -schedule for Rust 2.0, nor does it propose any new features -itself. These topics are complex enough to be worth considering in -separate RFCs. - -# Detailed design - -The detailed design is broken into two major section: how to address -soundness changes, and how to address other, opt-in style changes. We -do not discuss non-breaking changes here, since obviously those are -safe. - -### Soundness changes - -When compiler bugs or soundness problems are encountered in the -language itself (as opposed to in a library), clearly they ought to be -fixed. However, it is important to fix them in such a way as to -minimize the impact on the ecosystem. - -The first step then is to evaluate the impact of the fix on the crates -found in the `crates.io` website (using e.g. the crater tool). If -impact is found to be "small" (which this RFC does not attempt to -precisely define), then the fix can simply be landed. As today, the -commit message of any breaking change should include the term -`[breaking-change]` along with a description of how to resolve the -problem, which helps those people who are affected to migrate their -code. A description of the problem should also appear in the relevant -subteam report. - -In cases where the impact seems larger, the following steps can be -taken to ease the transition: - -1. Identify important crates (such as those with many dependencies) - and work with the crate author to correct the code as quickly as - possible, ideally before the fix even lands. -2. Work hard to ensure that the error message identifies the problem - clearly and suggests the appropriate solution. -3. Provide an annotation that allows for a scoped "opt out" of the - newer rules, as described below. While the change is still - breaking, this at least makes it easy for crates to update and get - back to compiling status quickly. -4. Begin with a deprecation or other warning before issuing a hard - error. In extreme cases, it might be nice to begin by issuing a - deprecation warning for the unsound behavior, and only make the - behavior a hard error after the deprecation has had time to - circulate. This gives people more time to update their crates. - However, this option may frequently not be available, because the - source of a compilation error is often hard to pin down with - precision. - -Some of the factors that should be taken into consideration when -deciding whether and how to minimize the impact of a fix: - -- How many crates on `crates.io` are affected? - - This is a general proxy for the overall impact (since of course - there will always be private crates that are not part of - crates.io). -- Were particularly vital or widely used crates affected? - - This could indicate that the impact will be wider than the raw - number would suggest. -- Does the change silently change the result of running the program, - or simply cause additional compilation failures? - - The latter, while frustrating, are easier to diagnose. -- What changes are needed to get code compiling again? Are those - changes obvious from the error message? - - The more cryptic the error, the more frustrating it is when - compilation fails. - -#### Opting out - -In some cases, it may be useful to permit users to opt out of new type -rules. The intention is that this "opt out" is used as a temporary -crutch to make it easy to get the code up and running. Depending on -the severity of the soundness fix, the "opt out" may be permanently -available, or it could be removed in a later release. In either case, -use of the "opt out" API would trigger the deprecation lint. - -#### Changes that alter dynamic semantics versus typing rules - -In some cases, fixing a bug may not cause crates to stop compiling, -but rather will cause them to silently start doing something different -than they were doing before. In cases like these, the same principle -of using mitigation measures to lessen the impact (and ease the -transition) applies, but the precise strategy to be used will have to -be worked out on a more case-by-case basis. This is particularly -relevant to the underspecified areas of the language described in the -next section. - -Our approach to handling [dynamic drop][RFC 320] is a good -example. Because we expect that moving to the complete non-zeroing -dynamic drop semantics will break code, we've made an intermediate -change that -[altered the compiler to fill with use a non-zero value](https://github.com/rust-lang/rust/pull/23535), -which helps to expose code that was implicitly relying on the current -behavior (much of which has since been restructured in a more -future-proof way). - -#### Underspecified language semantics - -There are a number of areas where the precise language semantics are -currently somewhat underspecified. Over time, we expect to be fully -defining the semantics of all of these areas. This may cause some -existing code -- and in particular existing unsafe code -- to break or -become invalid. Changes of this nature should be treated as soundness -changes, meaning that we should attempt to mitigate the impact and -ease the transition wherever possible. - -Known areas where change is expected include the following: - -- Destructors semantics: - - We plan to stop zeroing data and instead use marker flags on the stack, - as specified in [RFC 320]. This may affect destructors that rely on ovewriting - memory or using the `unsafe_no_drop_flag` attribute. - - Currently, panicing in a destructor can cause unintentional memory - leaks and other poor behavior (see [#14875], [#16135]). We are - likely to make panic in a destructor simply abort, but the precise - mechanism is not yet decided. - - Order of dtor execution within a data structure is somewhat - inconsistent (see [#744]). -- The legal aliasing rules between unsafe pointers is not fully settled (see [#19733]). -- The interplay of assoc types and lifetimes is not fully settled and can lead - to unsoundness in some cases (see [#23442]). -- The trait selection algorithm is expected to be improved and made more complete over time. - It is possible that this will affect existing code. -- [Overflow semantics][RFC 560]: in particular, we may have missed some cases. -- Memory allocation in unsafe code is currently unstable. We expect to - be defining safe interfaces as part of the work on supporting - tracing garbage collectors (see [#415]). -- The treatment of hygiene in macros is uneven (see [#22462], [#24278]). In some cases, - changes here may be backwards compatible, or may be more appropriate only with explicit opt-in - (or perhaps an alternate macro system altogether). -- The layout of data structures is expected to change over time unless they are annotated - with a `#[repr(C)]` attribute. -- Lints will evolve over time (both the lints that are enabled and the - precise cases that lints catch). We expect to introduce a - [means to limit the effect of these changes on dependencies][#1029]. -- Stack overflow is currently detected via a segmented stack check - prologue and results in an abort. We expect to experiment with a - system based on guard pages in the future. -- We currently abort the process on OOM conditions (exceeding the heap space, overflowing - the stack). We may attempt to panic in such cases instead if possible. -- Some details of type inference may change. For example, we expect to - implement the fallback mechanism described in [RFC 213], and we may - wish to make minor changes to accommodate overloaded integer - literals. In some cases, type inferences changes may be better - handled via explicit opt-in. - -(Although it is not directly covered by this RFC, it's worth noting in -passing that some of the CLI flags to the compiler may change in the -future as well. The `-Z` flags are of course explicitly unstable, but -some of the `-C`, rustdoc, and linker-specific flags are expected to -evolve over time.) - -### Opt-in changes - -For breaking changes that are not related to soundness or language -semantics, but are still deemed desirable, an opt-in strategy can be -used instead. This section describes an attribute for opting in to -newer language updates, and gives guidelines on what kinds of changes -should or should not be introduced in this fashion. - -We use the term *"opt-in changes"* to refer to changes that would be -breaking changes, but are not because of the opt-in mechanism. - -#### Rust version attribute - -The specific proposal is an attribute `#![rust_version="X.Y"]` that -can be attached to the crate; the version `X.Y` in this attribute is -called the crate's "declared version". Every build of the Rust -compiler will also have a version number built into it reflecting the -current release. - -When a `#[rust_version="X.Y"]` attribute is encountered, the compiler -will endeavor to produce the semantics of Rust "as it was" during -version `X.Y`. RFCs that propose opt-in changes should discuss how the -older behavior can be supported in the compiler, but this is expected -to be straightforward: if supporting older behavior is hard to do, it -may indicate that the opt-in change is too complex and should not be -accepted. - -If the crate declares a version `X.Y` that is *newer* than the -compiler itself, the compiler should simply issue a warning and -proceed as if the crate had declared the compiler's version (i.e., the -newer version the compiler knows about). - -Note that if the changes introducing by the Rust version `X.Y` affect -parsing, implementing these semantics may require some limited amount -of feedback between the parser and the tokenizer, or else a limited -"pre-parse" to scan the set of crate attributes and extract the -version. For example, if version `X.Y` adds new keywords, the -tokenizer will likely need to be configured appropriately with the -proper set of keywords. For this reason, it may make sense to require -that the `#![rust_version]` attribute appear *first* on the crate. - -#### When opt-in changes are appropriate - -Opt-in changes allow us to greatly expand the scope of the kinds of -additions we can make without breaking existing code, but they are not -applicable in all situations. A good rule of thumb is that an opt-in -change is only appropriate if the exact effect of the older code can -be easily recreated in the newer system with only surface changes to -the syntax. - -Another view is that opt-in changes are appropriate if those changes -do not affect the "abstract AST" of your Rust program. In other words, -existing Rust syntax is just a serialization of a more idealized view -of the syntax, in which there are no conflicts between keywords and -identifiers, syntactic sugar is expanded, and so forth. Opt-in changes -might affect the translation into this abstract AST, but should not -affect the semantics of the AST itself at a deeper level. This concept -of an idealized AST is analagous to the "elaborated syntax" described -in [RFC 1105], except that it is at a conceptual level. - -So, for example, the conflict between new keywords and existing -identifiers can (generally) be trivially worked around by renaming -identifiers, though the question of public identifiers is an -interesting one (contextual keywords may suffice, or else perhaps some -kind of escaping syntax -- we defer this question here for a later -RFC). - -In the previous section on breaking changes, we identified various -criteria that can be used to decide how to approach a breaking change -(i.e., how far to go in attempting to mitigate the fallout). For the -most part, those same criteria also apply when deciding whether to -accept an "opt-in" change: - -- How many crates on `crates.io` would break if they "opted-in" to the - change, and would opting in require extensive changes? -- Does the change silently change the result of running the program, - or simply cause additional compilation failures? - - Opt-in changes that silently change the result of running the - program are particularly unlikely to be accepted. -- What changes are needed to get code compiling again? Are those - changes obvious from the error message? - -# Drawbacks - -**Allowing unsafe code to continue compiling -- even with warnings -- -raises the probability that people experiences crashes and other -undesirable effects while using Rust.** However, in practice, most -unsafety hazards are more theoretical than practical: consider the -problem with the `thread::scoped` API. To actually create a data-race, -one had to place the guard into an `Rc` cycle, which would be quite -unusual. Therefore, a compromise path that warns about bad content but -provides an option for gradual migration seems preferable. - -**Deprecation implies that a maintenance burden.** For library APIs, -this is relatively simple, but for type-system changes it can be quite -onerous. We may want to consider a policy for dropping older, -deprecated type-system rules after some time, as discussed in the -section on *unresolved questions*. - -## Notes on phasing - -# Alternatives - -**Rather than supporting opt-in changes, one might consider simply -issuing a new major release for every such change.** Put simply, -though, issuing a new major release just because we want to have a new -keyword feels like overkill. This seems like to have two potential -negative effects. It may simply cause us to not make some of the -changes we would make otherwise, or work harder to fit them within the -existing syntactic constraints. It may also serve to dilute the -meaning of issuing a new major version, since even additive changes -that do not affect existing code in any meaningful way would result in -a major release. One would then be tempted to have some *additional* -numbering scheme, PR blitz, or other means to notify people when a new -major version is coming that indicates deeper changes. - -**Rather than simply fixing soundness bugs, we could use the opt-in -mechanism to fix them conditionally.** This was initially considered -as an option, but eventually rejected for the following reasons: - -- This would effectively cause a deeper split between minor versions; - currently, opt-in is limited to "surface changes" only, but allowing - opt-in to affect the type system feels like it would be creating two - distinct languages. -- It seems likely that all users of Rust will want to know that their code - is sound and would not want to be working with unsafe constructs or bugs. -- Users may choose not to opt-in to newer versions because they do not - need the new features introduced there or because they wish to - preserve compatibility with older compilers. It would be sad for - them to lose the benefits of bug fixes as well. -- We already have several mitigation measures, such as opt-out or - temporary deprecation, that can be used to ease the transition - around a soundness fix. Moreover, separating out new type rules so - that they can be "opted into" can be very difficult and would - complicate the compiler internally; it would also make it harder to - reason about the type system as a whole. - -**Rather than using a version number to opt-in to minor changes, one -might consider using the existing feature mechanism.** For example, -one could write `#![feature(foo)]` to opt in to the feature "foo" and -its associated keywords and type rules, rather than -`#![rust_version="1.2.3"]`. While using minimum version numbers is -more opaque than named features, they do offer several advantages: - -1. Using a version number alone makes it easy to think about what - version of Rust you are using as a conceptual unit, rather than - choosing features "a la carte". -2. Using named features, the list of features that must be attached to - Rust code will grow indefinitely, presuming your crate wants to - stay up to date. -3. Using a version attribute preserves a mental separation between - "experimental work" (feature gates) and stable, new features. -4. Named features present a combinatoric testing problem, where we - should (in principle) test for all possible combinations of - features. - -# Unresolved questions - -**Can (and should) we give a more precise definition for compiler bugs -and soundness problems?** The current text is vague on what precisely -constitutes a compiler bug and soundness change. It may be worth -defining more precisely, though likely this would be best done as part -of writing up a more thorough (and authoritative) Rust reference -manual. - -**Should we add a mechanism for "escaping" keywords?"** We may need a -mechanism for escaping keywords in the future. Imagine you have a -public function named `foo`, and we add a keyword `foo`. Now, if you -opt in to the newer version of Rust, your function declaration is -illegal: but if you rename the function `foo`, you are making a -breaking change for your clients, which you may not wish to do. If we -had an escaping mechanism, you would probably still want to deprecate -`foo` in favor of a new function `bar` (since typing `foo` would be -awkward), but it could still exist. - -**Should we add a mechanism for skipping over new syntax?** The -current `#[cfg]` mechanism is applied *after* parsing. This implies -that if we add new syntax, crates which employ that new syntax will -not be parsable by older compilers, even if the modules that depend on -that new syntax are disabled via `#[cfg]` directives. It may be useful -to add some mechanism for informing the parser that it should skip -over sections of the input (presumably based on token trees). One -approach to this might just be modifying the existing `#[cfg]` -directives so that they are applied during parsing rather than as a -post-pass. - -**What precisely constitutes "small" impact?** This RFC does not -attempt to define when the impact of a patch is "small" or "not -small". We will have to develop guidelines over time based on -precedent. One of the big unknowns is how indicative the breakage we -observe on `crates.io` will be of the total breakage that will occur: -it is certainly possible that all crates on `crates.io` work fine, but -the change still breaks a large body of code we do not have access to. - -**Should deprecation due to unsoundness have a special lint?** We may -not want to use the same deprecation lint for unsoundness that we use -for everything else. - -**What attribute should we use to "opt out" of soundness changes?** -The section on breaking changes indicated that it may sometimes be -appropriate to includ an "opt out" that people can use to temporarily -revert to older, unsound type rules, but did not specify precisely -what that opt-out should look like. Ideally, we would identify a -specific attribute in advance that will be used for such purposes. In -the past, we have simply created ad-hoc attributes (e.g., -`#[old_orphan_check]`), but because custom attributes are forbidden by -stable Rust, this has the unfortunate side-effect of meaning that code -which opts out of the newer rules cannot be compiled on older -compilers (even though it's using the older type system rules). If we -introduce an attribute in advance we will not have this problem. - -[RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 -[RFC 320]: https://github.com/rust-lang/rfcs/pull/320 -[#774]: https://github.com/rust-lang/rfcs/issues/744 -[#14875]: https://github.com/rust-lang/rust/issues/14875 -[#16135]: https://github.com/rust-lang/rust/issues/16135 -[#19733]: https://github.com/rust-lang/rust/issues/19733 -[#23442]: https://github.com/rust-lang/rust/issues/23442 -[RFC 213]: https://github.com/rust-lang/rfcs/pull/213 -[#415]: https://github.com/rust-lang/rfcs/issues/415 -[#22462]: https://github.com/rust-lang/rust/issues/22462#issuecomment-81756673 -[#24278]: https://github.com/rust-lang/rust/issues/24278 -[#1029]: https://github.com/rust-lang/rfcs/issues/1029 -[RFC 560]: https://github.com/rust-lang/rfcs/pull/560 diff --git a/text/0000-language-semver.md b/text/0000-language-semver.md index ef42b11377a..0c976ad2377 100644 --- a/text/0000-language-semver.md +++ b/text/0000-language-semver.md @@ -17,10 +17,10 @@ This RFC has two main goals: With the release of 1.0, we need to establish clear policy on what precisely constitutes a "minor" vs "major" change to the Rust language itself (as opposed to libraries, which are covered by [RFC 1105]). -**This RFC proposes limiting breaking changes to changes with -soundness implications**: this includes both bug fixes in the compiler -itself, as well as changes to the type system or RFCs that are -necessary to close flaws uncovered later. +**This RFC proposes that breaking changes are only permitted within a +minor release if they are fix to restore soundness**: this includes +both bug fixes in the compiler itself, as well as changes to the type +system or RFCs that are necessary to close flaws uncovered later. However, simply landing all breaking changes immediately could be very disruptive to the ecosystem. Therefore, **the RFC also proposes @@ -60,8 +60,8 @@ safe. ### Soundness changes -When compiler bugs or soundness problems are encountered in the -language itself (as opposed to in a library), clearly they ought to be +When compiler or type-system bugs are encountered in the language +itself (as opposed to in a library), clearly they ought to be fixed. However, it is important to fix them in such a way as to minimize the impact on the ecosystem. @@ -75,14 +75,18 @@ problem, which helps those people who are affected to migrate their code. A description of the problem should also appear in the relevant subteam report. -In cases where the impact seems larger, the following steps can be -taken to ease the transition: +In cases where the impact seems larger, any effort to ease the +transition is sure to be welcome. The following are suggestions for +possible steps we could take (not all of which will be applicable to +all scenarios): -1. Identify important crates (such as those with many dependencies) +1. Identify important crates (such as those with many dependants) and work with the crate author to correct the code as quickly as possible, ideally before the fix even lands. 2. Work hard to ensure that the error message identifies the problem clearly and suggests the appropriate solution. + - If we develop a rustfix tool, in some cases we may be able to + extend that tool to perform the fix automatically. 3. Provide an annotation that allows for a scoped "opt out" of the newer rules, as described below. While the change is still breaking, this at least makes it easy for crates to update and get @@ -94,11 +98,18 @@ taken to ease the transition: circulate. This gives people more time to update their crates. However, this option may frequently not be available, because the source of a compilation error is often hard to pin down with - precision. + precision. Some of the factors that should be taken into consideration when deciding whether and how to minimize the impact of a fix: +- How important is the change? + - Soundness holes that can be easily exploited or which impact + running code are obviously much more concerning than minor corner + cases. There is somewhat in tension with the other factors: if + there is, for example, a widely deployed vulnerability, fixing + that vulnerability is important, but it will also cause a larger + disruption. - How many crates on `crates.io` are affected? - This is a general proxy for the overall impact (since of course there will always be private crates that are not part of @@ -114,14 +125,43 @@ deciding whether and how to minimize the impact of a fix: - The more cryptic the error, the more frustrating it is when compilation fails. +#### What is a "compiler bug" or "soundness change"? + +In the absence of a formal spec, it is hard to define precisely what +constitutes a "compiler bug" or "soundness change" (see also the +section below on underspecified parts of the language). The obvious +cases are soundness violations in a rather strict sense: + +- Cases where the user is able to produce Undefined Behavior (UB) + purely from safe code. +- Cases where the user is able to produce UB using standard library + APIs or other unsafe code that "should work". + +However, there are other kinds of type-system inconsistencies that +might be worth fixing, even if they cannot lead directly to UB. Bugs +in the coherence system that permit uncontrolled overlap between impls +are one example. Another example might be inference failures that +cause code to compile which should not (because ambiguities +exist). Finally, there is a list below of areas of the language which +are generally considered underspecified. + +We expect that there will be cases that fall on a grey line betwen bug +and expected behavior, and discussion will be needed to determine +where it falls. The recent conflict between `Rc` and scoped threads is +an example of such a discusison: it was clear that both APIs could not +be legal, but not clear which one was at fault. The results of these +discussions will feed into the Rust spec as it is developed. + #### Opting out In some cases, it may be useful to permit users to opt out of new type rules. The intention is that this "opt out" is used as a temporary -crutch to make it easy to get the code up and running. Depending on -the severity of the soundness fix, the "opt out" may be permanently -available, or it could be removed in a later release. In either case, -use of the "opt out" API would trigger the deprecation lint. +crutch to make it easy to get the code up and running. Typically this +opt out will thus be removed in a later release. But in some cases, +particularly those cases where the severity of the problem is +relatively small, it could be an option to leave the "opt out" +mechanism in place permanently. In either case, use of the "opt out" +API would trigger the deprecation lint. #### Changes that alter dynamic semantics versus typing rules @@ -177,8 +217,6 @@ Known areas where change is expected include the following: - The treatment of hygiene in macros is uneven (see [#22462], [#24278]). In some cases, changes here may be backwards compatible, or may be more appropriate only with explicit opt-in (or perhaps an alternate macro system altogether). -- The layout of data structures is expected to change over time unless they are annotated - with a `#[repr(C)]` attribute. - Lints will evolve over time (both the lints that are enabled and the precise cases that lints catch). We expect to introduce a [means to limit the effect of these changes on dependencies][#1029]. @@ -193,11 +231,18 @@ Known areas where change is expected include the following: literals. In some cases, type inferences changes may be better handled via explicit opt-in. -(Although it is not directly covered by this RFC, it's worth noting in +There are other kinds of changes that can be made in a minor version +that may break unsafe code but which are not considered breaking +changes, because the unsafe code is relying on things known to be +intentionally unspecified. One obvious example is the layout of data +structures, which is considered undefined unless they have a +`#[repr(C)]` attribute. + +Although it is not directly covered by this RFC, it's worth noting in passing that some of the CLI flags to the compiler may change in the future as well. The `-Z` flags are of course explicitly unstable, but some of the `-C`, rustdoc, and linker-specific flags are expected to -evolve over time.) +evolve over time. ### Opt-in changes From cc19a1b219d3518b86f8446f56dc26da0c1f1e2a Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 21 May 2015 15:15:44 -0400 Subject: [PATCH 0299/1195] Switch to use `--rust-version` instead of `#[rust_version]` --- text/0000-language-semver.md | 82 +++++++++++++++++++++++------------- 1 file changed, 53 insertions(+), 29 deletions(-) diff --git a/text/0000-language-semver.md b/text/0000-language-semver.md index 0c976ad2377..5aa25b1ce7f 100644 --- a/text/0000-language-semver.md +++ b/text/0000-language-semver.md @@ -255,35 +255,51 @@ should or should not be introduced in this fashion. We use the term *"opt-in changes"* to refer to changes that would be breaking changes, but are not because of the opt-in mechanism. -#### Rust version attribute - -The specific proposal is an attribute `#![rust_version="X.Y"]` that -can be attached to the crate; the version `X.Y` in this attribute is -called the crate's "declared version". Every build of the Rust -compiler will also have a version number built into it reflecting the -current release. - -When a `#[rust_version="X.Y"]` attribute is encountered, the compiler -will endeavor to produce the semantics of Rust "as it was" during -version `X.Y`. RFCs that propose opt-in changes should discuss how the -older behavior can be supported in the compiler, but this is expected -to be straightforward: if supporting older behavior is hard to do, it -may indicate that the opt-in change is too complex and should not be -accepted. - -If the crate declares a version `X.Y` that is *newer* than the -compiler itself, the compiler should simply issue a warning and -proceed as if the crate had declared the compiler's version (i.e., the -newer version the compiler knows about). - -Note that if the changes introducing by the Rust version `X.Y` affect -parsing, implementing these semantics may require some limited amount -of feedback between the parser and the tokenizer, or else a limited -"pre-parse" to scan the set of crate attributes and extract the -version. For example, if version `X.Y` adds new keywords, the -tokenizer will likely need to be configured appropriately with the -proper set of keywords. For this reason, it may make sense to require -that the `#![rust_version]` attribute appear *first* on the crate. +#### Rust version option + +The specific proposal is to introduce a command-line option +`--rust-version=X.Y[.Z]` that instructs the Rust compiler to expect +source code from older versions of Rust. This option could also be +specified in a `Cargo.toml` file in a `rust-version` property. The +version applies to the crate currently being compiled and is called +the crate's "supplied version". Every build of the Rust compiler will +also have a version number built into it reflecting the current +release; if the command-line option is not supplied, the compiler +defaults to this builtin version. + +The supplied version is used by the compiler to produce the semantics +of Rust "as it was" during version `X.Y`. RFCs that propose opt-in +changes should discuss how the older behavior can be supported in the +compiler, but this is expected to be straightforward: if supporting +older behavior is hard to do, it may indicate that the opt-in change +is too complex and should not be accepted. + +Note that the supplied version may affect the parser configuration +used when parsing the initial crate, since it can affect the keywords +recognized by the tokenizer and perhaps other minor details in the +syntax. However, because the version is supplied on the command line, +this configuration is known before parsing begins. + +#### Defaults and extreme cases + +If no version is supplied on the `rustc` command line, `rustc` will +default to the maximal version it recognizes. If the user supplies a +version `X.Y` that is *newer* than the compiler itself, the compiler +should simply issue a warning and proceed as if the user had supplied +the compiler's version (i.e., the newest version the compiler knows +about). + +Cargo will always invoke `rustc` with a supplied version. If there is +no version in the `Cargo.toml` file, then `1.0.0` is assumed. Whenever +a new project is created with `cargo new`, the new `Cargo.toml` will +include the most recent Rust version number by default. + +Note that the defaults for `rustc` and `cargo` differ. `rustc` prefers +the most recent verison of Rust by default, whereas `cargo` prefers +the oldest. The reason is that we expect running `rustc` in a +standalone fashion to be used primarily when experimenting with small +scripts and one-offs, and the user is most likely to want "current +Rust" in that scenario. #### When opt-in changes are appropriate @@ -347,6 +363,14 @@ section on *unresolved questions*. # Alternatives +**Use an attribute rather than command-line option.** Earlier versions +of this RFC used a `#[rust_version]` attribute to specify the Rust +version rather than a command-line parameter. This was changed to use +a command-line parameter because it (a) exposes the version int he +Cargo metadata, (b) is analogous to the approach used by most other +languages, and (c) simplifies the implementation, since the parser +does not need to be reconfigured midparse. + **Rather than supporting opt-in changes, one might consider simply issuing a new major release for every such change.** Put simply, though, issuing a new major release just because we want to have a new From 23fa07e5a690c2776d3156271e5511c3042b29f2 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 22 May 2015 10:31:19 -0400 Subject: [PATCH 0300/1195] Adjust text to fix poor wording, typos, and add a few details. --- text/0000-language-semver.md | 52 ++++++++++++++++++++++++------------ 1 file changed, 35 insertions(+), 17 deletions(-) diff --git a/text/0000-language-semver.md b/text/0000-language-semver.md index 5aa25b1ce7f..fe0e5d7e402 100644 --- a/text/0000-language-semver.md +++ b/text/0000-language-semver.md @@ -17,10 +17,12 @@ This RFC has two main goals: With the release of 1.0, we need to establish clear policy on what precisely constitutes a "minor" vs "major" change to the Rust language itself (as opposed to libraries, which are covered by [RFC 1105]). -**This RFC proposes that breaking changes are only permitted within a -minor release if they are fix to restore soundness**: this includes -both bug fixes in the compiler itself, as well as changes to the type -system or RFCs that are necessary to close flaws uncovered later. +**This RFC proposes that minor releases may only contain breaking +changes that fix compiler bugs or other type-system +issues**. Primarily, this means soundness issues where "innocent" code +can cause undefined behavior (in the technical sense), but it also +covers cases like compiler bugs and tightening up the semantics of +"underspecified" parts of the language (more details below). However, simply landing all breaking changes immediately could be very disruptive to the ecosystem. Therefore, **the RFC also proposes @@ -53,7 +55,7 @@ considering in separate RFCs. # Detailed design -The detailed design is broken into two major section: how to address +The detailed design is broken into two major sections: how to address soundness changes, and how to address other, opt-in style changes. We do not discuss non-breaking changes here, since obviously those are safe. @@ -145,8 +147,8 @@ cause code to compile which should not (because ambiguities exist). Finally, there is a list below of areas of the language which are generally considered underspecified. -We expect that there will be cases that fall on a grey line betwen bug -and expected behavior, and discussion will be needed to determine +We expect that there will be cases that fall on a grey line between +bug and expected behavior, and discussion will be needed to determine where it falls. The recent conflict between `Rc` and scoped threads is an example of such a discusison: it was clear that both APIs could not be legal, but not clear which one was at fault. The results of these @@ -214,9 +216,10 @@ Known areas where change is expected include the following: - Memory allocation in unsafe code is currently unstable. We expect to be defining safe interfaces as part of the work on supporting tracing garbage collectors (see [#415]). -- The treatment of hygiene in macros is uneven (see [#22462], [#24278]). In some cases, - changes here may be backwards compatible, or may be more appropriate only with explicit opt-in - (or perhaps an alternate macro system altogether). +- The treatment of hygiene in macros is uneven (see [#22462], + [#24278]). In some cases, changes here may be backwards compatible, + or may be more appropriate only with explicit opt-in (or perhaps an + alternate macro system altogether, such as [this proposal][macro]). - Lints will evolve over time (both the lints that are enabled and the precise cases that lints catch). We expect to introduce a [means to limit the effect of these changes on dependencies][#1029]. @@ -242,7 +245,7 @@ Although it is not directly covered by this RFC, it's worth noting in passing that some of the CLI flags to the compiler may change in the future as well. The `-Z` flags are of course explicitly unstable, but some of the `-C`, rustdoc, and linker-specific flags are expected to -evolve over time. +evolve over time (see e.g. [#24451]). ### Opt-in changes @@ -271,8 +274,8 @@ The supplied version is used by the compiler to produce the semantics of Rust "as it was" during version `X.Y`. RFCs that propose opt-in changes should discuss how the older behavior can be supported in the compiler, but this is expected to be straightforward: if supporting -older behavior is hard to do, it may indicate that the opt-in change -is too complex and should not be accepted. +older behavior is hard to do, this may be an indication that the +opt-in change is too complex and should not be accepted. Note that the supplied version may affect the parser configuration used when parsing the initial crate, since it can affect the keywords @@ -290,9 +293,16 @@ the compiler's version (i.e., the newest version the compiler knows about). Cargo will always invoke `rustc` with a supplied version. If there is -no version in the `Cargo.toml` file, then `1.0.0` is assumed. Whenever -a new project is created with `cargo new`, the new `Cargo.toml` will -include the most recent Rust version number by default. +no version in the `Cargo.toml` file, then `1.0.0` is assumed. (It may +be a good idea to issue a warning in this case as well.) + +Whenever a new project is created with `cargo new`, the new +`Cargo.toml` will include the most recent Rust version number by +default. (Since Cargo and rustc are not, at least today, necessarily +released on the same schedule, we'll have to pick some sensible +definition of the "most recent" Rust version number; one option is to +query the `rustc` executable in scope. Another is to synchronize the +release schedules and use the "built-in" notion.) Note that the defaults for `rustc` and `cargo` differ. `rustc` prefers the most recent verison of Rust by default, whereas `cargo` prefers @@ -342,6 +352,12 @@ accept an "opt-in" change: - What changes are needed to get code compiling again? Are those changes obvious from the error message? +Another important criterion is the implementation complexity. In +particular, how easy will it be to maintain both the older behavior +and the newer behavior? It is important to consider not just the +complexity today, but possible complexity in the future as the +compiler changes. + # Drawbacks **Allowing unsafe code to continue compiling -- even with warnings -- @@ -481,7 +497,7 @@ introduce an attribute in advance we will not have this problem. [RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 [RFC 320]: https://github.com/rust-lang/rfcs/pull/320 -[#774]: https://github.com/rust-lang/rfcs/issues/744 +[#744]: https://github.com/rust-lang/rfcs/issues/744 [#14875]: https://github.com/rust-lang/rust/issues/14875 [#16135]: https://github.com/rust-lang/rust/issues/16135 [#19733]: https://github.com/rust-lang/rust/issues/19733 @@ -492,3 +508,5 @@ introduce an attribute in advance we will not have this problem. [#24278]: https://github.com/rust-lang/rust/issues/24278 [#1029]: https://github.com/rust-lang/rfcs/issues/1029 [RFC 560]: https://github.com/rust-lang/rfcs/pull/560 +[macro]: https://internals.rust-lang.org/t/pre-rfc-macro-improvements/2088 +[#24451]: https://github.com/rust-lang/rust/pull/24451 From e84d6544a5558a0328bfd07298fa262c3f6e6863 Mon Sep 17 00:00:00 2001 From: snorr Date: Wed, 27 May 2015 17:48:21 +0200 Subject: [PATCH 0301/1195] Fix typos and update syntax in RFC 164 --- text/0164-feature-gate-slice-pats.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/text/0164-feature-gate-slice-pats.md b/text/0164-feature-gate-slice-pats.md index b3e9130b63b..4712e3aa342 100644 --- a/text/0164-feature-gate-slice-pats.md +++ b/text/0164-feature-gate-slice-pats.md @@ -4,18 +4,18 @@ # Summary -Rust's support for pattern matching on sices has grown steadily and incrementally without a lot of oversight, -and we have concern that Rust is doing too much here, that the complexity is not worth it. This RFC proposes -to feature gate multiple-element slice matches in the head and middle positions (`[..xs, 0, 0]` and `[0, ..xs, 0]`. +Rust's support for pattern matching on slices has grown steadily and incrementally without a lot of oversight. +We have concern that Rust is doing too much here, and that the complexity is not worth it. This RFC proposes +to feature gate multiple-element slice matches in the head and middle positions (`[xs.., 0, 0]` and `[0, xs.., 0]`). # Motivation -Some general reasons and one specific: first, the implementation of Rust's match machinery is notoriously complex, and not well-loved. Remove features is seen as a valid way to reduce complexity. Second, slice matching in particular, is difficult to implement, while also being of only moderate utility (there are many types of collections - slices just happen to be built into the language). Finally, the exhaustiveness check is not correct for slice patterns - because of their complexity; it's not known that it -can be done correctly, nor whether it is worth the effort even if. +Some general reasons and one specific: first, the implementation of Rust's match machinery is notoriously complex, and not well-loved. Removing features is seen as a valid way to reduce complexity. Second, slice matching in particular, is difficult to implement, while also being of only moderate utility (there are many types of collections - slices just happen to be built into the language). Finally, the exhaustiveness check is not correct for slice patterns because of their complexity; it's not known if it +can be done correctly, nor whether it is worth the effort to do so. # Detailed design -The `advanced_slice_patterns` feature gate will be added. When the compiler encounters slice pattern matches in head or middle position it will emit a warning or error accourding to the current settings. +The `advanced_slice_patterns` feature gate will be added. When the compiler encounters slice pattern matches in head or middle position it will emit a warning or error according to the current settings. # Drawbacks From c16c3327c35c0d6f5ebe63b96ff2ecf5205ca797 Mon Sep 17 00:00:00 2001 From: Ariel Ben-Yehuda Date: Wed, 27 May 2015 20:10:23 +0300 Subject: [PATCH 0302/1195] Implement raw pointer comparisons --- text/0000-raw-pointer-comparisons.md | 58 ++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 text/0000-raw-pointer-comparisons.md diff --git a/text/0000-raw-pointer-comparisons.md b/text/0000-raw-pointer-comparisons.md new file mode 100644 index 00000000000..5287c239951 --- /dev/null +++ b/text/0000-raw-pointer-comparisons.md @@ -0,0 +1,58 @@ +- Feature Name: raw-pointer-comparisons +- Start Date: 2015-05-27 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow equality, but not order, comparisons between fat raw pointers +of the same type. + +# Motivation + +Currently, fat raw pointers can't be compared via either PartialEq or +PartialOrd (currently this causes an ICE). It seems to me that a primitive +type like a fat raw pointer should implement equality in some way. + +However, there doesn't seem to be a sensible way to order raw fat pointers +unless we take vtable addresses into account, which is relatively weird. + +# Detailed design + +Implement PartialEq/Eq for fat raw pointers, defined as comparing both the +unsize-info and the address. This means that these are true: + +```Rust + &s as &fmt::Debug as *const _ == &s as &fmt::Debug as *const _ // of course + &s.first_field as &fmt::Debug as *const _ + != &s as &fmt::Debug as *const _ // these are *different* (one + // prints only the first field, + // the other prints all fields). +``` + +But +```Rust + &s.first_field as &fmt::Debug as *const _ as *const () == + &s as &fmt::Debug as *const _ as *const () // addresses are equal +``` + +# Drawbacks + +Order comparisons may be useful for putting fat raw pointers into +ordering-based data structures (e.g. BinaryTree). + +# Alternatives + +@nrc suggested to implement heterogeneous comparisons between all thin +raw pointers and all fat raw pointers. I don't like this because equality +between fat raw pointers of different traits is false most of the +time (unless one of the traits is a supertrait of the other and/or the +only difference is in free lifetimes), and anyway you can always compare +by casting both pointers to a common type. + +It is also possible to implement ordering too, either in unsize -> addr +lexicographic order or addr -> unsize lexicographic order. + +# Unresolved questions + +See Alternatives. \ No newline at end of file From 992550b1ac1869c3ef0591ac64c55ca961656f2c Mon Sep 17 00:00:00 2001 From: Sean Patrick Santos Date: Wed, 27 May 2015 18:38:16 -0600 Subject: [PATCH 0303/1195] Shrink the scope of the associated const changes to RFC 195, and mention trait objects. --- text/0195-associated-items.md | 167 +++++----------------------------- 1 file changed, 22 insertions(+), 145 deletions(-) diff --git a/text/0195-associated-items.md b/text/0195-associated-items.md index 0a76c63d5fc..c17018b580e 100644 --- a/text/0195-associated-items.md +++ b/text/0195-associated-items.md @@ -882,6 +882,7 @@ trait Foo { type Output1; type Output2; lifetime 'a; + const C: bool; ... } ``` @@ -894,6 +895,7 @@ T: Foo T: Foo T: Foo T: Foo>(t: T) // this is valid fn consume_obj(t: Box>) // this is NOT valid // but this IS valid: -fn consume_obj(t: Box::N] = [0u8; ::N + ::N]; - let x: [u8; ::N + 1] = [0u8; 1 + ::N]; - // Still not allowed. - let x: [u8; ::N + 1] = [0u8; ::N + 1]; - // Workaround for the expression above. - const N_PLUS_1: usize = ::N + 1; - let x: [u8; N_PLUS_1] = [0u8; N_PLUS_1]; - // Neither of the following are allowed. - const ALIAS_N_PLUS_1: usize = N_PLUS_1; - let x: [u8; N_PLUS_1] = [0u8; ALIAS_N_PLUS_1]; - const ALIAS_N: usize = ::N; - let x: [u8; ::N] = [0u8; ALIAS_N]; - ``` +If the value of an associated const depends on a type parameter (including +`Self`), it cannot be used in a constant expression. This restriction will +almost certainly be lifted in the future, but this raises questions outside the +scope of this RFC. # Staging @@ -1471,97 +1414,31 @@ on implementation concerns, which are not yet clear. ## Generic associated consts in match patterns It seems desirable to allow constants that depend on type parameters in match -patterns, but it's not clear how to do so. - -Looking at the `HasVar` example above, one possibility would be to simply treat -the first, forbidden match expression as syntactic sugar for the second, allowed -match expression that uses a pattern guard. This is simple to implement because -one can simply ignore the constant when performing exhaustiveness and -reachability checks. Unfortunately, this approach blurs the difference between -match patterns (which provide strict checks) and pattern guards (which are just -useful syntactic sugar), and it does not increase the expressiveness of the -language. - -An alternative would be to allow `where` clauses to place constraints on -associated consts. If an associated const is known to be equal/unequal to some -other value (or in the case of integers, inside/outside a given range), this can -inform exhaustiveness and reachability checks. But this requires more design and -implementation work, and more syntax. +patterns, but it's not clear how to do so while still checking exhaustiveness +and reachability of the match arms. Most likely this requires new forms of +where clause, to constrain associated constant values. For now, we simply defer the question. ## Generic associated consts in array sizes -The above solution for type-checking array sizes is somewhat unsatisfactory. In -particular, it is counter-intuitive that neither of the following will type -check: +It would be useful to be able to use trait-associated constants in generic code. ```rust // Shouldn't this be OK? const ALIAS_N: usize = ::N; let x: [u8; ::N] = [0u8; ALIAS_N]; -// This is likely to yield an embarrassing error message such as: -// "couldn't prove that `::N + 1` is equal to `::N + 1`" -let x: [u8; ::N + 1] = [0u8; ::N + 1]; -``` - -A function like this is especially affected: - -```rust -trait HasN { - const N: usize; -} -fn foo() -> [u8; ::N + 1] { - // Can't be verified to be correct for the return type, and can't use the - // intermediate const workaround due to scoping issues. - [0u8; ::N + 1] -} +// Or... +let x: [u8; T::N + 1] = [0u8; T::N + 1]; ``` -This can be worked around with type-level naturals that use associated consts to -produce array sizes, but this is syntactically a bit inelegant. +However, this causes some problems. What should we do with the following case in +type checking, where we need to prove that a generic is valid for any `T`? ```rust -// Assume that `TypeAdd` and `One` are from a type-level naturals or similar -// library, and that `NAsTypeNatN` provides some way of translating the `N` -// on a `HasN` to a type compatible with that library. -trait HasN { - const N: usize; - type TypeNatN; -} -fn foo() -> [u8; TypeAdd<::TypeNatN, One>::AsUsize] { - // Because the type `TypeAdd<::TypeNatN, One>` can be verified to be - // equal to itself in type checking, we know that the associated const - // `AsUsize` below must be the same item as the `AsUsize` mentioned in the - // return type above. - [0u8; TypeAdd<::NAsTypeNat, One>::AsUsize] -} +let x: [u8; T::N + T::N] = [0u8; 2 * T::N]; ``` -There are a variety of possible ways to address the above issues, including: - - - Implementing smarter handling of consts that are just aliases of other - constant items. - - Allowing `where` clauses to constrain some associated constants to be equal, - to other expressions, and using this information in type checking. - - Adding normalization with little or no awareness of arithmetic (e.g. allowing - expressions that are exactly the same to be considered equal, or using only - a very basic understanding of which operations are commutative and/or - associative). - - Adding new syntax and/or new capability to plugins to allow type-level - naturals to be used with more ergonomic and clear syntax. - - Implementing a dependent type system that provides built-in semantics for - integer arithmetic at the type level, rather than implementing this in an - external or standard library. - - Using a full-fledged SMT solver. - - Some other creative solutions not on this list. - -While there are many ways to improve on the current design, and many of these -approaches are not mutually exclusive, much more work is needed to investigate -and implement a self-consistent, effective, and ideally intuitive set of -solutions. - -Though admittedly not very satisfying at the moment, the current approach has -the advantage of being (arguably) a good minimalist design, allowing associated -consts to be used for array sizes in generic code now, but also allowing for any -of a number of improved systems to be implemented later. +We would like to handle at least some obvious cases (e.g. proving that +`T::N == T::N`), but without trying to prove arbitrary statements about +arithmetic. The question of how to do this is deferred. From 31d230b3f1f40797405589f107983c2052792807 Mon Sep 17 00:00:00 2001 From: James Miller Date: Thu, 28 May 2015 12:52:24 +1200 Subject: [PATCH 0304/1195] Re-write text to propose likely/unlikely instead of expect It's clear that `likely`/`unlikely` is the more popular option, so make that the main focus. Also expands on the guarantees the intrinsics provide (i.e. none) and mentions the prevalance of `LIKELY`/`UNLIKELY` macros in many C/C++ projects. --- text/0000-expect-intrinsic.md | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/text/0000-expect-intrinsic.md b/text/0000-expect-intrinsic.md index 078aba7ffa8..8162ef161b0 100644 --- a/text/0000-expect-intrinsic.md +++ b/text/0000-expect-intrinsic.md @@ -5,7 +5,7 @@ # Summary -Provide an intrinsic function for hinting the likelyhood of branches being taken. +Provide a pair of intrinsic functions for hinting the likelyhood of branches being taken. # Motivation @@ -22,21 +22,34 @@ quintillion cases. # Detailed design -Implement an `expect` intrinsic with the signature: `fn(bool, bool) -> bool`. The first argument is -the condition being tested, the second argument is the expected result. The return value is the -same as the first argument, meaning that `if foo == bar { .. }` can be simply replaced with -`if expect(foo == bar, false) { .. }`. +Implement a pair of intrinsics `likely` and `unlikely`, both with signature `fn(bool) -> bool` +which hint at the probability of the passed value being true or false. Specifically, `likely` hints +to the compiler that the passed value is likely to be true, and `unlikely` hints that it is likely +to be false. Both functions simply return the value they are passed. -The expected value is required to be a constant value. +The primary reason for this design is that it reflects common usage of this general feature in many +C and C++ projects, most of which define simple `LIKELY` and `UNLIKELY` macros around the gcc +`__builtin_expect` intrinsic. It also provides the most flexibility, allowing branches on any +condition to be hinted at, even if the process that produced the branched-upon value is +complex. For why an equivalent to `__builtin_expect` is not being exposed, see the Alternatives +section. + +There are no observable changes in behaviour from use of these intrinsics. It is valid to implement +these intrinsics simply as the identity function. Though it is expected that the intrinsics provide +information to the optimizer, that information is not guaranteed to change the decisions the +optimiser makes. # Drawbacks -The second argument is required to be a constant value, which can't be easily expressed. +The intrinsics cannot be used to hint at arms in `match` expressions. However, given that hints +would need to be variants, a simple intrinsic would not be sufficient for those purposes. # Alternatives -Provide a pair of intrinsics `likely` and `unlikely`, these are the same as `expect` just with -`true` and `false` substituted in for the expected value, respectively. +Expose an `expect` intrinsic. This is what gcc/clang does with `__builtin_expect`. However there is +a restriction that the second argument be a constant value, a requirement that is not easily +expressible in Rust code. The split into `likely` and `unlikely` intrinsics reflects the strategy +we have used for similar restrictions like the ordering constraint of the atomic intrinsics. # Unresolved questions From df1efffef08d9ac593b3ff1c19a48f83d25cbe9f Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 28 May 2015 17:08:38 -0400 Subject: [PATCH 0305/1195] Pare back the RFC to just the minimum: guidelines on making breaking changes. --- text/0000-language-semver.md | 277 +++-------------------------------- 1 file changed, 24 insertions(+), 253 deletions(-) diff --git a/text/0000-language-semver.md b/text/0000-language-semver.md index fe0e5d7e402..e2ddcf10e8d 100644 --- a/text/0000-language-semver.md +++ b/text/0000-language-semver.md @@ -5,12 +5,9 @@ # Summary -This RFC has two main goals: - -- define what precisely constitutes a breaking change for the Rust language itself; -- define a language versioning mechanism that extends the sorts of - changes we can make without causing compilation failures (for - example, adding new keywords). +This RFC has the goal of defining what sorts of breaking changes we +will permit for the Rust language itself, and giving guidelines for +how to go about making such changes. # Motivation @@ -29,29 +26,11 @@ disruptive to the ecosystem. Therefore, **the RFC also proposes specific measures to mitigate the impact of breaking changes**, and some criteria when those measures might be appropriate. -Furthermore, there are other kinds of changes that we may want to make -which feel like they *ought* to be possible, but which are in fact -breaking changes. The simplest example is adding a new keyword to the -language -- despite being a purely additive change, a new keyword can -of course conflict with existing identifiers. Therefore, **the RFC -proposes a simple annotation that allows crates to designate the -version of the language they were written for**. This effectively -permits some amount of breaking changes by making them "opt-in" -through the version attribute. - -However, even though the version attribute can be used to make -breaking changes "opt-in" (and hence not really breaking), this is -still a tool to be used with great caution. Therefore, **the RFC also -proposes guidelines on when it is appropriate to include an "opt-in" -breaking change and when it is not**. - -This RFC is focused specifically on the question of what kinds of -changes we can make within a single major version (as well as some -limited mechanisms that lay the groundwork for certain kinds of -anticipated changes). It intentionally does not address the question -of a release schedule for Rust 2.0, nor does it propose any new -features itself. These topics are complex enough to be worth -considering in separate RFCs. +In rare cases, it may be deemed a good idea to make a breaking change +that is not a soundness problem or compiler bug, but rather correcting +a defect in design. Such cases should be rare. But if such a change is +deemed worthwhile, then the guidelines given here can still be used to +mitigate its impact. # Detailed design @@ -247,173 +226,29 @@ future as well. The `-Z` flags are of course explicitly unstable, but some of the `-C`, rustdoc, and linker-specific flags are expected to evolve over time (see e.g. [#24451]). -### Opt-in changes - -For breaking changes that are not related to soundness or language -semantics, but are still deemed desirable, an opt-in strategy can be -used instead. This section describes an attribute for opting in to -newer language updates, and gives guidelines on what kinds of changes -should or should not be introduced in this fashion. - -We use the term *"opt-in changes"* to refer to changes that would be -breaking changes, but are not because of the opt-in mechanism. - -#### Rust version option - -The specific proposal is to introduce a command-line option -`--rust-version=X.Y[.Z]` that instructs the Rust compiler to expect -source code from older versions of Rust. This option could also be -specified in a `Cargo.toml` file in a `rust-version` property. The -version applies to the crate currently being compiled and is called -the crate's "supplied version". Every build of the Rust compiler will -also have a version number built into it reflecting the current -release; if the command-line option is not supplied, the compiler -defaults to this builtin version. - -The supplied version is used by the compiler to produce the semantics -of Rust "as it was" during version `X.Y`. RFCs that propose opt-in -changes should discuss how the older behavior can be supported in the -compiler, but this is expected to be straightforward: if supporting -older behavior is hard to do, this may be an indication that the -opt-in change is too complex and should not be accepted. - -Note that the supplied version may affect the parser configuration -used when parsing the initial crate, since it can affect the keywords -recognized by the tokenizer and perhaps other minor details in the -syntax. However, because the version is supplied on the command line, -this configuration is known before parsing begins. - -#### Defaults and extreme cases - -If no version is supplied on the `rustc` command line, `rustc` will -default to the maximal version it recognizes. If the user supplies a -version `X.Y` that is *newer* than the compiler itself, the compiler -should simply issue a warning and proceed as if the user had supplied -the compiler's version (i.e., the newest version the compiler knows -about). - -Cargo will always invoke `rustc` with a supplied version. If there is -no version in the `Cargo.toml` file, then `1.0.0` is assumed. (It may -be a good idea to issue a warning in this case as well.) - -Whenever a new project is created with `cargo new`, the new -`Cargo.toml` will include the most recent Rust version number by -default. (Since Cargo and rustc are not, at least today, necessarily -released on the same schedule, we'll have to pick some sensible -definition of the "most recent" Rust version number; one option is to -query the `rustc` executable in scope. Another is to synchronize the -release schedules and use the "built-in" notion.) - -Note that the defaults for `rustc` and `cargo` differ. `rustc` prefers -the most recent verison of Rust by default, whereas `cargo` prefers -the oldest. The reason is that we expect running `rustc` in a -standalone fashion to be used primarily when experimenting with small -scripts and one-offs, and the user is most likely to want "current -Rust" in that scenario. - -#### When opt-in changes are appropriate - -Opt-in changes allow us to greatly expand the scope of the kinds of -additions we can make without breaking existing code, but they are not -applicable in all situations. A good rule of thumb is that an opt-in -change is only appropriate if the exact effect of the older code can -be easily recreated in the newer system with only surface changes to -the syntax. - -Another view is that opt-in changes are appropriate if those changes -do not affect the "abstract AST" of your Rust program. In other words, -existing Rust syntax is just a serialization of a more idealized view -of the syntax, in which there are no conflicts between keywords and -identifiers, syntactic sugar is expanded, and so forth. Opt-in changes -might affect the translation into this abstract AST, but should not -affect the semantics of the AST itself at a deeper level. This concept -of an idealized AST is analagous to the "elaborated syntax" described -in [RFC 1105], except that it is at a conceptual level. - -So, for example, the conflict between new keywords and existing -identifiers can (generally) be trivially worked around by renaming -identifiers, though the question of public identifiers is an -interesting one (contextual keywords may suffice, or else perhaps some -kind of escaping syntax -- we defer this question here for a later -RFC). - -In the previous section on breaking changes, we identified various -criteria that can be used to decide how to approach a breaking change -(i.e., how far to go in attempting to mitigate the fallout). For the -most part, those same criteria also apply when deciding whether to -accept an "opt-in" change: - -- How many crates on `crates.io` would break if they "opted-in" to the - change, and would opting in require extensive changes? -- Does the change silently change the result of running the program, - or simply cause additional compilation failures? - - Opt-in changes that silently change the result of running the - program are particularly unlikely to be accepted. -- What changes are needed to get code compiling again? Are those - changes obvious from the error message? - -Another important criterion is the implementation complexity. In -particular, how easy will it be to maintain both the older behavior -and the newer behavior? It is important to consider not just the -complexity today, but possible complexity in the future as the -compiler changes. - # Drawbacks -**Allowing unsafe code to continue compiling -- even with warnings -- -raises the probability that people experiences crashes and other -undesirable effects while using Rust.** However, in practice, most -unsafety hazards are more theoretical than practical: consider the -problem with the `thread::scoped` API. To actually create a data-race, -one had to place the guard into an `Rc` cycle, which would be quite -unusual. Therefore, a compromise path that warns about bad content but -provides an option for gradual migration seems preferable. - -**Deprecation implies that a maintenance burden.** For library APIs, -this is relatively simple, but for type-system changes it can be quite -onerous. We may want to consider a policy for dropping older, -deprecated type-system rules after some time, as discussed in the -section on *unresolved questions*. +The primary drawback is that making breaking changes are disruptive, +even when done with the best of intentions. The alternatives list some +ways that we could avoid breaking changes altogether, and the +downsides of each. ## Notes on phasing # Alternatives -**Use an attribute rather than command-line option.** Earlier versions -of this RFC used a `#[rust_version]` attribute to specify the Rust -version rather than a command-line parameter. This was changed to use -a command-line parameter because it (a) exposes the version int he -Cargo metadata, (b) is analogous to the approach used by most other -languages, and (c) simplifies the implementation, since the parser -does not need to be reconfigured midparse. - -**Rather than supporting opt-in changes, one might consider simply -issuing a new major release for every such change.** Put simply, -though, issuing a new major release just because we want to have a new -keyword feels like overkill. This seems like to have two potential -negative effects. It may simply cause us to not make some of the -changes we would make otherwise, or work harder to fit them within the -existing syntactic constraints. It may also serve to dilute the -meaning of issuing a new major version, since even additive changes -that do not affect existing code in any meaningful way would result in -a major release. One would then be tempted to have some *additional* -numbering scheme, PR blitz, or other means to notify people when a new -major version is coming that indicates deeper changes. - -**Rather than simply fixing soundness bugs, we could use the opt-in -mechanism to fix them conditionally.** This was initially considered -as an option, but eventually rejected for the following reasons: - -- This would effectively cause a deeper split between minor versions; - currently, opt-in is limited to "surface changes" only, but allowing - opt-in to affect the type system feels like it would be creating two - distinct languages. -- It seems likely that all users of Rust will want to know that their code - is sound and would not want to be working with unsafe constructs or bugs. -- Users may choose not to opt-in to newer versions because they do not - need the new features introduced there or because they wish to - preserve compatibility with older compilers. It would be sad for - them to lose the benefits of bug fixes as well. +**Rather than simply fixing soundness bugs, we could issue new major +releases, or use some sort of opt-in mechanism to fix them +conditionally.** This was initially considered as an option, but +eventually rejected for the following reasons: + +- Opting in to type system changes would cause deep splits between + minor versions; it would also create a high maintenance burden in + the compiler, since both older and newer versions would have to be + supported. +- It seems likely that all users of Rust will want to know that their + code is sound and would not want to be working with unsafe + constructs or bugs. - We already have several mitigation measures, such as opt-out or temporary deprecation, that can be used to ease the transition around a soundness fix. Moreover, separating out new type rules so @@ -421,55 +256,8 @@ as an option, but eventually rejected for the following reasons: complicate the compiler internally; it would also make it harder to reason about the type system as a whole. -**Rather than using a version number to opt-in to minor changes, one -might consider using the existing feature mechanism.** For example, -one could write `#![feature(foo)]` to opt in to the feature "foo" and -its associated keywords and type rules, rather than -`#![rust_version="1.2.3"]`. While using minimum version numbers is -more opaque than named features, they do offer several advantages: - -1. Using a version number alone makes it easy to think about what - version of Rust you are using as a conceptual unit, rather than - choosing features "a la carte". -2. Using named features, the list of features that must be attached to - Rust code will grow indefinitely, presuming your crate wants to - stay up to date. -3. Using a version attribute preserves a mental separation between - "experimental work" (feature gates) and stable, new features. -4. Named features present a combinatoric testing problem, where we - should (in principle) test for all possible combinations of - features. - # Unresolved questions -**Can (and should) we give a more precise definition for compiler bugs -and soundness problems?** The current text is vague on what precisely -constitutes a compiler bug and soundness change. It may be worth -defining more precisely, though likely this would be best done as part -of writing up a more thorough (and authoritative) Rust reference -manual. - -**Should we add a mechanism for "escaping" keywords?"** We may need a -mechanism for escaping keywords in the future. Imagine you have a -public function named `foo`, and we add a keyword `foo`. Now, if you -opt in to the newer version of Rust, your function declaration is -illegal: but if you rename the function `foo`, you are making a -breaking change for your clients, which you may not wish to do. If we -had an escaping mechanism, you would probably still want to deprecate -`foo` in favor of a new function `bar` (since typing `foo` would be -awkward), but it could still exist. - -**Should we add a mechanism for skipping over new syntax?** The -current `#[cfg]` mechanism is applied *after* parsing. This implies -that if we add new syntax, crates which employ that new syntax will -not be parsable by older compilers, even if the modules that depend on -that new syntax are disabled via `#[cfg]` directives. It may be useful -to add some mechanism for informing the parser that it should skip -over sections of the input (presumably based on token trees). One -approach to this might just be modifying the existing `#[cfg]` -directives so that they are applied during parsing rather than as a -post-pass. - **What precisely constitutes "small" impact?** This RFC does not attempt to define when the impact of a patch is "small" or "not small". We will have to develop guidelines over time based on @@ -478,23 +266,6 @@ observe on `crates.io` will be of the total breakage that will occur: it is certainly possible that all crates on `crates.io` work fine, but the change still breaks a large body of code we do not have access to. -**Should deprecation due to unsoundness have a special lint?** We may -not want to use the same deprecation lint for unsoundness that we use -for everything else. - -**What attribute should we use to "opt out" of soundness changes?** -The section on breaking changes indicated that it may sometimes be -appropriate to includ an "opt out" that people can use to temporarily -revert to older, unsound type rules, but did not specify precisely -what that opt-out should look like. Ideally, we would identify a -specific attribute in advance that will be used for such purposes. In -the past, we have simply created ad-hoc attributes (e.g., -`#[old_orphan_check]`), but because custom attributes are forbidden by -stable Rust, this has the unfortunate side-effect of meaning that code -which opts out of the newer rules cannot be compiled on older -compilers (even though it's using the older type system rules). If we -introduce an attribute in advance we will not have this problem. - [RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 [RFC 320]: https://github.com/rust-lang/rfcs/pull/320 [#744]: https://github.com/rust-lang/rfcs/issues/744 From 8e7f04fa29030e283e87f88da55fd7276b68a0e2 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Mon, 1 Jun 2015 14:32:17 -0700 Subject: [PATCH 0306/1195] Clone -> Copy and address discussion --- text/0000-embrace-extend-extinguish.md | 73 +++++++++++++++++++------- 1 file changed, 54 insertions(+), 19 deletions(-) diff --git a/text/0000-embrace-extend-extinguish.md b/text/0000-embrace-extend-extinguish.md index b28e9d7dba0..428fd00e8db 100644 --- a/text/0000-embrace-extend-extinguish.md +++ b/text/0000-embrace-extend-extinguish.md @@ -5,12 +5,10 @@ # Summary -NOTE: This RFC assumes Extend is improved to take IntoIterator, as was always intended. +Make all collections `impl<'a, T: Copy> Extend<&'a T>`. -Make all collections `impl<'a, T: Clone> Extend<&'a T>`. - -This enables both `vec.extend(&[1, 2, 3])`, and `vec.extend(&hash_set)`. -This provides a more expressive replacement for `Vec::push_all` with +This enables both `vec.extend(&[1, 2, 3])`, and `vec.extend(&hash_set_of_ints)`. +This partially covers the usecase of the awkward `Vec::push_all` with literally no ergonomic loss, while leveraging established APIs. # Motivation @@ -24,22 +22,22 @@ because generic APIs and semantics are tailored for non-Copy types. Even with Extend upgraded to take IntoIterator, that won't work with &[Copy], because a slice can't be moved out of. Collections would have to take `IntoIterator<&T>`, -and clone out of the reference. So, do exactly that. +and copy out of the reference. So, do exactly that. -As a bonus, this is more expressive than `push_all`, because you can feed in any -collection by-reference to clone the data out of it. +As a bonus, this is more expressive than `push_all`, because you can feed in *any* +collection by-reference to clone the data out of it, not just slices. # Detailed design -* For sequences and sets: `impl<'a, T: Clone> Extend<&'a T>` -* For maps: `impl<'a, K: Clone, V: Clone> Extend<(&'a K, &'a V)>` +* For sequences and sets: `impl<'a, T: Copy> Extend<&'a T>` +* For maps: `impl<'a, K: Copy, V: Copy> Extend<(&'a K, &'a V)>` e.g. -``` +```rust use std::iter::IntoIterator; -impl<'a, T: Clone> Extend<&'a T> for Vec { +impl<'a, T: Copy> Extend<&'a T> for Vec { fn extend>(&mut self, iter: I) { self.extend(iter.into_iter().cloned()) } @@ -59,18 +57,49 @@ fn main() { # Drawbacks -Mo' generics, mo' magic. How you gonna discover it? +* Mo' generics, mo' magic. How you gonna discover it? + +* This creates a potentially confusing behaviour in a generic context. + +Consider the following code: + +```rust +fn feed<'a, X: Extend<&'a T>>(&'a self, buf: &mut X) { + buf.extend(self.data.iter()); +} +``` + +One would reasonably extend X to contain &T's, but with this +proposal it is possible that X now instead contains T's. It's not +clear that in "real" code that this would ever be a problem, though. +It may lead to novices accidentally by-passing ownership through +implicit copies. + +It also may make inference fail in some other cases, as Extend would +not always be sufficient to determine the type of a `vec![]`. -Hidden clones? +* This design does not fully replace the push_all, as it takes `T: Clone`. # Alternatives -Restrict this proposal to only work for Copy types. This avoids any concern over -implicit expensive operations, and enables easily working with Plain Old Data. -The only downside is creating a larger divide between Clone and Copy, while also -being a bit needlessly inexpressive. -# Unresolved questions +## The Cloneian Candidate +This proposal is artifically restricting itself to `Copy` rather than full +`Clone` as a concession to the general Rustic philosophy of Clones being +explicit. Since this proposal is largely motivated by simple shuffling of +primitives, this is sufficient. Also, because `Copy: Clone`, it would be +backwards compatible to upgrade to `Clone` in the future if demand is +high enough. + +## The New Method +It is theoretically plausible to add a new defaulted method to Extend called +`extend_cloned` that provides this functionality. This removes any concern of +accidental clones and makes inference totally work. However this design cannot +simultaneously support Sequences and Maps, as the signature for sequences would +mean Maps can only Copy through &(K, V), rather than (&K, &V). This would make +it impossible to copy-chain Maps through Extend. + +## Why not FromIterator? FromIterator could also be extended in the same manner, but this is less useful for two reasons: @@ -82,3 +111,9 @@ two reasons: Of course, context might disambiguate in many cases, and `let foo: Vec = [1, 2, 3].iter().collect()` might still be nicer than `let foo: Vec<_> = [1, 2, 3].iter().cloned().collect()`. + + +# Unresolved questions + +None. + From 6874dab3ddaa53ba74d4e7dc11867a851a4c555f Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Mon, 1 Jun 2015 16:26:41 -0700 Subject: [PATCH 0307/1195] fixup --- text/0000-embrace-extend-extinguish.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-embrace-extend-extinguish.md b/text/0000-embrace-extend-extinguish.md index 428fd00e8db..6ef99f6e9a1 100644 --- a/text/0000-embrace-extend-extinguish.md +++ b/text/0000-embrace-extend-extinguish.md @@ -5,10 +5,10 @@ # Summary -Make all collections `impl<'a, T: Copy> Extend<&'a T>`. +Make all collections `impl<'a, T: Copy> Extend<&'a T>`. -This enables both `vec.extend(&[1, 2, 3])`, and `vec.extend(&hash_set_of_ints)`. -This partially covers the usecase of the awkward `Vec::push_all` with +This enables both `vec.extend(&[1, 2, 3])`, and `vec.extend(&hash_set_of_ints)`. +This partially covers the usecase of the awkward `Vec::push_all` with literally no ergonomic loss, while leveraging established APIs. # Motivation @@ -69,7 +69,7 @@ fn feed<'a, X: Extend<&'a T>>(&'a self, buf: &mut X) { } ``` -One would reasonably extend X to contain &T's, but with this +One would reasonably expect X to contain &T's, but with this proposal it is possible that X now instead contains T's. It's not clear that in "real" code that this would ever be a problem, though. It may lead to novices accidentally by-passing ownership through From 1f9a2ae26639dcc0b946ddd9e09cf1bc857d68a5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 2 Jun 2015 16:31:59 -0700 Subject: [PATCH 0308/1195] RFC 839 is Extend<&T> for collections --- ...extend-extinguish.md => 0839-embrace-extend-extinguish.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-embrace-extend-extinguish.md => 0839-embrace-extend-extinguish.md} (96%) diff --git a/text/0000-embrace-extend-extinguish.md b/text/0839-embrace-extend-extinguish.md similarity index 96% rename from text/0000-embrace-extend-extinguish.md rename to text/0839-embrace-extend-extinguish.md index 6ef99f6e9a1..a23acffebf5 100644 --- a/text/0000-embrace-extend-extinguish.md +++ b/text/0839-embrace-extend-extinguish.md @@ -1,7 +1,7 @@ - Feature Name: embrace-extend-extinguish - Start Date: 2015-02-13 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#839](https://github.com/rust-lang/rfcs/pull/839) +- Rust Issue: [rust-lang/rust#25976](https://github.com/rust-lang/rust/issues/25976) # Summary From 4059db0b8694a5513c8be4dc652544c5e7e7e003 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 2 Jun 2015 16:36:45 -0700 Subject: [PATCH 0309/1195] RFC 1014 is not panicking when stdout isn't present --- .../1014-stdout-existential-crisis.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename 0000-stdout-existential-crisis.md => text/1014-stdout-existential-crisis.md (92%) diff --git a/0000-stdout-existential-crisis.md b/text/1014-stdout-existential-crisis.md similarity index 92% rename from 0000-stdout-existential-crisis.md rename to text/1014-stdout-existential-crisis.md index bcc7b648531..8649b6c3508 100644 --- a/0000-stdout-existential-crisis.md +++ b/text/1014-stdout-existential-crisis.md @@ -1,7 +1,7 @@ -- Feature Name: stdout_existential_crisis +- Feature Name: `stdout_existential_crisis` - Start Date: 2015-03-25 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1014](https://github.com/rust-lang/rfcs/pull/1014) +- Rust Issue: [rust-lang/rust#25977](https://github.com/rust-lang/rust/issues/25977) # Summary From 1b9d4bf2f06ddc07e1cd92a5f266d6f512716904 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 2 Jun 2015 19:41:55 -0400 Subject: [PATCH 0310/1195] Merge RFC 1096 -- remove-static-assert --- .../1096-remote-static-assert.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename 0000-remove-static-assert.md => active/1096-remote-static-assert.md (97%) diff --git a/0000-remove-static-assert.md b/active/1096-remote-static-assert.md similarity index 97% rename from 0000-remove-static-assert.md rename to active/1096-remote-static-assert.md index 56778da0e6d..60cf8e81157 100644 --- a/0000-remove-static-assert.md +++ b/active/1096-remote-static-assert.md @@ -1,6 +1,6 @@ - Feature Name: remove-static-assert - Start Date: 2015-04-28 -- RFC PR: +- RFC PR: https://github.com/rust-lang/rfcs/pull/1096 - Rust Issue: https://github.com/rust-lang/rust/pull/24910 # Summary From b802ae272482a067ab16da45c693dc30a5d0d890 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 2 Jun 2015 20:05:37 -0400 Subject: [PATCH 0311/1195] Update text --- text/0008-new-intrinsics.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0008-new-intrinsics.md b/text/0008-new-intrinsics.md index 230b02964c5..78043cf9270 100644 --- a/text/0008-new-intrinsics.md +++ b/text/0008-new-intrinsics.md @@ -2,7 +2,9 @@ - RFC PR: [rust-lang/rfcs#8](https://github.com/rust-lang/rfcs/pull/8) - Rust Issue: -** Note: this RFC was never implemented. ** +** Note: this RFC was never implemented and has been retired. The +design may still be useful in the future, but before implementing we +would prefer to revisit it so as to be sure it is up to date. ** # Summary From bf8d81b117fcfc223b2a3b79cfd9a6f8d59ae81d Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 17 Mar 2015 08:51:49 +1300 Subject: [PATCH 0312/1195] DST custom coercions. Custom coercions allow smart pointers to fully participate in the DST system. In particular, they allow practical use of `Rc` and `Arc` where `T` is unsized. This RFC subsumes part of [RFC 401 coercions](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md). --- text/0000-dst-coercion.md | 175 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 175 insertions(+) create mode 100644 text/0000-dst-coercion.md diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md new file mode 100644 index 00000000000..2f9bdd1824e --- /dev/null +++ b/text/0000-dst-coercion.md @@ -0,0 +1,175 @@ +- Feature Name: dst-coercions +- Start Date: 2015-03-16 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Custom coercions allow smart pointers to fully participate in the DST system. +In particular, they allow practical use of `Rc` and `Arc` where `T` is unsized. + +This RFC subsumes part of [RFC 401 coercions](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md). + +# Motivation + +DST is not really finished without this, in particular there is a need for types +like reference counted trait objects (`Rc`) which are not currently well- +supported (without coercions, it is pretty much impossible to create such values +with such a type). + +# Detailed design + +There is an `Unsize` trait and lang item. This trait signals that a type can be +converted using the compiler's coercion machinery from a sized to an unsized +type. All implementations of this trait are implicit and compiler generated. It +is an error to implement this trait. If `&T` can be coerced to `&U` then there +will be an implementation of `Unsize` for `T`. E.g, `[i32; 42]: +Unsize<[i32]>`. Note that the existence of an `Unsize` impl does not signify a +coercion can itself can take place, it represents an internal part of the +coercion mechanism (it corresponds with `coerce_inner` from RFC 401). The trait +is defined as: + +``` +#[lang="unsize"] +trait Unsize: ::std::marker::PhantomFn {} +``` + +There are implementations for any fixed size array to the corresponding unsized +array, for any type to any trait that that type implements, for structs and +tuples where the last field can be unsized, and for any pair of traits where +`Self` is a sub-trait of `T` (see RFC 401 for more details). + +There is a `CoerceUnsized` trait which is implemented by smart pointer types to +opt-in to DST coercions. It is defined as: + +``` +#[lang="coerce_unsized"] +trait CoerceUnsized: ::std::marker::PhantomFn + Sized {} +``` + +An example implementation: + +``` +impl, U: ?Sized> CoerceUnsized> for Rc {} + +// For reference, the definition of Rc: +pub struct Rc { + _ptr: NonZero<*mut RcBox>, +} +``` + +Implementing `CoerceUnsized` indicates that the self type should be able to be +coerced to the `Target` type. E.g., the above implementation means that +`Rc<[i32; 42]>` can be coerced to `Rc<[i32]>`. + + +## Newtype coercions + +We also add a new built-in coercion for 'newtype's. If `Foo` is a tuple +struct with a single field with type `T` and `T` has at least the `?Sized` +bound, then coerce_inner(`Foo`) = `Foo` holds for any `T` and `U` where +`T` coerces to `U`. + +This coercion is not opt-in. It is best thought of as an extension to the +coercion rule for structs with an unsized field, the extension is that here the +field conversion is a proper coercion, not an application of `coerce_inner`. +Note that this coercion can be recursively applied. + + +## Compiler checking + +### On encountering an implementation of `CoerceUnsized` (type collection phase) + +* The compiler checks that the `Self` type is a struct or tuple struct and that +the `Target` type is a simple substitution of type parameters from the `Self` +type (one day, with HKT, this could be a regular part of type checking, for now +it must be an ad hoc check). We might enforce that this substitution is of the +form `X/Y` where `X` and `Y` are both formal type parameters of the +implementation (I don't think this is necessary, but it makes checking coercions +easier and is satisfied for all smart pointers). +* The compiler checks each field in the `Self` type against the corresponding field +in the `Target` type. Either the field types must be subtypes or be coercible from the +`Self` field to the `Target` field (this is checked taking into account any +`Unsize` bounds in the environment which indicate that some coercion can take +place). Note that this per-field check uses only the built-in coercion +mechanics. It does not take into account `CoerceUnsized` impls (although we +might allow this in the future). +* There must be only one field that is coerced. +* We record in a side table a mapping from the impl to an adjustment. The +adjustment will contain the field which is coerced and a nested adjustment +representing that coercion. The nested adjustment will have a placeholder for +any use of the `Unsize` bound (we should require that there is exactly one such use). + +### On encountering a potential coercion + +* If we have an expression with type `E` where the type `F` is required during +type checking and `E` is not a subtype of `F`, nor is it coercible using the +built-in coercions, then we search for an implementation of `CoerceUnsized` +for `E`. A match will give us a substitution of the formal type parameters of +the impl by some actual types. +* We look up the impl in the side table described above. The substitution is used +with the placeholder in the recorded adjustment to create a new coercion which +will map one field of the struct being coerced. That coercion should always be +valid (if it is not, there is a compiler bug). +* We create a new adjustment for the coerced expression. This will include the +index of the field which is deeply coerced and the adjustment for the coercion +described in the previous step. +* In trans, the adjustment is used to codegen a coercion by moving the coerced +value and changing the indicated field to a new type according to the nested +adjustment. + +### Adjustment types + +We add `AdjustCustom(usize, Box)` and +`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These +represent the new custom and newtype coercions, respectively. We add +`UnsizePlaceHolder(Ty, Ty)` to the `UnsizeKind` enum to represent a placeholder +adjustment due to an `Unsize` bound. + +### Example + +For the above `Rc` impl, we record the following adjustment (with some trivial +bits and pieces elided): + +``` +AdjustCustom(0, AdjustNewType( + AutoDerefRef { + autoderefs: 1, + autoref: AutoUnsafe(mut, AutoUnsize( + UnsizeStruct(UnsizePlaceholder(T, U)))) + })) +``` + +When we need to coerce `Rc<[i32; 42]>` to `Rc<[i32]>`, we look up the impl and +find `T = [i32; 42]` and `U = [i32]` (note that we automatically require that +`Unsize` is satisfied when looking up the impl). We can therefore replace the +placeholder in the above adjustment with `UnsizeLength(42)`. That gives us the +real adjustment to store for trans. + +# Drawbacks + +Not as flexible as the previous proposal. Can't handle pointer-like types like +`Option>`. + +# Alternatives + +The original [DST5 proposal](http://smallcultfollowing.com/babysteps/blog/2014/01/05/dst-take-5/) +contains a similar proposal with no opt-in trait, i.e., coercions are completely +automatic and arbitrarily deep. This is a little too magical and unpredicatable. +It violates some 'soft abstraction boundaries' by interefering with the deep +structure of objects, sometimes even automatically (and implicitly) allocating. + +[RFC 401](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md) +proposed a scheme for proposals where users write their own coercion using +intrinsics. Although more flexible, this allows for implcicit excecution of +arbitrary code. If we need the increased flexibility, I believe we can add a +manual option to the `CoerceUnsized` trait backwards compatibly. + +The proposed design could be tweaked: we could make newtype coercions opt-in +(this would complicate other parts of the proposal though). We could change the +`CoerceUnsized` trait in many ways (we experimented with an associated type to +indicate the field type which is coerced, for example). + +# Unresolved questions + +None From 4d803c14371ca7cbfd327b3db9735ed3ba58e31f Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 17 Mar 2015 12:12:36 +1300 Subject: [PATCH 0313/1195] Tweak newtype coercions, remove ?Sized requirement. --- text/0000-dst-coercion.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 2f9bdd1824e..2dea9d73a42 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -66,9 +66,8 @@ coerced to the `Target` type. E.g., the above implementation means that ## Newtype coercions We also add a new built-in coercion for 'newtype's. If `Foo` is a tuple -struct with a single field with type `T` and `T` has at least the `?Sized` -bound, then coerce_inner(`Foo`) = `Foo` holds for any `T` and `U` where -`T` coerces to `U`. +struct with a single field with type `T`, then coerce_inner(`Foo`) = `Foo` +holds for any `T` and `U` where `T` coerces to `U`. This coercion is not opt-in. It is best thought of as an extension to the coercion rule for structs with an unsized field, the extension is that here the @@ -121,7 +120,7 @@ adjustment. ### Adjustment types We add `AdjustCustom(usize, Box)` and -`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These +`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These represent the new custom and newtype coercions, respectively. We add `UnsizePlaceHolder(Ty, Ty)` to the `UnsizeKind` enum to represent a placeholder adjustment due to an `Unsize` bound. From 97452ca19a8dab989057b93c01c6e259bc86222e Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Wed, 18 Mar 2015 15:00:57 +1300 Subject: [PATCH 0314/1195] Remove newtype coercions, make CoerceUnsized coercions more general --- text/0000-dst-coercion.md | 124 ++++++++++++++++++++------------------ 1 file changed, 66 insertions(+), 58 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 2dea9d73a42..0a390ebce24 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -51,34 +51,60 @@ An example implementation: ``` impl, U: ?Sized> CoerceUnsized> for Rc {} +impl, U: ?Sized> NonZero for NonZero {} -// For reference, the definition of Rc: +// For reference, the definitions of Rc and NonZero: pub struct Rc { _ptr: NonZero<*mut RcBox>, } +pub struct NonZero(T); ``` Implementing `CoerceUnsized` indicates that the self type should be able to be coerced to the `Target` type. E.g., the above implementation means that -`Rc<[i32; 42]>` can be coerced to `Rc<[i32]>`. +`Rc<[i32; 42]>` can be coerced to `Rc<[i32]>`. There will be `CoerceUnsized` impls +for the various pointer kinds available in Rust and which allow coercions, therefore +`CoerceUnsized` when used as a bound indicates coercible types. E.g., +``` +fn foo, U>(x: T) -> U { + x +} +``` + +Built-in pointer impls: -## Newtype coercions +``` +impl, U: ?Sized> CoerceUnsized> for Box {} +impl, U: ?Sized, 'a> CoerceUnsized<&'a U> for Box {} +impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for Box {} +impl, U: ?Sized> CoerceUnsized<*const U> for Box {} +impl, U: ?Sized> CoerceUnsized<*mut U> for Box {} + +impl, U: ?Sized, 'a, 'b: 'a> CoerceUnsized<&'a U> for &mut 'b U {} +impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for &mut 'a U {} +impl, U: ?Sized, 'a> CoerceUnsized<*const U> for &mut 'a U {} +impl, U: ?Sized, 'a> CoerceUnsized<*mut U> for &mut 'a U {} -We also add a new built-in coercion for 'newtype's. If `Foo` is a tuple -struct with a single field with type `T`, then coerce_inner(`Foo`) = `Foo` -holds for any `T` and `U` where `T` coerces to `U`. +impl, U: ?Sized, 'a, 'b> CoerceUnsized<&'a U> for &'b U {} +impl, U: ?Sized, 'b> CoerceUnsized<*const U> for &'b U {} -This coercion is not opt-in. It is best thought of as an extension to the -coercion rule for structs with an unsized field, the extension is that here the -field conversion is a proper coercion, not an application of `coerce_inner`. -Note that this coercion can be recursively applied. +impl, U: ?Sized> CoerceUnsized<*const U> for *mut U {} +impl, U: ?Sized> CoerceUnsized<*mut U> for *mut U {} + +impl, U: ?Sized> CoerceUnsized<*const U> for *const U {} +``` + +Note that there are some coercions which are not given by `CoerceUnsized`, e.g., +from safe to unsafe function pointers, so it really is a `CoerceUnsized` trait, +not a general `Coerce` trait. ## Compiler checking ### On encountering an implementation of `CoerceUnsized` (type collection phase) +* If the impl is for a built-in pointer type, we check nothing, otherwise... * The compiler checks that the `Self` type is a struct or tuple struct and that the `Target` type is a simple substitution of type parameters from the `Self` type (one day, with HKT, this could be a regular part of type checking, for now @@ -87,63 +113,46 @@ form `X/Y` where `X` and `Y` are both formal type parameters of the implementation (I don't think this is necessary, but it makes checking coercions easier and is satisfied for all smart pointers). * The compiler checks each field in the `Self` type against the corresponding field -in the `Target` type. Either the field types must be subtypes or be coercible from the -`Self` field to the `Target` field (this is checked taking into account any -`Unsize` bounds in the environment which indicate that some coercion can take -place). Note that this per-field check uses only the built-in coercion -mechanics. It does not take into account `CoerceUnsized` impls (although we -might allow this in the future). +in the `Target` type. Assuming `Fs` is the type of a field in `Self` and `Ft` is +the type of the corresponding field in `Target`, then either `Ft <: Fs` or +`Fs: CoerceUnsized` (note that this includes built-in coercions). * There must be only one field that is coerced. -* We record in a side table a mapping from the impl to an adjustment. The -adjustment will contain the field which is coerced and a nested adjustment -representing that coercion. The nested adjustment will have a placeholder for -any use of the `Unsize` bound (we should require that there is exactly one such use). +* We record for each impl, the index of the field in the `Self` type which is +coerced. -### On encountering a potential coercion +### On encountering a potential coercion (type checking phase) * If we have an expression with type `E` where the type `F` is required during type checking and `E` is not a subtype of `F`, nor is it coercible using the -built-in coercions, then we search for an implementation of `CoerceUnsized` -for `E`. A match will give us a substitution of the formal type parameters of -the impl by some actual types. -* We look up the impl in the side table described above. The substitution is used -with the placeholder in the recorded adjustment to create a new coercion which -will map one field of the struct being coerced. That coercion should always be -valid (if it is not, there is a compiler bug). -* We create a new adjustment for the coerced expression. This will include the -index of the field which is deeply coerced and the adjustment for the coercion -described in the previous step. -* In trans, the adjustment is used to codegen a coercion by moving the coerced -value and changing the indicated field to a new type according to the nested -adjustment. +built-in coercions, then we search for a bound of `E: CoerceUnsized`. Note +that we may not at this stage find the actual impl, but finding the bound is +good enough for type checking. -### Adjustment types +* If we require a coercion in the receiver of a method call or field lookup, we +perform the same search that we currently do, except that where we currently +check for coercions, we check for built-in coercions and then for `CoerceUnsized` +bounds. We must also check for `Unsize` bounds for the case where the receiver +is auto-deref'ed, but not autoref'ed. -We add `AdjustCustom(usize, Box)` and -`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These -represent the new custom and newtype coercions, respectively. We add -`UnsizePlaceHolder(Ty, Ty)` to the `UnsizeKind` enum to represent a placeholder -adjustment due to an `Unsize` bound. -### Example +### On encountering an adjustment (translation phase) -For the above `Rc` impl, we record the following adjustment (with some trivial -bits and pieces elided): +* In trans (which is post-monomorphisation) we should always be able to find an +impl for any `CoerceUnsized` bound. +* If the impl is for a built-in pointer type, then we use the current coercion +code for the various pointer kinds (`Box` has different behaviour than `&` and +`*` pointers). +* Otherwise, we lookup which field is coerced due to the opt-in coercion, move +the object being coerced and coerce the field in question by recursing (the +built-in pointers are the base cases). -``` -AdjustCustom(0, AdjustNewType( - AutoDerefRef { - autoderefs: 1, - autoref: AutoUnsafe(mut, AutoUnsize( - UnsizeStruct(UnsizePlaceholder(T, U)))) - })) -``` -When we need to coerce `Rc<[i32; 42]>` to `Rc<[i32]>`, we look up the impl and -find `T = [i32; 42]` and `U = [i32]` (note that we automatically require that -`Unsize` is satisfied when looking up the impl). We can therefore replace the -placeholder in the above adjustment with `UnsizeLength(42)`. That gives us the -real adjustment to store for trans. +### Adjustment types + +We add `AdjustCustom` to the `AutoAdjustment` enum as a placeholder for coercions +due to a `CoerceUnsized` bound. I don't think we need the `UnsizeKind` enum at +all now, since all checking is postponed until trans or relies on traits and impls. + # Drawbacks @@ -164,8 +173,7 @@ intrinsics. Although more flexible, this allows for implcicit excecution of arbitrary code. If we need the increased flexibility, I believe we can add a manual option to the `CoerceUnsized` trait backwards compatibly. -The proposed design could be tweaked: we could make newtype coercions opt-in -(this would complicate other parts of the proposal though). We could change the +The proposed design could be tweaked: for example, we could change the `CoerceUnsized` trait in many ways (we experimented with an associated type to indicate the field type which is coerced, for example). From 7468c306816c136dff320515de73e36061b16f7c Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 24 Mar 2015 13:39:12 +1300 Subject: [PATCH 0315/1195] Address some comments --- text/0000-dst-coercion.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 0a390ebce24..5c8ab6329a1 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -1,4 +1,4 @@ -- Feature Name: dst-coercions +- Feature Name: dst_coercions - Start Date: 2015-03-16 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -51,7 +51,7 @@ An example implementation: ``` impl, U: ?Sized> CoerceUnsized> for Rc {} -impl, U: ?Sized> NonZero for NonZero {} +impl, U: ?Sized> CoerceUnsized> for NonZero {} // For reference, the definitions of Rc and NonZero: pub struct Rc { @@ -107,7 +107,9 @@ not a general `Coerce` trait. * If the impl is for a built-in pointer type, we check nothing, otherwise... * The compiler checks that the `Self` type is a struct or tuple struct and that the `Target` type is a simple substitution of type parameters from the `Self` -type (one day, with HKT, this could be a regular part of type checking, for now +type (i.e., That `Self` is `Foo`, `Target` is `Foo` and that there exist +`Vs` and `Xs` (where `Xs` are all type parameters) such that `Target = [Vs/Xs]Self`. +One day, with HKT, this could be a regular part of type checking, for now it must be an ad hoc check). We might enforce that this substitution is of the form `X/Y` where `X` and `Y` are both formal type parameters of the implementation (I don't think this is necessary, but it makes checking coercions @@ -115,7 +117,8 @@ easier and is satisfied for all smart pointers). * The compiler checks each field in the `Self` type against the corresponding field in the `Target` type. Assuming `Fs` is the type of a field in `Self` and `Ft` is the type of the corresponding field in `Target`, then either `Ft <: Fs` or -`Fs: CoerceUnsized` (note that this includes built-in coercions). +`Fs: CoerceUnsized` (note that this includes some built-in coercions, coercions +unrelated to unsizing are excluded, these could probably be added later, if needed). * There must be only one field that is coerced. * We record for each impl, the index of the field in the `Self` type which is coerced. @@ -156,8 +159,7 @@ all now, since all checking is postponed until trans or relies on traits and imp # Drawbacks -Not as flexible as the previous proposal. Can't handle pointer-like types like -`Option>`. +Not as flexible as the previous proposal. # Alternatives From 4b99006cac67f7ad89c670bcb351d32bc0d5d33f Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Wed, 3 Jun 2015 16:55:41 +1200 Subject: [PATCH 0316/1195] eddyb's changes --- text/0000-dst-coercion.md | 28 +++++++++++----------------- 1 file changed, 11 insertions(+), 17 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 5c8ab6329a1..59e27ae0c9b 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -51,13 +51,13 @@ An example implementation: ``` impl, U: ?Sized> CoerceUnsized> for Rc {} -impl, U: ?Sized> CoerceUnsized> for NonZero {} +impl, U: Zeroable> CoerceUnsized> for NonZero {} // For reference, the definitions of Rc and NonZero: pub struct Rc { _ptr: NonZero<*mut RcBox>, } -pub struct NonZero(T); +pub struct NonZero(T); ``` Implementing `CoerceUnsized` indicates that the self type should be able to be @@ -75,24 +75,18 @@ fn foo, U>(x: T) -> U { Built-in pointer impls: ``` -impl, U: ?Sized> CoerceUnsized> for Box {} -impl, U: ?Sized, 'a> CoerceUnsized<&'a U> for Box {} -impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for Box {} -impl, U: ?Sized> CoerceUnsized<*const U> for Box {} -impl, U: ?Sized> CoerceUnsized<*mut U> for Box {} +impl<'a, 'b: 'aT: ?Sized+Unsize, U: ?Sized> CoerceUnsized<&'a U> for &'b mut T {} +impl<'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<&'a mut U> for &'a mut T {} +impl<'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<*const U> for &'a mut T {} +impl<'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<*mut U> for &'a mut T {} -impl, U: ?Sized, 'a, 'b: 'a> CoerceUnsized<&'a U> for &mut 'b U {} -impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for &mut 'a U {} -impl, U: ?Sized, 'a> CoerceUnsized<*const U> for &mut 'a U {} -impl, U: ?Sized, 'a> CoerceUnsized<*mut U> for &mut 'a U {} +impl<'a, 'b: 'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<&'a U> for &'b T {} +impl<'b, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<*const U> for &'b T {} -impl, U: ?Sized, 'a, 'b> CoerceUnsized<&'a U> for &'b U {} -impl, U: ?Sized, 'b> CoerceUnsized<*const U> for &'b U {} +impl, U: ?Sized> CoerceUnsized<*const U> for *mut T {} +impl, U: ?Sized> CoerceUnsized<*mut U> for *mut T {} -impl, U: ?Sized> CoerceUnsized<*const U> for *mut U {} -impl, U: ?Sized> CoerceUnsized<*mut U> for *mut U {} - -impl, U: ?Sized> CoerceUnsized<*const U> for *const U {} +impl, U: ?Sized> CoerceUnsized<*const U> for *const T {} ``` Note that there are some coercions which are not given by `CoerceUnsized`, e.g., From 4424c49b165b6c983b750aefc16c3017c0a7abd9 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 3 Jun 2015 15:38:45 -0400 Subject: [PATCH 0317/1195] Merge RFC #982: DST Coercion --- README.md | 1 + text/{0000-dst-coercion.md => 0982-dst-coercion.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-dst-coercion.md => 0982-dst-coercion.md} (97%) diff --git a/README.md b/README.md index c4073a55955..f199a7a9afc 100644 --- a/README.md +++ b/README.md @@ -48,6 +48,7 @@ the direction the language is evolving in. * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0911-const-fn.md](text/0911-const-fn.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) +* [0982-dst-coercion.md](text/0982-dst-coercion.md) * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) * [1011-process.exit.md](text/1011-process.exit.md) * [1023-rebalancing-coherence.md](text/1023-rebalancing-coherence.md) diff --git a/text/0000-dst-coercion.md b/text/0982-dst-coercion.md similarity index 97% rename from text/0000-dst-coercion.md rename to text/0982-dst-coercion.md index 59e27ae0c9b..c4263fd28b8 100644 --- a/text/0000-dst-coercion.md +++ b/text/0982-dst-coercion.md @@ -1,7 +1,7 @@ - Feature Name: dst_coercions - Start Date: 2015-03-16 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#982](https://github.com/rust-lang/rfcs/pull/982) +- Rust Issue: [rust-lang/rust#18598](https://github.com/rust-lang/rust/issues/18598) # Summary From 1359ebcfa7f12cb8f7c5b8c9e261b641840564d8 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 3 Jun 2015 17:12:33 -0700 Subject: [PATCH 0318/1195] Remove active folder --- {active => text}/1096-remote-static-assert.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename {active => text}/1096-remote-static-assert.md (100%) diff --git a/active/1096-remote-static-assert.md b/text/1096-remote-static-assert.md similarity index 100% rename from active/1096-remote-static-assert.md rename to text/1096-remote-static-assert.md From 6fa5d0568b47a24b5cfc55d809314c6375e50340 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 4 Jun 2015 14:24:54 -0400 Subject: [PATCH 0319/1195] rename rfc to its proper location --- README.md | 1 + .../1096-remove-static-assert.md | 0 2 files changed, 1 insertion(+) rename active/1096-remote-static-assert.md => text/1096-remove-static-assert.md (100%) diff --git a/README.md b/README.md index f199a7a9afc..97348cd2e83 100644 --- a/README.md +++ b/README.md @@ -55,6 +55,7 @@ the direction the language is evolving in. * [1040-duration-reform.md](text/1040-duration-reform.md) * [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) * [1066-safe-mem-forget.md](text/1066-safe-mem-forget.md) +* [1096-remove-static-assert.md](text/1096-remove-static-assert.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/active/1096-remote-static-assert.md b/text/1096-remove-static-assert.md similarity index 100% rename from active/1096-remote-static-assert.md rename to text/1096-remove-static-assert.md From e4372c43a4624a225370e3ef303654bc595d8c2a Mon Sep 17 00:00:00 2001 From: P1start Date: Sat, 6 Jun 2015 13:58:37 +1200 Subject: [PATCH 0320/1195] =?UTF-8?q?Add=20some=20of=20`[T]`=E2=80=99s=20m?= =?UTF-8?q?ethods=20to=20`str`=20and=20vice=20versa?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/0000-slice-string-symmetry.md | 68 ++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) create mode 100644 text/0000-slice-string-symmetry.md diff --git a/text/0000-slice-string-symmetry.md b/text/0000-slice-string-symmetry.md new file mode 100644 index 00000000000..e6cbf9490e5 --- /dev/null +++ b/text/0000-slice-string-symmetry.md @@ -0,0 +1,68 @@ +- Feature Name: `slice_string_symmetry` +- Start Date: 2015-06-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add some methods that already exist on slices to strings and vice versa. +Specifically, the following methods should be added: + +- `str::chunks` +- `str::windows` +- `str::into_string` +- `String::into_boxed_slice` +- `<[T]>::subslice_offset` + +# Motivation + +Conceptually, strings and slices are similar types. Many methods are already +shared between the two types due to their similarity. However, not all methods +are shared between the types, even though many could be. This is a little +unexpected and inconsistent. Because of that, this RFC proposes to remedy this +by adding a few methods to both strings and slices to even out these two types’ +available methods. + +# Detailed design + +Add the following methods to `str`, presumably as inherent methods: + +- `chunks(&self, n: usize) -> Chunks`: Returns an iterator that yields the + *characters* (not bytes) of the string in groups of `n` at a time. Iterator + element type: `&str`. + +- `windows(&self, n: usize) -> Windows`: Returns an iterator over all contiguous + windows of character length `n`. Iterator element type: `&str`. + +- `into_string(self: Box) -> String`: Returns `self` as a `String`. This is + equivalent to `[T]`’s `into_vec`. + +`split_at(&self, mid: usize) -> (&str, &str)` would also be on this list, but +there is [an existing RFC](https://github.com/rust-lang/rfcs/pull/1123) for it. + +Add the following method to `String` as an inherent method: + +- `into_boxed_slice(self) -> Box`: Returns `self` as a `Box`, + reallocating to cut off any excess capacity if needed. This is required to + provide a safe means of creating `Box`. + +Add the following method to `[T]` (for all `T`), presumably as an inherent +method: + +- `subslice_offset(&self, inner: &[T]) -> usize`: Returns the offset (in + elements) of an inner slice relative to an outer slice. Panics of `inner` is + not contained within `self`. + +# Drawbacks + +- `str::subslice_offset` is already unstable, so creating a similar method on + `[T]` is perhaps not such a good idea. + +# Alternatives + +- Do a subset of the proposal. For example, the `Box`-related methods could + be removed. + +# Unresolved questions + +None. From c2146b3670972e17c57bf1499982fa1aa5ab3b99 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 9 Jun 2015 16:52:04 -0700 Subject: [PATCH 0321/1195] RFC 1123 is str::split_at --- text/{0000-str-split-at.md => 1123-str-split-at.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-str-split-at.md => 1123-str-split-at.md} (94%) diff --git a/text/0000-str-split-at.md b/text/1123-str-split-at.md similarity index 94% rename from text/0000-str-split-at.md rename to text/1123-str-split-at.md index 66f849096d0..f57e08b3458 100644 --- a/text/0000-str-split-at.md +++ b/text/1123-str-split-at.md @@ -1,7 +1,7 @@ -- Feature Name: str-split-at +- Feature Name: `str_split_at` - Start Date: 2015-05-17 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1123](https://github.com/rust-lang/rfcs/pull/1123) +- Rust Issue: [rust-lang/rust#25839](https://github.com/rust-lang/rust/pull/25839) # Summary From 642bde9c054cacc0efbb9faf9d8018ba8b83d409 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 9 Jun 2015 17:11:51 -0700 Subject: [PATCH 0322/1195] RFC 1119 is Result::expect --- text/{0000-result-expect.md => 1119-result-expect.md} | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) rename text/{0000-result-expect.md => 1119-result-expect.md} (88%) diff --git a/text/0000-result-expect.md b/text/1119-result-expect.md similarity index 88% rename from text/0000-result-expect.md rename to text/1119-result-expect.md index 8546b6eb354..59ddf9ed6a1 100644 --- a/text/0000-result-expect.md +++ b/text/1119-result-expect.md @@ -1,7 +1,7 @@ - Feature Name: `result_expect` - Start Date: 2015-05-13 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1119](https://github.com/rust-lang/rfcs/pull/1119) +- Rust Issue: [rust-lang/rust#25359](https://github.com/rust-lang/rust/pull/25359) # Summary @@ -21,6 +21,7 @@ message and the error value. The format of the error message is left undefined in the documentation, but will most likely be the following + ``` panic!("{}: {:?}", msg, e) ``` From d69cf9248e992b2679f444e4a591c892df3e08f6 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Wed, 10 Jun 2015 15:11:08 +1200 Subject: [PATCH 0323/1195] Clarify text about lvalues --- text/0803-type-ascription.md | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/text/0803-type-ascription.md b/text/0803-type-ascription.md index 94dc373d59c..e5e62c37ad2 100644 --- a/text/0803-type-ascription.md +++ b/text/0803-type-ascription.md @@ -153,10 +153,10 @@ context of cross-platform programming). ### Type ascription and temporaries There is an implementation choice between treating `x: T` as an lvalue or -rvalue. Note that when a rvalue is used in lvalue context (e.g., the subject of -a reference operation), then the compiler introduces a temporary variable. -Neither option is satisfactory, if we treat an ascription expression as an -lvalue (i.e., no new temporary), then there is potential for unsoundness: +rvalue. Note that when an rvalue is used in 'reference context' (e.g., the +subject of a reference operation), then the compiler introduces a temporary +variable. Neither option is satisfactory, if we treat an ascription expression +as an lvalue (i.e., no new temporary), then there is potential for unsoundness: ``` let mut foo: S = ...; @@ -172,11 +172,13 @@ lvalue position), then we don't have the soundness problem, but we do get the unexpected result that `&(x: T)` is not in fact a reference to `x`, but a reference to a temporary copy of `x`. -The proposed solution is that type ascription expressions are lvalues. If the -type ascription expression is in reference context, then we require the ascribed -type to exactly match the type of the expression, i.e., neither subtyping nor -coercion is allowed. These reference contexts are as follows (where is a -type ascription expression): +The proposed solution is that type ascription expressions inherit their +'lvalue-ness' from their underlying expressions. I.e., `e: T` is an lvalue if +`e` is an lvalue, and an rvalue otherwise. If the type ascription expression is +in reference context, then we require the ascribed type to exactly match the +type of the expression, i.e., neither subtyping nor coercion is allowed. These +reference contexts are as follows (where `` is a type ascription +expression): ``` &[mut] From 8980cc60c69518a86f815a4090318aff0c6ac012 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 10 Jun 2015 11:25:17 -0400 Subject: [PATCH 0324/1195] Create the various links and interconnections. --- README.md | 1 + text/{0000-expect-intrinsic.md => 1131-likely-intrinsic.md} | 6 +++--- 2 files changed, 4 insertions(+), 3 deletions(-) rename text/{0000-expect-intrinsic.md => 1131-likely-intrinsic.md} (94%) diff --git a/README.md b/README.md index 97348cd2e83..a972ab693e6 100644 --- a/README.md +++ b/README.md @@ -56,6 +56,7 @@ the direction the language is evolving in. * [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) * [1066-safe-mem-forget.md](text/1066-safe-mem-forget.md) * [1096-remove-static-assert.md](text/1096-remove-static-assert.md) +* [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-expect-intrinsic.md b/text/1131-likely-intrinsic.md similarity index 94% rename from text/0000-expect-intrinsic.md rename to text/1131-likely-intrinsic.md index 8162ef161b0..b5b894cbd70 100644 --- a/text/0000-expect-intrinsic.md +++ b/text/1131-likely-intrinsic.md @@ -1,7 +1,7 @@ - Feature Name: expect_intrinsic - Start Date: 2015-05-20 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1131](https://github.com/rust-lang/rfcs/pull/1131) +- Rust Issue: [rust-lang/rust#26179](https://github.com/rust-lang/rust/issues/26179) # Summary @@ -53,4 +53,4 @@ we have used for similar restrictions like the ordering constraint of the atomic # Unresolved questions -None. \ No newline at end of file +None. From 2dba27690e744b8e224c49af39fbade368e668f2 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 10 Jun 2015 11:40:17 -0400 Subject: [PATCH 0325/1195] Final amendments. --- text/0000-language-semver.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/text/0000-language-semver.md b/text/0000-language-semver.md index e2ddcf10e8d..3ac247a2937 100644 --- a/text/0000-language-semver.md +++ b/text/0000-language-semver.md @@ -144,6 +144,9 @@ relatively small, it could be an option to leave the "opt out" mechanism in place permanently. In either case, use of the "opt out" API would trigger the deprecation lint. +Note that we should make every effort to ensure that crates which +employ this opt out can be used compatibly with crates that do not. + #### Changes that alter dynamic semantics versus typing rules In some cases, fixing a bug may not cause crates to stop compiling, @@ -266,6 +269,26 @@ observe on `crates.io` will be of the total breakage that will occur: it is certainly possible that all crates on `crates.io` work fine, but the change still breaks a large body of code we do not have access to. +**What attribute should we use to "opt out" of soundness changes?** +The section on breaking changes indicated that it may sometimes be +appropriate to includ an "opt out" that people can use to temporarily +revert to older, unsound type rules, but did not specify precisely +what that opt-out should look like. Ideally, we would identify a +specific attribute in advance that will be used for such purposes. In +the past, we have simply created ad-hoc attributes (e.g., +`#[old_orphan_check]`), but because custom attributes are forbidden by +stable Rust, this has the unfortunate side-effect of meaning that code +which opts out of the newer rules cannot be compiled on older +compilers (even though it's using the older type system rules). If we +introduce an attribute in advance we will not have this problem. + +**Are there any other circumstances in which we might perform a +breaking change?** In particular, it may happen from time to time that +we wish to alter some detail of a stable component. If we believe that +this change will not affect anyone, such a change may be worth doing, +but we'll have to work out more precise guidelines. [RFC 1156] is an +example. + [RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 [RFC 320]: https://github.com/rust-lang/rfcs/pull/320 [#744]: https://github.com/rust-lang/rfcs/issues/744 @@ -281,3 +304,4 @@ the change still breaks a large body of code we do not have access to. [RFC 560]: https://github.com/rust-lang/rfcs/pull/560 [macro]: https://internals.rust-lang.org/t/pre-rfc-macro-improvements/2088 [#24451]: https://github.com/rust-lang/rust/pull/24451 +[RFC 1156]: https://github.com/rust-lang/rfcs/pull/1156 From f018ef9192285c88a9cd9167679ab6a31ee6405a Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 10 Jun 2015 11:48:32 -0400 Subject: [PATCH 0326/1195] Move things into their proper places --- README.md | 1 + text/{0000-language-semver.md => 1122-language-semver.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-language-semver.md => 1122-language-semver.md} (99%) diff --git a/README.md b/README.md index a972ab693e6..b59eb498ac4 100644 --- a/README.md +++ b/README.md @@ -56,6 +56,7 @@ the direction the language is evolving in. * [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) * [1066-safe-mem-forget.md](text/1066-safe-mem-forget.md) * [1096-remove-static-assert.md](text/1096-remove-static-assert.md) +* [1122-language-semver.md](text/1122-language-semver.md) * [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) ## Table of Contents diff --git a/text/0000-language-semver.md b/text/1122-language-semver.md similarity index 99% rename from text/0000-language-semver.md rename to text/1122-language-semver.md index 3ac247a2937..ed0985d606d 100644 --- a/text/0000-language-semver.md +++ b/text/1122-language-semver.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-05-07 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1122](https://github.com/rust-lang/rfcs/pull/1122) +- Rust Issue: N/A # Summary From 721f2d74cc4daf76f3e49d58fbc6ded55d545e45 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 10 Jun 2015 10:53:05 -0700 Subject: [PATCH 0327/1195] RFC 1105 is API evolution --- text/{0000-api-evolution.md => 1105-api-evolution.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-api-evolution.md => 1105-api-evolution.md} (99%) diff --git a/text/0000-api-evolution.md b/text/1105-api-evolution.md similarity index 99% rename from text/0000-api-evolution.md rename to text/1105-api-evolution.md index 54c465f3acb..b8b37f1d2dd 100644 --- a/text/0000-api-evolution.md +++ b/text/1105-api-evolution.md @@ -1,7 +1,7 @@ - Feature Name: not applicable - Start Date: 2015-05-04 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1105](https://github.com/rust-lang/rfcs/pull/1105) +- Rust Issue: N/A # Summary From 56466320ec42f3109bcc4919a7f71dcd65dcaaea Mon Sep 17 00:00:00 2001 From: arielb1 Date: Thu, 11 Jun 2015 20:19:11 +0300 Subject: [PATCH 0328/1195] Changes for raw pointer casts --- text/0401-coercions.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/0401-coercions.md b/text/0401-coercions.md index 0bbb1e2a9cc..a6bebbe9d6b 100755 --- a/text/0401-coercions.md +++ b/text/0401-coercions.md @@ -335,7 +335,12 @@ following holds: where `&.T` and `*T` are references of either mutability, and where unsize_kind(`T`) is the kind of the unsize info -in `T` - a vtable or a length (or `()` if `T: Sized`). +in `T` - the vtable for a trait definition (e.g. `fmt::Display` or +`Iterator`, not `Iterator`) or a length (or `()` if `T: Sized`). + +Note that lengths are not adjusted when casting raw slices - +`T: *const [u16] as *const [u8]` creates a slice that only includes +half of the original memory. Casting is not transitive, that is, even if `e as U1 as U2` is a valid expression, `e as U2` is not necessarily so (in fact it will only be valid if From 317885e0bf7d7f3f85b08c57e74e56cbaece731b Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 12 Jun 2015 14:07:29 -0400 Subject: [PATCH 0329/1195] Amend the Language Semver specification to reflect the criteria from this RFC. --- text/1122-language-semver.md | 93 ++++++++++++++++++++---------------- 1 file changed, 53 insertions(+), 40 deletions(-) diff --git a/text/1122-language-semver.md b/text/1122-language-semver.md index ed0985d606d..2296b763186 100644 --- a/text/1122-language-semver.md +++ b/text/1122-language-semver.md @@ -14,37 +14,29 @@ how to go about making such changes. With the release of 1.0, we need to establish clear policy on what precisely constitutes a "minor" vs "major" change to the Rust language itself (as opposed to libraries, which are covered by [RFC 1105]). -**This RFC proposes that minor releases may only contain breaking +**This RFC proposes that minor releases may contain breaking changes that fix compiler bugs or other type-system issues**. Primarily, this means soundness issues where "innocent" code can cause undefined behavior (in the technical sense), but it also covers cases like compiler bugs and tightening up the semantics of "underspecified" parts of the language (more details below). -However, simply landing all breaking changes immediately could be very +Simply landing all breaking changes immediately could be very disruptive to the ecosystem. Therefore, **the RFC also proposes specific measures to mitigate the impact of breaking changes**, and some criteria when those measures might be appropriate. -In rare cases, it may be deemed a good idea to make a breaking change +In rare cases, it may be deemed worthwhile to make a breaking change that is not a soundness problem or compiler bug, but rather correcting -a defect in design. Such cases should be rare. But if such a change is -deemed worthwhile, then the guidelines given here can still be used to -mitigate its impact. +a defect in design. **These changes are to be avoided, but may be +permissible if the impact is judged to be negligible (and thus there +is expected to be no breakage in practice).** This RFC also includes +criteria for how to estimate the impact of a change and advice on the +timing of such changes. # Detailed design -The detailed design is broken into two major sections: how to address -soundness changes, and how to address other, opt-in style changes. We -do not discuss non-breaking changes here, since obviously those are -safe. - -### Soundness changes - -When compiler or type-system bugs are encountered in the language -itself (as opposed to in a library), clearly they ought to be -fixed. However, it is important to fix them in such a way as to -minimize the impact on the ecosystem. +### Evaluating the impact of a change The first step then is to evaluate the impact of the fix on the crates found in the `crates.io` website (using e.g. the crater tool). If @@ -56,6 +48,8 @@ problem, which helps those people who are affected to migrate their code. A description of the problem should also appear in the relevant subteam report. +### Techniques for easing the transition + In cases where the impact seems larger, any effort to ease the transition is sure to be welcome. The following are suggestions for possible steps we could take (not all of which will be applicable to @@ -80,6 +74,8 @@ all scenarios): However, this option may frequently not be available, because the source of a compilation error is often hard to pin down with precision. + +### Other factors to consider Some of the factors that should be taken into consideration when deciding whether and how to minimize the impact of a fix: @@ -142,10 +138,19 @@ opt out will thus be removed in a later release. But in some cases, particularly those cases where the severity of the problem is relatively small, it could be an option to leave the "opt out" mechanism in place permanently. In either case, use of the "opt out" -API would trigger the deprecation lint. +API would trigger the deprecation lint. Note that we should make every +effort to ensure that crates which employ this opt out can be used +compatibly with crates that do not. -Note that we should make every effort to ensure that crates which -employ this opt out can be used compatibly with crates that do not. +Opt outs should be specified using the `#[legacy(foo)]` attribute. +This attribute intentionally ignores unrecognized opt-outs (such as +`foo`) to allow for forwards compatibility with opt-outs that may be +added in later compiler releases (in such cases, older compilers will +naturally perform the legacy behavior). + +Ideally, opt-outs should be constructed in as targeted a fashion as +possible. That means it is generally better, for example, to have +users opt out individual items than an entire crate at once. #### Changes that alter dynamic semantics versus typing rules @@ -229,6 +234,34 @@ future as well. The `-Z` flags are of course explicitly unstable, but some of the `-C`, rustdoc, and linker-specific flags are expected to evolve over time (see e.g. [#24451]). +#### Other kinds of breaking changes + +From time to time, we may find a flaw in a design that is neither a +soundness concern nor a bug fix, but rather simply a suboptimal +decision. In general, it is best to find ways to correct such errors +without making breaking changes, such as improved error messages or +deprecation. However, if the impact of making the change is judged to +be negligible, we can also consider fixing the problem, presuming that +the following criteria are met: + +- All data indicates that correcting this flaw will break extremely little + or no existing code (such as crates.io testing, communication with production + users of Rust or other private developers, etc). +- The feature was only recently stabilized, preferably in the previous + cycle. This minimizes the possibility that a large body of code has + crept up that relies on this feature. + - If and when we establish LTS releases, we should never make + changes to features marked as stable in a LTS release (except for + soundness reasons). +- There is no backwards compatible way to repair the problem. + +Naturally, all of the concerns listed above in the section "Other +Factors to Consider" also apply here. For example, we should consider +the quality of the error messages that result from the breaking +change, and evaluate whether it is possible to write code that works +both before/after the change (which enables users to span compiler +versions). + # Drawbacks The primary drawback is that making breaking changes are disruptive, @@ -269,26 +302,6 @@ observe on `crates.io` will be of the total breakage that will occur: it is certainly possible that all crates on `crates.io` work fine, but the change still breaks a large body of code we do not have access to. -**What attribute should we use to "opt out" of soundness changes?** -The section on breaking changes indicated that it may sometimes be -appropriate to includ an "opt out" that people can use to temporarily -revert to older, unsound type rules, but did not specify precisely -what that opt-out should look like. Ideally, we would identify a -specific attribute in advance that will be used for such purposes. In -the past, we have simply created ad-hoc attributes (e.g., -`#[old_orphan_check]`), but because custom attributes are forbidden by -stable Rust, this has the unfortunate side-effect of meaning that code -which opts out of the newer rules cannot be compiled on older -compilers (even though it's using the older type system rules). If we -introduce an attribute in advance we will not have this problem. - -**Are there any other circumstances in which we might perform a -breaking change?** In particular, it may happen from time to time that -we wish to alter some detail of a stable component. If we believe that -this change will not affect anyone, such a change may be worth doing, -but we'll have to work out more precise guidelines. [RFC 1156] is an -example. - [RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 [RFC 320]: https://github.com/rust-lang/rfcs/pull/320 [#744]: https://github.com/rust-lang/rfcs/issues/744 From e17464cb6436f98a19640e8a05e952a4b5b6cfd4 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 16 Jun 2015 17:45:57 -0400 Subject: [PATCH 0330/1195] Adjust policy text to include warnings in 1.2, hard error in 1.3, and remove the legacy attribute (also adjust text of RFC #1122 slightly). --- text/0000-adjust-default-object-bounds.md | 243 ++++++++++++++++++++++ text/1122-language-semver.md | 28 ++- 2 files changed, 262 insertions(+), 9 deletions(-) create mode 100644 text/0000-adjust-default-object-bounds.md diff --git a/text/0000-adjust-default-object-bounds.md b/text/0000-adjust-default-object-bounds.md new file mode 100644 index 00000000000..a85a7d4126e --- /dev/null +++ b/text/0000-adjust-default-object-bounds.md @@ -0,0 +1,243 @@ +- Feature Name: N/A +- Start Date: 2015-06-4 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Adjust the object default bound algorithm for cases like `&'x +Box` and `&'x Arc`. The existing algorithm would default +to `&'x Box`. The proposed change is to default to `&'x +Box`. + +Note: This is a **BREAKING CHANGE**. The change has +[been implemented][branch] and its impact has been evaluated. It was +[found][crater] to cause **no root regressions** on `crates.io`. +Nonetheless, to minimize impact, this RFC proposes phasing in the +change as follows: + +- In Rust 1.2, a warning will be issued for code which will break when the + defaults are changed. This warning can be disabled by using explicit + bounds. The warning will only be issued when explicit bounds would be required + in the future anyway. +- In Rust 1.3, the change will be made permanent. Any code that has + not been updated by that time will break. + +# Motivation + +When we instituted default object bounds, [RFC 599] specified that +`&'x Box` (and `&'x mut Box`) should expand to `&'x +Box` (and `&'x mut Box`). This is in contrast to a +`Box` type that appears outside of a reference (e.g., `Box`), +which defaults to using `'static` (`Box`). This +decision was made because it meant that a function written like so +would accept the broadest set of possible objects: + +```rust +fn foo(x: &Box) { +} +``` + +In particular, under the current defaults, `foo` can be supplied an +object which references borrowed data. Given that `foo` is taking the +argument by reference, it seemed like a good rule. Experience has +shown otherwise (see below for some of the problems encountered). + +This RFC proposes changing the default object bound rules so that the +default is drawn from the innermost type that encloses the trait +object. If there is no such type, the default is `'static`. The type +is a reference (e.g., `&'r Trait`), then the default is the lifetime +`'r` of that reference. Otherwise, the type must in practice be some +user-declared type, and the default is derived from the declaration: +if the type declares a lifetime bound, then this lifetime bound is +used, otherwise `'static` is used. This means that (e.g.) `&'r +Box` would default to `&'r Box`, and `&'r +Ref<'q, Trait>` (from `RefCell`) would default to `&'r Ref<'q, +Trait+'q>`. + +### Problems with the current default. + +**Same types, different expansions.** One problem is fairly +predictable: the current default means that identical types differ in +their interpretation based on where they appear. This is something we +have striven to avoid in general. So, as an example, this code +[will not type-check](http://is.gd/Yaak1l): + +```rust +trait Trait { } + +struct Foo { + field: Box +} + +fn do_something(f: &mut Foo, x: &mut Box) { + mem::swap(&mut f.field, &mut *x); +} +``` + +Even though `x` is a reference to a `Box` and the type of +`field` is a `Box`, the expansions differ. `x` expands to `&'x +mut Box` and the field expands to `Box`. In +general, we have tried to ensure that if the type is *typed precisely +the same* in a type definition and a fn definition, then those two +types are equal (note that fn definitions allow you to omit things +that cannot be omitted in types, so some types that you can enter in a +fn definition, like `&i32`, cannot appear in a type definition). + +Now, the same is of course true for the type `Trait` itself, which +appears identically in different contexts and is expanded in different +ways. This is not a problem here because the type `Trait` is unsized, +which means that it cannot be swapped or moved, and hence the main +sources of type mismatches are avoided. + +**Mental model.** In general the mental model of the newer rules seems +simpler: once you move a trait object into the heap (via `Box`, or +`Arc`), you must explicitly indicate whether it can contain borrowed +data or not. So long as you manipulate by reference, you don't have +to. In contrast, the current rules are more subtle, since objects in +the heap may still accept borrowed data, if you have a reference to +the box. + +**Poor interaction with the dropck rules.** When implementing the +newer dropck rules specified by [RFC 769], we found a +[rather subtle problem] that would arise with the current defaults. +The precise problem is spelled out in appendix below, but the TL;DR is +that if you wish to pass an array of boxed objects, the current +defaults can be actively harmful, and hence force you to specify +explicit lifetimes, whereas the newer defaults do something +reasonable. + +# Detailed design + +The rules for user-defined types from RFC 599 are altered as follows +(text that is not changed is italicized): + +- *If `SomeType` contains a single where-clause like `T:'a`, where + `T` is some type parameter on `SomeType` and `'a` is some + lifetime, then the type provided as value of `T` will have a + default object bound of `'a`. An example of this is + `std::cell::Ref`: a usage like `Ref<'x, X>` would change the + default for object types appearing in `X` to be `'a`.* +- If `SomeType` contains no where-clauses of the form `T:'a`, then + the "base default" is used. The base default depends on the overall context: + - in a fn body, the base default is a fresh inference variable. + - outside of a fn body, such in a fn signature, the base default + is `'static`. + Hence `Box` would typically be a default of `'static` for `X`, + regardless of whether it appears underneath an `&` or not. + (Note that in a fn body, the inference is strong enough to adopt `'static` + if that is the necessary bound, or a looser bound if that would be helpful.) +- *If `SomeType` contains multiple where-clauses of the form `T:'a`, + then the default is cleared and explicit lifetiem bounds are + required. There are no known examples of this in the standard + library as this situation arises rarely in practice.* + +# Timing and breaking change implications + +This is a breaking change, and hence it behooves us to evaluate the +impact and describe a procedure for making the change as painless as +possible. One nice propery of this change is that it only affects +*defaults*, which means that it is always possible to write code that +compiles both before and after the change by avoiding defaults in +those cases where the new and old compiler disagree. + +The estimated impact of this change is very low, for two reasons: +- A recent test of crates.io found [no regressions][crater] caused by + this change (however, a [previous run] (from before Rust 1.0) found 8 + regressions). +- This feature was only recently stabilized as part of Rust 1.0 (and + was only added towards the end of the release cycle), so there + hasn't been time for a large body of dependent code to arise + outside of crates.io. + +Nonetheless, to minimize impact, this RFC proposes phasing in the +change as follows: + +- In Rust 1.2, a warning will be issued for code which will break when the + defaults are changed. This warning can be disabled by using explicit + bounds. The warning will only be issued when explicit bounds would be required + in the future anyway. + - Specifically, types that were written `&Box` where the + (boxed) trait object may contain references should now be written + `&Box` to disable the warning. +- In Rust 1.3, the change will be made permanent. Any code that has + not been updated by that time will break. + +# Drawbacks + +The primary drawback is that this is a breaking change, as discussed +in the previous section. + +# Alternatives + +Keep the current design, with its known drawbacks. + +# Unresolved questions + +None. + +# Appendix: Details of the dropck problem + +This appendix goes into detail about the sticky interaction with +dropck that was uncovered. The problem arises if you have a function +that wishes to take a mutable slice of objects, like so: + +```rust +fn do_it(x: &mut [Box]) { ... } +``` + +Here, `&mut [..]` is used because the objects are `FnMut` objects, and +hence require `&mut self` to call. This function in turn is expanded +to: + +```rust +fn do_it<'x>(x: &'x mut [Box]) { ... } +``` + +Now callers might try to invoke the function as so: + +```rust +do_it(&mut [Box::new(val1), Box::new(val2)]) +``` + +Unfortunately, this code fails to compile -- in fact, it cannot be +made to compile without changing the definition of `do_it`, due to a +sticky interaction between dropck and variance. The problem is that +dropck requires that all data in the box strictly outlives the +lifetime of the box's owner. This is to prevent cyclic +content. Therefore, the type of the objects must be `Box` +where `'R` is some region that strictly outlives the array itself (as +the array is the owner of the objects). However, the signature of +`do_it` demands that the reference to the array has the same lifetime +as the trait objects within (and because this is an `&mut` reference +and hence invariant, no approximation is permitted). This implies that +the array must live for at least the region `'R`. But we defined the +region `'R` to be some region that outlives the array, so we have a +quandry. + +The solution is to change the definition of `do_it` in one of two +ways: + +```rust +// Use explicit lifetimes to make it clear that the reference is not +// required to have the same lifetime as the objects themselves: +fn do_it1<'a,'b>(x: &'a mut [Box]) { ... } + +// Specifying 'static is easier, but then the closures cannot +// capture the stack: +fn do_it2(x: &'a mut [Box]) { ... } +``` + +Under the proposed RFC, `do_it2` would be the default. If one wanted +to use lifetimes, then one would have to use explicit lifetime +overrides as shown in `do_it1`. This is consistent with the mental +model of "once you box up an object, you must add annotations for it +to contain borrowed data". + +[RFC 599]: 0599-default-object-bound.md +[RFC 769]: 0769-sound-generic-drop.md +[rather subtle problem]: https://github.com/rust-lang/rust/pull/25212#issuecomment-100244929 +[crater]: https://gist.github.com/brson/085d84d43c6a9a8d4dc3 +[branch]: https://github.com/nikomatsakis/rust/tree/better-object-defaults +[previous run]: https://gist.github.com/brson/80f9b80acef2e7ab37ee +[RFC 1122]: https://github.com/rust-lang/rfcs/pull/1122 diff --git a/text/1122-language-semver.md b/text/1122-language-semver.md index 2296b763186..fe13868f47e 100644 --- a/text/1122-language-semver.md +++ b/text/1122-language-semver.md @@ -242,7 +242,8 @@ decision. In general, it is best to find ways to correct such errors without making breaking changes, such as improved error messages or deprecation. However, if the impact of making the change is judged to be negligible, we can also consider fixing the problem, presuming that -the following criteria are met: +the following criteria are met (in addition to the criteria listed +above in the section "Other Factors to Consider"): - All data indicates that correcting this flaw will break extremely little or no existing code (such as crates.io testing, communication with production @@ -255,12 +256,23 @@ the following criteria are met: soundness reasons). - There is no backwards compatible way to repair the problem. -Naturally, all of the concerns listed above in the section "Other -Factors to Consider" also apply here. For example, we should consider -the quality of the error messages that result from the breaking -change, and evaluate whether it is possible to write code that works -both before/after the change (which enables users to span compiler -versions). +When we do decide to make such a change, we should follow one of the +following ways to ease the transition, in order of preference: + +1. When possible, issue a release that gives warnings when changes will be required. + - These warnings should be as targeted as possible -- if the warning + is too broad, it can easily cause more annoyance than the change + itself. + - To maximize the chance of these warnings being taken seriously, + the warning should not be a lint. The only way to disable it is to + make a change that will be forwards compatible with the new + version. + - After a suitable time has elapsed (at least one cycle, possibly + more), make the actual change. +2. If warnings are not possible, then include a `#[legacy]` attribute + that allows the old behavior to be restored. + - Ideally, this `#[legacy]` attribute will only persist for a fixed + number of cycles. # Drawbacks @@ -269,8 +281,6 @@ even when done with the best of intentions. The alternatives list some ways that we could avoid breaking changes altogether, and the downsides of each. -## Notes on phasing - # Alternatives **Rather than simply fixing soundness bugs, we could issue new major From 0b056f7a3b57c085ac73d221c844fa9aed6d252a Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 16 Jun 2015 18:17:22 -0400 Subject: [PATCH 0331/1195] Reverse changes to #1122, which I will extract into a distinct RFC for more visibility --- text/1122-language-semver.md | 107 ++++++++++++++--------------------- 1 file changed, 42 insertions(+), 65 deletions(-) diff --git a/text/1122-language-semver.md b/text/1122-language-semver.md index fe13868f47e..ed0985d606d 100644 --- a/text/1122-language-semver.md +++ b/text/1122-language-semver.md @@ -14,29 +14,37 @@ how to go about making such changes. With the release of 1.0, we need to establish clear policy on what precisely constitutes a "minor" vs "major" change to the Rust language itself (as opposed to libraries, which are covered by [RFC 1105]). -**This RFC proposes that minor releases may contain breaking +**This RFC proposes that minor releases may only contain breaking changes that fix compiler bugs or other type-system issues**. Primarily, this means soundness issues where "innocent" code can cause undefined behavior (in the technical sense), but it also covers cases like compiler bugs and tightening up the semantics of "underspecified" parts of the language (more details below). -Simply landing all breaking changes immediately could be very +However, simply landing all breaking changes immediately could be very disruptive to the ecosystem. Therefore, **the RFC also proposes specific measures to mitigate the impact of breaking changes**, and some criteria when those measures might be appropriate. -In rare cases, it may be deemed worthwhile to make a breaking change +In rare cases, it may be deemed a good idea to make a breaking change that is not a soundness problem or compiler bug, but rather correcting -a defect in design. **These changes are to be avoided, but may be -permissible if the impact is judged to be negligible (and thus there -is expected to be no breakage in practice).** This RFC also includes -criteria for how to estimate the impact of a change and advice on the -timing of such changes. +a defect in design. Such cases should be rare. But if such a change is +deemed worthwhile, then the guidelines given here can still be used to +mitigate its impact. # Detailed design -### Evaluating the impact of a change +The detailed design is broken into two major sections: how to address +soundness changes, and how to address other, opt-in style changes. We +do not discuss non-breaking changes here, since obviously those are +safe. + +### Soundness changes + +When compiler or type-system bugs are encountered in the language +itself (as opposed to in a library), clearly they ought to be +fixed. However, it is important to fix them in such a way as to +minimize the impact on the ecosystem. The first step then is to evaluate the impact of the fix on the crates found in the `crates.io` website (using e.g. the crater tool). If @@ -48,8 +56,6 @@ problem, which helps those people who are affected to migrate their code. A description of the problem should also appear in the relevant subteam report. -### Techniques for easing the transition - In cases where the impact seems larger, any effort to ease the transition is sure to be welcome. The following are suggestions for possible steps we could take (not all of which will be applicable to @@ -74,8 +80,6 @@ all scenarios): However, this option may frequently not be available, because the source of a compilation error is often hard to pin down with precision. - -### Other factors to consider Some of the factors that should be taken into consideration when deciding whether and how to minimize the impact of a fix: @@ -138,19 +142,10 @@ opt out will thus be removed in a later release. But in some cases, particularly those cases where the severity of the problem is relatively small, it could be an option to leave the "opt out" mechanism in place permanently. In either case, use of the "opt out" -API would trigger the deprecation lint. Note that we should make every -effort to ensure that crates which employ this opt out can be used -compatibly with crates that do not. - -Opt outs should be specified using the `#[legacy(foo)]` attribute. -This attribute intentionally ignores unrecognized opt-outs (such as -`foo`) to allow for forwards compatibility with opt-outs that may be -added in later compiler releases (in such cases, older compilers will -naturally perform the legacy behavior). +API would trigger the deprecation lint. -Ideally, opt-outs should be constructed in as targeted a fashion as -possible. That means it is generally better, for example, to have -users opt out individual items than an entire crate at once. +Note that we should make every effort to ensure that crates which +employ this opt out can be used compatibly with crates that do not. #### Changes that alter dynamic semantics versus typing rules @@ -234,46 +229,6 @@ future as well. The `-Z` flags are of course explicitly unstable, but some of the `-C`, rustdoc, and linker-specific flags are expected to evolve over time (see e.g. [#24451]). -#### Other kinds of breaking changes - -From time to time, we may find a flaw in a design that is neither a -soundness concern nor a bug fix, but rather simply a suboptimal -decision. In general, it is best to find ways to correct such errors -without making breaking changes, such as improved error messages or -deprecation. However, if the impact of making the change is judged to -be negligible, we can also consider fixing the problem, presuming that -the following criteria are met (in addition to the criteria listed -above in the section "Other Factors to Consider"): - -- All data indicates that correcting this flaw will break extremely little - or no existing code (such as crates.io testing, communication with production - users of Rust or other private developers, etc). -- The feature was only recently stabilized, preferably in the previous - cycle. This minimizes the possibility that a large body of code has - crept up that relies on this feature. - - If and when we establish LTS releases, we should never make - changes to features marked as stable in a LTS release (except for - soundness reasons). -- There is no backwards compatible way to repair the problem. - -When we do decide to make such a change, we should follow one of the -following ways to ease the transition, in order of preference: - -1. When possible, issue a release that gives warnings when changes will be required. - - These warnings should be as targeted as possible -- if the warning - is too broad, it can easily cause more annoyance than the change - itself. - - To maximize the chance of these warnings being taken seriously, - the warning should not be a lint. The only way to disable it is to - make a change that will be forwards compatible with the new - version. - - After a suitable time has elapsed (at least one cycle, possibly - more), make the actual change. -2. If warnings are not possible, then include a `#[legacy]` attribute - that allows the old behavior to be restored. - - Ideally, this `#[legacy]` attribute will only persist for a fixed - number of cycles. - # Drawbacks The primary drawback is that making breaking changes are disruptive, @@ -281,6 +236,8 @@ even when done with the best of intentions. The alternatives list some ways that we could avoid breaking changes altogether, and the downsides of each. +## Notes on phasing + # Alternatives **Rather than simply fixing soundness bugs, we could issue new major @@ -312,6 +269,26 @@ observe on `crates.io` will be of the total breakage that will occur: it is certainly possible that all crates on `crates.io` work fine, but the change still breaks a large body of code we do not have access to. +**What attribute should we use to "opt out" of soundness changes?** +The section on breaking changes indicated that it may sometimes be +appropriate to includ an "opt out" that people can use to temporarily +revert to older, unsound type rules, but did not specify precisely +what that opt-out should look like. Ideally, we would identify a +specific attribute in advance that will be used for such purposes. In +the past, we have simply created ad-hoc attributes (e.g., +`#[old_orphan_check]`), but because custom attributes are forbidden by +stable Rust, this has the unfortunate side-effect of meaning that code +which opts out of the newer rules cannot be compiled on older +compilers (even though it's using the older type system rules). If we +introduce an attribute in advance we will not have this problem. + +**Are there any other circumstances in which we might perform a +breaking change?** In particular, it may happen from time to time that +we wish to alter some detail of a stable component. If we believe that +this change will not affect anyone, such a change may be worth doing, +but we'll have to work out more precise guidelines. [RFC 1156] is an +example. + [RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105 [RFC 320]: https://github.com/rust-lang/rfcs/pull/320 [#744]: https://github.com/rust-lang/rfcs/issues/744 From d5284eb212c836ddf307722d10a694cc21b112f9 Mon Sep 17 00:00:00 2001 From: David Turner Date: Wed, 17 Jun 2015 12:13:37 -0400 Subject: [PATCH 0332/1195] read_exact and read_full --- text/0000-read-all.md | 57 ++++++++++++++++++++++++------------------- 1 file changed, 32 insertions(+), 25 deletions(-) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index 2aebad18ca9..606b644dee5 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -1,4 +1,4 @@ -- Feature Name: read_all +- Feature Name: read_exact and read_full - Start Date: 2015-03-15 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -6,54 +6,61 @@ # Summary Rust's Write trait has write_all, which attempts to write an entire -buffer. This proposal adds read_all, which attempts to read a fixed -number of bytes into a given buffer. +buffer. This proposal adds two new methods, read_full and read_exact. +read_full attempts to read a fixed number of bytes into a given +buffer, and returns Ok(n) if it succeeds or in the event of EOF. +read_exact attempts to read a fixed number of bytes into a given +buffer, and returns Ok(n) if it succeeds and Err(ErrorKind::ShortRead) +if it fails. # Motivation -The new read_all method will allow programs to read from disk without -having to write their own read loops. Most Rust programs which need -to read from disk will prefer this to the plain read function. Many C -programs have the same need, and solve it the same way (e.g. git has -read_in_full). Here's one example of a Rust library doing this: +The new read_exact method will allow programs to read from disk +without having to write their own read loops to handle EINTR. Most +Rust programs which need to read from disk will prefer this to the +plain read function. Many C programs have the same need, and solve it +the same way (e.g. git has read_in_full). Here's one example of a +Rust library doing this: https://github.com/BurntSushi/byteorder/blob/master/src/new.rs#L184 +The read_full method is useful the common case of implementing +buffered reads from a file or socket. In this case, a short read due +to EOF is an expected outcome, and the caller must check the number of +bytes returned. + # Detailed design -The read_all function will take a mutable, borrowed slice of u8 to +The read_full function will take a mutable, borrowed slice of u8 to read into, and will attempt to fill that entire slice with data. It will loop, calling read() once per iteration and attempting to read the remaining amount of data. If read returns EINTR, the loop will retry. If there are no more bytes to read (as signalled by a return -of Ok(0) from read()), a new error type, ErrorKind::ShortRead(usize), -will be returned. ShortRead includes the number of bytes successfully -read. In the event of another error, that error will be +of Ok(0) from read()), the number of bytes read so far +will be returned. In the event of another error, that error will be returned. After a read call returns having successfully read some bytes, the total number of bytes read will be updated. If that total -is equal to the size of the buffer, read will return successfully. +is equal to the size of the buffer, read_full will return successfully. + +The read_exact method can be implemented in terms of read_full. # Drawbacks The major weakness of this API (shared with write_all) is that in the event of an error, there is no way to return the number of bytes that -were successfully read before the error. But since that is the design -of write_all, it makes sense to mimic that design decision for read_all. +were successfully read before the error. But returning that data +would require a much more complicated return type, as well as +requiring more work on the part of callers. # Alternatives One alternative design would return some new kind of Result which could report the number of bytes sucessfully read before an error. -This would be inconsistent with write_all, but arguably more correct. - -If we wanted io::ErrorKind to be a smaller type, ErrorKind::ShortRead -could be unparameterized. But this would reduce the information -available to calleres. -Finally, in the event of a short read, we could return Ok(number of -bytes read before EOF) instead of an error. But then every user would -have to check for this case. And it would be inconsistent with -write_all. +If we wanted one method instead of two, ErrorKind::ShortRead could be +parameterized with the number of bytes read before EOF. But this +would increase the size of ErrorKind. Or we could leave this out, and let every Rust user write their own -read_all function -- like savages. +read_full or read_exact function, or import a crate of stuff just for +this one function. From f65d966b3bbac8705d0962a663687478c37a2378 Mon Sep 17 00:00:00 2001 From: rkjnsn Date: Wed, 17 Jun 2015 13:14:11 -0700 Subject: [PATCH 0333/1195] Modify read_full/read_exact RFC Expand detailed design to include signatures and default definitions, and make a number of other modifications. --- text/0000-read-all.md | 119 +++++++++++++++++++++++++++--------------- 1 file changed, 76 insertions(+), 43 deletions(-) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index 606b644dee5..e3686756335 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -5,62 +5,95 @@ # Summary -Rust's Write trait has write_all, which attempts to write an entire -buffer. This proposal adds two new methods, read_full and read_exact. -read_full attempts to read a fixed number of bytes into a given -buffer, and returns Ok(n) if it succeeds or in the event of EOF. -read_exact attempts to read a fixed number of bytes into a given -buffer, and returns Ok(n) if it succeeds and Err(ErrorKind::ShortRead) -if it fails. +Rust's `Write` trait has `write_all`, which is a convenience method that calls +`write` repeatedly to write an entire buffer. This proposal adds two similar +convenience methods to the `Read` trait: `read_full` and `read_exact`. +`read_full` calls `read` repeatedly until the buffer has been filled, EOF has +been reached, or an error other than `Interrupted` occurs. `read_exact` is +similar to `read_full`, except that reaching EOF before filling the buffer is +considered an error. # Motivation -The new read_exact method will allow programs to read from disk -without having to write their own read loops to handle EINTR. Most -Rust programs which need to read from disk will prefer this to the -plain read function. Many C programs have the same need, and solve it -the same way (e.g. git has read_in_full). Here's one example of a -Rust library doing this: -https://github.com/BurntSushi/byteorder/blob/master/src/new.rs#L184 - -The read_full method is useful the common case of implementing -buffered reads from a file or socket. In this case, a short read due -to EOF is an expected outcome, and the caller must check the number of -bytes returned. +The `read` method may return fewer bytes than requested, and may fail with an +`Interrupted` error if a signal is received during the call. This requires +programs wishing to fill a buffer to call `read` repeatedly in a loop. This is +a very common need, and it would be nice if this functionality were provided in +the standard library. Many C and Rust programs have the same need, and solve it +in the same way. For example, Git has [`read_in_full`][git], which behaves like +the proposed `read_full`, and the Rust byteorder crate has +[`read_full`][byteorder], which behaves like the proposed `read_exact`. +[git]: https://github.com/git/git/blob/16da57c7c6c1fe92b32645202dd19657a89dd67d/wrapper.c#L246 +[byteorder]: https://github.com/BurntSushi/byteorder/blob/2358ace61332e59f596c9006e1344c97295fdf72/src/new.rs#L184 # Detailed design -The read_full function will take a mutable, borrowed slice of u8 to -read into, and will attempt to fill that entire slice with data. +The following methods will be added to the `Read` trait: + +``` rust +fn read_full(&mut self, buf: &mut [u8]) -> Result; +fn read_exact(&mut self, buf: &mut [u8]) -> Result<()>; +``` + +Additionally, default implementations of these methods will be provided: + +``` rust +fn read_full(&mut self, mut buf: &mut [u8]) -> Result { + let mut read = 0; + while buf.len() > 0 { + match self.read(buf) { + Ok(0) => break, + Ok(n) => { read += n; let tmp = buf; buf = &mut tmp[n..]; } + Err(ref e) if e.kind() == ErrorKind::Interrupted => {} + Err(e) => return Err(e), + } + } + Ok(read) +} -It will loop, calling read() once per iteration and attempting to read -the remaining amount of data. If read returns EINTR, the loop will -retry. If there are no more bytes to read (as signalled by a return -of Ok(0) from read()), the number of bytes read so far -will be returned. In the event of another error, that error will be -returned. After a read call returns having successfully read some -bytes, the total number of bytes read will be updated. If that total -is equal to the size of the buffer, read_full will return successfully. +fn read_exact(&mut self, buf: &mut [u8]) -> Result<()> { + if try!(self.read_full(buf)) != buf.len() { + Err(Error::new(ErrorKind::UnexpectedEOF, "failed to fill whole buffer")) + } else { + Ok(()) + } +} +``` -The read_exact method can be implemented in terms of read_full. +Finally, a new `ErrorKind::UnexpectedEOF` will be introduced, which will be +returned by `read_exact` in the event of a premature EOF. # Drawbacks -The major weakness of this API (shared with write_all) is that in the -event of an error, there is no way to return the number of bytes that -were successfully read before the error. But returning that data -would require a much more complicated return type, as well as -requiring more work on the part of callers. +Like `write_all`, these APIs are lossy: in the event of an error, there is no +way to determine the number of bytes that were successfully read before the +error. However, doing so would complicate the methods, and the caller will want +to simply fail if an error occurs the vast majority of the time. Situations +that require lower level control can still use `read` directly. + +# Unanswered Questions + +Naming. Is `read_full` the best name? Should `UnexpectedEOF` instead be +`ShortRead` or `ReadZero`? # Alternatives -One alternative design would return some new kind of Result which -could report the number of bytes sucessfully read before an error. +Use a more complicated return type to allow callers to retrieve the number of +bytes successfully read before an error occurred. As explained above, this +would complicate the use of these methods for very little gain. It's worth +noting that git's `read_in_full` is similarly lossy, and just returns an error +even if some bytes have been read. + +Only provide `read_exact`, but parameterize the `UnexpectedEOF` or `ShortRead` +error kind with the number of bytes read to allow it to be used in place of +`read_full`. This would be less convenient to use in cases where EOF is not an +error. -If we wanted one method instead of two, ErrorKind::ShortRead could be -parameterized with the number of bytes read before EOF. But this -would increase the size of ErrorKind. +Only provide `read_full`. This would cover most of the convenience (callers +could avoid the read loop), but callers requiring a filled buffer would have to +manually check if all of the desired bytes were read. -Or we could leave this out, and let every Rust user write their own -read_full or read_exact function, or import a crate of stuff just for -this one function. +Finally, we could leave this out, and let every Rust user needing this +functionality continue to write their own `read_full` or `read_exact` function, +or have to track down an external crate just for one straightforward and +commonly used convenience method. From b754769889f39aad409d14965a56f3b74868e245 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 19 Jun 2015 13:35:40 -0700 Subject: [PATCH 0334/1195] RFC 1156 is Adjust default object bounds --- README.md | 1 + ...bject-bounds.md => 1156-adjust-default-object-bounds.md} | 6 +++--- 2 files changed, 4 insertions(+), 3 deletions(-) rename text/{0000-adjust-default-object-bounds.md => 1156-adjust-default-object-bounds.md} (98%) diff --git a/README.md b/README.md index b59eb498ac4..a3e2cf91fa6 100644 --- a/README.md +++ b/README.md @@ -58,6 +58,7 @@ the direction the language is evolving in. * [1096-remove-static-assert.md](text/1096-remove-static-assert.md) * [1122-language-semver.md](text/1122-language-semver.md) * [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) +* [1156-adjust-default-object-bounds.md](text/1156-adjust-default-object-bounds.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-adjust-default-object-bounds.md b/text/1156-adjust-default-object-bounds.md similarity index 98% rename from text/0000-adjust-default-object-bounds.md rename to text/1156-adjust-default-object-bounds.md index a85a7d4126e..b600f095b5b 100644 --- a/text/0000-adjust-default-object-bounds.md +++ b/text/1156-adjust-default-object-bounds.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-06-4 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1156 +- Rust Issue: https://github.com/rust-lang/rust/issues/26438 # Summary @@ -131,7 +131,7 @@ The rules for user-defined types from RFC 599 are altered as follows then the default is cleared and explicit lifetiem bounds are required. There are no known examples of this in the standard library as this situation arises rarely in practice.* - + # Timing and breaking change implications This is a breaking change, and hence it behooves us to evaluate the From 2e61ada06dbe2556d5db1af74600527a77ed793c Mon Sep 17 00:00:00 2001 From: Fraser Hutchison Date: Sat, 20 Jun 2015 02:26:55 +0100 Subject: [PATCH 0335/1195] Fixed broken anchor. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a3e2cf91fa6..8b32f51b4dd 100644 --- a/README.md +++ b/README.md @@ -173,7 +173,7 @@ to the pull request number), at which point the RFC is 'active', or reject it by closing the pull request. ## The role of the shepherd -[The role of the shepherd]: the-role-of-the-shepherd +[The role of the shepherd]: #the-role-of-the-shepherd During triage, every RFC will either be closed or assigned a shepherd. The role of the shepherd is to move the RFC through the process. This From 5b9a48d4ba1e314ce7d4e97d06b91c4a9d6e7865 Mon Sep 17 00:00:00 2001 From: Barosl Lee Date: Wed, 24 Jun 2015 00:28:28 +0900 Subject: [PATCH 0336/1195] Mention the usage in the standard library --- text/0000-rename-connect-to-join.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-rename-connect-to-join.md b/text/0000-rename-connect-to-join.md index 68e3782883a..0ed07ade10c 100644 --- a/text/0000-rename-connect-to-join.md +++ b/text/0000-rename-connect-to-join.md @@ -41,6 +41,10 @@ all functional-ish languages. Note that Rust also has `.concat()` in `SliceConcatExt`, which is a specialized version of `.connect()` that uses an empty string as a separator. +Another reason is that the term "join" already has similar usage in the standard +library. There are `std::path::Path::join` and `std::env::join_paths` which are +used to join the paths. + # Detailed design While the `SliceConcatExt` trait is unstable, the `.connect()` method itself is From 9bfb567af36e26c5e17b6b16325937937c710417 Mon Sep 17 00:00:00 2001 From: Tshepang Lekhonkhobe Date: Tue, 23 Jun 2015 20:55:28 +0200 Subject: [PATCH 0337/1195] fix markup, and use actual links --- text/0505-api-comment-conventions.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0505-api-comment-conventions.md b/text/0505-api-comment-conventions.md index 0908db572e5..9db62ad3f76 100644 --- a/text/0505-api-comment-conventions.md +++ b/text/0505-api-comment-conventions.md @@ -113,15 +113,15 @@ it's important to mark what is not Rust so your tests don't fail. References and citation should be linked 'reference style.' Prefer ``` -[some paper][something] +[Rust website][1] -[something]: http://www.foo.edu/something.pdf) +[1]: http://www.rust-lang.org ``` to ``` -[some paper][http://www.foo.edu/something.pdf] +[Rust website](http://www.rust-lang.org) ``` ## English From 605af898330fb8f615ef87f2cb3fa69cead52677 Mon Sep 17 00:00:00 2001 From: Ivan Petkov Date: Wed, 24 Jun 2015 11:24:19 -0700 Subject: [PATCH 0338/1195] RFC for creation of `IntoRaw{Fd, Socket, Handle}` trait to complement `AsRaw*` --- text/0000-into-raw-fd-socket-handle-traits.md | 68 +++++++++++++++++++ 1 file changed, 68 insertions(+) create mode 100644 text/0000-into-raw-fd-socket-handle-traits.md diff --git a/text/0000-into-raw-fd-socket-handle-traits.md b/text/0000-into-raw-fd-socket-handle-traits.md new file mode 100644 index 00000000000..a92e175c5dd --- /dev/null +++ b/text/0000-into-raw-fd-socket-handle-traits.md @@ -0,0 +1,68 @@ +- Feature Name: into-raw-fd-socket-handle-traits +- Start Date: 2015-06-24 +- RFC PR: +- Rust Issue: + +# Summary + +Introduce and implement `IntoRaw{Fd, Socket, Handle}` traits to complement the +existing `AsRaw{Fd, Socket, Handle}` traits already in the standard library. + +# Motivation + +The `FromRaw{Fd, Socket, Handle}` traits each take ownership of the provided +handle, however, the `AsRaw{Fd, Socket, Handle}` traits do not give up +ownership. Thus, converting from one handle wrapper to another (for example +converting an open `fs::File` to a `process::Stdio`) requires the caller to +either manually `dup` the handle, or `mem::forget` the wrapper, which +is unergonomic and can be prone to mistakes. + +Traits such as `IntoRaw{Fd, Socket, Handle}` will allow for easily transferring +ownership of OS handles, and it will allow wrappers to perform any +cleanup/setup as they find necessary. + +# Detailed design + +The `IntoRaw{Fd, Socket, Handle}` traits will behave exactly like their +`AsRaw{Fd, Socket, Handle}` counterparts, except they will consume the wrapper +before transferring ownership of the handle. + +Note that these traits should **not** have a blanket implementation over `T: +AsRaw{Fd, Socket, Handle}`: these traits should be opt-in so that implementors +can decide if leaking through `mem::forget` is acceptable or another course of +action is required. + +```rust +// Unix +pub trait IntoRawFd { + fn into_raw_fd(self) -> RawFd; +} + +// Windows +pub trait IntoRawSocket { + fn into_raw_socket(self) -> RawSocket; +} + +// Windows +pub trait IntoRawHandle { + fn into_raw_handle(self) -> RawHandle; +} +``` + +# Drawbacks + +This adds three new traits and methods which would have to be maintained. + +# Alternatives + +Instead of defining three new traits we could instead use the +`std::convert::Into` trait over the different OS handles. However, this +approach will not offer a duality between methods such as +`as_raw_fd()`/`into_raw_fd()`, but will instead be `as_raw_fd()`/`into()`. + +Another possibility is defining both the newly proposed traits as well as the +`Into` trait over the OS handles letting the caller choose what they prefer. + +# Unresolved questions + +None at the moment. From 566bb704942f68645d992cbaac36e4b7eb635f80 Mon Sep 17 00:00:00 2001 From: P1start Date: Sat, 27 Jun 2015 15:57:35 +1200 Subject: [PATCH 0339/1195] Remove `str::{windows,chunks}` and `<[T]>::subslice_offset` --- text/0000-slice-string-symmetry.md | 63 +++++++++++++++--------------- 1 file changed, 32 insertions(+), 31 deletions(-) diff --git a/text/0000-slice-string-symmetry.md b/text/0000-slice-string-symmetry.md index e6cbf9490e5..d7cc405275a 100644 --- a/text/0000-slice-string-symmetry.md +++ b/text/0000-slice-string-symmetry.md @@ -5,14 +5,11 @@ # Summary -Add some methods that already exist on slices to strings and vice versa. -Specifically, the following methods should be added: +Add some methods that already exist on slices to strings. Specifically, the +following methods should be added: -- `str::chunks` -- `str::windows` - `str::into_string` -- `String::into_boxed_slice` -- `<[T]>::subslice_offset` +- `String::into_boxed_str` # Motivation @@ -20,48 +17,52 @@ Conceptually, strings and slices are similar types. Many methods are already shared between the two types due to their similarity. However, not all methods are shared between the types, even though many could be. This is a little unexpected and inconsistent. Because of that, this RFC proposes to remedy this -by adding a few methods to both strings and slices to even out these two types’ -available methods. +by adding a few methods to strings to even out these two types’ available +methods. -# Detailed design - -Add the following methods to `str`, presumably as inherent methods: +Specifically, it is currently very difficult to construct a `Box`, while it +is fairly simple to make a `Box<[T]>` by using `Vec::into_boxed_slice`. This RFC +proposes a means of creating a `Box` by converting a `String`. -- `chunks(&self, n: usize) -> Chunks`: Returns an iterator that yields the - *characters* (not bytes) of the string in groups of `n` at a time. Iterator - element type: `&str`. +# Detailed design -- `windows(&self, n: usize) -> Windows`: Returns an iterator over all contiguous - windows of character length `n`. Iterator element type: `&str`. +Add the following method to `str`, presumably as an inherent method: - `into_string(self: Box) -> String`: Returns `self` as a `String`. This is equivalent to `[T]`’s `into_vec`. -`split_at(&self, mid: usize) -> (&str, &str)` would also be on this list, but -there is [an existing RFC](https://github.com/rust-lang/rfcs/pull/1123) for it. - Add the following method to `String` as an inherent method: -- `into_boxed_slice(self) -> Box`: Returns `self` as a `Box`, +- `into_boxed_str(self) -> Box`: Returns `self` as a `Box`, reallocating to cut off any excess capacity if needed. This is required to - provide a safe means of creating `Box`. - -Add the following method to `[T]` (for all `T`), presumably as an inherent -method: + provide a safe means of creating `Box`. This is equivalent to `Vec`’s + `into_boxed_slice`. -- `subslice_offset(&self, inner: &[T]) -> usize`: Returns the offset (in - elements) of an inner slice relative to an outer slice. Panics of `inner` is - not contained within `self`. # Drawbacks -- `str::subslice_offset` is already unstable, so creating a similar method on - `[T]` is perhaps not such a good idea. +None, yet. # Alternatives -- Do a subset of the proposal. For example, the `Box`-related methods could - be removed. +- The original version of this RFC had a few extra methods: + - `str::chunks(&self, n: usize) -> Chunks`: Returns an iterator that yields + the *characters* (not bytes) of the string in groups of `n` at a time. + Iterator element type: `&str`. + + - `str::windows(&self, n: usize) -> Windows`: Returns an iterator over all + contiguous windows of character length `n`. Iterator element type: `&str`. + + This and `str::chunks` aren’t really useful without proper treatment of + graphemes, so they were removed from the RFC. + + - `<[T]>::subslice_offset(&self, inner: &[T]) -> usize`: Returns the offset + (in elements) of an inner slice relative to an outer slice. Panics of + `inner` is not contained within `self`. + + `str::subslice_offset` isn’t yet stable and its usefulness is dubious, so + this method was removed from the RFC. + # Unresolved questions From cbbc7acf74bac5ea7714cd2f2bac284f020b0e1e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Sat, 27 Jun 2015 23:50:10 -0700 Subject: [PATCH 0340/1195] RFC: Allow changing the default allocator Add support to the compiler to override the default allocator, allowing a different allocator to be used by default in Rust programs. Additionally, also switch the default allocator for dynamic libraries and static libraries to using the system malloc instead of jemalloc. --- text/0000-swap-out-jemalloc.md | 235 +++++++++++++++++++++++++++++++++ 1 file changed, 235 insertions(+) create mode 100644 text/0000-swap-out-jemalloc.md diff --git a/text/0000-swap-out-jemalloc.md b/text/0000-swap-out-jemalloc.md new file mode 100644 index 00000000000..737dd905e7c --- /dev/null +++ b/text/0000-swap-out-jemalloc.md @@ -0,0 +1,235 @@ +- Feature Name: `allocator` +- Start Date: 2015-06-27 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add support to the compiler to override the default allocator, allowing a +different allocator to be used by default in Rust programs. Additionally, also +switch the default allocator for dynamic libraries and static libraries to using +the system malloc instead of jemalloc. + +# Motivation + +Note that this issue was [discussed quite a bit][babysteps] in the past, and +the meat of this RFC draws from Niko's post. + +[babysteps]: http://smallcultfollowing.com/babysteps/blog/2014/11/14/allocators-in-rust/ + +Currently all Rust programs by default use jemalloc for an allocator because it +is a fairly reasonable default as it is commonly much faster than the default +system allocator. This is not desirable, however, when embedding Rust code into +other runtimes. Using jemalloc implies that Rust will be using one allocator +while the host application (e.g. Ruby, Firefox, etc) will be using a separate +allocator. Having two allocators in one process generally hurts performance and +is not recommended, so the Rust toolchain needs to provide a method to configure +the allocator. + +In addition to using an entirely separate allocator altogether, some Rust +programs may want to simply instrument allocations or shim in additional +functionality (such as memory tracking statistics). This is currently quite +difficult to do, and would be accomodated with a custom allocation scheme. + +# Detailed design + +The high level design can be found [in this gist][gist], but this RFC intends to +expound on the idea to make it more concrete in terms of what the compiler +implementation will look like. A [sample implementaiton][impl] is available of +this section. + +[gist]: https://gist.github.com/alexcrichton/41c6aad500e56f49abda +[impl]: https://github.com/alexcrichton/rust/tree/less-jemalloc + +### High level design + +The design of this RFC from 10,000 feet (referred to below), which was +[previously outlined][gist] looks like: + +1. Define a set of symbols which correspond to the APIs specified in + `alloc::heap`. The `liballoc` library will call these symbols directly. + Note that this means that each of the symbols take information like the size + of allocations and such. +2. Create two shim libraries which implement these allocation-related functions. + Each shim is shipped with the compiler in the form of a static library. One + shim will redirect to the system allocator, the other shim will bundle a + jemalloc build along with Rust shims to redirect to jemalloc. +3. Intermediate artifacts (rlibs) do not resolve this dependency, they're just + left dangling. +4. When producing a "final artifact", rustc by default links in one of two + shims: + * If we're producing a staticlib or a dylib, link the system shim. + * If we're producing an exe and all dependencies are rlibs link the + jemalloc shim. + +The final link step will be optional, and one could link in any compliant +allocator at that time if so desired. + +### New Attributes + +Two new **unstable** attributes will be added to the compiler: + +* `#![needs_allocator]` indicates that a library requires the "allocation + symbols" to link successfully. This attribute will be attached to `liballoc` + and no other library should need to be tagged as such. Additionally, most + crates don't need to worry about this attribute as they'll transitively link + to liballoc. +* `#![allocator]` indicates that a crate is an allocator crate. This is + currently also used for tagging FFI functions as an "allocation function" + to leverage more LLVM optimizations as well. + +All crates implementing the Rust allocation API must be tagged with +`#![allocator]` to get properly recognized and handled. + +### New Crates + +Two new **unstable** crates will be added to the standard distribution: + +* `alloc_system` is a crate that will be tagged with `#![allocator]` and will + redirect allocation requests to the system allocator. +* `alloc_jemalloc` is another allocator crate that will bundle a static copy of + jemalloc to redirect allocations to. + +Both crates will be available to link to manually, but they will not be +available in stable Rust to start out. + +### Allocation functions + +Each crate tagged `#![allocator]` is expected to provide the full suite of +allocation functions used by Rust, defined as: + +```rust +extern { + fn __rust_allocate(size: usize, align: usize) -> *mut u8; + fn __rust_deallocate(ptr: *mut u8, old_size: usize, align: usize); + fn __rust_reallocate(ptr: *mut u8, old_size: usize, size: usize, + align: usize) -> *mut u8; + fn __rust_reallocate_inplace(ptr: *mut u8, old_size: usize, size: usize, + align: usize) -> usize; + fn __rust_usable_size(size: usize, align: usize) -> usize; +} +``` + +The exact API of all these symbols is considered **unstable** (hence the +leading `__`). This otherwise currently maps to what `liballoc` expects today. +The compiler will not currently typecheck `#![allocator]` crates to ensure +these symbols are defined and have the correct signature. + +Also note that to define the above API in a Rust crate it would look something +like: + +```rust +#[no_mangle] +pub extern fn __rust_allocate(size: usize, align: usize) -> *mut u8 { + /* ... */ +} +``` + +### Limitations of `#![allocator]` + +Allocator crates (those tagged with `#![allocator]`) are not allowed to +transitively depend on a crate which is tagged with `#![needs_allocator]`. This +would introduce a circular dependency which is difficult to link and is highly +likely to otherwise just lead to infinite recursion. + +The compiler will also not immediately verify that crates tagged with +`#![allocator]` do indeed define an appropriate allocation API, and vice versa +if a crate defines an allocation API the compiler will not verify that it is +tagged with `#![allocator]`. This means that the only meaning `#![allocator]` +has to the compiler is to signal that the default allocator should not be +linked. + +### Default allocator specifications + +Target specifications will be extended with two keys: `lib_allocation_crate` +and `exe_allocation_crate`, describing the default allocator crate for these +two kinds of artifacts for each target. The compiler will by default have all +targets redirect to `alloc_system` for both scenarios, but `alloc_jemalloc` will +be used for binaries on OSX, Bitrig, DragonFly, FreeBSD, Linux, OpenBSD, and GNU +Windows. MSVC will notably **not** use jemalloc by default for binaries (we +don't currently build jemalloc on MSVC). + +### Injecting an allocator + +As described above, the compiler will inject an allocator if necessary into the +current compilation. The compiler, however, cannot blindly do so as it can +easily lead to link errors (or worse, two allocators), so it will have some +heuristics for only injecting an allocator when necessary. The steps taken by +the compiler for any particular compilation will be: + +* If no crate in the dependency graph is tagged with `#![needs_allocator]`, then + the compiler does not inject an allocator. +* If only an rlib is being produced, no allocator is injected. +* If any crate tagged with `#[allocator]` has been explicitly linked to (e.g. + via an `extern crate` statement directly or transitively) then no allocator is + injected. +* If two allocators have been linked to explicitly an error is generated. +* If only a binary is being produced, then the target's `exe_allocation_crate` + value is injected, otherwise the `lib_allocation_crate` is injected. + +The compiler will also record that the injected crate is injected, so later +compilations know that rlibs don't actually require the injected crate at +runtime (allowing it to be overridden). + +### Allocators in practice + +Most libraries written in Rust wouldn't interact with the scheme proposed in +this RFC at all as they wouldn't explicitly link with an allocator and generally +are compiled as rlibs. If a Rust dynamic library is used as a dependency, then +its original choice of allocator is propagated throughout the crate graph, but +this rarely happens (except for the compiler itself, which will continue to use +jemalloc). + +Authors of crates which are embedded into other runtimes will start using the +system allocator by default with no extra annotation needed. If they wish to +funnel Rust allocations to the same source as the host application's allocations +then a crate can be written and linked in. + +Finally, providers of allocators will simply provide a crate to do so, and then +applications and/or libraries can make explicit use of the allocator by +depending on it as usual. + +# Drawbacks + +A significant amount of API surface area is being added to the compiler and +standard distribution as part of this RFC, but it is possible for it to all +enter as `#[unstable]`, so we can take our time stabilizing it and perhaps only +stabilize a subset over time. + +The limitation of an allocator crate not being able to link to the standard +library (or libcollections) may be a somewhat significant hit to the ergonomics +of defining an allocator, but allocators are traditionally a very niche class of +library and end up defining their own data structures regardless. + +Libraries on crates.io may accidentally link to an allocator and not actually +use any specific API from it (other than the standard allocation symbols), +forcing transitive dependants to silently use that allocator. + +This RFC does not specify the ability to swap out the allocator via the command +line, which is certainly possible and sometimes more convenient than modifying +the source itself. + +It's possible to define an allocator API (e.g. define the symbols) but then +forget the `#![allocator]` annotation, causing the compiler to wind up linking +two allocators, which may cause link errors that are difficult to debug. + +# Alternatives + +The compiler's knowledge about allocators could be simplified quite a bit to the +point where a compiler flag is used to just turn injection on/off, and then it's +the responsibility of the application to define the necessary symbols if the +flag is turned off. The current implementation of this RFC, however, is not seen +as overly invasive and the benefits of "everything's just a crate" seems worth +it for the mild amount of complexity in the compiler. + +Many of the names (such as `alloc_system`) have a number of alternatives, and +the naming of attributes and functions could perhaps follow a stronger +convention. + +# Unresolved questions + +Does this enable jemalloc to be built without a prefix on Linux? This would +enable us to direct LLVM allocations to jemalloc, which would be quite nice! + +Should BSD-like systems use Rust's jemalloc by default? Many of them have +jemalloc as the system allocator and even the special APIs we use from jemalloc. From 2c32edae9fc10914f8483a059863c297a105e0de Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Sat, 27 Jun 2015 22:54:39 -0700 Subject: [PATCH 0341/1195] RFC: Stabilize the #![no_std] attribute Stabilize the `#![no_std]` attribute while also improving the ergonomics of using libcore by default. Additionally add a new `#![no_core]` attribute to opt out of linking to libcore. Finally, stabilize a number of language items required by libcore which the standard library defines. --- text/0000-stabilize-no_std.md | 267 ++++++++++++++++++++++++++++++++++ 1 file changed, 267 insertions(+) create mode 100644 text/0000-stabilize-no_std.md diff --git a/text/0000-stabilize-no_std.md b/text/0000-stabilize-no_std.md new file mode 100644 index 00000000000..80aea6ac0fa --- /dev/null +++ b/text/0000-stabilize-no_std.md @@ -0,0 +1,267 @@ +- Feature Name: N/A +- Start Date: 2015-06-26 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Stabilize the `#![no_std]` attribute, add a new `#![no_core]` attribute, and +start stabilizing the libcore library. + +# Motivation + +Currently all stable Rust programs must link to the standard library (libstd), +and it is impossible to opt out of this. The standard library is not appropriate +for use cases such as kernels, embedded development, or some various niche cases +in userspace. For these applications Rust itself is appropriate, but the +compiler does not provide a stable interface compiling in this mode. + +The standard distribution provides a library, libcore, which is "the essence of +Rust" as it provides many language features such as iterators, slice methods, +string methods, etc. The defining feature of libcore is that it has 0 +dependencies, unlike the standard library which depends on many I/O APIs, for +example. The purpose of this RFC is to provide a stable method to access +libcore. + +Applications which do not want to use libstd still want to use libcore 99% of +the time, but unfortunately the current `#![no_std]` attribute does not do a +great job in facilitating this. When moving into the realm of not using the +standard library, the compiler should make the use case as ergonomic as +possible, so this RFC proposes different behavior than today's `#![no_std]`. + +Finally, the standard library defines a number of language items which must be +defined when libstd is not used. These language items are: + +* `panic_fmt` +* `eh_personality` +* `stack_exhausted` + +To be able to usefully leverage `#![no_std]` in stable Rust these lang items +must be available in a stable fashion. + +# Detailed Design + +This RFC proposes a nuber of changes: + +* Stabilize the `#![no_std]` attribute after tweaking its behavior slightly +* Stabilize a `#![no_core]` attribute. +* Stabilize the name "core" in libcore. +* Stabilize required language items by the core library. + +## `no_std` + +The `#![no_std]` attribute currently provides two pieces of functionality: + +* The compiler no longer injects `extern crate std` at the top of a crate. +* The prelude (`use std::prelude::v1::*`) is no longer injected at the top of + every module. + +This RFC proposes adding the following behavior to the `#![no_std]` attribute: + +* The compiler will inject `extern crate core` at the top of a crate. +* The libcore prelude will be injected at the top of every module. + +Most uses of `#![no_std]` already want behavior along these lines as they want +to use libcore, just not the standard library. + +## `no_core` + +A new attribute will be added to the compiler, `#![no_core]`, which serves two +purposes: + +* This attribute implies the `#![no_std]` attribute (no std prelude/crate + injection). +* This attribute will prevent core prelude/crate injection. + +Users of `#![no_std]` today who do *not* use libcore would migrate to moving +this attribute instead of `#![no_std]`. + +## Stabilization of libcore + +This RFC does not yet propose a stabilization path for the contents of libcore, +but it proposes stabilizing the name `core` for libcore, paving the way for the +rest of the library to be stabilized. The exact method of stabilizing its +contents will be determined with a future RFC or pull requests. + +## Stabilizing lang items + +This section will describe the purpose for each lang item currently required in +addition to the interface that it will be stabilized with. Each lang item will +no longer be defined with the `#[lang = "..."]` syntax but will instead receive +a dedicated attribute (e.g. `#[panic_fmt]`) to be attached to functions to +identify an implementation. + +Like lang items each of these will only allow one implementor in any crate +dependency graph which will be verified at compile time. Also like today, none +of these lang items will be required unless a static library, dynamic library, +or executable is being produced. In other words, libraries (rlibs) do not need +(and probably should not) to define these items. + +#### `panic_fmt` + +This lang item is the definition of how to panic in Rust. The standard library +defines this by throwing an exception (in a platform-specific manner), but users +of libcore often want to define their own meaning of panicking. The signature of +this function will be: + +```rust +#[panic_fmt] +pub extern fn panic_fmt(msg: &core::fmt::Arguments) -> !; +``` + +This differs with the `panic_fmt` function today in that the file and line +number arguments are omitted. The libcore library will continue to provide +file/line number information in panics (as it does today) by assembling a new +`core::fmt::Arguments` value which uses the old one and appends the file/line +information. + +This signature also differs from today's implementation by taking a `&Arguments` +instead of taking it by value, and the purpose of this is to ensure that the +function has a clearly defined ABI on all platforms in case that is required. + +#### `eh_personality` + +The compiler will continue to compile libcore with landing pads (e.g. cleanup to +run on panics), and a "personality function" is required by LLVM to be available +to call for each landing pad. In the current implementation of panicking, a +personality function is typically just calling a standard personality function +in libgcc (or in MSVC's CRT), but the purpose is to indicate whether an +exception should be caught or whether cleanup should be run for this particular +landing pad and exception combination. + +The exact signature of this function is quite platform-specific, but many users +of libcore will never actually call this function as exceptions will not be +thrown (many will likely compile with `-Z no-landing-pads` anyway). As a result +the signature of this lang item will not be defined, but instead it will simply +be required to be defined (as libcore will reference the symbol name +regardless). + +```rust +#[eh_personality] +pub extern fn eh_personality(...) -> ...; +``` + +The compiler will not check the signature of this function, but it will assign +it a known symbol so libcore can be successfully linked. + +#### `stack_exhausted` + +The current implementation of stack overflow in the compiler is to use LLVM's +segmented stack support, inserting a prologue to every function in an object +file to detect when a stack overflow occurred. When a stack overflow is +detected, LLVM emits code that will call the symbol `__morestack`, which the +Rust distribution provides an implementation of. Our implementation, however, +then in turn calls a this `stack_exhausted` language item to define the +implementation of what happens on stack overflow. + +The compiler therefore needs to ensure that this lang item is present in order +for libcore to be correctly linked, so the lang item will have the following +signature: + +```rust +#[stack_exhausted] +pub extern fn stack_exhausted() -> !; +``` + +The compiler will control the symbol name and visibility of this function. + +# Drawbacks + +The current distribution provides precisely one library, the standard library, +for general consumption of Rust programs. Adding a new one (libcore) is adding +more surface area to the distribution (in addition to adding a new `#![no_core]` +attribute). This surface area is greatly desired, however. + +When using `#![no_std]` the experience of Rust programs isn't always the best as +there are some pitfalls that can be run into easily. For example, macros and +plugins sometimes hardcode `::std` paths, but most ones in the standard +distribution have been updated to use `::core` in the case that `#![no_std]` is +present. Another example is that common utilities like vectors, pointers, and +owned strings are not available without liballoc, which will remain an unstable +library. This means that users of `#![no_std]` will have to reimplement all of +this functionality themselves. + +This RFC does not yet pave a way forward for using `#![no_std]` and producing an +executable because the `#[start]` item is required, but remains feature gated. +This RFC just enables creation of Rust static or dynamic libraries which don't +depend on the standard library in addition to Rust libraries (rlibs) which do +not depend on the standard library. + +On the topic of lang item stabilization, it's likely expected that the +`panic_fmt` lang item must be defined, but the other two, `eh_personality` and +`stack_exhausted` are generally quite surprising. Code using `#![no_std]` is +also likely to very rarely actually make use of these functions: + +* Most no-std contexts don't throw exceptions (or don't have exceptions), so + they either have stubs that panic or just compile with `-Z no-landing-pads`, + so the `eh_personality` may not strictly be necessary to be defined in order + to link against libcore. +* Additionally, most no-std contexts don't actually set up stack overflow + detection, so the `stack_exhausted` function will either never be compiled or + the crates are compiled with `-C no-stack-check` meaning that the item may not + strictly be necessary to be defined. + +Currently, however, a binary distribution of libcore is provided which is +compiled with unwinding and stack overflow checks enabled. Consequently the +libcore library does indeed depend on these two symbols and require these items +to be defined. It is seen as not-that-large of a drawback for the following +reasons: + +* The functions `eh_personality` and `stack_exhausted` are fairly easy to + define, and are only required by end products (not Rust libraries). +* It's easy for the compiler to *stop* requiring these functions to be defined + in the future if we, for example, provide multiple binary copies of libcore in + the standard distribution. + +The final drawback of this RFC is the overall stabilization of the `#![no_std]` +attribute, meaning that the compiler will no longer be able to make assumptions +in the future about a function being defined. Put another way, the `panic_fmt`, +`eh_personality`, and `stack_exhausted` lang items are the only three that will +ever be able to be required to be defined by downstream crates. This is not seen +as too strong of a drawback as it's not clear that the compiler will need to +assume more functions exist. Additionally, the compiler will likely be able to +provide or emit a stub implementation for any future symbol it does need to +exist. + +# Alternatives + +Most of the strategies taken in this RFC have some minor variations on what can +happen: + +* The `#![no_std]` attribute could be stabilized as-is without adding a + `#![no_core]` attribute, requiring users to write `extern crate core` and + import the core prelude manually. The burden of adding `#![no_core]` to the + compiler, however, is seen as not-too-bad compared to the increase in + ergonomics of using `#![no_std]`. +* The language items could continue to use the same `#[lang = "..."]` syntax and + we could just stabilize a subset of the `#[lang]` items. It seems more + consistent, however, to blanket feature-gate all `#[lang]` attributes instead + of allowing three particular ones, so individual attributes are proposed. +* The `panic_fmt` lang item could retain the same signature today, but it has an + unclear ABI (passing `Arguments` by value) and we may not want to 100% commit + to always passing filename/line information on panics. +* The `eh_personality` and `stack_exhausted` lang items could not be required to + be defined, and the compiler could provide aborting stubs to be linked in if + they aren't defined anywhere else. +* The compiler could not require `eh_personality` or `stack_exhausted` if no + crate in the dependency tree has landing pads enabled or stack overflow checks + enabled. This is quite a difficult situation to get into today, however, as + the libcore distribution always has these enabled and Cargo does not easily + provide a method to configure this when compiling crates. The overhead of + defining these functions seems small and because the compiler could stop + requiring them in the future it seems plausibly ok to require them today. +* A `#[lang_items_abort]` attribute could be added to explicitly define the the + `eh_personality` and `stack_exhausted` lang items to immediately abort. This + would avoid us having to stabilize their signatures as we could stabilize just + this attribute and not their definitions. + +# Unresolved Questions + +* How important/common are `#![no_std]` executables? Should this RFC attempt to + stabilize that as well? +* When a staticlib is emitted should the compiler *guarantee* that a + `#![no_std]` one will link by default? This precludes us from ever adding + future require language items for features like unwinding or stack exhaustion + by default. For example if a new security feature is added to LLVM and we'd + like to enable it by default, it may require that a symbol or two is defined + somewhere in the compilation. From 41ae6e6a58cb083eabd56675ebc2e8bab102feef Mon Sep 17 00:00:00 2001 From: Kevin Ballard Date: Tue, 30 Jun 2015 18:54:14 -0700 Subject: [PATCH 0342/1195] Update with review feedback --- text/0000-slice-tail-redesign.md | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/text/0000-slice-tail-redesign.md b/text/0000-slice-tail-redesign.md index 86a2ce5eb0d..b428cca7b9c 100644 --- a/text/0000-slice-tail-redesign.md +++ b/text/0000-slice-tail-redesign.md @@ -40,9 +40,9 @@ fn shift_last_mut(&mut self) -> Option<(&mut T, &mut [T])>; Existing code using `tail()` or `init()` could be translated as follows: -* `slice.tail()` becomes `slice.shift_first().unwrap().1` or `&slice[1..]` -* `slice.init()` becomes `slice.shift_last().unwrap().1` or - `&slice[..slice.len()-1]` +* `slice.tail()` becomes `&slice[1..]` +* `slice.init()` becomes `&slice[..slice.len()-1]` or + `slice.shift_last().unwrap().1` It is expected that a lot of code using `tail()` or `init()` is already either testing `len()` explicitly or using `first()` / `last()` and could be refactored @@ -78,10 +78,12 @@ let (argv0, args_) = args.shift_first().unwrap(); # Drawbacks -The expression `slice.shift_last().unwrap.1` is more cumbersome than +The expression `slice.shift_last().unwrap().1` is more cumbersome than `slice.init()`. However, this is primarily due to the need for `.unwrap()` rather than the need for `.1`, and would affect the more conservative solution -(of making the return type `Option<&[T]>`) as well. +(of making the return type `Option<&[T]>`) as well. Furthermore, the more +idiomatic translation is `&slice[..slice.len()-1]`, which can be used any time +the slice is already stored in a local variable. # Alternatives @@ -90,6 +92,10 @@ more conservative change mentioned above. It still has the same drawback of requiring `.unwrap()` when translating existing code. And it's unclear what the function names should be (the current names are considered suboptimal). +Just deprecate the current methods without adding replacements. This gets rid of +the odd methods today, but it doesn't do anything to make it easier to safely +perform these operations. + # Unresolved questions Is the name correct? There's precedent in this name in the form of From 03b6bc754631eff13a032745787fa2a5bb861758 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 30 Jun 2015 19:07:11 -0700 Subject: [PATCH 0343/1195] RFC 1152 is slice/string symmetry --- ...slice-string-symmetry.md => 1152-slice-string-symmetry.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-slice-string-symmetry.md => 1152-slice-string-symmetry.md} (93%) diff --git a/text/0000-slice-string-symmetry.md b/text/1152-slice-string-symmetry.md similarity index 93% rename from text/0000-slice-string-symmetry.md rename to text/1152-slice-string-symmetry.md index d7cc405275a..0a863c1e587 100644 --- a/text/0000-slice-string-symmetry.md +++ b/text/1152-slice-string-symmetry.md @@ -1,7 +1,7 @@ - Feature Name: `slice_string_symmetry` - Start Date: 2015-06-06 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1152](https://github.com/rust-lang/rfcs/pull/1152) +- Rust Issue: [rust-lang/rust#26697](https://github.com/rust-lang/rust/issues/26697) # Summary From 0d8fcae61e7cbefef6818a6a0173870490befde6 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 1 Jul 2015 10:44:17 -0700 Subject: [PATCH 0344/1195] Mention not stabilizing lang items as an alternative. --- text/0000-stabilize-no_std.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-stabilize-no_std.md b/text/0000-stabilize-no_std.md index 80aea6ac0fa..d577ff240e7 100644 --- a/text/0000-stabilize-no_std.md +++ b/text/0000-stabilize-no_std.md @@ -254,6 +254,9 @@ happen: `eh_personality` and `stack_exhausted` lang items to immediately abort. This would avoid us having to stabilize their signatures as we could stabilize just this attribute and not their definitions. +* The various language items could not be stabilized at this time, allowing + stable libraries that leverage `#![no_std]` but not stable final artifacts + (e.g. staticlibs, dylibs, or binaries). # Unresolved Questions From dc1d2cf443e1f7d0ac20b616db5c5a880a55e6ee Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 1 Jul 2015 10:50:56 -0700 Subject: [PATCH 0345/1195] Add downside of #![no_std] and std interoperation --- text/0000-stabilize-no_std.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/text/0000-stabilize-no_std.md b/text/0000-stabilize-no_std.md index d577ff240e7..fc8ecad2a0e 100644 --- a/text/0000-stabilize-no_std.md +++ b/text/0000-stabilize-no_std.md @@ -89,7 +89,9 @@ This section will describe the purpose for each lang item currently required in addition to the interface that it will be stabilized with. Each lang item will no longer be defined with the `#[lang = "..."]` syntax but will instead receive a dedicated attribute (e.g. `#[panic_fmt]`) to be attached to functions to -identify an implementation. +identify an implementation. It should be noted that these language items are +already not quite the same as other `#[lang]` items due to the ability to rely +on them in a "weak" fashion. Like lang items each of these will only allow one implementor in any crate dependency graph which will be verified at compile time. Also like today, none @@ -213,7 +215,7 @@ reasons: in the future if we, for example, provide multiple binary copies of libcore in the standard distribution. -The final drawback of this RFC is the overall stabilization of the `#![no_std]` +Another drawback of this RFC is the overall stabilization of the `#![no_std]` attribute, meaning that the compiler will no longer be able to make assumptions in the future about a function being defined. Put another way, the `panic_fmt`, `eh_personality`, and `stack_exhausted` lang items are the only three that will @@ -223,6 +225,13 @@ assume more functions exist. Additionally, the compiler will likely be able to provide or emit a stub implementation for any future symbol it does need to exist. +In stabilizing the `#![no_std]` attribute it's likely that a whole ecosystem of +crates will arise which work with `#![no_std]`, but in theory all of these +crates should also interoperate with the rest of the ecosystem using `std`. +Unfortunately, however, there are known cases where this is not possible. For +example if a macro is exported from a `#![no_std]` crate which references items +from `core` it won't work by default with a `std` library. + # Alternatives Most of the strategies taken in this RFC have some minor variations on what can From aba213dae631b84560857b7358faeedbac018f35 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 2 Jul 2015 16:47:06 -0700 Subject: [PATCH 0346/1195] Introduce no_core, don't stabilize it --- text/0000-stabilize-no_std.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-stabilize-no_std.md b/text/0000-stabilize-no_std.md index fc8ecad2a0e..3eb8125aaf8 100644 --- a/text/0000-stabilize-no_std.md +++ b/text/0000-stabilize-no_std.md @@ -44,7 +44,7 @@ must be available in a stable fashion. This RFC proposes a nuber of changes: * Stabilize the `#![no_std]` attribute after tweaking its behavior slightly -* Stabilize a `#![no_core]` attribute. +* Introduce a `#![no_core]` attribute. * Stabilize the name "core" in libcore. * Stabilize required language items by the core library. From f03c50b9e86288cd1627ac134e2d908926d42eea Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Mon, 6 Jul 2015 16:25:55 +1200 Subject: [PATCH 0347/1195] Add a HIR to the compiler Add a high-level intermediate representation (HIR) to the compiler. This is basically a new (and additional) AST more suited for use by the compiler. This is purely an implementation detail of the compiler. It has no effect on the language. Note that adding a HIR does not preclude adding a MIR or LIR in the future. --- text/0000-hir.md | 81 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) create mode 100644 text/0000-hir.md diff --git a/text/0000-hir.md b/text/0000-hir.md new file mode 100644 index 00000000000..be8ef62c1f2 --- /dev/null +++ b/text/0000-hir.md @@ -0,0 +1,81 @@ +- Feature Name: N/A +- Start Date: 2015-07-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + + +# Summary + +Add a high-level intermediate representation (HIR) to the compiler. This is +basically a new (and additional) AST more suited for use by the compiler. + +This is purely an implementation detail of the compiler. It has no effect on the +language. + +Note that adding a HIR does not preclude adding a MIR or LIR in the future. + + +# Motivation + +Currently the AST is used by libsyntax for syntactic operations, by the compiler +for pretty much everything, and in syntax extensions. I propose splitting the +AST into a libsyntax version that is specialised for syntactic operation and +will eventually be stabilised for use by syntax extensions and tools, and the +HIR which is entirely internal to the compiler. + +The benefit of this split is that each AST can be specialised to its task and we +can separate the interface to the compiler (the AST) from its implementation +(the HIR). Specific changes I see that could happen are more ids and spans in +the AST, the AST adhering more closely to the surface syntax, the HIR becoming +more abstract (e.g., combining structs and enums), and using resolved names in +the HIR (i.e., performing name resolution as part of the AST->HIR lowering). + +Not using the AST in the compiler means we can work to stabilise it for syntax +extensions and tools: it will become part of the interface to the compiler. + +I also envisage all syntactic expansion of language constructs (e.g., `for` +loops, `if let`) moving to the lowering step from AST to HIR, rather than being +AST manipulations. That should make both error messages and tool support better +for such constructs. It would be nice to move lifetime elision to the lowering +step too, in order to make the HIR as explicit as possible. + + +# Detailed design + +Initially, the HIR will be an (almost) identical copy of the AST and the +lowering step will simply be a copy operation. Since some constructs (macros, +`for` loops, etc.) are expanded away in libsyntax, these will not be part of the +HIR. Tools such as the AST visitor will need to be duplicated. + +The compiler will be changed to use the HIR throughout (this should mostly be a +matter of change the imports). Incrementally, I expect to move expansion of +language constructs to the lowering step. Further in the future, the HIR should +get more abstract and compact, and the AST should get closer to the surface +syntax. + + +# Drawbacks + +Potentially slower compilations and higher memory use. However, this should be +offset in the long run by making improvements to the compiler easier by having a +more appropriate data structure. + + +# Alternatives + +Leave things as they are. + +Skip the HIR and lower straight to a MIR later in compilation. This has +advantages which adding a HIR does not have, however, it is a far more complex +refactoring and also misses some benefits of the HIR, notably being able to +stabilise the AST for tools and syntax extensions without locking in the +compiler. + + +# Unresolved questions + +How to deal with spans and source code. We could keep the AST around and +reference back to it from the HIR. Or we could copy span information to the HIR +(I plan on doing this initially). Possibly some other solution like keeping the +span info in a side table (note that we need less span info in the compiler than +we do in libsyntax, which is in turn less than tools want). From c7b72a029e28726474c7b28ed410e63b538d85d5 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Tue, 7 Jul 2015 11:20:07 -0700 Subject: [PATCH 0348/1195] Lay groundwork for SIMD. --- text/0000-simd-infrastructure.md | 411 +++++++++++++++++++++++++++++++ 1 file changed, 411 insertions(+) create mode 100644 text/0000-simd-infrastructure.md diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md new file mode 100644 index 00000000000..7876de442dd --- /dev/null +++ b/text/0000-simd-infrastructure.md @@ -0,0 +1,411 @@ +- Feature Name: simd_basics +- Start Date: 2015-06-02 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Lay the ground work for building powerful SIMD functionality. + +# Motivation + +SIMD (Single-Instruction Multiple-Data) is an important part of +performant modern applications. Most CPUs used for that sort of task +provide dedicated hardware and instructions for operating on multiple +values in a single instruction, and exposing this is an important part +of being a low-level language. + +This RFC lays the ground-work for building nice SIMD functionality, +but doesn't fill everything out. The goal here is to provide the raw +types and access to the raw instructions on each platform. + +# Detailed design + +The design comes in three parts: + +- types +- operations +- platform detection + +The general idea is to avoid bad performance cliffs, so that an +intrinsic call in Rust maps to preferably one CPU instruction, or, if +not, the "optimal" sequence required to do the given operation +anyway. This means exposing a *lot* of platform specific details, +since platforms behave very differently: both across architecture +families (x86, x86-64, ARM, MIPS, ...), and even within a family +(x86-64's Skylake, Haswell, Nehalem, ...). + +There is definitely a common core of SIMD functionality shared across +many platforms, but this RFC doesn't try to extract that, it is just +building tools that can be wrapped into a more uniform API later. + +## Background: Where does this code go? + +This RFC is focused on building stable, powerful SIMD functionality in +external crates, not `std`. This makes it much easier to support +functionality only "occasionally" available with Rust's preexisting +`cfg` system. If it were in `std`, there would need to be some highly +delayed `cfg` system so that functions that only work with AVX-2 +support: + +- don't break compilation on systems that don't support it, but +- are still usable on systems that do support it. + +## Types & traits + +A type designed to be used as a SIMD vector is indicated by the +`repr(simd)` attribute. A type marked as such will be compiled to +behave like a SIMD register (as well as the target platform can +support it). + +The types/traits will be defined as follows: + +```rust +#[repr(simd)] +struct Simd2(T, T); +#[repr(simd)] +struct Simd4(T, T, T, T); +#[repr(simd)] +struct Simd8(T, T, T, T, T, T, T, T); +#[repr(simd)] +struct Simd16(T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T); +#[repr(simd)] +struct Simd32(T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, + T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T); +#[repr(simd)] +struct Simd64(T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, + T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, + T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, + T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T); + +trait SimdVector { + type Elem: SimdPrim; + type Bool: SimdVector::Bool>; +} + +impl for Simd2 { + type Elem = T; + type Bool = Simd2; +} +impl for Simd4 { + type Elem = T; + type Bool = Simd4; +} +// ... +impl for Simd64 { + type Elem = T; + type Bool = Simd64; +} + +#[simd_prim_trait] +trait SimdPrim { + type Bool: SimdPrim; +} + +// boolean types, see below +struct bool8i(...); +struct bool16i(...); +struct bool32i(...); +struct bool64i(...); +struct bool32f(...); +struct bool64f(...); + +// specifying what types are SIMD-able. +impl SimdPrim for u8 { type Bool = bool8i; } +impl SimdPrim for i8 { type Bool = bool8i; } +impl SimdPrim for u16 { type Bool = bool16i; } +// ... +impl SimdPrim for i64 { type Bool = bool64i; } + +impl SimdPrim for f32 { type Bool = bool32f; } +impl SimdPrim for f64 { type Bool = bool64f; } + +impl SimdPrim for bool8i { type Bool = bool8i; } +// ... +impl SimdPrim for bool64i { type Bool = bool64i; } + +impl SimdPrim for bool32f { type Bool = bool32f; } +impl SimdPrim for bool64f { type Bool = bool64f; } +``` + +It is illegal to take an internal reference to the fields of a +`repr(simd)` type. + +### `repr(simd)` + +The `simd` `repr` can be attached to a struct and will cause such a +struct to be compiled to a SIMD vector. It is required that the +monomorphised vector consist of only a single "primitive" type, +repeated some number of times. The restrictions on the element type +are exactly the same restrictions as `#[simd_primitive_trait]` traits +impose on their implementing types. + +The `repr(simd)` may not enforce that the trait bound exists/does the +right thing at the type checking level for generic `repr(simd)` +types. As such, it will be possible to get the code-generator to error +out (ala the old `transmute` size errosr), however, this shouldn't +cause problems in practice: libraries wrapping this functionality +would layer type-safety on top (i.e. the `SimdPrim` trait). + +### `simd_primitive_trait` + +Traits marked with the `simd_primitive_trait` attribute are special: +types implementing it are those that can be stored in SIMD +vectors. Initially, only primitives and single-field structs that +store `SimdPrim` types will be allowed to implement it. + +This is explicitly not a lang item: it is legal to have multiple +distinct traits in a compilation. The attribute just adds the +restriction and possibly tweaks type's internal representation (as +such, it's legal for a single type to implement multiple traits with +the attribute, if a bit pointless). + +### Booleans + +SIMD booleans are non-trivial. Many conventional APIs e.g. SSE, and +NEON, use "wide booleans": a large number of bits set to all-zeros +(false) or all-ones, e.g. equality between `Simd4(0_u32, 1, 2, 3)` and +`Simd4(0_u32, 0, 2, 3)` gives (on the CPU) `Simd4(!0_u32, 0, !0, +!0)`. Hence, the boolean types need to have width. It's tempting to +just use the integer types of the appropriate width, but this falls +down for two reasons: + +1. booleans aren't always this format +2. the source of the boolean matters + +The second is easiest: CPUs are complicated beasts, and the hardware +that handles floating point vector operations may be very different to +the hardware that handles integer ones: instructions use different +execution units. It can take several cycles to transfer data between +them. Encoding the provenance/execution unit of the value in the type +makes costs explicit. + +The first is much harder to solve. Some architectures/instruction sets +model booleans as single bits. For example, equality between +`Simd4(0_u32, 1, 2, 3)` and `Simd(0_u32, 0, 2, 3)` gives `1 + 4 + 8 == +0b1101`. One example is AVX-512 which essentially replaces all of the +older SSE through AVX2 boolean-returning instructions with versions +that return those. Using separate types for booleans (and restricting +their API) allows for some serious magic: `Simd4` becomes +`u4`. (This is where the reference-restriction above comes in.) + +## Operations + +CPU vendors usually offer "standard" C headers for their CPU specific +operations, such as [`arm_neon.h`][armneon] and [the `...mmintrin.h` headers for +x86(-64)][x86]. + +[armneon]: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf +[x86]: https://software.intel.com/sites/landingpage/IntrinsicsGuide + +All of these would be exposed as (eventually) stable intrinsics with +names very similar to those that the vendor suggests (only difference +would be some form of manual namespacing, e.g. prefixing with the CPU +target), loadable via an `extern` block with an appropriate ABI. + +```rust +extern "rust-intrinsic" { + fn x86_mm_abs_epi16(a: Simd8) -> Simd8; + // ... +} +``` + +These all use entirely concrete types, and this is the core interface +to these intrinsics: essentially it is just allowing code to exactly +specify a CPU instruction to use. These intrinsics only actually work +on a subset of the CPUs that Rust targets, and are only be available +for `extern`ing on those targets. The signatures are typechecked, but +in a "duck-typed" manner: it will just ensure that the types are SIMD +vectors with the appropriate length and element type, it will not +enforce a specific nominal type. + +There would additionally be a small set of cross-platform operations +that are either generally efficiently supported everywhere or are +extremely useful. These won't necessarily map to a single instruction, +but will be shimmed as efficiently as possible. + +- shuffles and extracting/inserting elements +- comparisons + +Lastly, arithmetic and conversions are supported via built-in operators. + +### Shuffles & element operations + +One of the most powerful features of SIMD is the ability to rearrange +data within vectors, giving super-linear speed-ups sometimes. As such, +shuffles are exposed generally: intrinsics that represent arbitrary +shuffles. + +This may violate the "one instruction per instrinsic" principal +depending on the shuffle, but rearranging SIMD vectors is extremely +useful, and providing a direct intrinsic lets the compiler (a) do the +programmers work in synthesising the optimal (short) sequence of +instructions to get a given shuffle and (b) track data through +shuffles without having to understand all the details of every +platform specific intrinsic for shuffling. + +```rust +extern "rust-intrinsic" { + fn simd_shuffle2(v: T, w: T, i0: u32, i1: u32) -> Simd2; + fn simd_shuffle4(v: T, w: T, i0: u32, i1: u32, i2: u32, i3: u32) -> Simd4; + fn simd_shuffle8(v: T, w: T, + i0: u32, i1: u32, i2: u32, i3: u32, + i4: u32, i5: u32, i6: u32, i7: u32) -> Simd8; + fn simd_shuffle16(v: T, w: T, + i0: u32, i1: u32, i2: u32, i3: u32, + i4: u32, i5: u32, i6: u32, i7: u32 + i8: u32, i9: u32, i10: u32, i11: u32, + i12: u32, i13: u32, i14: u32, i15: u32) -> Simd16; +} +``` + +This approach has some downsides: `simd_shuffle32` (e.g. `Simd32` +on AVX, and `Simd32` on AVX-512) and especially `simd_shuffle64` +(e.g. `Simd64` on AVX-512) are unwieldy. These have similar type +"safety"/code-generation errors to the vectors themselves. + +These operations are semantically: + +```rust +// vector of double length +let z = concat(v, w); + +return [z[i0], z[i1], z[i2], ...] +``` + +The indices `iN` have to be compile time constants. + +Similarly, intrinsics for inserting/extracting elements into/out of +vectors are provided, to allow modelling the SIMD vectors as actual +CPU registers as much as possible: + +```rust +extern "rust-intrinsic" { + fn simd_insert(v: T, i0: u32, elem: T::Elem) -> T; + fn simd_extract(v: T, i0: u32) -> T::Elem; +} +``` + +The `i0` indices do not have to be constant. These are equivalent to +`v[i0] = elem` and `v[i0]` respectively. + +### Comparisons + +Comparisons are implemented via intrinsics, because the current +comparison operator infrastructure doesn't easily lend itself to +return vectors, as required. + +A library could give signatures like: + +```rust +extern "rust-intrinsic" { + fn simd_eq(v: T, w: T) -> T::Bool; + fn simd_ne(v: T, w: T) -> T::Bool; + fn simd_lt(v: T, w: T) -> T::Bool; + fn simd_le(v: T, w: T) -> T::Bool; + fn simd_gt(v: T, w: T) -> T::Bool; + fn simd_ge(v: T, w: T) -> T::Bool; +} +``` + + +### Built-in functionality + +Any type marked `repr(simd)` automatically has the `+`, `-` and `*` +operators work. The `/` operator works for floating point, and the +`<<` and `>>` ones work for integers. + +SIMD vectors can be converted with `as`. As with intrinsics, this is +"duck-typed" it is possible to cast a vector type `V` to a type `W` if +their lengths match and their elements are castable (i.e. are +primitives), there's no enforcement of nominal types. + +All of these are never checked: explicit SIMD is essentially only +required for speed, and checking inflates one instruction to 5 or +more. + +## Platform Detection + +The availability of efficient SIMD functionality is very fine-grained, +and our current `cfg(target_arch = "...")` is not precise enough. This +RFC proposes a `target_feature` `cfg`, that would be set to the +features of the architecture that are known to be supported by the +exact target e.g. + +- a default x86-64 compilation would essentially only set + `target_feature = "sse"` and `target_feature = "sse2"` +- compiling with `-C target-feature="+sse4.2"` would set + `target_feature = "sse4.2"`, `target_feature = "sse.4.1"`, ..., + `target_feature = "sse"`. +- compiling with `-C target-cpu=native` on a modern CPU might set + `target_feature = "avx2"`, `target_feature = "avx"`, ... + +(There are other non-SIMD features that might have `target_feature`s +set too, such as `popcnt` and `rdrnd` on x86/x86-64.) + +With a `cfg_if_else!` macro that expands to the first `cfg` that is +satisfied (ala [@alexcrichton's cascade][cascade]), code might look +like: + +[cascade]: https://github.com/alexcrichton/backtrace-rs/blob/03703031babfa87cbe2c723ad6752131819dc554/src/macros.rs + +```rust +cfg_if_else! { + if #[cfg(target_feature = "avx")] { + fn foo() { /* use AVX things */ } + } else if #[cfg(target_feature = "sse4.1")] { + fn foo() { /* use SSE4.1 things */ } + } else if #[cfg(target_feature = "sse2")] { + fn foo() { /* use SSE2 things */ } + } else if #[cfg(target_feature = "neon")] { + fn foo() { /* use NEON things */ } + } else { + fn foo() { /* universal fallback */ } + } +} +``` + +# Extensions + +- scatter/gather operations allow (partially) operating on a SIMD + vector of pointers. This would require extending `SimdPrim` to also + allow pointer types. +- allow (and ignore for everything but type checking) zero-sized types + in `repr(simd)` structs, to allow tagging them with markers + +# Alternatives + +- The SIMD on-route-to-stable intrinsics could have their own ABI +- Intrinsics could instead by namespaced by ABI, `extern + "x86-intrinsic"`, `extern "arm-intrinsic"`. +- There could be more syntactic support for shuffles, either with true + syntax, or with a syntax extension. The latter might look like: + `shuffle![x, y, i0, i1, i2, i3, i4, ...]`. However, this requires + that shuffles are restricted to a single type only (i.e. `Simd4` + can be shuffled to `Simd4` but nothing else), or some sort of + type synthesis. The compiler has to somehow work out the return + value: + + ```rust + let x: Simd4 = ...; + let y: Simd4 = ...; + + // reverse all the elements. + let z = shuffle![x, y, 7, 6, 5, 4, 3, 2, 1, 0]; + ``` + + Presumably `z` should be `Simd8`, but it's not obvious how the + compiler can know this. The `repr(simd)` approach means there may be + more than one SIMD-vector type with the `Simd8` shape (or, in + fact, there may be zero). +- Instead of platform detection, there could be feature detection + (e.g. "platform supports something equivalent to x86's `DPPS`"), but + there probably aren't enough cross-platform commonalities for this + to be worth it. (Each "feature" would essentially be a platform + specific `cfg` anyway.) +- Check vector operators in debug mode just like the scalar versions. + +# Unresolved questions + +- Should integer vectors get `/` and `%` automatically? Most CPUs + don't support them for vectors. From 8f91f8ad5377226d89b1e36c6ab57ce5d15bf7e6 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Tue, 7 Jul 2015 11:47:25 -0700 Subject: [PATCH 0349/1195] RFC for inclusive ranges with ... --- text/0000-inclusive-ranges.md | 79 +++++++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100644 text/0000-inclusive-ranges.md diff --git a/text/0000-inclusive-ranges.md b/text/0000-inclusive-ranges.md new file mode 100644 index 00000000000..800bdd8d191 --- /dev/null +++ b/text/0000-inclusive-ranges.md @@ -0,0 +1,79 @@ +- Feature Name: inclusive_range_syntax +- Start Date: 2015-07-07 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow a `x...y` expression to create an inclusive range. + +# Motivation + +There are several use-cases for inclusive ranges, that semantically +include both end-points. For example, iterating from `0_u8` up to and +including some number `n` can be done via `for _ in 0..n + 1` at the +moment, but this will fail if `n` is `255`. Furthermore, some iterable +things only have a successor operation that is sometimes sensible, +e.g., `'a'..'{'` is equivalent to the inclusive range `'a'...'z'`: +there's absolutely no reason that `{` is after `z` other than a quirk +of the representation. + +The `...` syntax mirrors the current `..` used for exclusive ranges: +more dots means more elements. + +# Detailed design + +`std::ops` defines + +```rust +pub struct RangeInclusive { + pub start: T, + pub end: T, +} +``` + +Writing `a...b` in an expression desugars to `std::ops::RangeInclusive +{ start: a, end: b }`. + +This struct implements the standard traits (`Clone`, `Debug` etc.), +but, unlike the other `Range*` types, does not implement `Iterator` +directly, since it cannot do so correctly without more internal +state. It can implement `IntoIterator` that converts it into an +iterator type that contains the necessary state. + +The use of `...` in a pattern remains as testing for inclusion +within that range, *not* a struct match. + +The author cannot forsee problems with breaking backward +compatibility. In particular, one tokenisation of syntax like `1...` +now would be `1. ..` i.e. a floating point number on the left, however, fortunately, +it is actually tokenised like `1 ...`, and is hence an error. + +# Drawbacks + +There's a mismatch between pattern-`...` and expression-`...`, in that +the former doesn't undergo the same desugaring as the +latter. (Although they represent essentially the same thing +semantically.) + +The `...` vs. `..` distinction is the exact inversion of Ruby's syntax. + +Only implementing `IntoIterator` means uses of it in iterator chains +look like `(a...b).into_iter().collect()` instead of +`(a..b).collect()` as with exclusive ones (although this doesn't +affect `for` loops: `for _ in a...b` works fine). + +# Alternatives + +An alternate syntax could be used, like +`..=`. [There has been discussion][discuss], but there wasn't a clear +winner. + +[discuss]: https://internals.rust-lang.org/t/vs-for-inclusive-ranges/1539 + +This RFC doesn't propose non-double-ended syntax, like `a...`, `...b` +or `...` since it isn't clear that this is so useful. Maybe it is. + +# Unresolved questions + +None so far. From 5fe90e256010f6e085dcaa32ce81191d9363fc18 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 7 Jul 2015 19:01:21 -0700 Subject: [PATCH 0350/1195] RFC: Prevent lint changes being a breaking change Add a new flag to the compiler, `--cap-lints`, which set the maximum possible lint level for the entire crate (and cannot be overridden). Cargo will then pass `--cap-lints allow` to all upstream dependencies when compiling code. --- text/0000-cap-lints.md | 95 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) create mode 100644 text/0000-cap-lints.md diff --git a/text/0000-cap-lints.md b/text/0000-cap-lints.md new file mode 100644 index 00000000000..2ddb767e677 --- /dev/null +++ b/text/0000-cap-lints.md @@ -0,0 +1,95 @@ +- Feature Name: N/A +- Start Date: 2015-07-07 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add a new flag to the compiler, `--cap-lints`, which set the maximum possible +lint level for the entire crate (and cannot be overridden). Cargo will then pass +`--cap-lints allow` to all upstream dependencies when compiling code. + +# Motivation + +> Note: this RFC represents issue [#1029][issue] + +Currently any modification to a lint in the compiler is strictly speaking a +breaking change. All crates are free to place `#![deny(warnings)]` at the top of +their crate, turning any new warnings into compilation errors. This means that +if a future version of Rust starts to emit new warnings it may fail to compile +some previously written code (a breaking change). + +We would very much like to be able to modify lints, however. For example +[rust-lang/rust#26473][pr] updated the `missing_docs` lint to also look for +missing documentation on `const` items. This ended up [breaking some +crates][term-pr] in the ecosystem due to their usage of +`#![deny(missing_docs)]`. + +[issue]: https://github.com/rust-lang/rfcs/issues/1029 +[pr]: https://github.com/rust-lang/rust/pull/26473 +[term-pr]: https://github.com/rust-lang/term/pull/34 + +The mechanism proposed in this RFC is aimed at providing a method to compile +upstream dependencies in a way such that they are resilient to changes in the +behavior of the standard lints in the compiler. A new lint warning or error will +never represent a memory safety issue (otherwise it'd be a real error) so it +should be safe to ignore any new instances of a warning that didn't show up +before. + +# Detailed design + +There are two primary changes propsed by this RFC, the first of which is a new +flag to the compiler: + +``` + --cap-lints LEVEL Set the maximum lint level for this compilation, cannot + be overridden by other flags or attributes. +``` + +For example when `--cap-lints allow` is passed, all instances of `#[warn]`, +`#[deny]`, and `#[forbid] are ignored. If, however `--cap-lints warn` is passed +only `deny` and `forbid` directives are ignored. + +The acceptable values for `LEVEL` will be `allow`, `warn`, `deny`, or `forbid`. + +The second change proposed is to have Cargo pass `--cap-lints allow` to all +upstream dependencies. Cargo currently passes `-A warnings` to all upstream +dependencies (allow all warnings by default), so this would just be guaranteeing +that no lints could be fired for upstream dependencies. + +With these two pieces combined together it is now possible to modify lints in +the compiler in a backwards compatible fashion. Modifications to existing lints +to emit new warnings will not get triggered, and new lints will also be entirely +suppressed **only for upstream dependencies**. + +# Drawbacks + +This RFC adds surface area to the command line of the compiler with a relatively +obscure option `--cap-lints`. The option will almost never be passed by anything +other than Cargo, so having it show up here is a little unfortunate. + +Some crates may inadvertently rely on memory safety through lints, or otherwise +very much not want lints to be turned off. For example if modifications to a new +lint to generate more warnings caused an upstream dependency to fail to compile, +it could represent a serious bug indicating the dependency needs to be updated. +This system would paper over this issue by forcing compilation to succeed. This +use case seems relatively rare, however, and lints are also perhaps not the best +method to ensure the safety of a crate. + +Cargo may one day grow configuration to *not* pass this flag by default (e.g. go +back to passing `-Awarnings` by default), which is yet again more expansion of +API surface area. + +# Alternatives + +* Modifications to lints or additions to lints could be considered + backwards-incompatible changes. +* The meaning of the `-A` flag could be reinterpreted as "this cannot be + overridden" +* A new "meta lint" could be introduced to represent the maximum cap, for + example `-A everything`. This is semantically different enough from `-A foo` + that it seems worth having a new flag. + +# Unresolved questions + +None yet. From 28861b9ad67d0c26750446332b949cffbd6bda5c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 7 Jul 2015 19:15:39 -0700 Subject: [PATCH 0351/1195] Add a note about Cargo backcompat --- text/0000-cap-lints.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/text/0000-cap-lints.md b/text/0000-cap-lints.md index 2ddb767e677..5aca0d59764 100644 --- a/text/0000-cap-lints.md +++ b/text/0000-cap-lints.md @@ -62,6 +62,20 @@ the compiler in a backwards compatible fashion. Modifications to existing lints to emit new warnings will not get triggered, and new lints will also be entirely suppressed **only for upstream dependencies**. +## Cargo Backwards Compatibility + +This flag would be first non-1.0 flag that Cargo would be passing to the +compiler. This means that Cargo can no longer drive a 1.0 compiler, but only a +1.N+ compiler which has the `--cap-lints` flag. To handle this discrepancy Cargo +will detect whether `--cap-lints` is a valid flag to the compiler. + +Cargo already runs `rustc -vV` to learn about the compiler (e.g. a "unique +string" that's opaque to Cargo) and it will instead start passing +`rustc -vV --cap-lints allow` to the compiler instead. This will allow Cargo to +simultaneously detect whether the flag is valid and learning about the version +string. If this command fails and `rustc -vV` succeeds then Cargo will fall back +to the old behavior of passing `-A warnings`. + # Drawbacks This RFC adds surface area to the command line of the compiler with a relatively From 320ad8e97c3766d1f1889fcca713540930445d30 Mon Sep 17 00:00:00 2001 From: Andrew Paseltiner Date: Fri, 3 Jul 2015 15:31:40 -0400 Subject: [PATCH 0352/1195] create collection recovery RFC --- text/0000-collection-recovery.md | 173 +++++++++++++++++++++++++++++++ 1 file changed, 173 insertions(+) create mode 100644 text/0000-collection-recovery.md diff --git a/text/0000-collection-recovery.md b/text/0000-collection-recovery.md new file mode 100644 index 00000000000..be79017b8be --- /dev/null +++ b/text/0000-collection-recovery.md @@ -0,0 +1,173 @@ +- Feature Name: collection_recovery +- Start Date: 2015-07-08 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add item-recovery methods to the set types in `std`. Add key-recovery methods to the map types in +`std` in order to facilitate this. + +# Motivation + +Sets are sometimes used as a cache keyed on a certain property of a type, but programs may need to +access the type's other properties for efficiency or functionailty. The sets in `std` do not expose +their items (by reference or by value), making this use-case impossible. + +Consider the following example: + +```rust +use std::collections::HashSet; +use std::hash::{Hash, Hasher}; + +// The `Widget` type has two fields that are inseparable. +#[derive(PartialEq, Eq, Hash)] +struct Widget { + foo: Foo, + bar: Bar, +} + +#[derive(PartialEq, Eq, Hash)] +struct Foo(&'static str); + +#[derive(PartialEq, Eq, Hash)] +struct Bar(u32); + +// Widgets are normally considered equal if all their corresponding fields are equal, but we would +// also like to maintain a set of widgets keyed only on their `bar` field. To this end, we create a +// new type with custom `{PartialEq, Hash}` impls. +struct MyWidget(Widget); + +impl PartialEq for MyWidget { + fn eq(&self, other: &Self) -> bool { self.0.bar == other.0.bar } +} + +impl Eq for MyWidget {} + +impl Hash for MyWidget { + fn hash(&self, h: &mut H) { self.0.bar.hash(h); } +} + +fn main() { + // In our program, users are allowed to interactively query the set of widgets according to + // their `bar` field, as well as insert, replace, and remove widgets. + + let mut widgets = HashSet::new(); + + // Add some default widgets. + widgets.insert(MyWidget(Widget { foo: Foo("iron"), bar: Bar(1) })); + widgets.insert(MyWidget(Widget { foo: Foo("nickel"), bar: Bar(2) })); + widgets.insert(MyWidget(Widget { foo: Foo("copper"), bar: Bar(3) })); + + // At this point, the user enters commands and receives output like: + // + // ``` + // > get 1 + // Some(iron) + // > get 4 + // None + // > remove 2 + // removed nickel + // > add 2 cobalt + // added cobalt + // > add 3 zinc + // replaced copper with zinc + // ``` + // + // However, `HashSet` does not expose its items via its `{contains, insert, remove}` methods, + // instead providing only a boolean indicator of the item's presence in the set, preventing us + // from implementing the desired functionality. +} +``` + +# Detailed design + +Add the following item-recovery methods to `std::collections::{BTreeSet, HashSet}`: + +```rust +impl Set { + // Like `contains`, but returns a reference to the item if the set contains it. + fn item(&self, item: &Q) -> Option<&T>; + + // Like `remove`, but returns the item if the set contained it. + fn remove_item(&mut self, item: &Q) -> Option; + + // Like `insert`, but replaces the item with the given one and returns the previous item if the + // set contained it. + fn replace(&mut self, item: T) -> Option; +} +``` + +In order to implement the above methods, add the following key-recovery methods to +`std::collections::{BTreeMap, HashMap}`: + +```rust +impl Map { + // Like `get`, but additionally returns a reference to the entry's key. + fn key_value(&self, key: &Q) -> Option<(&K, &V)>; + + // Like `get_mut`, but additionally returns a reference to the entry's key. + fn key_value_mut(&mut self, key: &Q) -> Option<(&K, &mut V)>; + + // Like `remove`, but additionally returns the entry's key. + fn remove_key_value(&mut self, key: &Q) -> Option<(K, V)>; + + // Like `insert`, but additionally replaces the key with the given one and returns the previous + // key and value if the map contained it. + fn replace(&mut self, key: K, value: V) -> Option<(K, V)>; +} +``` + +For completion, add the following key-recovery methods to +`std::collections::{btree_map, hash_map}::OccupiedEntry`: + +```rust +impl<'a, K, V> OccupiedEntry<'a, K, V> { + // Like `get`, but additionally returns a reference to the entry's key. + fn key_value(&self) -> (&K, &V); + + // Like `get_mut`, but additionally returns a reference to the entry's key. + fn key_value_mut(&mut self) -> (&K, &mut V); + + // Like `into_mut`, but additionally returns a reference to the entry's key. + fn into_key_value_mut(self) -> (&'a K, &'a mut V); + + // Like `remove`, but additionally returns the entry's key. + fn remove_key_value(self) -> (K, V); +} +``` + +# Drawbacks + +This complicates the collection APIs. + +The distinction between `insert` and `replace` may be confusing. It would be more consistent to +call `Set::replace` `Set::insert_item` and `Map::replace` `Map::insert_key_value`, but `BTreeMap` +and `HashMap` do not replace equivalent keys in their `insert` methods, so rather than have +`insert` and `insert_key_value` behave differently in that respect, `replace` is used instead. + +# Alternatives + +Do nothing. + +# Unresolved questions + +Are these the best method names? + +Should `std::collections::{btree_map, hash_map}::VacantEntry` provide methods like + +```rust +impl<'a, K, V> VacantEntry<'a, K, V> { + /// Returns a reference to the entry's key. + fn key(&self) -> &K; + + // Like `insert`, but additionally returns a reference to the entry's key. + fn insert_key_value(self, value: V) -> (&'a K, &'a mut V); + + // Returns the entry's key without inserting it into the map. + fn into_key(self) -> K; +} +``` + +Should `{BTreeMap, HashMap}::insert` be changed to replace equivalent keys? This could break code +relying on the old behavior, and would add an additional inconsistency to `OccupiedEntry::insert`. From 272ce9b458b32c1e6a2f4e349e5a2b4458673107 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 8 Jul 2015 10:58:49 -0700 Subject: [PATCH 0353/1195] First round of changes: remove extraneous traits etc. --- text/0000-simd-infrastructure.md | 176 +++++++++---------------------- 1 file changed, 50 insertions(+), 126 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 7876de442dd..6fb50f0cb5a 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -19,6 +19,24 @@ This RFC lays the ground-work for building nice SIMD functionality, but doesn't fill everything out. The goal here is to provide the raw types and access to the raw instructions on each platform. +## Where does this code go? Aka. why not in `std`? + +This RFC is focused on building stable, powerful SIMD functionality in +external crates, not `std`. + +This makes it much easier to support functionality only "occasionally" +available with Rust's preexisting `cfg` system. If it were in `std`, +there would need to be some highly delayed `cfg` system so that +functions that only work with (say) AVX-2 support: + +- don't break compilation on systems that don't support it, but +- are still usable on systems that do support it. + +With an external crate, we can leverage `cargo`'s existing build +infrastructure: compiling with some target features will rebuild with +those features enabled. + + # Detailed design The design comes in three parts: @@ -39,113 +57,42 @@ There is definitely a common core of SIMD functionality shared across many platforms, but this RFC doesn't try to extract that, it is just building tools that can be wrapped into a more uniform API later. -## Background: Where does this code go? - -This RFC is focused on building stable, powerful SIMD functionality in -external crates, not `std`. This makes it much easier to support -functionality only "occasionally" available with Rust's preexisting -`cfg` system. If it were in `std`, there would need to be some highly -delayed `cfg` system so that functions that only work with AVX-2 -support: - -- don't break compilation on systems that don't support it, but -- are still usable on systems that do support it. ## Types & traits -A type designed to be used as a SIMD vector is indicated by the -`repr(simd)` attribute. A type marked as such will be compiled to -behave like a SIMD register (as well as the target platform can -support it). - -The types/traits will be defined as follows: +There are two new attributes: `repr(simd)` and `simd_primitive_trait` ```rust #[repr(simd)] -struct Simd2(T, T); -#[repr(simd)] -struct Simd4(T, T, T, T); -#[repr(simd)] -struct Simd8(T, T, T, T, T, T, T, T); -#[repr(simd)] -struct Simd16(T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T); -#[repr(simd)] -struct Simd32(T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, - T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T); -#[repr(simd)] -struct Simd64(T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, - T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, - T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, - T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T); - -trait SimdVector { - type Elem: SimdPrim; - type Bool: SimdVector::Bool>; -} - -impl for Simd2 { - type Elem = T; - type Bool = Simd2; -} -impl for Simd4 { - type Elem = T; - type Bool = Simd4; -} -// ... -impl for Simd64 { - type Elem = T; - type Bool = Simd64; -} +struct f32x4(f32, f32, f23, f23); -#[simd_prim_trait] -trait SimdPrim { - type Bool: SimdPrim; -} +#[repr(simd)] +struct Simd2(T, T); -// boolean types, see below -struct bool8i(...); -struct bool16i(...); -struct bool32i(...); -struct bool64i(...); -struct bool32f(...); -struct bool64f(...); - -// specifying what types are SIMD-able. -impl SimdPrim for u8 { type Bool = bool8i; } -impl SimdPrim for i8 { type Bool = bool8i; } -impl SimdPrim for u16 { type Bool = bool16i; } -// ... -impl SimdPrim for i64 { type Bool = bool64i; } - -impl SimdPrim for f32 { type Bool = bool32f; } -impl SimdPrim for f64 { type Bool = bool64f; } - -impl SimdPrim for bool8i { type Bool = bool8i; } -// ... -impl SimdPrim for bool64i { type Bool = bool64i; } - -impl SimdPrim for bool32f { type Bool = bool32f; } -impl SimdPrim for bool64f { type Bool = bool64f; } +#[simd_primitive_trait] +trait SimdPrim {} ``` -It is illegal to take an internal reference to the fields of a -`repr(simd)` type. - ### `repr(simd)` The `simd` `repr` can be attached to a struct and will cause such a -struct to be compiled to a SIMD vector. It is required that the -monomorphised vector consist of only a single "primitive" type, -repeated some number of times. The restrictions on the element type -are exactly the same restrictions as `#[simd_primitive_trait]` traits -impose on their implementing types. +struct to be compiled to a SIMD vector. It can be generic, but it is +required that any fully monomorphised instance of the type consist of +only a single "primitive" type, repeated some number of times. The +restrictions on the element type are exactly the same restrictions as +`#[simd_primitive_trait]` traits impose on their implementing types. The `repr(simd)` may not enforce that the trait bound exists/does the right thing at the type checking level for generic `repr(simd)` types. As such, it will be possible to get the code-generator to error out (ala the old `transmute` size errosr), however, this shouldn't cause problems in practice: libraries wrapping this functionality -would layer type-safety on top (i.e. the `SimdPrim` trait). +would layer type-safety on top (i.e. generic `repr(simd)` types would +use the `SimdPrim` trait as a bound). + +It is illegal to take an internal reference to the fields of a +`repr(simd)` type, because the representation of booleans may require +to change, so that booleans are bit-packed. ### `simd_primitive_trait` @@ -160,35 +107,6 @@ restriction and possibly tweaks type's internal representation (as such, it's legal for a single type to implement multiple traits with the attribute, if a bit pointless). -### Booleans - -SIMD booleans are non-trivial. Many conventional APIs e.g. SSE, and -NEON, use "wide booleans": a large number of bits set to all-zeros -(false) or all-ones, e.g. equality between `Simd4(0_u32, 1, 2, 3)` and -`Simd4(0_u32, 0, 2, 3)` gives (on the CPU) `Simd4(!0_u32, 0, !0, -!0)`. Hence, the boolean types need to have width. It's tempting to -just use the integer types of the appropriate width, but this falls -down for two reasons: - -1. booleans aren't always this format -2. the source of the boolean matters - -The second is easiest: CPUs are complicated beasts, and the hardware -that handles floating point vector operations may be very different to -the hardware that handles integer ones: instructions use different -execution units. It can take several cycles to transfer data between -them. Encoding the provenance/execution unit of the value in the type -makes costs explicit. - -The first is much harder to solve. Some architectures/instruction sets -model booleans as single bits. For example, equality between -`Simd4(0_u32, 1, 2, 3)` and `Simd(0_u32, 0, 2, 3)` gives `1 + 4 + 8 == -0b1101`. One example is AVX-512 which essentially replaces all of the -older SSE through AVX2 boolean-returning instructions with versions -that return those. Using separate types for booleans (and restricting -their API) allows for some serious magic: `Simd4` becomes -`u4`. (This is where the reference-restriction above comes in.) - ## Operations CPU vendors usually offer "standard" C headers for their CPU specific @@ -295,19 +213,23 @@ Comparisons are implemented via intrinsics, because the current comparison operator infrastructure doesn't easily lend itself to return vectors, as required. -A library could give signatures like: +The raw signatures would look like: ```rust extern "rust-intrinsic" { - fn simd_eq(v: T, w: T) -> T::Bool; - fn simd_ne(v: T, w: T) -> T::Bool; - fn simd_lt(v: T, w: T) -> T::Bool; - fn simd_le(v: T, w: T) -> T::Bool; - fn simd_gt(v: T, w: T) -> T::Bool; - fn simd_ge(v: T, w: T) -> T::Bool; + fn simd_eq(v: T, w: T) -> U; + fn simd_ne(v: T, w: T) -> U; + fn simd_lt(v: T, w: T) -> U; + fn simd_le(v: T, w: T) -> U; + fn simd_gt(v: T, w: T) -> U; + fn simd_ge(v: T, w: T) -> U; } ``` +However, these will be type checked, to ensure that `T` and `U` are +the same length, and that `U` is appropriately shaped for a boolean. A +library actually importing them might use some trait bounds to get +actual type-safety. ### Built-in functionality @@ -340,8 +262,10 @@ exact target e.g. - compiling with `-C target-cpu=native` on a modern CPU might set `target_feature = "avx2"`, `target_feature = "avx"`, ... -(There are other non-SIMD features that might have `target_feature`s -set too, such as `popcnt` and `rdrnd` on x86/x86-64.) +The possible values of `target_feature` will be a selected whitelist, +not necessarily just everything LLVM understands. There are other +non-SIMD features that might have `target_feature`s set too, such as +`popcnt` and `rdrnd` on x86/x86-64.) With a `cfg_if_else!` macro that expands to the first `cfg` that is satisfied (ala [@alexcrichton's cascade][cascade]), code might look From 5893b163931a7f27fe724a6cd456ef38f8cd76c0 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 8 Jul 2015 14:19:10 -0700 Subject: [PATCH 0354/1195] Second round of changes: minor tweaks. --- text/0000-simd-infrastructure.md | 88 ++++++++++++++++++++------------ 1 file changed, 54 insertions(+), 34 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 6fb50f0cb5a..a33f3b27068 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -1,4 +1,4 @@ -- Feature Name: simd_basics +- Feature Name: simd_basics, cfg_target_feature - Start Date: 2015-06-02 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -25,12 +25,12 @@ This RFC is focused on building stable, powerful SIMD functionality in external crates, not `std`. This makes it much easier to support functionality only "occasionally" -available with Rust's preexisting `cfg` system. If it were in `std`, -there would need to be some highly delayed `cfg` system so that -functions that only work with (say) AVX-2 support: - -- don't break compilation on systems that don't support it, but -- are still usable on systems that do support it. +available with Rust's preexisting `cfg` system. There's no way for +`std` to conditionally provide an API based on the target features +used for the final artifact. Building `std` in every configuration is +certainly untenable. Hence, if it were to be in `std`, there would +need to be some highly delayed `cfg` system to support that sort of +conditional API exposure. With an external crate, we can leverage `cargo`'s existing build infrastructure: compiling with some target features will rebuild with @@ -39,11 +39,11 @@ those features enabled. # Detailed design -The design comes in three parts: +The design comes in three parts, all on the path to stabilisation: -- types -- operations -- platform detection +- types (`feature(simd_basics)`) +- operations (`feature(simd_basics)`) +- platform detection (`feature(cfg_target_feature)`) The general idea is to avoid bad performance cliffs, so that an intrinsic call in Rust maps to preferably one CPU instruction, or, if @@ -92,7 +92,9 @@ use the `SimdPrim` trait as a bound). It is illegal to take an internal reference to the fields of a `repr(simd)` type, because the representation of booleans may require -to change, so that booleans are bit-packed. +to change, so that booleans are bit-packed. The official external +library providing SIMD support will have private fields so this will +not be generally observable. ### `simd_primitive_trait` @@ -107,6 +109,13 @@ restriction and possibly tweaks type's internal representation (as such, it's legal for a single type to implement multiple traits with the attribute, if a bit pointless). +This trait exists to allow new-type wrappers around primitives to also +be usable in a SIMD context. However, this only works in limited +scenarios (i.e. when the type wraps a single primitive) and so needs +to be an explicit part of every type's API: type authors opt-in to +being designed-for-SIMD. If it was implicit, changes to private fields +may break downstream code. + ## Operations CPU vendors usually offer "standard" C headers for their CPU specific @@ -116,10 +125,13 @@ x86(-64)][x86]. [armneon]: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf [x86]: https://software.intel.com/sites/landingpage/IntrinsicsGuide -All of these would be exposed as (eventually) stable intrinsics with -names very similar to those that the vendor suggests (only difference -would be some form of manual namespacing, e.g. prefixing with the CPU -target), loadable via an `extern` block with an appropriate ABI. +All of these would be exposed as compiler intrinsics with names very +similar to those that the vendor suggests (only difference would be +some form of manual namespacing, e.g. prefixing with the CPU target), +loadable via an `extern` block with an appropriate ABI. This subset of +intrinsics would be on the path to stabilisation (that is, one can +"import" them with `extern` in stable code), and would not be exported +by `std`. ```rust extern "rust-intrinsic" { @@ -164,19 +176,24 @@ platform specific intrinsic for shuffling. ```rust extern "rust-intrinsic" { - fn simd_shuffle2(v: T, w: T, i0: u32, i1: u32) -> Simd2; - fn simd_shuffle4(v: T, w: T, i0: u32, i1: u32, i2: u32, i3: u32) -> Simd4; - fn simd_shuffle8(v: T, w: T, - i0: u32, i1: u32, i2: u32, i3: u32, - i4: u32, i5: u32, i6: u32, i7: u32) -> Simd8; - fn simd_shuffle16(v: T, w: T, - i0: u32, i1: u32, i2: u32, i3: u32, - i4: u32, i5: u32, i6: u32, i7: u32 - i8: u32, i9: u32, i10: u32, i11: u32, - i12: u32, i13: u32, i14: u32, i15: u32) -> Simd16; + fn simd_shuffle2(v: T, w: T, i0: u32, i1: u32) -> Simd2; + fn simd_shuffle4(v: T, w: T, i0: u32, i1: u32, i2: u32, i3: u32) -> Sidm4; + fn simd_shuffle8(v: T, w: T, + i0: u32, i1: u32, i2: u32, i3: u32, + i4: u32, i5: u32, i6: u32, i7: u32) -> Simd8; + fn simd_shuffle16(v: T, w: T, + i0: u32, i1: u32, i2: u32, i3: u32, + i4: u32, i5: u32, i6: u32, i7: u32 + i8: u32, i9: u32, i10: u32, i11: u32, + i12: u32, i13: u32, i14: u32, i15: u32) -> Simd16; } ``` +The raw definitions are only checked for validity at monomorphisation +time, ensure that `T` is a SIMD vector, `U` is the element type of `T` +etc. Libraries can use traits to ensure that these will be enforced by +the type checker too. + This approach has some downsides: `simd_shuffle32` (e.g. `Simd32` on AVX, and `Simd32` on AVX-512) and especially `simd_shuffle64` (e.g. `Simd64` on AVX-512) are unwieldy. These have similar type @@ -191,7 +208,8 @@ let z = concat(v, w); return [z[i0], z[i1], z[i2], ...] ``` -The indices `iN` have to be compile time constants. +The indices `iN` have to be compile time constants. Out of bounds +indices yield unspecified results. Similarly, intrinsics for inserting/extracting elements into/out of vectors are provided, to allow modelling the SIMD vectors as actual @@ -199,13 +217,14 @@ CPU registers as much as possible: ```rust extern "rust-intrinsic" { - fn simd_insert(v: T, i0: u32, elem: T::Elem) -> T; - fn simd_extract(v: T, i0: u32) -> T::Elem; + fn simd_insert(v: T, i0: u32, elem: Elem) -> T; + fn simd_extract(v: T, i0: u32) -> Elem; } ``` The `i0` indices do not have to be constant. These are equivalent to -`v[i0] = elem` and `v[i0]` respectively. +`v[i0] = elem` and `v[i0]` respectively. They are type checked +similarly to the shuffles. ### Comparisons @@ -226,10 +245,10 @@ extern "rust-intrinsic" { } ``` -However, these will be type checked, to ensure that `T` and `U` are -the same length, and that `U` is appropriately shaped for a boolean. A -library actually importing them might use some trait bounds to get -actual type-safety. +These are type checked during code-generation similarly to the +shuffles. Ensuring that `T` and `U` has the same length, and that `U` +is appropriately "boolean"-y. Libraries can use traits to ensure that +these will be enforced by the type checker too. ### Built-in functionality @@ -333,3 +352,4 @@ cfg_if_else! { - Should integer vectors get `/` and `%` automatically? Most CPUs don't support them for vectors. +- How should out-of-bounds shuffle and insert/extract indices be handled? From d0f18993bcec30928fd43965b0883a876b55aeaf Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Jul 2015 14:36:22 -0700 Subject: [PATCH 0355/1195] RFC 1102 is renaming connect to join --- ...name-connect-to-join.md => 1102-rename-connect-to-join.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-rename-connect-to-join.md => 1102-rename-connect-to-join.md} (95%) diff --git a/text/0000-rename-connect-to-join.md b/text/1102-rename-connect-to-join.md similarity index 95% rename from text/0000-rename-connect-to-join.md rename to text/1102-rename-connect-to-join.md index 0ed07ade10c..35bae6a7d5f 100644 --- a/text/0000-rename-connect-to-join.md +++ b/text/1102-rename-connect-to-join.md @@ -1,7 +1,7 @@ - Feature Name: `rename_connect_to_join` - Start Date: 2015-05-02 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1102](https://github.com/rust-lang/rfcs/pull/1102) +- Rust Issue: [rust-lang/rust#26900](https://github.com/rust-lang/rust/issues/26900) # Summary From a0d4344372279b2ff1f5f025aa65338758613ad1 Mon Sep 17 00:00:00 2001 From: Kevin Ballard Date: Wed, 8 Jul 2015 15:32:00 -0700 Subject: [PATCH 0356/1195] Update with libs team consensus --- text/0000-slice-tail-redesign.md | 44 +++++++++++--------------------- 1 file changed, 15 insertions(+), 29 deletions(-) diff --git a/text/0000-slice-tail-redesign.md b/text/0000-slice-tail-redesign.md index b428cca7b9c..4e4b9ec60ca 100644 --- a/text/0000-slice-tail-redesign.md +++ b/text/0000-slice-tail-redesign.md @@ -5,8 +5,8 @@ # Summary -Replace `slice.tail()`, `slice.init()` with new methods `slice.shift_first()`, -`slice.shift_last()`. +Replace `slice.tail()`, `slice.init()` with new methods `slice.split_first()`, +`slice.split_last()`. # Motivation @@ -20,10 +20,10 @@ remaining methods that panic without taking an explicit index. A conservative change here would be to simply change `head()`/`tail()` to return `Option`, but I believe we can do better. These operations are actually specializations of `split_at()` and should be replaced with methods that return -`Option<(T,&[T])>`. This makes the common operation of processing the first/last -element and the remainder of the list more ergonomic, with very low impact on -code that only wants the remainder (such code only has to add `.1` to the -expression). This has an even more significant effect on code that uses the +`Option<(&T,&[T])>`. This makes the common operation of processing the +first/last element and the remainder of the list more ergonomic, with very low +impact on code that only wants the remainder (such code only has to add `.1` to +the expression). This has an even more significant effect on code that uses the mutable variants. # Detailed design @@ -32,21 +32,21 @@ The methods `head()`, `tail()`, `head_mut()`, and `tail_mut()` will be removed, and new methods will be added: ```rust -fn shift_first(&self) -> Option<(&T, &[T])>; -fn shift_last(&self) -> Option<(&T, &[T])>; -fn shift_first_mut(&mut self) -> Option<(&mut T, &mut [T])>; -fn shift_last_mut(&mut self) -> Option<(&mut T, &mut [T])>; +fn split_first(&self) -> Option<(&T, &[T])>; +fn split_last(&self) -> Option<(&T, &[T])>; +fn split_first_mut(&mut self) -> Option<(&mut T, &mut [T])>; +fn split_last_mut(&mut self) -> Option<(&mut T, &mut [T])>; ``` Existing code using `tail()` or `init()` could be translated as follows: * `slice.tail()` becomes `&slice[1..]` * `slice.init()` becomes `&slice[..slice.len()-1]` or - `slice.shift_last().unwrap().1` + `slice.split_last().unwrap().1` It is expected that a lot of code using `tail()` or `init()` is already either testing `len()` explicitly or using `first()` / `last()` and could be refactored -to use `shift_first()` / `shift_last()` in a more ergonomic fashion. As an +to use `split_first()` / `split_last()` in a more ergonomic fashion. As an example, the following code from typeck: ```rust @@ -57,7 +57,7 @@ if variant.fields.len() > 0 { can be rewritten as: ```rust -if let Some((_, init_fields)) = variant.fields.shift_last() { +if let Some((_, init_fields)) = variant.fields.split_last() { for field in init_fields { ``` @@ -71,14 +71,14 @@ let args_ = args.tail(); can be rewritten as: ```rust -let (argv0, args_) = args.shift_first().unwrap(); +let (argv0, args_) = args.split_first().unwrap(); ``` (the `clone()` ended up being unnecessary). # Drawbacks -The expression `slice.shift_last().unwrap().1` is more cumbersome than +The expression `slice.split_last().unwrap().1` is more cumbersome than `slice.init()`. However, this is primarily due to the need for `.unwrap()` rather than the need for `.1`, and would affect the more conservative solution (of making the return type `Option<&[T]>`) as well. Furthermore, the more @@ -95,17 +95,3 @@ function names should be (the current names are considered suboptimal). Just deprecate the current methods without adding replacements. This gets rid of the odd methods today, but it doesn't do anything to make it easier to safely perform these operations. - -# Unresolved questions - -Is the name correct? There's precedent in this name in the form of -[`str::slice_shift_char()`][slice_shift_char]. An alternative name might be -`pop_first()`/`pop_last()`, or `shift_front()`/`shift_back()` (although the -usage of `first`/`last` was chosen to match the existing methods `first()` and -`last()`). Another option is `split_first()`/`split_last()`. - -Should `shift_last()` return `Option<(&T, &[T])>` or `Option<(&[T], &T)>`? -I believe that the former is correct with this name, but the latter might be -more suitable given the name `split_last()`. - -[slice_shift_char]: http://doc.rust-lang.org/nightly/std/primitive.str.html#method.slice_shift_char From d56f29dd3cf0b2c7ecbfa9ed581dab5cd5a4cc2d Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Jul 2015 16:05:00 -0700 Subject: [PATCH 0357/1195] Recommend #![lang_items_abort] instead. --- text/0000-stabilize-no_std.md | 178 +++++++++------------------------- 1 file changed, 47 insertions(+), 131 deletions(-) diff --git a/text/0000-stabilize-no_std.md b/text/0000-stabilize-no_std.md index 3eb8125aaf8..aa18fa09775 100644 --- a/text/0000-stabilize-no_std.md +++ b/text/0000-stabilize-no_std.md @@ -46,7 +46,7 @@ This RFC proposes a nuber of changes: * Stabilize the `#![no_std]` attribute after tweaking its behavior slightly * Introduce a `#![no_core]` attribute. * Stabilize the name "core" in libcore. -* Stabilize required language items by the core library. +* Introduce a `#![lang_items_abort]` attribute. ## `no_std` @@ -85,87 +85,35 @@ contents will be determined with a future RFC or pull requests. ## Stabilizing lang items -This section will describe the purpose for each lang item currently required in -addition to the interface that it will be stabilized with. Each lang item will -no longer be defined with the `#[lang = "..."]` syntax but will instead receive -a dedicated attribute (e.g. `#[panic_fmt]`) to be attached to functions to -identify an implementation. It should be noted that these language items are -already not quite the same as other `#[lang]` items due to the ability to rely -on them in a "weak" fashion. - -Like lang items each of these will only allow one implementor in any crate -dependency graph which will be verified at compile time. Also like today, none -of these lang items will be required unless a static library, dynamic library, -or executable is being produced. In other words, libraries (rlibs) do not need -(and probably should not) to define these items. - -#### `panic_fmt` - -This lang item is the definition of how to panic in Rust. The standard library -defines this by throwing an exception (in a platform-specific manner), but users -of libcore often want to define their own meaning of panicking. The signature of -this function will be: - -```rust -#[panic_fmt] -pub extern fn panic_fmt(msg: &core::fmt::Arguments) -> !; -``` - -This differs with the `panic_fmt` function today in that the file and line -number arguments are omitted. The libcore library will continue to provide -file/line number information in panics (as it does today) by assembling a new -`core::fmt::Arguments` value which uses the old one and appends the file/line -information. - -This signature also differs from today's implementation by taking a `&Arguments` -instead of taking it by value, and the purpose of this is to ensure that the -function has a clearly defined ABI on all platforms in case that is required. - -#### `eh_personality` - -The compiler will continue to compile libcore with landing pads (e.g. cleanup to -run on panics), and a "personality function" is required by LLVM to be available -to call for each landing pad. In the current implementation of panicking, a -personality function is typically just calling a standard personality function -in libgcc (or in MSVC's CRT), but the purpose is to indicate whether an -exception should be caught or whether cleanup should be run for this particular -landing pad and exception combination. - -The exact signature of this function is quite platform-specific, but many users -of libcore will never actually call this function as exceptions will not be -thrown (many will likely compile with `-Z no-landing-pads` anyway). As a result -the signature of this lang item will not be defined, but instead it will simply -be required to be defined (as libcore will reference the symbol name -regardless). - -```rust -#[eh_personality] -pub extern fn eh_personality(...) -> ...; -``` - -The compiler will not check the signature of this function, but it will assign -it a known symbol so libcore can be successfully linked. - -#### `stack_exhausted` - -The current implementation of stack overflow in the compiler is to use LLVM's -segmented stack support, inserting a prologue to every function in an object -file to detect when a stack overflow occurred. When a stack overflow is -detected, LLVM emits code that will call the symbol `__morestack`, which the -Rust distribution provides an implementation of. Our implementation, however, -then in turn calls a this `stack_exhausted` language item to define the -implementation of what happens on stack overflow. - -The compiler therefore needs to ensure that this lang item is present in order -for libcore to be correctly linked, so the lang item will have the following -signature: - -```rust -#[stack_exhausted] -pub extern fn stack_exhausted() -> !; -``` - -The compiler will control the symbol name and visibility of this function. +As mentioned above, there are three separate lang items which are required by +the libcore library to link correctly. These items are: + +* `panic_fmt` +* `stack_exhausted` +* `eh_personality` + +This RFC does **not** attempt to stabilize these lang items for a number of +reasons: + +* The exact set of these lang items is somewhat nebulous and may change over + time. +* The signatures of each of these lang items can either be platform-specific or + it's just "too weird" to stabilize. +* These items are pretty obscure and it's not very widely known what they do or + how they should be implemented. + +For `#![no_std]` to be generally useful, however, these lang items *must* be +able to be defined in one form or another on stable Rust, so this RFC proposes a +new crate attribute, `lang_items_abort`, which will define these functions. Any +crate tagged with `#![lang_items_abort]` will cause the compiler to generate any +necessary language items to get the program to correctly link. Each lang item +generated will simply abort the program as if it called the `intrinsics::abort` +function. + +This attribute will behave the same as `#[lang]` in terms of uniqueness, two +crates declaring `#![lang_items_abort]` cannot be linked together and an +upstream crate declaring this attribute means that no downstream crate has to +worry about it. # Drawbacks @@ -189,41 +137,12 @@ This RFC just enables creation of Rust static or dynamic libraries which don't depend on the standard library in addition to Rust libraries (rlibs) which do not depend on the standard library. -On the topic of lang item stabilization, it's likely expected that the -`panic_fmt` lang item must be defined, but the other two, `eh_personality` and -`stack_exhausted` are generally quite surprising. Code using `#![no_std]` is -also likely to very rarely actually make use of these functions: - -* Most no-std contexts don't throw exceptions (or don't have exceptions), so - they either have stubs that panic or just compile with `-Z no-landing-pads`, - so the `eh_personality` may not strictly be necessary to be defined in order - to link against libcore. -* Additionally, most no-std contexts don't actually set up stack overflow - detection, so the `stack_exhausted` function will either never be compiled or - the crates are compiled with `-C no-stack-check` meaning that the item may not - strictly be necessary to be defined. - -Currently, however, a binary distribution of libcore is provided which is -compiled with unwinding and stack overflow checks enabled. Consequently the -libcore library does indeed depend on these two symbols and require these items -to be defined. It is seen as not-that-large of a drawback for the following -reasons: - -* The functions `eh_personality` and `stack_exhausted` are fairly easy to - define, and are only required by end products (not Rust libraries). -* It's easy for the compiler to *stop* requiring these functions to be defined - in the future if we, for example, provide multiple binary copies of libcore in - the standard distribution. - -Another drawback of this RFC is the overall stabilization of the `#![no_std]` -attribute, meaning that the compiler will no longer be able to make assumptions -in the future about a function being defined. Put another way, the `panic_fmt`, -`eh_personality`, and `stack_exhausted` lang items are the only three that will -ever be able to be required to be defined by downstream crates. This is not seen -as too strong of a drawback as it's not clear that the compiler will need to -assume more functions exist. Additionally, the compiler will likely be able to -provide or emit a stub implementation for any future symbol it does need to -exist. +On the topic of lang items, it's somewhat unfortunate that the implementation of +a panic cannot be defined on stable Rust. The `#![lang_items_abort]` attribute +unconditionally defines all lang items, including `panic_fmt`, so it's not +possible to provide a custom implementation of the `panic_fmt` lang item while +still asking the compiler to define others like `eh_personality` and +`stack_exhausted`. In stabilizing the `#![no_std]` attribute it's likely that a whole ecosystem of crates will arise which work with `#![no_std]`, but in theory all of these @@ -242,16 +161,10 @@ happen: import the core prelude manually. The burden of adding `#![no_core]` to the compiler, however, is seen as not-too-bad compared to the increase in ergonomics of using `#![no_std]`. -* The language items could continue to use the same `#[lang = "..."]` syntax and - we could just stabilize a subset of the `#[lang]` items. It seems more - consistent, however, to blanket feature-gate all `#[lang]` attributes instead - of allowing three particular ones, so individual attributes are proposed. -* The `panic_fmt` lang item could retain the same signature today, but it has an - unclear ABI (passing `Arguments` by value) and we may not want to 100% commit - to always passing filename/line information on panics. -* The `eh_personality` and `stack_exhausted` lang items could not be required to - be defined, and the compiler could provide aborting stubs to be linked in if - they aren't defined anywhere else. +* The lang items could not be required to be defined, and the compiler could + provide aborting stubs to be linked in if they aren't defined anywhere else. + This has the downside of perhaps silently aborting a program, however, without + an explicit opt-in. * The compiler could not require `eh_personality` or `stack_exhausted` if no crate in the dependency tree has landing pads enabled or stack overflow checks enabled. This is quite a difficult situation to get into today, however, as @@ -259,13 +172,16 @@ happen: provide a method to configure this when compiling crates. The overhead of defining these functions seems small and because the compiler could stop requiring them in the future it seems plausibly ok to require them today. -* A `#[lang_items_abort]` attribute could be added to explicitly define the the - `eh_personality` and `stack_exhausted` lang items to immediately abort. This - would avoid us having to stabilize their signatures as we could stabilize just - this attribute and not their definitions. +* The lang items could be stabilized at this time instead of providing a way to + have the compiler generate an appropriate function. The downsides of this + approach, however, were listed above. * The various language items could not be stabilized at this time, allowing stable libraries that leverage `#![no_std]` but not stable final artifacts (e.g. staticlibs, dylibs, or binaries). +* Another stable crate could be provided by the distribution which provides + definitions of these lang items which are all wired to abort. This has the + downside of selecting a name for this crate, however, and also inflating the + crates in our distribution again. # Unresolved Questions From 2604c80fb4b88d1ee56a4668cbb94e46a13481d9 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 8 Jul 2015 16:32:54 -0700 Subject: [PATCH 0358/1195] RFC 1058 is reworking slice::{init, tail} --- ...000-slice-tail-redesign.md => 1058-slice-tail-redesign.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-slice-tail-redesign.md => 1058-slice-tail-redesign.md} (95%) diff --git a/text/0000-slice-tail-redesign.md b/text/1058-slice-tail-redesign.md similarity index 95% rename from text/0000-slice-tail-redesign.md rename to text/1058-slice-tail-redesign.md index 4e4b9ec60ca..194073f4391 100644 --- a/text/0000-slice-tail-redesign.md +++ b/text/1058-slice-tail-redesign.md @@ -1,7 +1,7 @@ - Feature Name: `slice_tail_redesign` - Start Date: 2015-04-11 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1058](https://github.com/rust-lang/rfcs/pull/1058) +- Rust Issue: [rust-lang/rust#26906](https://github.com/rust-lang/rust/issues/26906) # Summary From f9e48d11f9f9c8c7977da2cc371103e94783c33c Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Thu, 9 Jul 2015 09:28:07 -0700 Subject: [PATCH 0359/1195] Clarify/fix typos. --- text/0000-simd-infrastructure.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index a33f3b27068..a132e856d6d 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -64,7 +64,7 @@ There are two new attributes: `repr(simd)` and `simd_primitive_trait` ```rust #[repr(simd)] -struct f32x4(f32, f32, f23, f23); +struct f32x4(f32, f32, f32, f32); #[repr(simd)] struct Simd2(T, T); @@ -261,9 +261,10 @@ SIMD vectors can be converted with `as`. As with intrinsics, this is their lengths match and their elements are castable (i.e. are primitives), there's no enforcement of nominal types. -All of these are never checked: explicit SIMD is essentially only -required for speed, and checking inflates one instruction to 5 or -more. +All of these operators and conversions are never checked (in the sense +of the arithmetic overflow checks of `-C debug-assertions`): explicit +SIMD is essentially only required for speed, and checking inflates one +instruction to 5 or more. ## Platform Detection From e2fc223c99bea7aadf39f0859e5e7f0244175e5b Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Thu, 9 Jul 2015 09:59:58 -0700 Subject: [PATCH 0360/1195] Note that fixed-length arrays could be repr(simd)'d. --- text/0000-simd-infrastructure.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index a132e856d6d..af5fb0b0d15 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -348,6 +348,11 @@ cfg_if_else! { to be worth it. (Each "feature" would essentially be a platform specific `cfg` anyway.) - Check vector operators in debug mode just like the scalar versions. +- Make fixed length arrays `repr(simd)`-able (via just flattening), so + that, say, `#[repr(simd)] struct u32x4([u32; 4]);` and + `#[repr(simd)] struct f64x8([f64; 4], [f64; 4]);` etc works. This + will be most useful if/when we allow generic-lengths, `#[repr(simd)] + struct Simd([T; n]);` # Unresolved questions From efeafdb770728c3fe57a9e34da31421f38740943 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Thu, 9 Jul 2015 17:10:37 -0700 Subject: [PATCH 0361/1195] Remove the simd_primitive_trait attribute. Not really necessary: the type safety it offers can be provided by libraries. --- text/0000-simd-infrastructure.md | 53 ++++++++++---------------------- 1 file changed, 16 insertions(+), 37 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index af5fb0b0d15..a2a6e9884ed 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -58,9 +58,9 @@ many platforms, but this RFC doesn't try to extract that, it is just building tools that can be wrapped into a more uniform API later. -## Types & traits +## Types -There are two new attributes: `repr(simd)` and `simd_primitive_trait` +There is a new attributes: `repr(simd)`. ```rust #[repr(simd)] @@ -68,54 +68,30 @@ struct f32x4(f32, f32, f32, f32); #[repr(simd)] struct Simd2(T, T); - -#[simd_primitive_trait] -trait SimdPrim {} ``` -### `repr(simd)` - The `simd` `repr` can be attached to a struct and will cause such a struct to be compiled to a SIMD vector. It can be generic, but it is required that any fully monomorphised instance of the type consist of -only a single "primitive" type, repeated some number of times. The -restrictions on the element type are exactly the same restrictions as -`#[simd_primitive_trait]` traits impose on their implementing types. +only a single "primitive" type, repeated some number of times. Types +are flattened, so, for `struct Bar(u64);`, `Simd2` has the same +representation as `Simd2`. -The `repr(simd)` may not enforce that the trait bound exists/does the +The `repr(simd)` may not enforce that any trait bounds exists/does the right thing at the type checking level for generic `repr(simd)` types. As such, it will be possible to get the code-generator to error -out (ala the old `transmute` size errosr), however, this shouldn't +out (ala the old `transmute` size errors), however, this shouldn't cause problems in practice: libraries wrapping this functionality would layer type-safety on top (i.e. generic `repr(simd)` types would -use the `SimdPrim` trait as a bound). +use some `unsafe` trait as a bound that is designed to only be +implemented by types that will work). It is illegal to take an internal reference to the fields of a `repr(simd)` type, because the representation of booleans may require -to change, so that booleans are bit-packed. The official external +modification, so that booleans are bit-packed. The official external library providing SIMD support will have private fields so this will not be generally observable. -### `simd_primitive_trait` - -Traits marked with the `simd_primitive_trait` attribute are special: -types implementing it are those that can be stored in SIMD -vectors. Initially, only primitives and single-field structs that -store `SimdPrim` types will be allowed to implement it. - -This is explicitly not a lang item: it is legal to have multiple -distinct traits in a compilation. The attribute just adds the -restriction and possibly tweaks type's internal representation (as -such, it's legal for a single type to implement multiple traits with -the attribute, if a bit pointless). - -This trait exists to allow new-type wrappers around primitives to also -be usable in a SIMD context. However, this only works in limited -scenarios (i.e. when the type wraps a single primitive) and so needs -to be an explicit part of every type's API: type authors opt-in to -being designed-for-SIMD. If it was implicit, changes to private fields -may break downstream code. - ## Operations CPU vendors usually offer "standard" C headers for their CPU specific @@ -312,8 +288,8 @@ cfg_if_else! { # Extensions - scatter/gather operations allow (partially) operating on a SIMD - vector of pointers. This would require extending `SimdPrim` to also - allow pointer types. + vector of pointers. This would require allowing + pointers(/references?) in `repr(simd)` types. - allow (and ignore for everything but type checking) zero-sized types in `repr(simd)` structs, to allow tagging them with markers @@ -353,9 +329,12 @@ cfg_if_else! { `#[repr(simd)] struct f64x8([f64; 4], [f64; 4]);` etc works. This will be most useful if/when we allow generic-lengths, `#[repr(simd)] struct Simd([T; n]);` +- have 100% guaranteed type-safety for generic `#[repr(simd)]` types + and the generic intrinsics. This would probably require a relatively + complicated set of traits (with compiler integration). # Unresolved questions - Should integer vectors get `/` and `%` automatically? Most CPUs - don't support them for vectors. + don't support them for vectors. However - How should out-of-bounds shuffle and insert/extract indices be handled? From a7c409b3291758eab122d9bef034dfc2f255fb0e Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Fri, 10 Jul 2015 10:43:14 -0700 Subject: [PATCH 0362/1195] Mention alignment changes due to repr(simd). --- text/0000-simd-infrastructure.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index a2a6e9884ed..f815bb4b9c2 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -92,6 +92,10 @@ modification, so that booleans are bit-packed. The official external library providing SIMD support will have private fields so this will not be generally observable. +Adding `repr(simd)` to a type may increase its minimum/preferred +alignment, based on platform behaviour. (E.g. x86 wants its 128-bit +SSE vectors to be 128-bit aligned.) + ## Operations CPU vendors usually offer "standard" C headers for their CPU specific From f4e2ecfbe09c0798433f49785a625e8d54ac9b04 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Fri, 10 Jul 2015 10:48:55 -0700 Subject: [PATCH 0363/1195] Note pre-RFC discussion. --- text/0000-simd-infrastructure.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index f815bb4b9c2..0c6af575682 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -19,6 +19,9 @@ This RFC lays the ground-work for building nice SIMD functionality, but doesn't fill everything out. The goal here is to provide the raw types and access to the raw instructions on each platform. +(An earlier variant of this RFC was discussed as a +[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-simd-groundwork/2343).) + ## Where does this code go? Aka. why not in `std`? This RFC is focused on building stable, powerful SIMD functionality in From 30bf26b23bff6515d893f4bc8267963e087b502f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 10 Jul 2015 12:59:43 -0700 Subject: [PATCH 0364/1195] RFC: Add `cargo install` Add a new subcommand to Cargo, `install`, which will install `[[bin]]`-based packages onto the local system in a Cargo-specific directory. --- text/0000-cargo-install.md | 271 +++++++++++++++++++++++++++++++++++++ 1 file changed, 271 insertions(+) create mode 100644 text/0000-cargo-install.md diff --git a/text/0000-cargo-install.md b/text/0000-cargo-install.md new file mode 100644 index 00000000000..f6143295a85 --- /dev/null +++ b/text/0000-cargo-install.md @@ -0,0 +1,271 @@ +- Feature Name: N/A +- Start Date: 2015-07-10 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add a new subcommand to Cargo, `install`, which will install `[[bin]]`-based +packages onto the local system in a Cargo-specific directory. + +# Motivation + +There has [almost always been a desire][cargo-37] to be able to install Cargo +packages locally, but it's been somewhat unclear over time what the precise +meaning of this is. Now that we have crates.io and lots of experience with +Cargo, however, the niche that `cargo install` would fill is much clearer. + +[cargo-37]: https://github.com/rust-lang/cargo/issues/37 + +Fundamentally, however, Cargo is a ubiquitous tool among the Rust community and +implementing `cargo install` would facilitate sharing Rust code among its +developers. Simple tasks like installing a new cargo subcommand, installing an +editor plugin, etc, would be just a `cargo install` away. Cargo can manage +dependencies, versions, updates, etc, itself to make the process as seamless as +possible. + +Put another way, enabling easily sharing code is one of Cargo's fundamental +design goals, and expanding into binaries is simply an extension of Cargo's core +functionality. + +# Detailed design + +The following new subcommand will be added to Cargo: + +``` +Install a crate onto the local system + +Installing new crates: + cargo install [options] + cargo install [options] [-p CRATE | --package CRATE] [--vers VERS] + cargo install [options] --git URL [--branch BRANCH | --tag TAG | --rev SHA] + cargo install [options] --path PATH + +Managing installed crates: + cargo install [options] --list + cargo install [options] --update [SPEC | --all] + +Options: + -h, --help Print this message + -j N, --jobs N The number of jobs to run in parallel + --features FEATURES Space-separated list of features to activate + --no-default-features Do not build the `default` feature + --debug Build in debug mode instead of release mode + --bin NAME Only install the binary NAME + --example EXAMPLE Install the example EXAMPLE instead of binaries + -p, --package CRATE Install this crate from crates.io or select the + package in a repository/path to install. + -v, --verbose Use verbose output + +This command manages Cargo's local set of install binary crates. Only packages +which have [[bin]] targets can be installed, and all binaries are installed into +`$HOME/.cargo/bin` by default (or `$CARGO_HOME/bin` if you change the home +directory). + +There are multiple methods of installing a new crate onto the system. The +`cargo install` command with no arguments will install the current crate (as +specifed by the current directory). Otherwise the `-p`, `--package`, `--git`, +and `--path` options all specify the source from which a crate is being +installed. The `-p` and `--package` options will download crates from crates.io. + +Crates from crates.io can optionally specify the version they wish to install +via the `--vers` flags, and similarly packages from git repositories can +optionally specify the branch, tag, or revision that should be installed. If a +crate has multiple binaries, the `--bin` argument can selectively install only +one of them, and if you'd rather install examples the `--example` argument can +be used as well. + +The `--list` option will list all installed packages (and their versions). The +`--update` option will update either the crate specified or all installed +crates. +``` + +## Installing Crates + +Cargo attempts to be as flexible as possible in terms of installing crates from +various locations and specifying what should be installed. All binaries will be +stored in the **cargo-local** directory `CARGO_HOME/bin`. This is typically +`$HOME/.cargo/bin` but the home directory can be modified via the `$CARGO_HOME` +environment variable. + +Cargo will not attempt to install binaries or crates into system directories +(e.g. `/usr`) as that responsibility is intended for system package managers. + +To use installed crates one just needs to add the binary path to their `PATH` +environment variable. This will be recommended when `cargo install` is run if +`PATH` does not already look like it's configured. + +#### Crate Sources + +The `cargo install` command will be able to install crates from any source that +Cargo already understands. For example it will start off being able to install +from crates.io, git repositories, and local paths. Like with normal +dependencies, downloads from crates.io can specify a version, git repositories +can specify branches, tags, or revisions. + +#### Sources with multiple crates + +Sources like git repositories and paths can have multiple crates inside them, +and Cargo needs a way to figure out which one is being installed. If there is +more than one crate in a repo (or path), then Cargo will apply the following +heuristics to select a crate, in order: + +1. If the `-p` argument is specified, use that crate. +2. If only one crate has binaries, use that crate. +3. If only one crate has examples, use that crate. +4. Print an error suggesting the `-p` flag. + +#### Multiple binaries in a crate + +Once a crate has been selected, Cargo will by default build all binaries and +install them. This behavior can be modified with the `--bin` or `--example` +flags to configure what's installed on the local system. + +#### Building a Binary + +The `cargo install` command has some standard build options found on `cargo +build` and friends, but a key difference is that `--release` is the default for +installed binaries so a `--debug` flag is present to switch this back to +debug-mode. Otherwise the `--features` flag can be specified to activate various +features of the crate being installed. + +The `--target` option is omitted as `cargo install` is not intended for creating +cross-compiled binaries to ship to other platforms. + +#### Conflicting Crates + +Cargo will not namespace the installation directory for crates, so conflicts may +arise in terms of binary names. For example if crates A and B both provide a +binary called `foo` they cannot be both installed at once. Cargo will reject +these situations and recommend that a binary is selected via `--bin` or the +conflicting crate is uninstalled. + +## Managing Installations + +If Cargo gives access to installing packages, it should surely provide the +ability to manage what's installed! The first part of this is just discovering +what's installed, and this is provided via `cargo install --list`. A more +interesting aspect is the `cargo install --update` command. + +#### Updating Crates + +Once a crate is installed new versions can be released or perhaps the build +configuration wants to be tweaked, so Cargo will provide the ability to update +crates in-place. By default *something* needs to be specified to the `--update` +flag, either a specific crate that's been installed or the `--all` flag to +update all crates. Because multiple crates of the same name can come from +different sources, the argument to the `--update` flag will be a package id +specification instead of just the name of a crate. + +When updating a crate, it will first attempt to update the source code for the +crate. For crates.io sources this means that it will download the most recent +version. For git sources it means the git repo will be updated, but the same +branch/tag will be used (if original specified when installed). Git sources +installed via `--rev` won't be updated. + +After the source code has been updated, the crate will be rebuilt according to +the flags specified on the command line. This will override the flags that were +previously used to install a crate, for example activated features are not +remembered. + +#### Removing Crates + +To remove an installed crate, another subcommand will be added to Cargo: + +``` +Remove a locally installed crate + +Usage: + cargo uninstall [options] SPEC + +Options: + -h, --help Print this message + --bin NAME Only uninstall the binary NAME + --example EXAMPLE Only uninstall the example EXAMPLE + -v, --verbose Use verbose output + +The argument SPEC is a package id specification (see `cargo help pkgid`) to +specify which crate should be uninstalled. By default all binaries are +uninstalled for a crate but the `--bin` and `--example` flags can be used to +only uninstall particular binaries. +``` + +Cargo won't remove the source for uninstalled crates, just the binaries that +were installed by Cargo itself. + +## Non-binary artifacts + +Cargo will not currently attempt to manage anything other than a binary artifact +of `cargo build`. For example the following items will not be available to +installed crates: + +* Dynamic native libraries built as part of `cargo build`. +* Native assets such as images not included in the binary itself. +* The source code is not guaranteed to exist, and the binary doesn't know where + the source code is. + +Additionally, Cargo will not immediately provide the ability to configure the +installation stage of a package. There is often a desire for a "pre-install +script" which runs various house-cleaning tasks. This is left as a future +extension to Cargo. + +# Drawbacks + +Beyond the standard "this is more surface area" and "this may want to +aggressively include more features initially" concerns there are no known +drawbacks at this time. + +# Alternatives + +### System Package Managers + +The primary alternative to putting effort behind `cargo install` it to instead +put effort behind system-specific package managers. For example the line between +a system package manager and `cargo install` is a little blurry, and the +"official" way to distribute a package should in theory be through a system +package manager. This also has the upside of benefiting those outside the Rust +community as you don't have to have Cargo installed to manage a program. This +approach is not without its downsides, however: + +* There are *many* system package managers, and it's unclear how much effort it + would be for Cargo to support building packages for all of them. +* Actually preparing a package for being packaged in a system package manager + can be quite onerous and is often associated with a high amount of overhead. +* Even once a system package is created, it must be added to an online + repository in one form or another which is often different for each + distribution. + +All in all, even if Cargo invested effort in facilitating creation of system +packages, **the threshold for distribution a Rust program is still too high**. +If everything went according to plan it's just unfortunately inherently complex +to only distribute packages through a system package manager because of the +various requirements and how diverse they are. The `cargo install` command +provides a cross-platform, easy-to-use, if Rust-specific interface to installing +binaries. + +It is expected that all major Rust projects will still invest effort into +distribution through standard package managers, and Cargo will certainly have +room to help out with this, but it doesn't obsolete the need for +`cargo install`. + +### Installing Libraries + +Another possibility for `cargo install` is to not only be able to install +binaries, but also libraries. The meaning of this however, is pretty nebulous +and it's not clear that it's worthwhile. For example all Cargo builds will not +have access to these libraries (as Cargo retains control over dependencies). It +may mean that normal invocations of `rustc` have access to these libraries (e.g. +for small one-off scripts), but it's not clear that this is worthwhile enough to +support installing libraries yet. + +Another possible interpretation of installing libraries is that a developer is +informing Cargo that the library should be available in a pre-compiled form. If +any compile ends up using the library, then it can use the precompiled form +instead of recompiling it. This job, however, seems best left to `cargo build` +as it will automatically handle when the compiler version changes, for example. +It may also be more appropriate to add the caching layer at the `cargo build` +layer instead of `cargo install`. + +# Unresolved questions + +None yet From 672f98d2f8b497334ae8a8efc2a04d0fcb82b3c5 Mon Sep 17 00:00:00 2001 From: Peter Marheine Date: Thu, 18 Jun 2015 09:42:25 -0600 Subject: [PATCH 0365/1195] Add support for naked functions. --- text/0000-naked-fns.md | 114 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 text/0000-naked-fns.md diff --git a/text/0000-naked-fns.md b/text/0000-naked-fns.md new file mode 100644 index 00000000000..0a9df61f643 --- /dev/null +++ b/text/0000-naked-fns.md @@ -0,0 +1,114 @@ +- Feature Name: `naked_fns` +- Start Date: 2015-07-10 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add support for generating naked (prologue/epilogue-free) functions via a new +function attribute. + +# Motivation + +Some systems programming tasks require that machine state not be modified at all +on function entry so it can be preserved- particularly in interrupt handlers. +For example, x86\_64 preserves only the stack pointer, flags register, and +instruction pointer on interrupt entry. To avoid corrupting program state, the +interrupt handler must save the registers which might be modified before handing +control to compiler-generated code. Consider a contrived interrupt handler: + +```rust +unsafe fn isr_nop() { + asm!("push %rax" + /* Additional pushes elided */ :::: "volatile"); + let n = 0u64; + asm!("pop %rax" + /* Additional pops elided */ :::: "volatile"); +} +``` + +The generated assembly for this function might resemble the following +(simplified for readability): + +```x86 +isr_nop: + sub $8, %rsp + push %rax + movq $0, 0(%rsp) + pop %rax + add $8, %rsp + retq +``` + +Here the programmer's need to save machine state conflicts with the compiler's +assumption that it has complete control over stack layout, with the result that +the saved value of `rax` is clobbered by the compiler. Given that details of +stack layout for any given function are not predictable (and may change with +compiler version or optimization settings), attempting to predict the stack +layout to sidestep this issue is infeasible. + +In other languages (particularly C), "naked" functions omit the prologue and +epilogue (represented by the modifications to `rsp` in the above example) to +allow the programmer complete control over stack layout. This makes the +availability of stack space for compiler use unpredictable, usually implying +that the body of such a function must consist entirely of inline assembly +statements (such as a jump or call to another function). + +The [LLVM language +reference](http://llvm.org/docs/LangRef.html#function-attributes) describes this +feature as having "very system-specific consequences", which the programmer must +be aware of. + +# Detailed design + +Add a new function attribute to the language, `#[naked]`, indicating the +function should have prologue/epilogue emission disabled. + +For example, the following construct could be assumed not to generate extra code +on entry to `isr_caller` which might violate the programmer's assumptions, while +allowing the compiler to generate the function definition as usual: + +```rust +#[naked] +unsafe fn isr_caller() { + asm!("push %rax + call other_function + pop %rax + iretq" :::: "volatile"); + core::intrinsics::unreachable(); +} + +#[no_mangle] +pub fn other_function() { + +} +``` + +# Drawbacks + +The utility of this feature is extremely limited to most users, and it might be +misused if the implications of writing a naked function are not carefully +considered. + +# Alternatives + +Do nothing. The required functionality for the use case outlined can be +implemented outside Rust code (such as with a small amount of externally-built +assembly) and merely linked in as needed. + +Add a new calling convention (`extern "interrupt" fn ...`) which is defined to +do any necessary state saving for interrupt service routines. This permits more +efficient code to be generated for the motivating example (omitting a 'call' +instruction which is necessary for any non-trivial ISR), but may not be +appropriate for other situations that might call for a naked function. +Implementation of additional calling conventions like this in the current +`rustc` would involve significant modification to LLVM to support it (whereas +the proof-of-concept patch for `#[naked]` is less than 10 lines of code). + +# Unresolved questions + +It is easy to quietly generate wrong code in naked functions, such as by causing +the compiler to allocate stack space for temporaries where none were +anticipated. It may be desirable to allow the `#[naked]` attribute on `unsafe` +functions only, reinforcing the need for extreme care in the use of this +feature. From 8e0f4f277f3650bb8a3951652844a00750f50f72 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Fri, 10 Jul 2015 14:33:07 +0200 Subject: [PATCH 0366/1195] Add RFC line_endings --- text/0000-line-endings.md | 70 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 text/0000-line-endings.md diff --git a/text/0000-line-endings.md b/text/0000-line-endings.md new file mode 100644 index 00000000000..440f07ea401 --- /dev/null +++ b/text/0000-line-endings.md @@ -0,0 +1,70 @@ +- Feature Name: line_endings +- Start Date: 2015-07-10 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Change all functions dealing with reading "lines" to treat both '\n' and '\r\n' +as a valid line-ending. + +# Motivation + +The current behavior of these functions is to treat only '\n' as line-ending. +This is surprising for programmers experienced in other languages. Many +languages open files in a "text-mode" per default, which means when they iterate +over the lines, they don't have to worry about the two kinds of line-endings. +Such programmers will be surprised to learn that they have to take care of such +details themselves in Rust. Some may not even have heard of the distinction +between two styles of line-endings. + +The current design also violates the "do what I mean" principle. Both '\r\n' and +'\n' are widely used as line-separators. By talking about the concept of +"lines", it is clear that the current file (or buffer, really) is considered to +be in text format. It is thus very reasonable to expect "lines" to apply to both +kinds of encoding lines in binary format. + +In particular, if the crate is developed on Linux or Mac, the programmer will +probably have most of his input encoded with only '\n' for the line-endings. He +may use the functions talking about "lines", and they will work all right. It is +only when someone runs this crate on input that contains '\r\n' that the bug +will be uncovered. The editor has personally run into this issue when reading +line-by-line from stdin, with the program suddenly failing on Windows. + +# Detailed design + +The following functions will have to be changed: `BufRead::lines` and +`str::lines`. They both should treat '\r\n' as marking the end of a line. This +can be implemented, for example, by first splitting at '\n' like now and then +removing a trailing '\r' right before returning data to the caller. + +Furthermore, `str::lines_any` (the only function currently dealing with both +kinds of line-endings) is deprecated, as it is then functionally equivalent with +`str::lines`. + +# Drawbacks + +This is a semantics-breaking change, changing the behavior of released, stable +API. However, as argued above, the new behavior is much less surprising than the +old one - so one could consider this fixing a bug in the original +implementation. There are alternatives available for the case that one really +wants to split at '\n' only, namely `BufRead::split` and `str::split`. However, +`BufRead:split` does not iterate over `String`, but rather over `Vec`, so +users have to insert an additional explicit call to `String::from_utf8`. + +# Alternatives + +There's the obvious alternative of not doing anything. This leaves a gap in the +features Rust provides to deal with text files, making it hard to treat both +kinds of line-endings uniformly. + +The second alternative is to add `BufRead::lines_any` which works similar to +`str::lines_any` in that it deals with both '\n' and '\r\n'. This provides all +the necessary functionality, but it still leaves people with the need to choose +one of the two functions - and potentially choosing the wrong one. In +particular, the functions with the shorter, nicer name (the existing ones) will +almost always *not* be the right choice. + +# Unresolved questions + +None I can think of. From 85f16e252a0bdfbc59a8678c6600bc9cb8a06d0f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Sat, 11 Jul 2015 14:22:02 -0700 Subject: [PATCH 0367/1195] Fix typo --- text/0000-cargo-install.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-cargo-install.md b/text/0000-cargo-install.md index f6143295a85..8214f99647a 100644 --- a/text/0000-cargo-install.md +++ b/text/0000-cargo-install.md @@ -219,7 +219,7 @@ drawbacks at this time. ### System Package Managers -The primary alternative to putting effort behind `cargo install` it to instead +The primary alternative to putting effort behind `cargo install` is to instead put effort behind system-specific package managers. For example the line between a system package manager and `cargo install` is a little blurry, and the "official" way to distribute a package should in theory be through a system From 362090e0d19838ba5ab18b26ade6d0ef4bffe630 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Sat, 11 Jul 2015 14:22:41 -0700 Subject: [PATCH 0368/1195] Add a dollor for an env var --- text/0000-cargo-install.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-cargo-install.md b/text/0000-cargo-install.md index 8214f99647a..d4fa407a6bd 100644 --- a/text/0000-cargo-install.md +++ b/text/0000-cargo-install.md @@ -84,7 +84,7 @@ crates. Cargo attempts to be as flexible as possible in terms of installing crates from various locations and specifying what should be installed. All binaries will be -stored in the **cargo-local** directory `CARGO_HOME/bin`. This is typically +stored in the **cargo-local** directory `$CARGO_HOME/bin`. This is typically `$HOME/.cargo/bin` but the home directory can be modified via the `$CARGO_HOME` environment variable. From 4083ad7a0b852b800d360467624b3ce773d0a9b2 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 13 Jul 2015 10:13:31 -0700 Subject: [PATCH 0369/1195] Update FOLLOW set for `ty` tokens Accounts for rust-lang/rust#27000 --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index d9012492808..42f0608d8de 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -106,7 +106,7 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, - `FOLLOW(pat)` = `{FatArrow, Comma, Eq}` - `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` -- `FOLLOW(ty)` = `{Comma, FatArrow, Colon, Eq, Gt, Ident(as)}` +- `FOLLOW(ty)` = `{Comma, FatArrow, Colon, Eq, Gt, Ident(as), Semi}` - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(path)` = `FOLLOW(ty)` - `FOLLOW(block)` = any token From 3214bb31123381de000798b73112846fe3006a4d Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 13 Jul 2015 10:12:57 -0700 Subject: [PATCH 0370/1195] Clarify where packages are installed --- text/0000-cargo-install.md | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/text/0000-cargo-install.md b/text/0000-cargo-install.md index d4fa407a6bd..5561b246d13 100644 --- a/text/0000-cargo-install.md +++ b/text/0000-cargo-install.md @@ -56,6 +56,7 @@ Options: -p, --package CRATE Install this crate from crates.io or select the package in a repository/path to install. -v, --verbose Use verbose output + --root Directory to install packages into This command manages Cargo's local set of install binary crates. Only packages which have [[bin]] targets can be installed, and all binaries are installed into @@ -84,9 +85,8 @@ crates. Cargo attempts to be as flexible as possible in terms of installing crates from various locations and specifying what should be installed. All binaries will be -stored in the **cargo-local** directory `$CARGO_HOME/bin`. This is typically -`$HOME/.cargo/bin` but the home directory can be modified via the `$CARGO_HOME` -environment variable. +stored in a **cargo-local** directory, and more details on where exactly this is +located can be found below. Cargo will not attempt to install binaries or crates into system directories (e.g. `/usr`) as that responsibility is intended for system package managers. @@ -140,6 +140,24 @@ binary called `foo` they cannot be both installed at once. Cargo will reject these situations and recommend that a binary is selected via `--bin` or the conflicting crate is uninstalled. +#### Placing output artifacts + +The `cargo install` command can be customized where it puts its output artifacts +to install packages in a custom location. The root directory of the installation +will be determined in a hierarchical fashion, choosing the first of the +following that is specified: + +1. The `--root` argument on the command line. +2. The environment variable `CARGO_INSTALL_ROOT`. +3. The `install.root` configuration option. +4. The value of `$CARGO_HOME` (also determined in an independent and + hierarchical fashion). + +Once the root directory is found, Cargo will place all binaries in the +`$INSTALL_ROOT/bin` folder. Cargo will also reserve the right to retain some +metadata in this folder in order to keep track of what's installed and what +binaries belong to which package. + ## Managing Installations If Cargo gives access to installing packages, it should surely provide the From 585b289ec2db573ce995391c2563902a9fa49af1 Mon Sep 17 00:00:00 2001 From: Cesar Eduardo Barros Date: Mon, 13 Jul 2015 16:19:30 -0300 Subject: [PATCH 0371/1195] My proposal --- text/0000-read-all.md | 196 +++++++++++++++++++++++++++++------------- 1 file changed, 138 insertions(+), 58 deletions(-) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index e3686756335..6f191524fe6 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -1,58 +1,74 @@ -- Feature Name: read_exact and read_full +- Feature Name: read_exact and ErrorKind::UnexpectedEOF - Start Date: 2015-03-15 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) # Summary -Rust's `Write` trait has `write_all`, which is a convenience method that calls -`write` repeatedly to write an entire buffer. This proposal adds two similar -convenience methods to the `Read` trait: `read_full` and `read_exact`. -`read_full` calls `read` repeatedly until the buffer has been filled, EOF has -been reached, or an error other than `Interrupted` occurs. `read_exact` is -similar to `read_full`, except that reaching EOF before filling the buffer is -considered an error. +Rust's `Write` trait has the `write_all` method, which is a convenience +method that writes a whole buffer, failing with `ErrorKind::WriteZero` +if the buffer cannot be written in full. + +This RFC proposes adding its `Read` counterpart: a method (here called +`read_exact`) that reads a whole buffer, failing with an error (here +called `ErrorKind::UnexpectedEOF`) if the buffer cannot be read in full. # Motivation -The `read` method may return fewer bytes than requested, and may fail with an -`Interrupted` error if a signal is received during the call. This requires -programs wishing to fill a buffer to call `read` repeatedly in a loop. This is -a very common need, and it would be nice if this functionality were provided in -the standard library. Many C and Rust programs have the same need, and solve it -in the same way. For example, Git has [`read_in_full`][git], which behaves like -the proposed `read_full`, and the Rust byteorder crate has -[`read_full`][byteorder], which behaves like the proposed `read_exact`. -[git]: https://github.com/git/git/blob/16da57c7c6c1fe92b32645202dd19657a89dd67d/wrapper.c#L246 -[byteorder]: https://github.com/BurntSushi/byteorder/blob/2358ace61332e59f596c9006e1344c97295fdf72/src/new.rs#L184 +When dealing with serialization formats with fixed-length fields, +reading or writing less than the field's size is an error. For the +`Write` side, the `write_all` method does the job; for the `Read` side, +however, one has to call `read` in a loop until the buffer is completely +filled, or until a premature EOF is reached. + +This leads to a profusion of similar helper functions. For instance, the +`byteorder` crate has a `read_full` function, and the `postgres` crate +has a `read_all` function. However, their handling of the premature EOF +condition differs: the `byteorder` crate has its own `Error` enum, with +`UnexpectedEOF` and `Io` variants, while the `postgres` crate uses an +`io::Error` with an `io::ErrorKind::Other`. + +That can make it unnecessarily hard to mix uses of these helper +functions; for instance, if one wants to read a 20-byte tag (using one's +own helper function) followed by a big-endian integer, either the helper +function has to be written to use `byteorder::Error`, or the calling +code has to deal with two different ways to represent a premature EOF, +depending on which field encountered the EOF condition. + +Additionally, when reading from an in-memory buffer, looping is not +necessary; it can be replaced by a size comparison followed by a +`copy_memory` (similar to `write_all` for `&mut [u8]`). If this +non-looping implementation is `#[inline]`, and the buffer size is known +(for instance, it's a fixed-size buffer in the stack, or there was an +earlier check of the buffer size against a larger value), the compiler +could potentially turn a read from the buffer followed by an endianness +conversion into the native endianness (as can happen when using the +`byteorder` crate) into a single-instruction direct load from the buffer +into a register. # Detailed design -The following methods will be added to the `Read` trait: +First, a new variant `UnexpectedEOF` is added to the `io::ErrorKind` enum. + +The following method is added to the `Read` trait: ``` rust -fn read_full(&mut self, buf: &mut [u8]) -> Result; fn read_exact(&mut self, buf: &mut [u8]) -> Result<()>; ``` -Additionally, default implementations of these methods will be provided: +Aditionally, a default implementation of this method is provided: ``` rust -fn read_full(&mut self, mut buf: &mut [u8]) -> Result { - let mut read = 0; +fn read_exact(&mut self, mut buf: &mut [u8]) -> Result<()> { while buf.len() > 0 { match self.read(buf) { Ok(0) => break, - Ok(n) => { read += n; let tmp = buf; buf = &mut tmp[n..]; } + Ok(n) => { let tmp = buf; buf = &mut tmp[n..]; } Err(ref e) if e.kind() == ErrorKind::Interrupted => {} Err(e) => return Err(e), } } - Ok(read) -} - -fn read_exact(&mut self, buf: &mut [u8]) -> Result<()> { - if try!(self.read_full(buf)) != buf.len() { + if buf.len() > 0 { Err(Error::new(ErrorKind::UnexpectedEOF, "failed to fill whole buffer")) } else { Ok(()) @@ -60,40 +76,104 @@ fn read_exact(&mut self, buf: &mut [u8]) -> Result<()> { } ``` -Finally, a new `ErrorKind::UnexpectedEOF` will be introduced, which will be -returned by `read_exact` in the event of a premature EOF. +And an optimized implementation of this method for `&[u8]` is provided: + +```rust +#[inline] +fn read_exact(&mut self, buf: &mut [u8]) -> Result<()> { + if (buf.len() > self.len()) { + return Err(Error::new(ErrorKind::UnexpectedEOF, "failed to fill whole buffer")); + } + let (a, b) = self.split_at(buf.len()); + slice::bytes::copy_memory(a, buf); + *self = b; + Ok(()) +} +``` + +# Naming + +It's unfortunate that `write_all` used `WriteZero` for its `ErrorKind`; +were it named `UnexpectedEOF` (which is a much more intuitive name), the +same `ErrorKind` could be used for both functions. + +The initial proposal for this `read_exact` method called it `read_all`, +for symmetry with `write_all`. However, that name could also be +interpreted as "read as many bytes as you can that fit on this buffer, +and return what you could read" instead of "read enough bytes to fill +this buffer, and fail if you couldn't read them all". The previous +discussion led to `read_exact` for the later meaning, and `read_full` +for the former meaning. # Drawbacks -Like `write_all`, these APIs are lossy: in the event of an error, there is no -way to determine the number of bytes that were successfully read before the -error. However, doing so would complicate the methods, and the caller will want -to simply fail if an error occurs the vast majority of the time. Situations -that require lower level control can still use `read` directly. +If this method fails, the buffer contents are undefined; the +`read_exact' method might have partially overwritten it. If the caller +requires "all-or-nothing" semantics, it must clone the buffer. In most +use cases, this is not a problem; the caller will discard or overwrite +the buffer in case of failure. -# Unanswered Questions +In the same way, if this method fails, there is no way to determine how +many bytes were read before it determined it couldn't completely fill +the buffer. -Naming. Is `read_full` the best name? Should `UnexpectedEOF` instead be -`ShortRead` or `ReadZero`? +Situations that require lower level control can still use `read` +directly. # Alternatives -Use a more complicated return type to allow callers to retrieve the number of -bytes successfully read before an error occurred. As explained above, this -would complicate the use of these methods for very little gain. It's worth -noting that git's `read_in_full` is similarly lossy, and just returns an error -even if some bytes have been read. - -Only provide `read_exact`, but parameterize the `UnexpectedEOF` or `ShortRead` -error kind with the number of bytes read to allow it to be used in place of -`read_full`. This would be less convenient to use in cases where EOF is not an -error. - -Only provide `read_full`. This would cover most of the convenience (callers -could avoid the read loop), but callers requiring a filled buffer would have to -manually check if all of the desired bytes were read. - -Finally, we could leave this out, and let every Rust user needing this -functionality continue to write their own `read_full` or `read_exact` function, -or have to track down an external crate just for one straightforward and -commonly used convenience method. +The first alternative is to do nothing. Every Rust user needing this +functionality continues to write their own read_full or read_exact +function, or have to track down an external crate just for one +straightforward and commonly used convenience method. Additionally, +unless everybody uses the same external crate, every reimplementation of +this method will have slightly different error handling, complicating +mixing users of multiple copies of this convenience method. + +The second alternative is to just add the `ErrorKind::UnexpectedEOF` or +similar. This would lead in the long run to everybody using the same +error handling for their version of this convenience method, simplifying +mixing their uses. However, it's questionable to add an `ErrorKind` +variant which is never used by the standard library. + +Another alternative is to return the number of bytes read in the error +case. That makes the buffer contents defined also in the error case, at +the cost of increasing the size of the frequently-used `io::Error` +struct, for a rarely used return value. My objections to this +alternative are: + +* If the caller has an use for the partially written buffer contents, + then it's treating the "buffer partially filled" case as an + alternative success case, not as a failure case. This is not a good + match for the semantics of an `Err` return. +* Determining that the buffer cannot be completely filled can in some + cases be much faster than doing a partial copy. Many callers are not + going to be interested in an incomplete read, meaning that all the + work of filling the buffer is wasted. +* As mentioned, it increases the size of a commonly used type in all + cases, even when the code has no mention of `read_exact`. + +The final alternative is `read_full`, which returns the number of bytes +read (`Result`) instead of failing. This means that every caller +has to check the return value against the size of the passed buffer, and +some are going to forget (or misimplement) the check. It also prevents +some optimizations (like the early return in case there will never be +enough data). There are, however, valid use cases for this alternative; +for instance, reading a file in fixed-size chunks, where the last chunk +(and only the last chunk) can be shorter. I believe this should be +discussed as a separate proposal; its pros and cons are distinct enough +from this proposal to merit its own arguments. + +I believe that the case for `read_full` is weaker than `read_exact`, for +the following reasons: + +* While `read_exact` needs an extra variant in `ErrorKind`, `read_full` + has no new error cases. This means that implementing it yourself is + easy, and multiple implementations have no drawbacks other than code + duplication. +* While `read_exact` can be optimized with an early return in cases + where the reader knows its total size (for instance, reading from a + compressed file where the uncompressed size was given in a header), + `read_full` has to always write to the output buffer, so there's not + much to gain over a generic looping implementation calling `read`. + From 1132ede301fae89809ce2932faa9ef8cdad103b5 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Mon, 13 Jul 2015 13:39:43 -0700 Subject: [PATCH 0372/1195] Clarify how the intrinsics' structural typing works. --- text/0000-simd-infrastructure.md | 33 +++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 0c6af575682..4da9d3926cf 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -132,6 +132,33 @@ in a "duck-typed" manner: it will just ensure that the types are SIMD vectors with the appropriate length and element type, it will not enforce a specific nominal type. +NB. The structural typing is just for the declaration: if a SIMD intrinsic +is declared to take a type `X`, it must always be called with `X`, +even if other types are structurally equal to `X`. Also, within a +signature, SIMD types that must be structurally equal must be nominal +equal. I.e. if the `add_...` all refer to the same intrinsic to add a +SIMD vector of bytes, + +```rust +// (same length) +struct A(u8, u8, ..., u8); +struct B(u8, u8, ..., u8); + +extern "rust-intrinsic" { + fn add_aaa(x: A, y: A) -> A; // ok + fn add_bbb(x: B, y: B) -> B; // ok + fn add_aab(x: A, y: A) -> B; // error, expected B, found A + fn add_bab(x: B, y: A) -> B; // error, expected A, found B +} + +fn double_a(x: A) -> A { + add_aaa(x, x) +} +fn double_b(x: B) -> B { + add_aaa(x, x) // error, expected A, found B +} +``` + There would additionally be a small set of cross-platform operations that are either generally efficiently supported everywhere or are extremely useful. These won't necessarily map to a single instruction, @@ -173,9 +200,9 @@ extern "rust-intrinsic" { ``` The raw definitions are only checked for validity at monomorphisation -time, ensure that `T` is a SIMD vector, `U` is the element type of `T` -etc. Libraries can use traits to ensure that these will be enforced by -the type checker too. +time, ensure that `T` is a SIMD vector, `Elem` is the element type of +`T` etc. Libraries can use traits to ensure that these will be +enforced by the type checker too. This approach has some downsides: `simd_shuffle32` (e.g. `Simd32` on AVX, and `Simd32` on AVX-512) and especially `simd_shuffle64` From 8317ea49ce509af257722fcb0be27bf23de54377 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Mon, 13 Jul 2015 13:41:13 -0700 Subject: [PATCH 0373/1195] Add arithmetic intrinsics alternative. --- text/0000-simd-infrastructure.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 4da9d3926cf..a93fcf3d319 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -366,6 +366,9 @@ cfg_if_else! { - have 100% guaranteed type-safety for generic `#[repr(simd)]` types and the generic intrinsics. This would probably require a relatively complicated set of traits (with compiler integration). +- use generic intrinsics like shuffles for the arithmetic operations, + instead of providing the operations implicitly. + # Unresolved questions From c6ed18ac09c44e01d1076c8b1baf9fbc1877068a Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Mon, 13 Jul 2015 13:53:56 -0700 Subject: [PATCH 0374/1195] Write down an answer to "why not `asm!`?". --- text/0000-simd-infrastructure.md | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index a93fcf3d319..cc3063a4494 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -276,6 +276,33 @@ of the arithmetic overflow checks of `-C debug-assertions`): explicit SIMD is essentially only required for speed, and checking inflates one instruction to 5 or more. +### Why not inline asm? + +One alternative to providing intrinsics is to instead just use +inline-asm to expose each CPU instruction. However, this approach has +essentially only one benefit (avoiding defining the intrinsics), but +several downsides, e.g. + +- assembly is generally a black-box to optimisers, inhibiting + optimisations, like algebraic simplification/transformation, +- programmers would have to manually synthesise the right sequence of + operations to achieve a given shuffle, while having a generic + shuffle intrinsic lets the compiler do it (NB. the intention is that + the programmer will still have access to the platform specific + operations for when the compiler synthesis isn't quite right), +- inline assembly is not currently stable in + Rust and there's not a strong push for it to be so in the immediate + future (although this could change). + +Benefits of manual assembly writing, like instruction scheduling and +register allocation don't apply to the (generally) one-instruction +`asm!` blocks that replace the intrinsics (they need to be designed so +that the compiler has full control over register allocation, or else +the result will be strictly worse). Those possible advantages of hand +written assembly over intrinsics only come in to play when writing +longer blocks of raw assembly, i.e. some inner loop might be faster +when written as a single chunk of asm rather than as intrinsics. + ## Platform Detection The availability of efficient SIMD functionality is very fine-grained, From ba2f1fe8e0f024f36c4d2e99f8e5f681b171c844 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 13 Jul 2015 15:46:48 -0700 Subject: [PATCH 0375/1195] Fix a missing backtick --- text/0000-cap-lints.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-cap-lints.md b/text/0000-cap-lints.md index 5aca0d59764..ce71a1a9537 100644 --- a/text/0000-cap-lints.md +++ b/text/0000-cap-lints.md @@ -47,7 +47,7 @@ flag to the compiler: ``` For example when `--cap-lints allow` is passed, all instances of `#[warn]`, -`#[deny]`, and `#[forbid] are ignored. If, however `--cap-lints warn` is passed +`#[deny]`, and `#[forbid]` are ignored. If, however `--cap-lints warn` is passed only `deny` and `forbid` directives are ignored. The acceptable values for `LEVEL` will be `allow`, `warn`, `deny`, or `forbid`. From 4df689974c0b40207790d238254b2107baae5254 Mon Sep 17 00:00:00 2001 From: Cesar Eduardo Barros Date: Tue, 14 Jul 2015 08:35:59 -0300 Subject: [PATCH 0376/1195] Detailed semantics, and explanations about EINTR and the read pointer --- text/0000-read-all.md | 82 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index 6f191524fe6..83b1f4aa4b7 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -91,6 +91,88 @@ fn read_exact(&mut self, buf: &mut [u8]) -> Result<()> { } ``` +The detailed semantics of `read_exact` are as follows: `read_exact` +reads exactly the number of bytes needed to completely fill its `buf` +parameter. If that's not possible due to an "end of file" condition +(that is, the `read` method would return 0 even when passed a buffer +with at least one byte), it returns an `ErrorKind::UnexpectedEOF` error. + +On success, the read pointer is advanced by the number of bytes read, as +if the `read` method had been called repeatedly to fill the buffer. On +any failure (including an `ErrorKind::UnexpectedEOF`), the read pointer +might have been advanced by any number between zero and the number of +bytes requested (inclusive), and the contents of its `buf` parameter +should be treated as garbage (any part of it might or might not have +been overwritten by unspecified data). + +Even if the failure was an `ErrorKind::UnexpectedEOF`, the read pointer +might have been advanced by a number of bytes less than the number of +bytes which could be read before reaching an "end of file" condition. + +The `read_exact` method will never return an `ErrorKind::Interrupted` +error, similar to the `read_to_end` method. + +Similar to the `read` method, no guarantees are provided about the +contents of `buf` when this function is called; implementations cannot +rely on any property of the contents of `buf` being true. It is +recommended that implementations only write data to `buf` instead of +reading its contents. + +# About ErrorKind::Interrupted + +Whether or not `read_exact` can return an `ErrorKind::Interrupted` error +is orthogonal to its semantics. One could imagine an alternative design +where `read_exact` could return an `ErrorKind::Interrupted` error. + +The reason `read_exact` should deal with `ErrorKind::Interrupted` itself +is its non-idempotence. On failure, it might have already partially +advanced its read pointer an unknown number of bytes, which means it +can't be easily retried after an `ErrorKind::Interrupted` error. + +One could argue that it could return an `ErrorKind::Interrupted` error +if it's interrupted before the read pointer is advanced. But that +introduces a non-orthogonality in the design, where it might either +return or retry depending on whether it was interrupted at the beginning +or in the middle. Therefore, the cleanest semantics is to always retry. + +There's precedent for this choice in the `read_to_end` method. Users who +need finer control should use the `read` method directly. + +# About the read pointer + +This RFC proposes a `read_exact` function where the read pointer +(conceptually, what would be returned by `Seek::seek` if the stream was +seekable) is unspecified on failure: it might not have advanced at all, +have advanced in full, or advanced partially. + +Two possible alternatives could be considered: never advance the read +pointer on failure, or always advance the read pointer to the "point of +error" (in the case of `ErrorKind::UnexpectedEOF`, to the end of the +stream). + +Never advancing the read pointer on failure would make it impossible to +have a default implementation (which calls `read` in a loop), unless the +stream was seekable. It would also impose extra costs (like creating a +temporary buffer) to allow "seeking back" for non-seekable streams. + +Always advancing the read pointer to the end on failure is possible; it +happens without any extra code in the default implementation. However, +it can introduce extra costs in optimized implementations. For instance, +the implementation given above for `&[u8]` would need a few more +instructions in the error case. Some implementations (for instance, +reading from a compressed stream) might have a larger extra cost. + +The utility of always advancing the read pointer to the end is +questionable; for non-seekable streams, there's not much that can be +done on an "end of file" condition, so most users would discard the +stream in both an "end of file" and an `ErrorKind::UnexpectedEOF` +situation. For seekable streams, it's easy to seek back, but most users +would treat an `ErrorKind::UnexpectedEOF` as a "corrupted file" and +discard the stream anyways. + +Users who need finer control should use the `read` method directly, or +when available use the `Seek` trait. + # Naming It's unfortunate that `write_all` used `WriteZero` for its `ErrorKind`; From bd7f40c1b3e7012fba289d5ceba5c7c53235698d Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 7 Jul 2015 10:26:00 -0400 Subject: [PATCH 0377/1195] Initial commit --- text/0000-mir.md | 821 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 821 insertions(+) create mode 100644 text/0000-mir.md diff --git a/text/0000-mir.md b/text/0000-mir.md new file mode 100644 index 00000000000..20bdbbf0843 --- /dev/null +++ b/text/0000-mir.md @@ -0,0 +1,821 @@ +- Feature Name: N/A +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Introduce a "mid-level IR" (MIR) into the compiler. The MIR desugars +most of Rust's surface representation, leaving a simpler form that is +well-suited to type-checking and translation. + +# Motivation + +The current compiler uses a single AST from the initial parse all the +way to the final generation of LLVM. While this has some advantages, +there are also a number of distinct downsides. + +1. The complexity of the compiler is increased because all passes must + be written against the full Rust language, rather than being able + to consider a reduced subset. The MIR proposed here is *radically* + simpler than the surface Rust syntax -- for example, it contains no + "match" statements, and converts both `ref` bindings and `&` + expresions into a single form. + + a. There are numerous examples of "desugaring" in Rust. In + principle, desugaring one language feature into another should + make the compiler *simpler*, but in our current implementation, + it tends to make things more complex, because every phase must + simulate the desugaring anew. The most prominent example are + closure expressions (`|| ...`), which desugar to a fresh struct + instance, but other examples abound: `for` loops, `if let` and + `while let`, `box` expressions, overloaded operators (which + desugar to method calls), method calls (which desugar to UFCS + notation). + + b. There are a number of features which are almost infeasible to + implement today but which should be much easier given a MIR + representation. Examples include box patterns and non-lexical + lifetimes. + +2. Reasoning about fine-grained control-flow in an AST is rather + difficult. The right tool for this job is a control-flow graph + (CFG). We currently construct a CFG that lives "on top" of the AST, + which allows the borrow checking code to be flow sensitive, but it + is awkward to work with. Worse, because this CFG is not used by + trans, it is not necessarily the case that the control-flow as seen + by the analyses corresponds to the code that will be generated. + The MIR is based on a CFG, resolving this situation. + +3. The reliability of safety analyses is reduced because the gap + between what is being analyzed (the AST) and what is being executed + (LLVM bitcode) is very wide. The MIR is very low-level and hence the + translation to LLVM should be straightforward. + +4. The reliability of safety proofs, when we have some, would be + reduced because the formal language we are modeling is so far from + the full compiler AST. The MIR is simple enough that it should be + possible to (eventually) make safety proofs based on the MIR + itself. + +5. Rust-specific optimizations, and optimizing trans output, are very + challenging. There are numerous cases where it would be nice to be + able to do optimizations *before* translating to LLVM bitcode, or + to take advantage of Rust-specific knowledge of which LLVM is + unaware. Currently, we are forced to do these optimizations as part + of lowering to bitcode, which can get quite complex. Having an + intermediate form improves the situation because: + + a. In some cases, we can do the optimizations in the MIR itself before translation. + b. In other cases, we can do analyses on the MIR to easily determine when the optimization + would be safe. + c. In all cases, whatever we can do on the MIR will be helpful for other + targets beyond LLVM (see next bullet). + +6. Migrating away from LLVM is nearly impossible. In the future, it + may be advantageous to provide a choice of backends beyond + LLVM. Currently though this is infeasible, since so much of the + semantics of Rust itself are embedded in the `trans` step which + converts to LLVM IR. Under the MIR design, those semantics are + instead described in the translation from AST to MIR, and the LLVM + step itself simply applies optimizations. + +Given the numerous benefits of a MIR, you may wonder why we have not +taken steps in this direction earlier. In fact, we have a number of +structures in the compiler that simulate the effect of a MIR: + +1. Adjustments. Every expression can have various adjustments, like + autoderefs and so forth. These are computed by the type-checker + and then read by later analyses. This is a form of MIR, but not a particularly + convenient one. +2. The CFG. The CFG tries to model the flow of execution as a graph + rather than a tree, to help analyses in dealing with complex + control-flow formed by things like loops, `break`, `continue`, etc. + This CFG is however inferior to the MIR in that it is only an + approximation of control-flow and does not include all the + information one would need to actually execute the program (for + example, for an `if` expression, the CFG would indicate that two + branches are possible, but would not contain enough information to + decide which branch to take). +3. `ExprUseVisitor`. The `ExprUseVisitor` is designed to work in + conjunction with the CFG. It walks the AST and highlights actions + of interest to later analyses, such as borrows or moves. For each + such action, the analysis gets a callback indicating the point in + the CFG where the action occurred along with what + happened. Overloaded operators, method calls, and so forth are + "desugared" into their more primitive operations. This is + effectively a kind of MIR, but it is not complete enough to do + translation, since it focuses purely on borrows, moves, and other + things of interest to the safety checker. + +Each of these things were added in order to try and cope with the +complexity of working directly on the AST. The CFG for example +consolidates knowledge about control-flow into one piece of code, +producing a data structure that can be easily interpreted. Similarly, +the `ExprUseVisitor` consolidates knowledge of how to walk and +interpret the current compiler representation. + +### Goals + +It is useful to think about what "knowledge" the MIR should +encapsulate. Here is a listing of the kinds of things that should be +explicit in the MIR and thus that downstream code won't have to +re-encode in the form of repeated logic: + +- **Precise ordering of control-flow.** The CFG makes this very explicit, + and the individual statements and nodes in the MIR are very small + and detailed and hence nothing "interesting" happens in the middle + of an individual node with respect to control-flow. +- **What needs to be dropped and when.** The set of data that needs to + be dropped and when is a fairly complex thing to calculate: you have + to know what's in scope, including temporary values and so forth. + In the MIR, all drops are explicit, including those that result from + panics and unwinding. +- **How matches are desugared.** Reasoning about matches has been a + traditional source of complexity. Matches combine traversing types + with borrows, moves, and all sorts of other things, depending on the + precise patterns in use. This is all vastly simplified and explicit + in MIR. + +One thing the current MIR does not make explicit as explicit as it +could is when something is *moved*. For by-value uses of a value, the +code must still consult the type of the value to decide if that is a +move or not. This could be made more explicit in the IR. + +### Which analyses are well-suited to the MIR? + +Some analyses are better suited to the AST than to a MIR. The +following is a list of work the compiler does that would benefit from +using a MIR: + +- **liveness checking**: this is used to issue warnings about unused assignments + and the like. The MIR is perfect for this sort of data-flow analysis. +- **borrow and move checking**: the borrow checker already uses a + combination of the CFG and `ExprUseVisitor` to try and achieve a + similarly low-level of detail. +- **translation to LLVM IR**: the MIR is much closer than the AST to + the desired end-product. + +Some other passes would probably work equally well on the MIR or an +AST, but they will likely find the MIR somewhat easier to work with +than the current AST simply because it is, well, simpler: + +- **rvalue checking**, which checks that things are `Sized` which need to be. +- **reachability** and **death checking**. + +These items are likely ill-suited to the MIR as designed: + +- **privacy checking**, since it relies on explicit knowledge of paths that is not + necessarily present in the MIR. +- **lint checking**, since it is often dependent on the sort of surface details + we are seeking to obscure. + +For some passes, the impact is not entirely clear. In particular, +**match exhaustiveness checking** could easily be subsumed by the MIR +construction process, which must do a similar analysis during the +lowering process. However, once the MIR is built, the match is +completely desugared into more primitive switches and so forth, so we +will need to leave some markers in order to know where to check for +exhaustiveness and to reconstruct counter examples. + +# Detailed design + +### What is *really* being proposed here? + +The rest of this section goes into detail on a particular MIR design. +However, the true purpose of this RFC is not to nail down every detail +of the MIR -- which are expected to evolve and change over time anyway +-- but rather to establish some high-level principles which drive the +rest of the design: + +1. We should indeed lower the representation from an AST to something + else that will drive later analyses, and this representation should + be based on a CFG, not a tree. +2. This representation should be explicitly minimal and not attempt to retain + the original syntactic structure, though it should be possible to recover enough + of it to make quality error messages. +3. This representation should encode drops, panics, and other + scope-dependent items explicitly. +4. This representation does not have to be well-typed Rust, though it + should be possible to type-check it using a tweaked variant on the + Rust type system. + +### Prototype + +The MIR design being described here [has been prototyped][proto-crate] +and can be viewed in the `nikomatsakis` repository on github. In +particular, [the `repr` module][repr] defines the MIR representation, +and [the `build` module][build] contains the code to create a MIR +representation from an AST-like form. + +For increased flexibility, as well as to make the code simpler, the +prototype is not coded directly against the compiler's AST, but rather +against an idealized representation defined by [the `HIR` trait][hir]. +Note that this HIR trait is entirely independent from the HIR discussed by +nrc in [RFC 1191][1191] -- you can think of it as an abstract trait +that any high-level Rust IR could implement, including our current +AST. Moreover, it's just an implementation detail and not part of the +MIR being proposed here per se. Still, if you want to read the code, +you have to understand its design. + +The `HIR` trait contains a number of opaque associated types for the +various aspects of the compiler. For example, the type `H::Expr` +represents an expression. In order to find out what kind of expression +it is, the `mirror` method is called, which converts an `H::Expr` into +an `Expr` mirror. This mirror then contains embedded `ExprRef` +nodes to refer to further subexpressions; these may either be mirrors +themselves, or else they may be additional `H::Expr` nodes. This +allows the tree that is exported to differ in small ways from the +actual tree within the compiler; the primary intention is to use this +to model "adjustments" like autoderef. The code to convert from our +current AST to the HIR is not yet complete, but it can be found in the +[`tcx` module][tcx]. + +Note that the HIR mirroring system is an experiment and not really +part of the MIR itself. It does however present an interesting option +for (eventually) stabilizing access to the compiler's internals. + +[proto-crate]: https://github.com/nikomatsakis/rust/tree/mir/src/librustc_mir +[repr]: https://github.com/nikomatsakis/rust/blob/mir/src/librustc_mir/repr.rs +[build]: https://github.com/nikomatsakis/rust/tree/mir/src/librustc_mir/build +[hir]: https://github.com/nikomatsakis/rust/blob/mir/src/librustc_mir/hir.rs +[1191]: https://github.com/rust-lang/rfcs/pull/1191 +[tcx]: https://github.com/nikomatsakis/rust/blob/mir/src/librustc_mir/tcx/mod.rs + +### Overview of the MIR + +The proposed MIR always describes the execution of a single fn. At +the highest level it consists of a series of declarations regarding +the stack storage that will be required and then a set of basic +blocks: + + MIR = fn({TYPE}) -> TYPE { + {let [mut] B: TYPE;} // user-declared bindings and their types + {let TEMP: TYPE;} // compiler-introduced temporary + {BASIC_BLOCK} // control-flow graph + }; + +The storage declarations are broken into two categories. User-declared +bindings have a 1-to-1 relationship with the variables specified in +the program. Temporaries are introduced by the compiler in various +cases. For example, borrowing an lvalue (e.g., `&foo()`) will +introduce a temporary to store the result of `foo()`. Similarly, +discarding a value `foo();` is translated to something like `let tmp = +foo(); drop(tmp);`). Temporaries are single-assignment, but because +they can be borrowed they may be mutated after this assignment and +hence they differ somewhat from variables in a pure SSA +representation. + +The proposed MIR takes the form of a graph where each node is a *basic +block*. A basic block is a standard compiler term for a continuous +sequence of instructions with a single entry point. All interesting +control-flow happens between basic blocks. Each basic block has an id +`BB` and consists of a sequence of statements and a terminator: + + BASIC_BLOCK = BB: {STATEMENT} TERMINATOR + +A `STATEMENT` can have one of three forms: + + STATEMENT = LVALUE "=" RVALUE // assign rvalue into lvalue + | Drop(DROP_KIND, LVALUE) // drop value if needed + DROP_KIND = SHALLOW // (see discussion below) + | DEEP + +The following sections dives into these various kinds of statements in +more detail. + +The `TERMINATOR` for a basic block describes how it connects to +subsequent blocks: + + TERMINATOR = GOTO(BB) // normal control-flow + | PANIC(BB) // initiate unwinding, branching to BB for cleanup + | IF(LVALUE, BB0, BB1) // test LVALUE and branch to BB0 if true, else BB1 + | SWITCH(LVALUE, BB...) // load discriminant from LVALUE (which must be an enum), + // and branch to BB... depending on which variant it is + | CALL(LVALUE0 = LVALUE1(LVALUE2...), BB0, BB1) + // call LVALUE1 with LVALUE2... as arguments. Write + // result into LVALUE0. Branch to BB0 if it returns + // normally, BB1 if it is unwinding. + | DIVERGE // return to caller, unwinding + | RETURN // return to caller normally + +Most of the terminators should be fairly obvious. The most interesting +part is the handling of unwinding. This aligns fairly close with how +LLVM works: there is one terminator, PANIC, that initiates unwinding. +It immediately branches to a handler (BB) which will perform cleanup +and (eventually) reach a block that has a DIVERGE terminator. DIVERGE +causes unwinding to continue up the stack. + +Because calls to other functions can always (or almost always) panic, +calls are themselves a kind of terminator. If we can determine that +some function we are calling cannot unwind, we can always modify the +IR to make the second basic block optional. (We could also add an +`RVALUE` to represent calls, but it's probably easiest to keep the +call as a terminator unless the memory savings of consolidating basic +blocks are found to be worthwhile.) + +It's worth pointing out that basic blocks are just a kind of +compile-time and memory-use optimization; there is no semantic +difference between a single block and two blocks joined by a GOTO +terminator. + +### Assignments, values, and rvalues + +The primary kind of statement is an assignent: + + LVALUE "=" RVALUE + +The semantics of this operation are to first evaluate the RVALUE and +then store it into the LVALUE (which must represent a memory location +of suitable type). + +An `LVALUE` represents a path to a memory location. This is the basic +"unit" analyzed by the borrow checker. It is always possible to +evaluate an `LVALUE` without triggering any side-effects (modulo +derefences of unsafe pointers, which naturally can trigger arbitrary +behavior if the pointer is not valid). + + LVALUE = B // reference to a user-declared binding + | TEMP // a temporary introduced by the compiler + | ARG // a formal argument of the fn + | STATIC // a reference to a static or static mut + | RETURN // the return pointer of the fn + | LVALUE.f // project a field or tuple field, like x.f or x.0 + | *LVALUE // dereference a pointer + | LVALUE[LVALUE] // index into an array (see disc. below about bounds checks) + | (LVALUE as VARIANT) // downcast to a specific variant of an enum, + // see the section on desugaring matches below + +An `RVALUE` represents a computation that yields a result. This result +must be stored in memory somewhere to be accessible. The MIR does not +contain any kind of nested expressions: everything is flattened out, +going through lvalues as intermediaries. + + RVALUE = Use(LVALUE) // just read an lvalue + | [LVALUE; LVALUE] + | &'REGION LVALUE + | &'REGION mut LVALUE + | LVALUE as TYPE + | LVALUE LVALUE + | LVALUE + | Struct { f: LVALUE0, ... } // aggregates, see section below + | (LVALUE...LVALUE) + | [LVALUE...LVALUE] + | CONSTANT + | LEN(LVALUE) // load length from a slice, see section below + | BOX // malloc for builtin box, see section below + BINOP = + | - | * | / | ... // excluding && and || + UNOP = ! | - // note: no `*`, as that is part of LVALUE + +One thing worth pointing out is that the binary and unary operators +are only the *builtin* form, operating on scalar values. Overloaded +operators will be desugared to trait calls. Moreover, all method calls +are desugared into normal calls via UFCS form. + +### Constants + +Constants are a subset of rvalues that can be evaluated at compilation +time: + + CONSTANT = INT + | UINT + | FLOAT + | BOOL + | BYTES + | STATIC_STRING + | ITEM // reference to an item or constant etc + | > // projection + | CONSTANT(CONSTANT...) // + | CAST(CONSTANT, TY) // foo as bar + | Struct { (f: CONSTANT)... } // aggregates... + | (CONSTANT...) // + | [CONSTANT...] // + +### Aggregates and further lowering + +The set of rvalues includes "aggregate" expressions like `(x, y)` or +`Foo { f: x, g: y }`. This is a place where the MIR (somewhat) departs +from what will be generated compilation time, since (often) an +expression like `f = (x, y, z)` will wind up desugared into a series +of piecewise assignments like: + + f.0 = x; + f.1 = y; + f.2 = z; + +However, there are good reasons to include aggregates as first-class +rvalues. For one thing, if we break down each aggregate into the +specific assignments that would be used to construct the value, then +zero-sized types are *never* assigned, since there is no data to +actually move around at runtime. This means that the compiler couldn't +distinguish uninitialized variables from initialized ones. That is, +code like this: + +```rust +let x: (); // note: never initialized +use(x) +``` + +and this: + +```rust +let x: () = (); +use(x); +``` + +would desugar to the same MIR. That is a problem, particularly with +respect to destructors: imagine that instead of the type `()`, we used +a type like `struct Foo;` where `Foo` implements `Drop`. + +Another advantage is that building aggregates in a two-step way +assures the proper execution order when unwinding occurs before the +complete value is constructed. In particular, we want to drop the +intermediate results in the order that they appear in the source, not +in the order in which the fields are specified in the struct +definition. + +A final reason to include aggregates is that, at runtime, the +representation of an aggregate may indeed fit within a single word, in +which case making a temporary and writing the fields piecemeal may in +fact not be the correct representation. + +In any case, after the move and correctness checking is done, it is +easy enough to remove these aggregate rvalues and replace them with +assignments. This could potentially be done during LLVM lowering, or +as a pre-pass that transforms MIR statements like: + + x = ...x; + y = ...y; + z = ...z; + f = (x, y, z) + +to: + + x = ...x; + y = ...y; + z = ...z; + f.0 = x; + f.1 = y; + f.2 = z; + +combined with another pass that removes temporaries that are only used +within a single assignment (and nowhere else): + + f.0 = ...x; + f.1 = ...y; + f.2 = ...z; + +Going further, once type-checking is done, it is plausible to do +further lowering within the MIR purely for optimization purposes. For +example, we could introduce intermediate references to cache the +results of common lvalue computations and so forth. This may well be +better left to LLVM (or at least to the lowering pass). + +### Bounds checking + +Because bounds checks are fallible, it's important to encode them in +the MIR whenever we do indexing. Otherwise the trans code would have +to figure out on its own how to do unwinding at that point. Because +the MIR doesn't "desugar" fat pointers, we include a special rvalue +`LEN` that extracts the length from an array value whose type matches +`[T]` or `[T;n]` (in the latter case, it yields a constant). Using +this, we desugar an array reference like `y = arr[x]` as follows: + + let len: usize; + let idx: usize; + let lt: bool; + + B0: { + len = len(arr); + idx = x; + lt = idx < len; + if lt { B1 } else { B2 } + } + + B1: { + x = arr[idx] + ... + } + + B2: { + + } + +The key point here is that we create a temporary (`idx`) capturing the +value that we bounds checked and we ensure that there is a comparison +against the length. + +### Overflow checking + +Similarly, since overflow checks can trigger a panic, they ought to be +exposed in the MIR as well. This is handled by having distinct binary +operators for "add with overflow" and so forth, analogous to the LLVM +intrinsics. These operators yield a tuple of (result, overflow), so +`result = left + right` might be translated like: + + let tmp: (u32, bool); + + B0: { + tmp = left + right; + if(tmp.1, B1, B2) + } + + B1: { + result = tmp.0 + ... + } + + B2: { + + } + +### Matches + +One of the goals of the MIR is to desugar matches into something much +more primitive, so that we are freed from reasoning about their +complexity. This is primarily achieved through a combination of SWITCH +terminators and downcasts. To get the idea, consider this simple match +statement: + +```rust +match foo() { + Some(ref v) => ...0, + None => ...1 +} +``` + +This would be converted into MIR as follows (leaving out the unwinding support): + + BB0 { + call(tmp = foo(), BB1, ...); + } + + BB1 { + switch(tmp, BB2, BB3) // two branches, corresponding to the Some and None variants resp. + } + + BB2 { + v = &(tmp as Option::Some).0; + ...0 + } + + BB3 { + ...1 + } + +There are some interesting cases that arise from matches that are +worth examining. + +**Vector patterns.** Currently, (unstable) Rust supports vector +patterns which permit borrows that would not otherwise be legal: + +```rust +let mut vec = [1, 2]; +match vec { + [ref mut p, ref mut q] => { ... } +} +``` + +If this code were written using `p = &mut vec[0], q = &mut vec[1]`, +the borrow checker would complain. This is because it does not attempt +to reason about indices being disjoint, even if they are constant +(this is a limitation we may wish to consider lifting at some point in +the future, however). + +To accommodate these, we plan to desugar such matches into lvalues +using the special "constant index" form. The borrow checker would be +able to reason that two constant indices are disjoint but it could +consider "variable indices" to be (potentially) overlapping with all +constant indices. This is a fairly straightforward thing to do (and in +fact the borrow checker already includes similar logic, since the +`ExprUseVisitor` encounters a similar dilemna trying to resolve +borrows). + +### Drops + +The `Drop(DROP_KIND, LVALUE)` instruction is intended to represent +"automatic" compiler-inserted drops. The semantics of a `Drop` is that +it drops "if needed". This means that the compiler can insert it +everywhere that a `Drop` would make sense (due to scoping), and assume +that instrumentation will be done as needed to prevent double +drops. Currently, this signaling is done by zeroing out memory at +runtime, but we are in the process of introducing stack flags for this +purpose: the MIR offers the opportunity to reify those flags if we +wanted, and rewrite drops to be more narrow (versus leaving that work +for LLVM). + +To illustrate how drop works, let's work through a simple +example. Imagine that we have a snippet of code like: + +```rust +{ + let x = Box::new(22); + send(x); +} +``` + +The compiler would generate a drop for `x` at the end of the block, +but the value `x` would also be moved as part of the call to `send`. +A later analysis could easily strip out this `Drop` since it is evident +that the value is always used on all paths that lead to `Drop`. + +### Shallow drops and Box + +The MIR includes the distinction between "shallow" and "deep" +drop. Deep drop is the normal thing, but shallow drop is used when +partially initializing boxes. This is tied to the `box` keyword. +For example, an assignment like the following: + + let x = box Foo::new(); + +would be translated to something like the following: + + let tmp: Box; + + B0: { + tmp = BOX; + f = Foo::new; // constant reference + call(*tmp, f, B1, B2); + } + + B1: { // successful return of the call + x = use(tmp); // move of tmp + ... + } + + B2: { // calling Foo::new() panic'd + drop(Shallow, tmp); + diverge; + } + +The interesting part here is the block B2, which indicates the case +that `Foo::new()` invoked unwinding. In that case, we have to free the +box that we allocated, but we only want to free the box itself, not +its contents (it is not yet initialized). + +Note that having this kind of builtin box code is a legacy thing. The +more generalized protocol that [RFC 809][809] specifies works in +more-or-less exactly the same way: when that is adopted uniformly, the +need for shallow drop and the Box rvalue will go away. + +### Phasing + +Ideally, the translation to MIR would be done during type checking, +but before "region checking". This is because we would like to +implement non-lexical lifetimes eventually, and doing that well would +requires access to a control-flow graph. Given that we do very limited +reasoning about regions at present, this should not be a problem. + +### Representing scopes + +Lexical scopes in Rust play a large role in terms of when destructors +run and how the reasoning about lifetimes works. However, they are +completely erased by the graph format. For the most part, this is not +an issue, since drops are encoded explicitly into the control-flow +where needed. However, one place that we still need to reason about +scopes (at least in the short term) is in region checking, because +currently regions are encoded in terms of scopes, and we have to be +able to map that to a region in the graph. The MIR therefore includes +extra information mapping every scope to a SEME region (single-entry, +multiple-exit). If/when we move to non-lexical lifetimes, regions +would be defined in terms of the graph itself, and the need to retain +scoping information should go away. + +### Monomorphization + +Currently, we do monomorphization at LLVM translation time. If we ever +chose to do it at a MIR level, that would be fine, but one thing to be +careful of is that we may be able to elide `Drop` nodes based on the +specific types. + +### Unchecked assertions + +There are various bits of the MIR that are not trivially type-checked. +In general, these are properties which are assured in Rust by +construction in the high-level syntax, and thus we must be careful not +to do any transformation that would endanger them after the fact. + +- **Bounds-checking.** We introduce explicit bounds checks into the IR + that guard all indexing lvalues, but there is no explicit connection + between this check and the later accesses. +- **Downcasts to a specific variant.** We test variants with a SWITCH + opcode but there is no explicit connection between this test and + later downcasts. + +This need for unchecked operations results form trying to lower and +simplify the representation as much as possible, as well as trying to +represent all panics explicitly. We believe the tradeoff to be +worthwhile, particularly since: + +1. the existing analyses can continue to generally assume that these +properties hold (e.g., that all indices are in bounds and all +downcasts are safe); and, +2. it would be trivial to implement a static dataflow analysis +checking that bounds and downcasts only occur downstream of a relevant +check. + +# Drawbacks + +**Converting from AST to a MIR will take some compilation time.** +Expectations are that constructing the MIR will be quite fast, and +that follow-on code (such as trans and borowck) will execute faster, +because they will operate over a simpler and more compact +representation. However, this needs to be measured. + +**More effort is required to make quality error messages.** Because +the representation the compiler is working with is now quite different +from what the user typed, we have to put in extra effort to make sure +that we bridge this gap when reporting errors. We have some precedent +for dealing with this, however. For example, the `ExprUseVisitor` (and +`mem_categorization`) includes extra annotations and hints to tell the +borrow checker when a reference was introduced as part of a closure +versus being explicit in the source code. The current prototype +doesn't have much in this direction, but it should be relatively +straightforward to add. Hints like those, in addition to spans, should +be enough to bridge the error message gap. + +# Alternatives + +**Use SSA.** In the proposed MIR, temporaries are single-assignment +but can be borrowed, making them more analogous to allocas than SSA +values. This is helpful to analyses like the borrow checker, because +it means that the program operates directly on paths through memory, +versus having the stack modeled as allocas. The current model is also +helpful for generating debuginfo. + +SSA representation can be helpful for more sophisticated backend +optimizations. However, we tend to leave those optimizations to LLVM, +and hence it makes more sense to have the MIR be based on lvalues +instead. There are some cases where it might make sense to do analyses +on the MIR that would benefit from SSA, such as bounds check elision. +In those cases, we could either quickly identify those temporaries +that are not mutably borrowed (and which therefore act like SSA +variables); or, further lower into a LIR, (which would be an SSA +form); or else simply perform the analyses on the MIR using standard +techniques like def-use chains. (CSE and so forth are straightforward +both with and without SSA, honestly.) + +**Exclude unwinding.** Excluding unwinding from the MIR would allow us +to elide annoying details like bounds and overflow checking. These are +not particularly interesting to borrowck, so that is somewhat +appealing. But that would mean that consumers of MIR would have to +reconstruct the order of drops and so forth on unwinding paths, which +would require them reasoning about scopes and other rather complex +bits of information. Moreover, having all drops fully exposed in the +MIR is likely helpful for better handling of dynamic drop and also for +the rules collectively known as dropck, though all details there have +not been worked out. + +**Expand the set of operands.** The proposed MIR forces all rvalue operands +to be lvalues. This means that integer constants and other "simple" things +will wind up introducing a temporary. For example, translating `x = 2+2` +will generate code like: + + tmp0 = 2 + tmp1 = 2 + x = tmp0 + tmp1 + +A more common case will be calls to statically known functions like `x = foo(3)`, +which desugars to a temporary and a constant reference: + + tmp0 = foo; + tmp1 = 3 + x = tmp(tmp1) + +There is no particular *harm* in such constants: it would be very easy +to optimize them away when reducing to LLVM bitcode, and if we do not +do so, LLVM will do it. However, we could also expand the scope of +operands to include both lvalues and some simple rvalues like +constants. The main advantage of this is that it would reduce the +total number of statements and hence might help with memory +consumption. + +**Totally safe MIR.** This MIR includes operations whose safety is not +trivially type-checked (see the section on *unchecked assertions* +above). We might design a higher-level MIR where those properties held +by construction, or modify the MIR to thread "evidence" of some form +that makes it easier to check that the properties hold. The former +would make downstream code accommodate more complexity. The latter +remains an option in the future but doesn't seem to offer much +practical advantage. + +# Unresolved questions + +**What additional info is needed to provide for good error messages?** +Currently the implementation only has spans on statements, not lvalues +or rvalues. We'll have to experiment here. I expect we will probably +wind up placing "debug info" on all lvalues, which includes not only a +span but also a "translation" into terms the user understands. For +example, in a closure, a reference to an by-reference upvar `foo` will +be translated to something like `*self.foo`, and we would like that to +be displayed to the user as just `foo`. + +**What additional info is needed for debuginfo?** It may be that to +generate good debuginfo we want to include additional information +about control-flow or scoping. + +**Unsafe blocks.** Should we layer unsafe in the MIR so that effect +checking can be done on the CFG? It's not the most natural way to do +it, *but* it would make it fairly easy to support (e.g.) autoderef on +unsafe pointers, since all the implicit operations are made explicit +in the MIR. My hunch is that we can improve our HIR instead. From 67f78ec728913f56c9e969b2226d854724594ad1 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Tue, 14 Jul 2015 09:57:03 -0700 Subject: [PATCH 0378/1195] point to cfg-if. --- text/0000-simd-infrastructure.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index cc3063a4494..41c7d8bd2e3 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -324,11 +324,11 @@ not necessarily just everything LLVM understands. There are other non-SIMD features that might have `target_feature`s set too, such as `popcnt` and `rdrnd` on x86/x86-64.) -With a `cfg_if_else!` macro that expands to the first `cfg` that is -satisfied (ala [@alexcrichton's cascade][cascade]), code might look +With a `cfg_if!` macro that expands to the first `cfg` that is +satisfied (ala [@alexcrichton's `cfg-if`][cfg-if]), code might look like: -[cascade]: https://github.com/alexcrichton/backtrace-rs/blob/03703031babfa87cbe2c723ad6752131819dc554/src/macros.rs +[cfg-if]: https://crates.io/crates/cfg-if ```rust cfg_if_else! { From e2c36eb144f21428cbffcc6fdd28a4f6decee133 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 15 Jul 2015 21:39:25 -0700 Subject: [PATCH 0379/1195] RFC 1174 is IntoRaw{Fd,Socket,Handle} --- ...dle-traits.md => 1174-into-raw-fd-socket-handle-traits.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-into-raw-fd-socket-handle-traits.md => 1174-into-raw-fd-socket-handle-traits.md} (93%) diff --git a/text/0000-into-raw-fd-socket-handle-traits.md b/text/1174-into-raw-fd-socket-handle-traits.md similarity index 93% rename from text/0000-into-raw-fd-socket-handle-traits.md rename to text/1174-into-raw-fd-socket-handle-traits.md index a92e175c5dd..38ba3b720ad 100644 --- a/text/0000-into-raw-fd-socket-handle-traits.md +++ b/text/1174-into-raw-fd-socket-handle-traits.md @@ -1,7 +1,7 @@ - Feature Name: into-raw-fd-socket-handle-traits - Start Date: 2015-06-24 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1174](https://github.com/rust-lang/rfcs/pull/1174) +- Rust Issue: [rust-lang/rust#27062](https://github.com/rust-lang/rust/issues/27062) # Summary From f54d26e118ca093cf383e883120d0101a43e8b89 Mon Sep 17 00:00:00 2001 From: Peter Marheine Date: Thu, 16 Jul 2015 10:24:18 -0600 Subject: [PATCH 0380/1195] Constrain calling of naked functions from Rust Also expand motivation beyond ISRs per RFC discussion. --- text/0000-naked-fns.md | 146 ++++++++++++++++++++++++++++++----------- 1 file changed, 109 insertions(+), 37 deletions(-) diff --git a/text/0000-naked-fns.md b/text/0000-naked-fns.md index 0a9df61f643..d77a1416203 100644 --- a/text/0000-naked-fns.md +++ b/text/0000-naked-fns.md @@ -10,12 +10,19 @@ function attribute. # Motivation -Some systems programming tasks require that machine state not be modified at all -on function entry so it can be preserved- particularly in interrupt handlers. -For example, x86\_64 preserves only the stack pointer, flags register, and -instruction pointer on interrupt entry. To avoid corrupting program state, the -interrupt handler must save the registers which might be modified before handing -control to compiler-generated code. Consider a contrived interrupt handler: +Some systems programming tasks require that the programmer have complete control +over function stack layout and interpretation, generally in cases where the +compiler lacks support for a specific use case. While these cases can be +addressed by building the requisite code with external tools and linking with +Rust, it is advantageous to allow the Rust compiler to drive the entire process, +particularly in that code may be generated via monomorphization or macro +expansion. + +When writing interrupt handlers for example, most systems require additional +state be saved beyond the usual ABI requirements. To avoid corrupting program +state, the interrupt handler must save the registers which might be modified +before handing control to compiler-generated code. Consider a contrived +interrupt handler for x86\_64: ```rust unsafe fn isr_nop() { @@ -47,14 +54,25 @@ stack layout for any given function are not predictable (and may change with compiler version or optimization settings), attempting to predict the stack layout to sidestep this issue is infeasible. -In other languages (particularly C), "naked" functions omit the prologue and -epilogue (represented by the modifications to `rsp` in the above example) to -allow the programmer complete control over stack layout. This makes the -availability of stack space for compiler use unpredictable, usually implying -that the body of such a function must consist entirely of inline assembly -statements (such as a jump or call to another function). +When interacting with FFIs that are not natively supported by the compiler, +a similar situation arises where the programmer knows the expected calling +convention and can implement a translation between the foreign ABI and one +supported by the compiler. -The [LLVM language +Support for naked functions also allows programmers to write functions that +would otherwise be unsafe, such as the following snippet which returns the +address of its caller when called with the C ABI on x86. + +``` + mov 4(%ebp), %eax + ret +``` + +--- + +Because the compiler depends on a function prologue and epilogue to maintain +storage for local variable bindings, it is generally unsafe to write anything +but inline assembly inside a naked function. The [LLVM language reference](http://llvm.org/docs/LangRef.html#function-attributes) describes this feature as having "very system-specific consequences", which the programmer must be aware of. @@ -64,23 +82,83 @@ be aware of. Add a new function attribute to the language, `#[naked]`, indicating the function should have prologue/epilogue emission disabled. -For example, the following construct could be assumed not to generate extra code -on entry to `isr_caller` which might violate the programmer's assumptions, while -allowing the compiler to generate the function definition as usual: +Because the calling convention of a naked function is not guaranteed to match +any calling convention the compiler is compatible with, calls to naked functions +from within Rust code are forbidden unless the function is also declared with +a well-defined ABI. + +The function `call_foo` in the following code block is an error because the +default (Rust) ABI is unspecified and as such a programmer can never write code +in `foo` which is compatible: + +```rust +#[naked] +fn foo() { } + +fn call_foo() { + foo(); +} +``` + +The following variant is not an error because the C calling convention is +well-defined and it is thus possible for the programmer to write a conforming +function: + +```rust +#[naked] +extern "C" fn foo() { } + +fn call_foo() { + foo(); +} +``` + +--- + +The current support for `extern` functions in `rustc` generates a minimum of two +basic blocks for any function declared in Rust code with a non-default calling +convention: a trampoline which translates the declared calling convention to the +Rust convention, and a Rust ABI version of the function containing the actual +implementation. Calls to the function from Rust code call the Rust ABI version +directly. + +For naked functions, it is impossible for the compiler to generate a Rust ABI +version of the function because the implementation may depend on the calling +convention. In cases where calling a naked function from Rust is permitted, the +compiler must be able to use the target calling convention directly rather than +call the same function with the Rust convention. + +--- + +The following example illustrates the possible use of a naked function for +implementation of an interrupt service routine on 32-bit x86. ```rust +use std::intrinsics; +use std::sync::atomic::{self, AtomicUsize, Ordering}; + #[naked] -unsafe fn isr_caller() { - asm!("push %rax - call other_function - pop %rax - iretq" :::: "volatile"); - core::intrinsics::unreachable(); +#[cfg(target_arch="x86")] +unsafe fn isr_3() { + asm!("pushad + call increment_breakpoint_count + popad + iretd" :::: "volatile"); + intrinsics::unreachable(); } +static bp_count: AtomicUsize = ATOMIC_USIZE_INIT; + #[no_mangle] -pub fn other_function() { +pub fn increment_breakpoint_count() { + bp_count.fetch_add(1, Ordering::Relaxed); +} + +fn register_isr(vector: u8, handler: fn() -> ()) { /* ... */ } +fn main() { + register_isr(3, isr_3); + // ... } ``` @@ -93,22 +171,16 @@ considered. # Alternatives Do nothing. The required functionality for the use case outlined can be -implemented outside Rust code (such as with a small amount of externally-built -assembly) and merely linked in as needed. - -Add a new calling convention (`extern "interrupt" fn ...`) which is defined to -do any necessary state saving for interrupt service routines. This permits more -efficient code to be generated for the motivating example (omitting a 'call' -instruction which is necessary for any non-trivial ISR), but may not be -appropriate for other situations that might call for a naked function. -Implementation of additional calling conventions like this in the current -`rustc` would involve significant modification to LLVM to support it (whereas -the proof-of-concept patch for `#[naked]` is less than 10 lines of code). +implemented outside Rust code and linked in as needed. Support for additional +calling conventions could be added to the compiler as needed, or emulated with +external libraries such as `libffi`. # Unresolved questions It is easy to quietly generate wrong code in naked functions, such as by causing the compiler to allocate stack space for temporaries where none were -anticipated. It may be desirable to allow the `#[naked]` attribute on `unsafe` -functions only, reinforcing the need for extreme care in the use of this -feature. +anticipated. It may be desirable to require that all statements inside naked +functions be inside `unsafe` blocks (either by declaring the function `unsafe` +or including `unsafe { }` in the function body) to reinforce the need for +extreme care in the use of this feature. Requiring that the function always be +marked `unsafe` is not desirable because its external API may be safe. From 4732352c41a36a485d7b4e59f4dfea596e8f27b3 Mon Sep 17 00:00:00 2001 From: Cesar Eduardo Barros Date: Thu, 16 Jul 2015 20:20:26 -0300 Subject: [PATCH 0381/1195] Use .is_empty() instead of .len() Should make no difference in speed, but is slightly cleaner. --- text/0000-read-all.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index 83b1f4aa4b7..c242535a275 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -60,7 +60,7 @@ Aditionally, a default implementation of this method is provided: ``` rust fn read_exact(&mut self, mut buf: &mut [u8]) -> Result<()> { - while buf.len() > 0 { + while !buf.is_empty() { match self.read(buf) { Ok(0) => break, Ok(n) => { let tmp = buf; buf = &mut tmp[n..]; } @@ -68,7 +68,7 @@ fn read_exact(&mut self, mut buf: &mut [u8]) -> Result<()> { Err(e) => return Err(e), } } - if buf.len() > 0 { + if !buf.is_empty() { Err(Error::new(ErrorKind::UnexpectedEOF, "failed to fill whole buffer")) } else { Ok(()) From 7c18705a88ae16f83f40e1ac524d930e547df4ce Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 16 Jul 2015 17:27:17 -0700 Subject: [PATCH 0382/1195] Scale back; remove the ability to update --- text/0000-cargo-install.md | 36 +++++------------------------------- 1 file changed, 5 insertions(+), 31 deletions(-) diff --git a/text/0000-cargo-install.md b/text/0000-cargo-install.md index 5561b246d13..3580320af93 100644 --- a/text/0000-cargo-install.md +++ b/text/0000-cargo-install.md @@ -21,8 +21,7 @@ Fundamentally, however, Cargo is a ubiquitous tool among the Rust community and implementing `cargo install` would facilitate sharing Rust code among its developers. Simple tasks like installing a new cargo subcommand, installing an editor plugin, etc, would be just a `cargo install` away. Cargo can manage -dependencies, versions, updates, etc, itself to make the process as seamless as -possible. +dependencies and versions itself to make the process as seamless as possible. Put another way, enabling easily sharing code is one of Cargo's fundamental design goals, and expanding into binaries is simply an extension of Cargo's core @@ -43,7 +42,6 @@ Installing new crates: Managing installed crates: cargo install [options] --list - cargo install [options] --update [SPEC | --all] Options: -h, --help Print this message @@ -76,9 +74,7 @@ crate has multiple binaries, the `--bin` argument can selectively install only one of them, and if you'd rather install examples the `--example` argument can be used as well. -The `--list` option will list all installed packages (and their versions). The -`--update` option will update either the crate specified or all installed -crates. +The `--list` option will list all installed packages (and their versions). ``` ## Installing Crates @@ -162,31 +158,9 @@ binaries belong to which package. If Cargo gives access to installing packages, it should surely provide the ability to manage what's installed! The first part of this is just discovering -what's installed, and this is provided via `cargo install --list`. A more -interesting aspect is the `cargo install --update` command. - -#### Updating Crates - -Once a crate is installed new versions can be released or perhaps the build -configuration wants to be tweaked, so Cargo will provide the ability to update -crates in-place. By default *something* needs to be specified to the `--update` -flag, either a specific crate that's been installed or the `--all` flag to -update all crates. Because multiple crates of the same name can come from -different sources, the argument to the `--update` flag will be a package id -specification instead of just the name of a crate. - -When updating a crate, it will first attempt to update the source code for the -crate. For crates.io sources this means that it will download the most recent -version. For git sources it means the git repo will be updated, but the same -branch/tag will be used (if original specified when installed). Git sources -installed via `--rev` won't be updated. - -After the source code has been updated, the crate will be rebuilt according to -the flags specified on the command line. This will override the flags that were -previously used to install a crate, for example activated features are not -remembered. - -#### Removing Crates +what's installed, and this is provided via `cargo install --list`. + +## Removing Crates To remove an installed crate, another subcommand will be added to Cargo: From 7902a6d90854a6478770c16177bf6aa93fc9709a Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 17 Jul 2015 16:22:08 -0400 Subject: [PATCH 0383/1195] Initial version. --- text/0000-projections-lifetimes-and-wf.md | 956 ++++++++++++++++++++++ 1 file changed, 956 insertions(+) create mode 100644 text/0000-projections-lifetimes-and-wf.md diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md new file mode 100644 index 00000000000..a01a9db0787 --- /dev/null +++ b/text/0000-projections-lifetimes-and-wf.md @@ -0,0 +1,956 @@ +- Feature Name: N/A +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Type system changes to address the outlives relation with respect to +projections, and to better enforce that all types are well-formed +(meaning that they respect their declared bounds). The current +implementation can be both unsound ([#24662]), inconvenient +([#23442]), and surprising ([#21748], [#25692]). The changes are as follows: + +- Simplify the outlives relation to be syntactically based. +- Specify improved rules for the outlives relation and projections. +- Specify more specifically where WF bounds are enforced, covering + several cases missing from the implementation. + +The proposed changes here have been tested and found to cause only a +modest number of regressions (about two dozen root regressions were +previously found on crates.io; however, that run did not yet include +all the provisions from this RFC; updated numbers coming soon). In +order to minimize the impact on users, the plan is to first introduce +the changes in two stages: + +1. Initially, warnings will be issued for cases that violate the rules + specified in this RFC. These warnings are not lints and cannot be + silenced except by correcting the code such that it type-checks + under the new rules. +2. After one release cycle, those warnings will become errors. + +Note that although the changes do cause regressions, they also cause +some code (like that in [#23442]) which currently gets errors to +compile successfully. + +# Motivation + +### TL;DR + +This is a long detailed RFC that is attempting to specify in some +detail aspects of the type system that were underspecified or buggily +implemented before. This section just summarizes the effect on +existing Rust code in terms of changes that may be required. + +**Warnings first, errors later.** Although the changes described in +this RFC are necessary for soundness (and many of them are straight-up +bugfixes), there is some impact on existing code. Therefore the plan +is to first issue warnings for a release cycle and then transition to +hard errors, so as to ease the migration. + +**Associated type projections and lifetimes work more smoothly.** The +current rules for relating associated type projections (like `T::Foo`) +and lifetimes are somewhat cumbersome. The newer rules are more +flexible, so that e.g. we can deduce that `T::Foo: 'a` if `T: 'a`, and +similarly that `T::Foo` is well-formed if `T` is well-formed. As a +bonus, the new rules are also sound. ;) + +**The outlives relation is simpler.** The older definition for the +outlives relation `T: 'a` was rather subtle. The new rule basically +says that if all type/lifetime parameters appearing in the type `T` +must outlive `'a`, then `T: 'a` (though there can also be other ways +for us to decide that `T: 'a` is valid, such as in-scope where +clauses). So for example `fn(&'x X): 'a` if `'x: 'a` and `X: 'a` +(presuming that `X` is a type parameter). The older rules were based +on what kind of data was actually *reachable*, and hence accepted this +type (since no data of `&'x X` is reachable from a function pointer). +This change primarily affects struct declarations, since they may now +require additional outlives bounds: + +```rust +// OK now, but after this RFC requires `X: 'a`: +struct Foo<'a, X> { + f: fn(&'a X) // (because of this field) +} +``` + +**More types are sanity checked.** Generally Rust requires that if you +have a type like `SomeStruct`, then whatever where clauses are +declared on `SomeStruct` must hold for `T` (this is called being +"well-formed"). For example, if `SomeStruct` is declared like so: + +```rust +struct SomeStruct { .. } +``` + +then this implies that `SomeStruct` is ill-formed, since `f32` +does not implement `Eq` (just `PartialEq`). However, the current compiler +doesn't check this in associated type definitions: + +```rust +impl Iterator for SomethingElse { + type Item = SomeStruct; // OK now, not after this RFC +} +``` + +Similarly, WF checking was skipped for trait object types and fn +arguments. This means that `fn(SomeStruct)` would be considered +well-formed today, though attempting to call the function would be an +error. Under this RFC, that fn type is not well-formed (though +sometimes when there are higher-ranked regions, WF checking may still +be deferred until the point where the fn is called). + +There are a few other places where similar requirements were being +overlooked before but will now be enforced. For example, a number of +traits like the following were found in the wild: + +```rust +trait Foo { + // currently accepted, but should require that Self: Sized + fn method(&self, value: Option) { + // note: default implement here + } +} +``` + +Because this method supplies a default implementation, it requires +that the argument types are well-formed, which in turn means that +`Self: Sized` must hold. But for some reason (of which I am actually +not entirely sure) this is not checked now. + +### Projections and the outlives relation + +[RFC 192] introduced the outlives relation `T: 'a` and described the +rules that are used to decide when one type outlives a lifetime. In +particular, the RFC describes rules that govern how the compiler +determines what kind of borrowed data may be "hidden" by a generic +type. For example, given this function signature: + +```rust +fn foo<'a,I>(x: &'a I) + where I: Iterator +{ ... } +``` + +the compiler is able to use implied region bounds (described more +below) to automatically determine that: + +- all borrowed content in the type `I` outlives the function body; +- all borrowed content in the type `I` outlives the lifetime `'a`. + +When associated types were introduced in [RFC 195], some new rules +were required to decide when an "outlives relation" involving a +projection (e.g., `I::Item: 'a`) should hold. The initial rules were +[very conservative][#22246]. This led to the rules from [RFC 192] +being [adapted] to cover associated type projections like +`I::Item`. Unfortunately, these adapted rules are not ideal, and can +still lead to [annoying errors in some situations][#23442]. Finding a +better solution has been on the agenda for some time. + +Simultaneously, we realized in [#24662] that the compiler had a bug +that caused it erroneously assume that every projection like `I::Item` +outlived the current function body, just as it assumes that type +parameters like `I` outlive the current function body. **This bug can +lead to unsound behavior.** Unfortunately, simply implementing the +naive fix for #24662 exacerbates the shortcomings of the current rules +for projections, causing widespread compilation failures in all sorts +of reasonable and obviously correct code. + +**This RFC describes modifications to the type system that both +restore soundness and make working with associated types more +convenient in some situations.** The changes are largely but not +completely backwards compatible. + +### Well-formed types + +A type is considered *well-formed* (WF) if it meets some simple +correctness criteria. For builtin types like `&'a T` or `[T]`, these +criteria are built into the language. For user-defined types like a +struct or an enum, the criteria are declared in the form of where +clauses. In general, all types that appear in the source and elsewhere +should be well-formed. + +For example, consider this type, which combines a reference to a +hashmap and a vector of additional key/value pairs: + +```rust +struct DeltaMap<'a, K, V> where K: Hash + 'a, V: 'a { + base_map: &'a mut HashMap, + additional_values: Vec<(K,V)> +} +``` + +Here, the WF criteria for `DeltaMap` are as follows: + +- `K: Hash`, because of the where-clause, +- `K: 'a`, because of the where-clause, +- `V: 'a`, because of the where-clause +- `K: Sized`, because of the implicit `Sized` bound +- `V: Sized`, because of the implicit `Sized` bound + +Let's look at those `K:'a` bounds a bit more closely. If you leave +them out, you will find that the the structure definition above does +not type-check. This is due to the requirement that the types of all +fields in a structure definition must be well-formed. In this case, +the field `base_map` has the type `&'a mut HashMap`, and this +type is only valid if `K: 'a` and `V: 'a` hold. Since we don't know +what `K` and `V` are, we have to surface this requirement in the form +of a where-clause, so that users of the struct know that they must +maintain this relationship in order for the struct to be interally +coherent. + +#### An aside: explicit WF requirements on types + +You might wonder why you have to write `K:Hash` and `K:'a` explicitly. +After all, they are obvious from the types of the fields. The reason +is that we want to make it possible to check whether a type like +`DeltaMap<'foo,T,U>` is well-formed *without* having to inspect the +types of the fields -- that is, in the current design, the only +information that we need to use to decide if `DeltaMap<'foo,T,U>` is +well-formed is the set of bounds and where-clauses. + +This has real consequences on usability. It would be possible for the +compiler to infer bounds like `K:Hash` or `K:'a`, but the origin of +the bound might be quite remote. For example, we might have a series +of types like: + +```rust +struct Wrap1<'a,K>(Wrap2<'a,K>); +struct Wrap2<'a,K>(Wrap3<'a,K>); +struct Wrap3<'a,K>(DeltaMap<'a,K,K>); +``` + +Now, for `Wrap1<'foo,T>` to be well-formed, `T:'foo` and `T:Hash` must +hold, but this is not obvious from the declaration of +`Wrap1`. Instead, you must trace deeply through its fields to find out +that this obligation exists. + +#### Implied lifetime bounds + +To help avoid undue annotation, Rust relies on implied lifetime bounds +in certain contexts. Currently, this is limited to fn bodies. The idea +is that for functions, we can make callers do some portion of the WF +validation, and let the callees just assume it has been done +already. (This is in contrast to the type definition, where we +required that the struct itself declares all of its requirements up +front in the form of where-clauses.) + +To see this in action, consider a function that uses a `DeltaMap`: + +```rust +fn foo<'a,K:Hash,V>(d: DeltaMap<'a,K,V>) { ... } +``` + +You'll notice that there are no `K:'a` or `V:'a` annotations required +here. This is due to *implied lifetime bounds*. Unlike structs, a +function's caller must examine not only the explicit bounds and +where-clauses, but *also* the argument and return types. When there +are generic type/lifetime parameters involved, the caller is in charge +of ensuring that those types are well-formed. (This is in contrast +with type definitions, where the type is in charge of figuring out its +own requirements and listing them in one place.) + +As the name "implied lifetime bounds" suggests, we currently limit +implied bounds to region relationships. That is, we will implicitly +derive a bound like `K:'a` or `V:'a`, but not `K:Hash` -- this must +still be written manually. It might be a good idea to change this, but +that would be the topic of a separate RFC. + +Currently, implied bound are limited to fn bodies. This RFC expands +the use of implied bounds to cover impl definitions as well, since +otherwise the annotation burden is quite painful. More on this in the +next section. + +*NB.* There is an additional problem concerning the interaction of +implied bounds and contravariance ([#25860]). To better separate the +issues, this will be addressed in a follow-up RFC that should appear +shortly. + +#### Missing WF checks + +Unfortunately, the compiler currently fails to enforce WF in several +important cases. For example, the +[following program](http://is.gd/6JXjyg) is accepted: + +```rust +struct MyType { t: T } + +trait ExampleTrait { + type Output; +} + +struct ExampleType; + +impl ExampleTrait for ExampleType { + type Output = MyType>; + // ~~~~~~~~~~~~~~~~ + // | + // Note that `Box` is not `Copy`! +} +``` + +However, if we simply naively add the requirement that associated +types must be well-formed, this results in a large annotation burden +(see e.g. [PR 25701](https://github.com/rust-lang/rust/pull/25701/)). +For example, in practice, many iterator implementation break due to +region relationships: + +```rust +impl<'a, T> IntoIterator for &'a LinkedList { + type Item = &'a T; + ... +} +``` + +The problem here is that for `&'a T` to be well-formed, `T: 'a` must +hold, but that is not specified in the where clauses. This RFC +proposes using implied bounds to address this concern -- specifically, +every impl is permitted to assume that all types which appear in the +impl header (trait reference) are well-formed, and in turn each "user" +of an impl will validate this requirement whenever they project out of +a trait reference (e.g., to do a method call, or normalize an +associated type). + +# Detailed design + +This section dives into detail on the proposed type rules. + +### A little type grammar + +We extend the type grammar from [RFC 192] with projections and slice +types: + + T = scalar (i32, u32, ...) // Boring stuff + | X // Type variable + | Id // Nominal type (struct, enum) + | &r T // Reference (mut doesn't matter here) + | O0..On+r // Object type + | [T] // Slice type + | for fn(T1..Tn) -> T0 // Function pointer + | >::Id // Projection + P = r // Region name + | T // Type + O = for TraitId // Object type fragment + r = 'x // Region name + +We'll use this to describe the rules in detail. + +### Syntactic definition of the outlives relation + +The outlives relation is defined in purely syntactic terms as follows. +These are inference rules written in a primitive ASCII notation. :) As +part of defining the outlives relation, we need to track the set of +lifetimes that are bound within the type we are looking at. Let's +call that set `R=`. Initially, this set `R` is empty, but it +will grow as we traverse through types like fns or objects, which can +bind region names. + +#### Simple outlives rules + +Here are the rules covering the simple cases, where no type parameters +or projections are involved: + + OutlivesScalar: + -------------------------------------------------- + R ⊢ scalar: 'a + + OutlivesNominalType: + ∀i. Pi: 'a + -------------------------------------------------- + R ⊢ Id: 'a + + OutlivesReference: + R ⊢ 'x: 'a + R ⊢ T: 'a + -------------------------------------------------- + R ⊢ &'x T: 'a + + OutlivesObject: + ∀i. R ⊢ Oi: 'a + R ⊢ 'x: 'a + -------------------------------------------------- + R ⊢ O0..On+'x: 'a + + OutlivesFunction: + ∀i. R,r.. ⊢ Ti: 'a + -------------------------------------------------- + R ⊢ for fn(T1..Tn) -> T0 + + OutlivesTraitRef: + ∀i. R,r.. ⊢ Pi: 'a + -------------------------------------------------- + R ⊢ for TraitId: 'a + +#### Outlives for lifetimes + +The outlives relation for lifetimes depends on whether the lifetime in +question was bound within a type or not. In the usual case, we decide +the relationship between two lifetimes by consulting the environment. +Lifetimes representing scopes within the current fn have a +relationship derived from the code itself, lifetime parameters have +relationships defined by where-clauses and implied bounds: + + 'x not in R + ('x: 'a) in Env + -------------------------------------------------- + R ⊢ 'x: 'a + +For higher-ranked lifetimes, we simply ignore the relation, since the +lifetime is not yet known. This means for example that `fn<'a> fn(&'a +i32): 'x` holds, even though we do not yet know what region `'a` is +(and in fact it may be instantiated many times with different values +on each call to the fn). + + 'x in R + -------------------------------------------------- + R ⊢ 'x: 'a + +#### Outlives for type parameters + +For type parameters, the only way to draw "outlives" conclusions is to +find information in the environment (which is being threaded +implicitly here, since it is never modified). In terms of a Rust +program, this means both explicit where-clauses and implied bounds +derived from the signature (discussed below). + + OutlivesTypeParameterEnv: + X: 'a in Env + -------------------------------------------------- + R ⊢ X: 'a + + +#### Outlives for projections + +Projections have the most possibilities. First, we may find +information in the environment, as with type parameters, but we can +also consult the trait definition to find bounds (consider an +associated type declared like `type Foo: 'static`). These rule only +apply if there are no higher-ranked lifetimes in the projection; for +simplicity's sake, we encode that by requiring an empty list of +higher-ranked lifetimes. (This is somewhat stricter than necessary, +but reflects the behavior of my prototype implementation.) + + OutlivesProjectionEnv: + >::Id: 'a in Env + -------------------------------------------------- + <> ⊢ >::Id: 'a + + OutlivesProjectionTraitDef: + WC = [Xi => Pi] WhereClauses(Trait) + >::Id: 'a in WC + -------------------------------------------------- + <> ⊢ >::Id: 'a + +All the rules covered so far already exist today. This last rule, +however, is not only new, it is the crucial insight of this RFC. It +states that if all the components in a projection's trait reference +outlive `'a`, then the projection must outlive `'a`: + + OutlivesProjectionComponents: + ∀i. R ⊢ Pi: 'a + -------------------------------------------------- + R ⊢ >::Id: 'a + +Given the importance of this rule, it's worth spending a bit of time +discussing it in more detail. The following explanation is fairly +informal. A more detailed look can be found in the appendix. + +Let's begin with a concrete example of an iterator type, like +`std::vec::Iter<'a,T>`. We are interested in the projection of +`Iterator::Item`: + + as Iterator>::Item + +or, in the more succint (but potentially ambiguous) form: + + Iter<'a,T>::Item + +Since I'm going to be talking a lot about this type, let's just call +it `` for now. We would like to determine whether `: 'x` holds. + +Now, the easy way to solve `: 'x` would be to normalize `` +by looking at the relevant impl: + +```rust +impl<'b,U> Iterator for Iter<'b,U> { + type Item = &'b U; + ... +} +``` + +From this impl, we can conclude that ` == &'a T`, and thus +reduce `: 'x` to `&'a T: 'x`, which in turn holds if `'a: 'x` +and `T: 'x` (from the rule `OutlivesReference`). + +But often we are in a situation where we can't normalize the +projection. What can we do then? The rule +`OutlivesProjectionComponents` says that if we can conclude that every +lifetime/type parameter `Pi` to the trait reference outlives `'x`, +then we know that a projection from those parameters outlives `'x`. In +our example, the trait reference is ` as Iterator>`, so +that means that if the type `Iter<'a,T>` outlives `'x`, then the +projection `` outlives `'x`. Now, you can see that this +trivially reduces to the same result as the normalization, since +`Iter<'a,T>: 'x` holds if `'a: 'x` and `T: 'x` (from the rule +`OutlivesNominalType`). + +OK, so we've seen that applying the rule +`OutlivesProjectionComponents` comes up with the same result as +normalizing (at least in this case), and that's a good sign. But what +is the basis of the rule? + +The basis of the rule comes from reasoning about the impl that we used +to do normalization. Let's consider that impl again, but this time +hide the actual type that was specified: + +```rust +impl<'b,U> Iterator for Iter<'b,U> { + type Item = /* */; + ... +} +``` + +So when we normalize ``, we obtain the result by applying some +substitution `Θ` to ``. This substitution is a mapping from the +lifetime/type parameters on the impl to some specific values, such +that ` == Θ as Iterator>::Item`. In this case, that +means `Θ` would be `['b => 'a, U => T]` (and of course `` would +be `&'b U`, but we're not supposed to rely on that). + +The key idea for the `OutlivesProjectionComponents` is that the only +way that `` can *fail* to outlive `'x` is if either: + +- it names some lifetime parameter `'p` where `'p: 'x` does not hold; or, +- it names some type parameter `X` where `X: 'x` does not hold. + +Now, the only way that `` can refer to a parameter `P` is if it +is brought in by the substitution `Θ`. So, if we can just show that +all the types/lifetimes that in the range of `Θ` outlive `'x`, then we +know that `Θ ` outlives `'x`. + +Put yet another way: imagine that you have an impl with *no +parameters*, like: + +```rust +impl Iterator for Foo { + type Item = /* */; +} +``` + +Clearly, whatever `` is, it can only refer to the lifetime +`'static`. So clearly `::Item: 'static` holds. We +know this is true without ever knowing what `` is -- we just +need to see that the trait reference `` doesn't have +any lifetimes or type parameters in it, and hence the impl cannot +refer to any lifetime or type parameters. + +#### Implementation complications + +One complication for the implementation is that there are so many +potential outlives rules for projections. In particular, the rule that +says `>>: 'a` holds if `Pi: 'a` is not an "if and +only if" rule. So, for example, if we know that `T: 'a` and `'b: 'a`, +then we know that `>:: Item: 'a` (for any trait and +item), but not vice versa. This complicates inference significantly, +since if variables are involved, we do not know whether to create +edges between the variables or not (put another way, the simple +dataflow model we are currently using doesn't truly suffice for these +rules). + +This complication is unfortunate, but to a large extent already exists +with where-clauses and trait matching (see e.g. [#21974]). (Moreover, +it seems to be inherent to the concept of assocated types, since they +take several inputs (the parameters to the trait) which may or may not +be related to the actual type definition in question.) + +For the time being, the current implementation takes a pragmatic +approach based on heuristics. It tries to avoid adding edges to the +region graph in various common scenarios, and in the end falls back to +enforcing conditions that may be stricter than necessary, but which +certainly suffice. We have not yet encountered an example in practice +where the current implementation rules do not suffice. + +### The WF relation + +This section describes the "well-formed" relation. In +[previous RFCs][RFC 192], this was combined with the outlives +relation. We separate it here for reasons that shall become clear when +we discuss WF conditions on impls. + +The WF relation is really pretty simple: it just says that a type is +"self-consistent". Typically, this would include validating scoping +(i.e., that you don't refer to a type parameter `X` if you didn't +declare one), but we'll take those basic conditions for granted. + + WfScalar: + -------------------------------------------------- + R ⊢ scalar WF + + WfParameter: + -------------------------------------------------- + R ⊢ X WF + + WfNominalType: + ∀i. R ⊢ Pi Wf // parameters must be WF, + C = WhereClauses(Id) // and the conditions declared on Id must hold... + R ⊢ [P0..Pn] C // ...after substituting parameters, of course + -------------------------------------------------- + R ⊢ Id WF + + WfReference: + R ⊢ T WF // T must be WF + R ⊢ T: 'x // T must outlive 'x + -------------------------------------------------- + R ⊢ &'x T WF + + WfSlice: + R ⊢ T WF + R ⊢ T: Sized + -------------------------------------------------- + [T] WF + + WfProjection: + ∀i. R ⊢ Pi WF // all components well-formed + R ⊢ > // the projection itself is valid + -------------------------------------------------- + R ⊢ >::Id WF + +#### WF checking and higher-ranked types + +There are two places in Rust where types can introduce lifetime names +into scope: fns and trait objects. These have somewhat different rules +than the rest, simply because they modify the set `R` of bound +lifetime names. Let's start with the rule for fn types: + + WfFn: + ∀i. R, r.. ⊢ Ti + -------------------------------------------------- + R ⊢ for fn(T1..Tn) -> T0 + +Basically, this rule says that a `fn` type is *always* WF, regardless +of what types it references. This certainly accepts a type like +`for<'a> fn(x: &'a T)`. However, it also accepts some types that it +probably shouldn't. Consider for example if we had a type like +`NoHash` that is not hashable; in that case, it'd be nice if +`fn(HashMap)` were not considered well-formed. But these +rules would accept it, because `HashMap` appears inside a +fn signature. + +Note that `fn` types do not require that `T0..Tn` be `Sized`. This is +intentional. The limitation that only sized values can be passed as +argument (or returned) is enforced at the time when a fn is actually +called, as well as in actual fn definitions, but is not considered +fundamental to fn types thesmelves. There are several reasons for +this. For one thing, it's forwards compatible with passing DST by +value. For another, it means that non-defaulted trait methods to do +not have to show that their argument types are `Sized` (this will be +checked in the implementations, where more types are known). Since the +implicit `Self` type parameter is not `Sized` by default ([RFC 546]), +requiring that argument types be `Sized` in trait definitions proves +to be an annoying annotation burden. + +The object type rule is similar, though it includes an extra clause: + + WfObject: + rᵢ = union of implied region bounds from Oi + ∀i. rᵢ: r + ∀i. R ⊢ Oi WF + -------------------------------------------------- + R ⊢ O0..On+r WF + +The first two clauses here state that the explicit lifetime bound `r` +must be an approximation for the the implicit bounds `rᵢ` derived from +the trait definitions. That is, if you have a trait definition like + +```rust +trait Foo: 'static { ... } +``` + +and a trait object like `Foo+'x`, when we require that `'static: 'x` +(which is true, clearly, but in some cases the implicit bounds from +traits are not `'static` but rather some named lifetime). + +The next clause states that all object fragments must be WF. An object +fragment is WF if its components are WF: + + WfObjectFragment: + ∀i. R, r.. ⊢ Pi + -------------------------------------------------- + R ⊢ for TraitId + +Note that we don't check the where clauses declared on the trait +itself. These are checked when the object is created. The reason not +to check them here is because the `Self` type is not known (this is an +object, after all), and hence we can't check them in general. (But see +*unresolved questions*.) + +#### WF checking a trait reference + +In some contexts, we want to check a trait reference, such as the ones +that appear in where clauses or type parameter bounds. The rules for +this are given here: + + WfObjectFragment: + ∀i. R, r.. ⊢ Pi + C = WhereClauses(Id) // and the conditions declared on Id must hold... + R, r0...rn ⊢ [P0..Pn] C // ...after substituting parameters, of course + -------------------------------------------------- + R ⊢ for P0: TraitId + +The rules are fairly straightforward. The components must be well formed, +and any where-clauses declared on the trait itself much hold. + +#### Checking conditions + +In various rules above, we have rules that declare that a where-clause +must hold, which have the form `R ̣⊢ WhereClause`. Here, `R` represents +the set of bound regions. It may well be that `WhereClause` does not +use any of the regions in `R`. In that case, we can ignore the +bound-regions and simple check that `WhereClause` holds. But if +`WhereClause` *does* refer to regions in `R`, then we simply consider +`R ⊢ WhereClause` to hold. Those conditions will be checked later when +the bound lifetimes are instantiated (either through a call or a +projection). + +In practical terms, this means that if I have a type like: + +```rust +struct Iterator<'a, T:'a> { ... } +``` + +and a function type like `for<'a> fn(i: Iterator<'a, T>)` then this +type is considered well-formed without having to show that `T: 'a` +holds. In terms of the rules, this is because we would wind up with a +constraint like `'a ⊢ T: 'a`. + +However, if I have a type like + +```rust +struct Foo<'a, T:Eq> { .. } +``` + +and a function type like `for<'a> fn(f: Foo<'a, T>)`, I still must +show that `T: Eq` holds for that function to be well-formed. This is +because the condition which is geneated will be `'a ⊢ T: Eq`, but `'a` +is not referenced there. + +#### Implied bounds + +Implied bounds can be derived from the WF and outlives relations. The +implied bounds from a type `T` are given by expanding the requirements +that `T: WF`. Since we currently limit ourselves to implied region +bounds, we we are interesting in extracting requirements of the form: + +- `'a:'r`, where two regions must be related; +- `X:'r`, where a type parameter `X` outlives a region; or, +- `>::Id: 'r`, where a projection outlives a region. + +Some caution is required around projections when deriving implied +bounds. If we encounter a requirement that e.g. `X::Id: 'r`, we cannot +for example deduce that `X: 'r` must hold. This is because while `X: +'r` is *sufficient* for `X::Id: 'r` to hold, it is not *necessary* for +`X::Id: 'r` to hold. So we can only conclude that `X::Id: 'r` holds, +and not `X: 'r`. + +#### When should we check the WF relation and under what conditions? + +Currently the compiler performs WF checking in a somewhat haphazard +way: in some cases (such as impls), it omits checking WF, but in +others (such as fn bodies), it checks WF when it should not have +to. Partly that is due to the fact that the compiler currently +connects the WF and outlives relationship into one thing, rather than +separating them as described here. + +**Constants/statics.** The type of a constant or static can be checked +for WF in an empty environment. + +**Struct/enum declarations.** In a struct/enum declaration, we should +check that all field types are WF, given the bounds and where-clauses +from the struct declaration. + +**Function items.** For function items, the environment consists of +all the where-clauses from the fn, as well as implied bounds derived +from the fn's argument types. These are then used to check that the +following are well-formed: + +- argument types; +- return type; +- where clauses; +- types of local variables. + +These WF requirements are imposed at each fn or associated fn +definition (as well as within trait items). + +**Trait impls.** In a trait impl, we assume that all types appearing +in the impl header are well-formed. This means that the initial +environment for an impl consists of the impl where-clauses and implied +bounds derived from its header. Example: Given an impl like +`impl<'a,T> SomeTrait for &'a T`, the environment would be `T: Sized` +(explicit where-clause) and `T: 'a` (implied bound derived from `&'a +T`). This environment is used as the starting point for checking the +items: + +- Associated types must be WF in the trait environment. +- The types of associated constants must be WF in the trait environment. +- Associated fns are checked just like regular function items, but + with the additional implied bounds from the impl signature. + +**Inherent impls.** In an inherent impl, we can assume that the self +type is well-formed, but otherwise check the methods as if they were +normal functions. + +**Trait declarations.** Trait declarations (and defaults) are checked +in the same fashion as impls, except that there are no implied bounds +from the impl header. + +**Type aliases.** Type aliases are currently not checked for WF, since +they are considered transparent to type-checking. It's not clear that +this is the best policy, but it seems harmless, since the WF rules +will still be applied to the expanded version. See the *Unresolved +Questions* for some discussion on the alternatives here. + +Several points in the list above made use of *implied bounds* based on +assuming that various types were WF. We have to ensure that those +bounds are checked on the reciprocal side, as follows: + +**Fns being called.** Before calling a fn, we check that its argument +and return types are WF. This check takes place after all +higher-ranked lifetimes have been instantiated. Checking the argument +types ensures that the implied bounds due to argument types are +correct. Checking the return type ensures that the resulting type of +the call is WF. + +**Method calls, "UFCS" notation for fns and constants.** These are the +two ways to project a value out of a trait reference. A method call or +UFCS resolution will require that the trait reference is WF according +to the rules given above. + +**Normalizing associated type references.** Whenever a projection type +like `T::Foo` is normalized, we will require that the trait reference +is WF. + +# Drawbacks + +N/A + +# Alternatives + +I'm not aware of any appealing alternatives. + +# Unresolved questions + +For trait object fragments, should we check WF conditions when we can? +For example, if you have: + +```rust +trait HashSet +``` + +should an object like `Box>` be illegal? It seems +like that would be inline with our "best effort" approach to bound +regions, so probably yes. + +[RFC 192]: https://github.com/rust-lang/rfcs/blob/master/text/0192-bounds-on-object-and-generic-types.md +[RFC 195]: https://github.com/rust-lang/rfcs/blob/master/text/0195-associated-items.md +[RFC 447]: https://github.com/rust-lang/rfcs/blob/master/text/0447-no-unused-impl-parameters.md +[#21748]: https://github.com/rust-lang/rust/issues/21748 +[#23442]: https://github.com/rust-lang/rust/issues/23442 +[#24662]: https://github.com/rust-lang/rust/issues/24622 +[#22436]: https://github.com/rust-lang/rust/pull/22436 +[#22246]: https://github.com/rust-lang/rust/issues/22246 +[#25860]: https://github.com/rust-lang/rust/issues/25860 +[#25692]: https://github.com/rust-lang/rust/issues/25692 +[adapted]: https://github.com/rust-lang/rust/issues/22246#issuecomment-74186523 +[#22077]: https://github.com/rust-lang/rust/issues/22077 +[#24461]: https://github.com/rust-lang/rust/pull/24461 +[#21974]: https://github.com/rust-lang/rust/issues/21974 +[RFC 546]: 0546-Self-not-sized-by-default.md + +# Appendix + +The informal explanation glossed over some details. This appendix +tries to be a bit more thorough with how it is that we can conclude +that a projection outlives `'a` if its inputs outlive `'a`. To start, +let's specify the projection `` as: + + >::Id + +where `P` can be a lifetime or type parameter as appropriate. + +Then we know that there exists some impl of the form: + +```rust +impl Trait for Q0 { + type Id = T; +} +``` + +Here again, `X` can be a lifetime or type parameter name, and `Q` can +be any lifetime or type parameter. + +Let `Θ` be a suitable substitution `[Xi => Ri]` such that `∀i. Θ Qi == +Pi` (in other words, so that the impl applies to the projection). Then +the normalized form of `` is `Θ T`. Note that because trait +matching is invariant, the types must be exactly equal. + +[RFC 447] and [#24461] require that a parameter `Xi` can only appear +in `T` if it is *constrained* by the trait reference `>`. The full definition of *constrained* appears below, +but informally it means roughly that `Xi` appears in `Q0..Qn` +somewhere outside of a projection. Let's call the constrained set of +parameters `Constrained(Q0..Qn)`. + +Recall the rule `OutlivesProjectionComponents`: + + OutlivesProjectionComponents: + ∀i. R ⊢ Pi: 'a + -------------------------------------------------- + R ⊢ >::Id: 'a + +We aim to show that `∀i. R ⊢ Pi: 'a` implies `R ⊢ (Θ T): 'a`, which implies +that this rule is a sound approximation for normalization. The +argument follows from two lemmas ("proofs" for these lemmas are +sketched below): + +1. First, we show that if `R ⊢ Pi: 'a`, then every "subcomponent" `P'` + of `Pi` outlives `'a`. The idea here is that each variable `Xi` + from the impl will match against and extract some subcomponent `P'` + of `Pi`, and we wish to show that the subcomponent `P'` extracted + by `Xi` outlives `'a`. +2. Then we will show that the type `θ T` outlives `'a` if, for each of + the in-scope parameters `Xi`, `Θ Xi: 'a`. + +**Definition 1.** `Constrained(T)` defines the set of type/lifetime +parameters that are *constrained* by a type. This set is found just by +recursing over and extracting all subcomponents *except* for those +found in a projection. This is because a type like `X::Foo` does not +constrain what type `X` can take on, rather it uses `X` as an input to +compute a result: + + Constrained(scalar) = {} + Constrained(X) = {X} + Constrained(&'x T) = {'x} | Constrained(T) + Constrained(O0..On+'x) = Union(Constrained(Oi)) | {'x} + Constrained([T]) = Constrained(T), + Constrained(for<..> fn(T1..Tn) -> T0) = Union(Constrained(Ti)) + Constrained(>::Id) = {} // empty set + +**Definition 2.** `Constrained('a) = {'a}`. In other words, a lifetime +reference just constraints itself. + +**Lemma 1:** Given `R ⊢ P: 'a`, `P = [X => P'] Q`, and `X ∈ Constrained(Q)`, +then `R ⊢ P': 'a`. Proceed by induction and by cases over the form of `P`: + +1. If `P` is a scalar or parameter, there are no subcomponents, so `P'=P`. +2. For nominal types, references, objects, and function types, either + `P'=P` or `P'` is some subcomponent of `P`. The appropriate "outlives" + rules all require that all subcomponents outlive `'a`, and hence + the conclusion follows by induction. +3. If `P'` is a projection, that implies that `P'=P`. + - Otherwise, `Q` must be a projection, and in that case, `Constrained(Q)` would be + the empty set. + +**Lemma 2:** Given that `FV(T) ∈ X`, `∀i. Ri: 'a`, then `[X => R] T: +'a`. In other words, if all the type/lifetime parameters that appear +in a type outlive `'a`, then the type outlives `'a`. Follows by +inspection of the outlives rules. From bcc727f9efb85a09a818e96f2d113bf22104f0ae Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 17 Jul 2015 18:18:06 -0400 Subject: [PATCH 0384/1195] Correct issue #. --- text/0000-projections-lifetimes-and-wf.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index a01a9db0787..454f3e56199 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -8,7 +8,7 @@ Type system changes to address the outlives relation with respect to projections, and to better enforce that all types are well-formed (meaning that they respect their declared bounds). The current -implementation can be both unsound ([#24662]), inconvenient +implementation can be both unsound ([#24622]), inconvenient ([#23442]), and surprising ([#21748], [#25692]). The changes are as follows: - Simplify the outlives relation to be syntactically based. @@ -147,12 +147,12 @@ being [adapted] to cover associated type projections like still lead to [annoying errors in some situations][#23442]. Finding a better solution has been on the agenda for some time. -Simultaneously, we realized in [#24662] that the compiler had a bug +Simultaneously, we realized in [#24622] that the compiler had a bug that caused it erroneously assume that every projection like `I::Item` outlived the current function body, just as it assumes that type parameters like `I` outlive the current function body. **This bug can lead to unsound behavior.** Unfortunately, simply implementing the -naive fix for #24662 exacerbates the shortcomings of the current rules +naive fix for #24622 exacerbates the shortcomings of the current rules for projections, causing widespread compilation failures in all sorts of reasonable and obviously correct code. @@ -855,7 +855,7 @@ regions, so probably yes. [RFC 447]: https://github.com/rust-lang/rfcs/blob/master/text/0447-no-unused-impl-parameters.md [#21748]: https://github.com/rust-lang/rust/issues/21748 [#23442]: https://github.com/rust-lang/rust/issues/23442 -[#24662]: https://github.com/rust-lang/rust/issues/24622 +[#24622]: https://github.com/rust-lang/rust/issues/24622 [#22436]: https://github.com/rust-lang/rust/pull/22436 [#22246]: https://github.com/rust-lang/rust/issues/22246 [#25860]: https://github.com/rust-lang/rust/issues/25860 From 6b07c207ce3731eb4feada048b26d843b30716eb Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 17 Jul 2015 18:47:07 -0400 Subject: [PATCH 0385/1195] add clause that says that an object type is WF only if the trait is object safe --- text/0000-projections-lifetimes-and-wf.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index 454f3e56199..5a7108c702a 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -675,6 +675,7 @@ fragment is WF if its components are WF: WfObjectFragment: ∀i. R, r.. ⊢ Pi + TraitId is object safe -------------------------------------------------- R ⊢ for TraitId From ffeccfd14703bf8e86c0cde8c684b7f130e4a310 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 17 Jul 2015 18:53:14 -0400 Subject: [PATCH 0386/1195] discuss type aliases a bit more --- text/0000-projections-lifetimes-and-wf.md | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index 5a7108c702a..d25e7ef094f 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -767,7 +767,7 @@ for WF in an empty environment. **Struct/enum declarations.** In a struct/enum declaration, we should check that all field types are WF, given the bounds and where-clauses -from the struct declaration. +from the struct declaration. Also check that where-clauses are well-formed. **Function items.** For function items, the environment consists of all the where-clauses from the fn, as well as implied bounds derived @@ -791,6 +791,7 @@ bounds derived from its header. Example: Given an impl like T`). This environment is used as the starting point for checking the items: +- Where-clauses declared on the trait must be WF. - Associated types must be WF in the trait environment. - The types of associated constants must be WF in the trait environment. - Associated fns are checked just like regular function items, but @@ -798,11 +799,13 @@ items: **Inherent impls.** In an inherent impl, we can assume that the self type is well-formed, but otherwise check the methods as if they were -normal functions. +normal functions. We must check that all items are well-formed, along with +the where clauses declared on the impl. **Trait declarations.** Trait declarations (and defaults) are checked in the same fashion as impls, except that there are no implied bounds -from the impl header. +from the impl header. We must check that all items are well-formed, +along with the where clauses declared on the trait. **Type aliases.** Type aliases are currently not checked for WF, since they are considered transparent to type-checking. It's not clear that @@ -840,7 +843,19 @@ I'm not aware of any appealing alternatives. # Unresolved questions -For trait object fragments, should we check WF conditions when we can? +**Best policy for type aliases.** The current policy is not to check +type aliases, since they are transparent to type-checking, and hence +their expansion can be checked instead. This is coherent, though +somewhat confusing in terms of the interaction with projections, since +we frequently cannot resolve projections without at least minimal +bounds (i.e., `type IteratorAndItem = (T::Item, +T)`). Still, full-checking of WF on type aliases seems to just mean +more annotation with little benefit. It might be nice to keep the +current policy and later, if/when we adopt a more full notion of +implied bounds, rationalize it by saying that the suitable bounds for +a type alias are implied by its expansion. + +**For trait object fragments, should we check WF conditions when we can?** For example, if you have: ```rust From 58baeaa1dcc97216215c5f2bcda7627a88665455 Mon Sep 17 00:00:00 2001 From: Andrew Cann Date: Sun, 19 Jul 2015 14:21:09 +0800 Subject: [PATCH 0387/1195] bang_type initial commit --- text/0000-bang-type.md | 401 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 401 insertions(+) create mode 100644 text/0000-bang-type.md diff --git a/text/0000-bang-type.md b/text/0000-bang-type.md new file mode 100644 index 00000000000..72bf837f59c --- /dev/null +++ b/text/0000-bang-type.md @@ -0,0 +1,401 @@ +- Feature Name: bang_type +- Start Date: 2015-07-19 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Promote `!` to be a full-fledged type equivalent to an `enum` with no variants. + +# Motivation + +To understand the motivation for this it's necessary to understand the concept +of empty types. An empty type is a type with no inhabitants, ie. a type for +which there is nothing of that type. For example consider the type `enum Never +{}`. This type has no constructors and therefore can never be instantiated. It is empty, in the sense that there are no values of type `Never`. Note +that `Never` is not equivalent to `()` or `struct Foo {}` each of which have +exactly one inhabitant. Empty types have some interesting properties that may +be unfamiliar to programmers who have not encountered them before. + + * They never exist at runtime. + Because there is no way to create one. + + * They have no logical machine-level representation. + One way to think about this is to consider the number of bits required to + store a value of a given type. A value of type `bool` can be in two + possible states (`true` and `false`). Therefore to specify which state a + `bool` is in we need `log2(2) ==> 1` bit of information. A value of type + `()` can only be in one possible state (`()`). Therefore to specify which + state a `()` is in we need `log2(1) ==> 0` bits of information. A value of + type `Never` has no possible states it can be in. Therefore to ask which of + these states it is in is a meaningless question and we have `log2(0) ==> + undefined` (or `-∞`). Having no representation is not problematic as safe + code never has reason nor ability to handle data of an empty type (as such + data can never exist). In practice, Rust currently treats empty types as + having size 0. + + * Code that handles them never executes. + Because there is no value that it could execute with. Therefore, having a + `Never` in scope is a static guarantee that a piece of code will never be + run. + + * They represent the return type of functions that don't return. + For a function that never returns, such as `exit`, the set of all values it + may return is the empty set. That is to say, the type of all values it may + return is the type of no inhabitants, ie. `Never` or anything equivalent to + it. + + * They can be converted to any other type. + To specify a function `A -> B` we need to specify a return value in `B` for + every possible argument in `A`. For example, an expression that converts + `bool -> T` needs to specify a return value for both possible arguments + `true` and `false`: + + ```rust + let foo: &'static str = match x { + true => "some_value", + false => "some_other_value", + }; + ``` + + Likewise, an expression to convert `() -> T` needs to specify one value, + the value corresponding to `()`: + + ```rust + let foo: &'static str = match x { + () => "some_value", + }; + ``` + + And following this pattern, to convert `Never -> T` we need to specify a + `T` for every possible `Never`. Of which there are none: + + ```rust + let foo: &'static str = match x { + }; + ``` + + Reading this, it may be tempting to ask the question "what is the value of + `foo` then?". Remember that this depends on the value of `x`. As there are + no possible values of `x` it's a meaningless question and besides, the + fact that `x` has type `Never` gives us a static guarantee that the match + block will never be executed. + +Here's some example code that uses `Never`. This is legal rust code that you +can run today. + +```rust +use std::process::exit; + +// Our empty type +enum Never {} + +// A diverging function with an ordinary return type +fn wrap_exit() -> Never { + exit(0); +} + +// we can use a `Never` value to diverge without using unsafe code or calling +// any diverging intrinsics +fn diverge_from_never(n: Never) -> ! { + match n { + } +} + +fn main() { + let x: Never = wrap_exit(); + // `x` is in scope, everything below here is dead code. + + let y: String = match x { + // no match cases as `Never` has no variants + }; + + // we can still use `y` though + println!("Our string is: {}", y); + + // we can use `x` to diverge + diverge_from_never(x) +} +``` + +This RFC proposes that we allow `!` to be used directly, as a type, rather than +using `Never` (or equivalent) in it's place. Under this RFC, the above code +could more simply be written. + +```rust +use std::process::exit; + +fn main() { + let x: ! = exit(0); + // `x` is in scope, everything below here is dead code. + + let y: String = match x { + // no match cases as `Never` has no variants + }; + + // we can still use `y` though + println!("Our string is: {}", y); + + // we can use `x` to diverge + x +} +``` + +So why do this? AFAICS there are 3 main reasons + + * **It removes one superfluous concept from the language and allows diverging + functions to be used in generic code.** + + Currently, Rust's functions can be divided into two kinds: those that + return a regular type and those that use the `-> !` syntax to mark + themselves as diverging. This division is unnecessary and means that + functions of the latter kind don't play well with generic code. + + For example: you want to use a diverging function where something expects a + `Fn() -> T` + + ```rust + fn foo() -> !; + fn call_a_fn T>(f: F) -> T; + + call_a_fn(foo) // ERROR! + ``` + + Or maybe you want to use a diverging function to implement a trait method + that returns an associated type: + + ```rust + trait Zog { + type Output + fn zog() -> Output; + }; + + impl Zog for T { + type Output = !; // ERROR! + fn zog() -> ! { panic!("aaah!") }; // ERROR! + } + ``` + + The workaround in these cases is to define a type like `Never` and use it + in place of `!`. You can then define functions `wrap_foo` and `unwrap_zog` + similar to the functions `wrap_exit` and `diverge_from_never` defined + earlier. It would be nice if this workaround wasn't necessary. + + * **It creates a standard empty type for use throughout rust code.** + + Empty types are useful for more than just marking functions as diverging. + When used in an enum variant they prevent the variant from ever being + instantiated. One major use case for this is if a method needs to return a + `Result` to satisfy a trait but we know that the method will always + succeed. + + For example, here's a saner implementation of `FromStr` for `String` than + currently exists in `libstd`. + + ```rust + impl FromStr for String { + type Err = !; + + fn from_str(s: &str) -> Result { + Ok(String::from(s)) + } + } + ``` + + This result can then be safely unwrapped to a `String` without using + code-smelly things like `unreachable!()` which often mask bugs in code. + + ```rust + let r: Result = FromStr::from_str("hello"); + let s = match r { + Ok(s) => s, + Err(e) => match e {}, + } + ``` + + Empty types can also be used when someone needs a dummy type to implement a + trait. Because `!` can be converted to any other type it has a trivial + implementation of any trait whose only associated items are non-static + methods. The impl simply matches on self for every method. + + Example: + + ```rust + trait ToSocketAddr { + fn to_socket_addr(&self) -> IoResult; + fn to_socket_addr_all(&self) -> IoResult>; + } + + impl ToSocketAddr for ! { + fn to_socket_addr(&self) -> IoResult { + match self {} + } + + fn to_socket_addr_all(&self) -> IoResult> { + match self {} + } + } + ``` + + All possible implementations of this trait for `!` are equivalent. This is + because any two functions that take a `!` argument and return the same type + are equivalent - they return the same result for the same arguments and + have the same effects (because they are uncallable). + + Suppose someone wants to call `fn foo(arg: Option)` with + `None`. They need to choose a type for `T` so they can pass `None::` as + the argument. However there may be no sensible default type to use for `T` + or, worse, they may not have any types at their disposal that implement + `SomeTrait`. As the user in this case is only using `None`, a sensible + choice for `T` would be a type such that `Option` can ony be `None`, ie. + it would be nice to use `!`. If `!` has a trivial implementation of + `SomeTrait` then the choice of `T` is truly irrelevant as this means `foo` + doesn't use any associated types/lifetimes/constants or static methods of + `T` and is therefore unable to distinguish `None::` from `None::`. + With this RFC, the user could `impl SomeTrait for !` (if `SomeTrait`'s + author hasn't done so already) and call `foo(None::)`. + + Currently, `Never` can be used for all the above purposes. It's useful + enough that @reem has written a package for it + [here](https://github.com/reem/rust-void) where it is named `Void`. I've also + invented it independently for my own projects and probably other people + have aswell. However `!` can be extended logically to cover all the above + use cases. Doing so would standardise the concept and prevent different + people reimplementing it under different names. + + * **Because it's the correct thing to do.** + + The empty type is such a fundamental concept that - given that it already + exists in the form of empty enums - it warrants having a canonical form of + it built-into the language. For example, `return` and `break` expressions + should logically be typed `!` but currently seem to be typed `()`. (There + is some code in the compiler that assigns type `()` to diverging + expressions because it doesn't have a sensible type to assign to them). + This means we can write stuff like this: + + ```rust + match break { + () => ... // huh? Where did that `()` come from? + } + ``` + + But not this: + + ```rust + match break {} // whaddaya mean non-exhaustive patterns? + ``` + + This is just weird and should be fixed. + +I suspect the reason that `!` isn't already treated as a canonical empty type +is just most people's unfamilarity with empty types. To draw a parallel in +history: in C `void` is in essence a type like any other. However it can't be +used in all the normal positions where a type can be used. This breaks generic +code (eg. `T foo(); T val = foo()` where `T == void`) and forces one to use +workarounds such as defining `struct Void {}` and wrapping `void`-returning +functions: + +In the early days of programming having a type that contained no data probably +seemed pointless. After all, there's no point in having a `void` typed function +argument or a vector of `void`s. So `void` was treated as merely a special +syntax for denoting a function as returning no value resulting in a language +that was more broken and complicated than it needed to be. + +Fifty years later, Rust, building on decades of experience, decides to fix C's +shortsightedness and bring `void` into the type system in the form of the empty +tuple `()`. Rust also introduces coproduct types (in the form of enums), +allowing programmers to work with uninhabited types (such as `Never`). However +rust also introduces a special syntax for denoting a function as never +returning: `fn() -> !`. Here, `!` is in essence a type like any other. However +it can't be used in all the normal positions where a type can be used. This +breaks generic code (eg. `fn() -> T; let val: T = foo()` where `T == !`) and +forces one to use workarounds such as defining `enum Never {}` and wrapping +`!`-returning functions. + +To be clear, `!` has a meaning in any situation that any other type does. A `!` +function argument makes a function uncallable, a `Vec` is a vector that can +never contain an element, a `!` enum variant makes the variant guaranteed never +to occur and so forth. It might seem pointless to use a `!` function argument +or a `Vec` (just as it would be pointless to use a `()` function argument or +a `Vec<()>`), but that's no reason to disallow it. And generic code sometimes +requires it. + +Rust already has empty types in the form of empty enums. Any code that could be +written with this RFC's `!` can already be written by swapping out `!` with +`Never` (sans implicit casts, see below). So if this RFC could create any +issues for the language (such as making it unsound or complicating the +compiler) then these issues would already exist for `Never`. + +# Detailed design + +Add a type `!` to Rust. `!` behaves like an empty enum except that it can be +implicitly cast to any other type. ie. the following code is acceptable: + +```rust +let r: Result = Ok(23); +let i = match r { + Ok(i) => i, + Err(e) => e, // e is cast to i32 +} +``` + +Implicit casting is necessary for backwards-compatibility so that code like the +following will continue to compile: + +```rust +let i: i32 = match some_bool { + true => 23, + false => panic!("aaah!"), // an expression of type `!`, gets cast to `i32` +} + +match break { + () => 23, // matching with a `()` forces the match argument to be cast to type `()` +} +``` + +In the compiler, remove the distinctions that treat diverging and converging +expressions as two different kinds of things (eg. stuff like `FnConverging` vs +`FnDiverging`). Use the type system to do things like reachability analysis. + +Add an implementation for `!` of any trait that it can trivially implement. Add +methods to `Result` and `Result` for safely extracting the inner +value. Name these methods along the lines of `unwrap_nopanic`, `safe_unwrap` or +something. + +# Drawbacks + +Someone would have to implement this. + +# Alternatives + + * Don't do this. + * Move @reem's `Void` type into `libcore`. This would create a standard empty + type and make it available for use in the standard libraries. If we were to + do this it might be an idea to rename `Void` to something else (`Never`, + `Empty` and `Mu` have all been suggested). Although `Void` has some + precedence in languages like Haskell and Idris the name is likely to trip + up people coming from a C/Java et al. background as `Void` is *not* `void` + but it can be easy to confuse the two. + +# Unresolved questions + +Apparently, rust used to have something similar to this but it was removed. +There are still a few references to `ty_bot` in the compiler. Why was this +taken out? Note that if there any arguments for not having type `!` in the +language they should apply equally well to `Never`/`Void` so I assume the old +`ty_bot` was trying to be something crazier than this RFC's `!` (such as a +subtype of all types, given the name). Could someone who was around back then +clarify this? + +`!` has a unique impl of any trait whose only items are non-static methods. It +would be nice if there was a way a to automate the creation of these impls. +Should `!` automatically satisfy any such trait? Alternatively we could do this +through a new trait attribute: + +```rust +#[derive_bang] +trait FromStr { + ... +} +``` + From 7cd058d07ec802983eb6c5b999b7744cc33cbeeb Mon Sep 17 00:00:00 2001 From: Cesar Eduardo Barros Date: Sun, 19 Jul 2015 23:30:45 -0300 Subject: [PATCH 0388/1195] Explanations about the buffer contents --- text/0000-read-all.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/text/0000-read-all.md b/text/0000-read-all.md index c242535a275..d2c841813a4 100644 --- a/text/0000-read-all.md +++ b/text/0000-read-all.md @@ -173,6 +173,30 @@ discard the stream anyways. Users who need finer control should use the `read` method directly, or when available use the `Seek` trait. +# About the buffer contents + +This RFC proposes that the contents of the output buffer be undefined on +an error return. It might be untouched, partially overwritten, or +completely overwritten (even if less bytes could be read; for instance, +this method might in theory use it as a scratch space). + +Two possible alternatives could be considered: do not touch it on +failure, or overwrite it with valid data as much as possible. + +Never touching the output buffer on failure would make it much more +expensive for the default implementation (which calls `read` in a loop), +since it would have to read into a temporary buffer and copy to the +output buffer on success. Any implementation which cannot do an early +return for all failure cases would have similar extra costs. + +Overwriting as much as possible with valid data makes some sense; it +happens without any extra cost in the default implementation. However, +for optimized implementations this extra work is useless; since the +caller can't know how much is valid data and how much is garbage, it +can't make use of the valid data. + +Users who need finer control should use the `read` method directly. + # Naming It's unfortunate that `write_all` used `WriteZero` for its `ErrorKind`; From 4cb013799daf74889321ab382d2e1a7975c4bb07 Mon Sep 17 00:00:00 2001 From: Andrew Cann Date: Mon, 20 Jul 2015 17:53:54 +0800 Subject: [PATCH 0389/1195] s/aswell/as well/ --- text/0000-bang-type.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-bang-type.md b/text/0000-bang-type.md index 72bf837f59c..cea8750f8b6 100644 --- a/text/0000-bang-type.md +++ b/text/0000-bang-type.md @@ -259,7 +259,7 @@ So why do this? AFAICS there are 3 main reasons enough that @reem has written a package for it [here](https://github.com/reem/rust-void) where it is named `Void`. I've also invented it independently for my own projects and probably other people - have aswell. However `!` can be extended logically to cover all the above + have as well. However `!` can be extended logically to cover all the above use cases. Doing so would standardise the concept and prevent different people reimplementing it under different names. From fb3ef33a1db0707aefea24de7eccbf2102702296 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 20 Jul 2015 16:02:18 -0400 Subject: [PATCH 0390/1195] Apply various clarifications based on suggestions --- text/0000-projections-lifetimes-and-wf.md | 54 +++++++++++++---------- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index d25e7ef094f..feff7ba7acd 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -355,7 +355,7 @@ or projections are involved: R ⊢ scalar: 'a OutlivesNominalType: - ∀i. Pi: 'a + ∀i. R ⊢ Pi: 'a -------------------------------------------------- R ⊢ Id: 'a @@ -387,11 +387,11 @@ The outlives relation for lifetimes depends on whether the lifetime in question was bound within a type or not. In the usual case, we decide the relationship between two lifetimes by consulting the environment. Lifetimes representing scopes within the current fn have a -relationship derived from the code itself, lifetime parameters have -relationships defined by where-clauses and implied bounds: +relationship derived from the code itself, while lifetime parameters +have relationships defined by where-clauses and implied bounds. - 'x not in R - ('x: 'a) in Env + 'x ∉ R // not a bound region + ('x: 'a) in Env // derivable from where-clauses etc -------------------------------------------------- R ⊢ 'x: 'a @@ -401,7 +401,7 @@ i32): 'x` holds, even though we do not yet know what region `'a` is (and in fact it may be instantiated many times with different values on each call to the fn). - 'x in R + 'x ∈ R // bound region -------------------------------------------------- R ⊢ 'x: 'a @@ -422,9 +422,9 @@ derived from the signature (discussed below). #### Outlives for projections Projections have the most possibilities. First, we may find -information in the environment, as with type parameters, but we can -also consult the trait definition to find bounds (consider an -associated type declared like `type Foo: 'static`). These rule only +information in the in-scope where clauses, as with type parameters, +but we can also consult the trait definition to find bounds (consider +an associated type declared like `type Foo: 'static`). These rule only apply if there are no higher-ranked lifetimes in the projection; for simplicity's sake, we encode that by requiring an empty list of higher-ranked lifetimes. (This is somewhat stricter than necessary, @@ -483,7 +483,8 @@ reduce `: 'x` to `&'a T: 'x`, which in turn holds if `'a: 'x` and `T: 'x` (from the rule `OutlivesReference`). But often we are in a situation where we can't normalize the -projection. What can we do then? The rule +projection (for example, a projection like `I::Item` where we only +know that `I: Iterator`). (For example, What can we do then? The rule `OutlivesProjectionComponents` says that if we can conclude that every lifetime/type parameter `Pi` to the trait reference outlives `'x`, then we know that a projection from those parameters outlives `'x`. In @@ -538,11 +539,11 @@ impl Iterator for Foo { ``` Clearly, whatever `` is, it can only refer to the lifetime -`'static`. So clearly `::Item: 'static` holds. We -know this is true without ever knowing what `` is -- we just -need to see that the trait reference `` doesn't have -any lifetimes or type parameters in it, and hence the impl cannot -refer to any lifetime or type parameters. +`'static`. So `::Item: 'static` holds. We know this +is true without ever knowing what `` is -- we just need to see +that the trait reference `` doesn't have any +lifetimes or type parameters in it, and hence the impl cannot refer to +any lifetime or type parameters. #### Implementation complications @@ -588,7 +589,12 @@ declare one), but we'll take those basic conditions for granted. WfParameter: -------------------------------------------------- - R ⊢ X WF + R ⊢ X WF // where X is a type parameter + + WfTuple: + ∀i. R ⊢ Ti WF + -------------------------------------------------- + R ⊢ (T0..Tn) WF WfNominalType: ∀i. R ⊢ Pi Wf // parameters must be WF, @@ -611,7 +617,7 @@ declare one), but we'll take those basic conditions for granted. WfProjection: ∀i. R ⊢ Pi WF // all components well-formed - R ⊢ > // the projection itself is valid + R ⊢ > // the projection itself is valid -------------------------------------------------- R ⊢ >::Id WF @@ -623,9 +629,9 @@ than the rest, simply because they modify the set `R` of bound lifetime names. Let's start with the rule for fn types: WfFn: - ∀i. R, r.. ⊢ Ti + ∀i. R, r.. ⊢ Ti WF -------------------------------------------------- - R ⊢ for fn(T1..Tn) -> T0 + R ⊢ for fn(T1..Tn) -> T0 WF Basically, this rule says that a `fn` type is *always* WF, regardless of what types it references. This certainly accepts a type like @@ -670,8 +676,10 @@ and a trait object like `Foo+'x`, when we require that `'static: 'x` (which is true, clearly, but in some cases the implicit bounds from traits are not `'static` but rather some named lifetime). -The next clause states that all object fragments must be WF. An object -fragment is WF if its components are WF: +The next clause states that all object type fragments must be WF (an +"object type fragment" is part of an object type: so if you have +`Box`, `FnMut()` and `Send` are object type +fragments). An object type fragment is WF if its components are WF: WfObjectFragment: ∀i. R, r.. ⊢ Pi @@ -855,8 +863,8 @@ current policy and later, if/when we adopt a more full notion of implied bounds, rationalize it by saying that the suitable bounds for a type alias are implied by its expansion. -**For trait object fragments, should we check WF conditions when we can?** -For example, if you have: +**For trait object type fragments, should we check WF conditions when +we can?** For example, if you have: ```rust trait HashSet From e1a90c3a6ec693586e240b9a2a4a70dfddcd58b0 Mon Sep 17 00:00:00 2001 From: Andrew Paseltiner Date: Mon, 20 Jul 2015 16:24:01 -0400 Subject: [PATCH 0391/1195] s/item/element/ and move `VacantEntry` stuff into details section --- text/0000-collection-recovery.md | 67 ++++++++++++++++---------------- 1 file changed, 33 insertions(+), 34 deletions(-) diff --git a/text/0000-collection-recovery.md b/text/0000-collection-recovery.md index be79017b8be..3091c1ed5c5 100644 --- a/text/0000-collection-recovery.md +++ b/text/0000-collection-recovery.md @@ -5,14 +5,14 @@ # Summary -Add item-recovery methods to the set types in `std`. Add key-recovery methods to the map types in -`std` in order to facilitate this. +Add element-recovery methods to the set types in `std`. Add key-recovery methods to the map types +in `std` in order to facilitate this. # Motivation Sets are sometimes used as a cache keyed on a certain property of a type, but programs may need to access the type's other properties for efficiency or functionailty. The sets in `std` do not expose -their items (by reference or by value), making this use-case impossible. +their elements (by reference or by value), making this use-case impossible. Consider the following example: @@ -74,27 +74,27 @@ fn main() { // replaced copper with zinc // ``` // - // However, `HashSet` does not expose its items via its `{contains, insert, remove}` methods, - // instead providing only a boolean indicator of the item's presence in the set, preventing us - // from implementing the desired functionality. + // However, `HashSet` does not expose its elements via its `{contains, insert, remove}` + // methods, instead providing only a boolean indicator of the elements's presence in the set, + // preventing us from implementing the desired functionality. } ``` # Detailed design -Add the following item-recovery methods to `std::collections::{BTreeSet, HashSet}`: +Add the following element-recovery methods to `std::collections::{BTreeSet, HashSet}`: ```rust impl Set { - // Like `contains`, but returns a reference to the item if the set contains it. - fn item(&self, item: &Q) -> Option<&T>; + // Like `contains`, but returns a reference to the element if the set contains it. + fn element(&self, element: &Q) -> Option<&T>; - // Like `remove`, but returns the item if the set contained it. - fn remove_item(&mut self, item: &Q) -> Option; + // Like `remove`, but returns the element if the set contained it. + fn remove_element(&mut self, element: &Q) -> Option; - // Like `insert`, but replaces the item with the given one and returns the previous item if the - // set contained it. - fn replace(&mut self, item: T) -> Option; + // Like `insert`, but replaces the element with the given one and returns the previous element + // if the set contained it. + fn replace(&mut self, element: T) -> Option; } ``` @@ -118,8 +118,7 @@ impl Map { } ``` -For completion, add the following key-recovery methods to -`std::collections::{btree_map, hash_map}::OccupiedEntry`: +Add the following key-recovery methods to `std::collections::{btree_map, hash_map}::OccupiedEntry`: ```rust impl<'a, K, V> OccupiedEntry<'a, K, V> { @@ -137,24 +136,7 @@ impl<'a, K, V> OccupiedEntry<'a, K, V> { } ``` -# Drawbacks - -This complicates the collection APIs. - -The distinction between `insert` and `replace` may be confusing. It would be more consistent to -call `Set::replace` `Set::insert_item` and `Map::replace` `Map::insert_key_value`, but `BTreeMap` -and `HashMap` do not replace equivalent keys in their `insert` methods, so rather than have -`insert` and `insert_key_value` behave differently in that respect, `replace` is used instead. - -# Alternatives - -Do nothing. - -# Unresolved questions - -Are these the best method names? - -Should `std::collections::{btree_map, hash_map}::VacantEntry` provide methods like +Add the following key-recovery methods to `std::collections::{btree_map, hash_map}::VacantEntry`: ```rust impl<'a, K, V> VacantEntry<'a, K, V> { @@ -169,5 +151,22 @@ impl<'a, K, V> VacantEntry<'a, K, V> { } ``` +# Drawbacks + +This complicates the collection APIs. + +The distinction between `insert` and `replace` may be confusing. It would be more consistent to +call `Set::replace` `Set::insert_element` and `Map::replace` `Map::insert_key_value`, but +`BTreeMap` and `HashMap` do not replace equivalent keys in their `insert` methods, so rather than +have `insert` and `insert_key_value` behave differently in that respect, `replace` is used instead. + +# Alternatives + +Do nothing. + +# Unresolved questions + +Are these the best method names? + Should `{BTreeMap, HashMap}::insert` be changed to replace equivalent keys? This could break code relying on the old behavior, and would add an additional inconsistency to `OccupiedEntry::insert`. From ac347d190308d48c5e0007aa4d9e87bdc0723396 Mon Sep 17 00:00:00 2001 From: Andrew Paseltiner Date: Mon, 20 Jul 2015 16:25:35 -0400 Subject: [PATCH 0392/1195] s/functionailty/functionality/ --- text/0000-collection-recovery.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-collection-recovery.md b/text/0000-collection-recovery.md index 3091c1ed5c5..5481e7143f7 100644 --- a/text/0000-collection-recovery.md +++ b/text/0000-collection-recovery.md @@ -11,7 +11,7 @@ in `std` in order to facilitate this. # Motivation Sets are sometimes used as a cache keyed on a certain property of a type, but programs may need to -access the type's other properties for efficiency or functionailty. The sets in `std` do not expose +access the type's other properties for efficiency or functionality. The sets in `std` do not expose their elements (by reference or by value), making this use-case impossible. Consider the following example: From b032d5db2db007ba927879371222ac00dd373ee2 Mon Sep 17 00:00:00 2001 From: Sean McArthur Date: Sun, 15 Feb 2015 20:13:51 -0800 Subject: [PATCH 0393/1195] use_group_as RFC --- text/0000-use-group-as.md | 72 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) create mode 100644 text/0000-use-group-as.md diff --git a/text/0000-use-group-as.md b/text/0000-use-group-as.md new file mode 100644 index 00000000000..de1231f1641 --- /dev/null +++ b/text/0000-use-group-as.md @@ -0,0 +1,72 @@ +- Feature Name: use_group_as +- Start Date: 2015-02-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow renaming imports when importing a group of symbols from a module. + +```rust +use std::io::{ + Error as IoError, + Result as IoResult, + Read, + Write +} +``` + +# Motivation + +THe current design requires the above example to be written like this: + +```rust +use std::io::Error as IoError; +use std::io::Result as IoResult; +use std::io::{Read, Write}; +``` + +It's unfortunate to duplicate `use std::io::` on the 3 lines, and the proposed +example feels logical, and something you reach for in this instance, without +knowing for sure if it worked. + +# Detailed design + +The current grammar for use statements is something like: + +``` + use_decl : "pub" ? "use" [ path "as" ident + | path_glob ] ; + + path_glob : ident [ "::" [ path_glob + | '*' ] ] ? + | '{' path_item [ ',' path_item ] * '}' ; + + path_item : ident | "self" ; +``` + +This RFC proposes changing the grammar to something like: + +``` + use_decl : "pub" ? "use" [ path [ "as" ident ] ? + | path_glob ] ; + + path_glob : ident [ "::" [ path_glob + | '*' ] ] ? + | '{' path_item [ ',' path_item ] * '}' ; + + path_item : ident [ "as" ident] ? + | "self" ; +``` + +The `"as" ident` part is optional in each location, and if omitted, it is expanded +to alias to the same name, e.g. `use foo::{bar}` expands to `use foo::{bar as bar}`. + +# Drawbacks + +# Alternatives + +# Unresolved questions + +- **Should `self` also be aliasable?** So you could write `use foo::{self as xfoo, bar}`. + From d0e72a744fc40b2fd3f6fcf22fdc460fc457e3a1 Mon Sep 17 00:00:00 2001 From: Sean McArthur Date: Wed, 22 Jul 2015 11:08:25 -0700 Subject: [PATCH 0394/1195] include renaming 'self' in a group import --- text/0000-use-group-as.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-use-group-as.md b/text/0000-use-group-as.md index de1231f1641..1977175bbd8 100644 --- a/text/0000-use-group-as.md +++ b/text/0000-use-group-as.md @@ -56,17 +56,17 @@ This RFC proposes changing the grammar to something like: | '{' path_item [ ',' path_item ] * '}' ; path_item : ident [ "as" ident] ? - | "self" ; + | "self" [ "as" ident]; ``` The `"as" ident` part is optional in each location, and if omitted, it is expanded to alias to the same name, e.g. `use foo::{bar}` expands to `use foo::{bar as bar}`. +This includes being able to rename `self`, such as `use std::io::{self +as stdio, Result as IoResult};`. + # Drawbacks # Alternatives -# Unresolved questions - -- **Should `self` also be aliasable?** So you could write `use foo::{self as xfoo, bar}`. - +# Unresolved Questions From b6e4938f28500cf9e5bba6dfb70685166d4c07fc Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Mon, 16 Feb 2015 13:10:31 -0700 Subject: [PATCH 0395/1195] Allow macros in types --- text/0000-type-macros.md | 412 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 412 insertions(+) create mode 100644 text/0000-type-macros.md diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md new file mode 100644 index 00000000000..290c2265b9d --- /dev/null +++ b/text/0000-type-macros.md @@ -0,0 +1,412 @@ +- Feature Name: Macros in type positions +- Start Date: 2015-02-16 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow macros in type positions + +# Motivation + +Macros are currently allowed in syntax fragments for expressions, +items, and patterns, but not for types. This RFC proposes to lift that +restriction for the following reasons: + +1. Increase generality of the macro system - in the absence of a + concrete reason for disallowing macros in types, the limitation + should be removed in order to promote generality and to enable use + cases which would otherwise require resorting either to compiler + plugins or to more elaborate item-level macros. + +2. Enable more programming patterns - macros in type positions provide + a means to express **recursion** and **choice** within types in a + fashion that is still legible. Associated types alone can accomplish + the former (recursion/choice) but not the latter (legibility). + +# Detailed design + +## Implementation + +The proposed feature has been implemented at +[this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). There +is no real novelty to the design as it is simply an extension of the +existing macro machinery to handle the additional case of macro +expansion in types. The biggest change is the addition of a +[`TyMac`](https://github.com/freebroccolo/rust/blob/f8f8dbb6d332c364ecf26b248ce5f872a7a67019/src/libsyntax/ast.rs#L1274-L1275) +to the `Ty_` enum so that the parser can indicate a macro invocation +in a type position. In other words, `TyMac` is added to the ast and +handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. + +## Examples + +### Heterogeneous Lists + +Heterogeneous lists are one example where the ability to express +recursion via type macros is very useful. They can be used as an +alternative to (or in combination with) tuples. Their recursive +structure provide a means to abstract over arity and to manipulate +arbitrary products of types with operations like appending, taking +length, adding/removing items, computing permutations, etc. + +Heterogeneous lists are straightforward to define: + +```rust +struct Nil; // empty HList +struct Cons(H, T); // cons cell of HList + +// trait to classify valid HLists +trait HList {} +impl HList for Nil {} +impl HList for Cons {} +``` + +However, writing them in code is not so convenient: + +```rust +let xs = Cons("foo", Cons(false, Cons(vec![0u64], Nil))); +``` + +At the term-level, this is easy enough to fix with a macro: + +```rust +// term-level macro for HLists +macro_rules! hlist { + {} => { Nil }; + { $head:expr } => { Cons($head, Nil) }; + { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; +} + +let xs = hlist!["foo", false, vec![0u64]]; +``` + +Unfortunately, this is an incomplete solution. HList terms are more +convenient to write but HList types are not: + +```rust +let xs: Cons<&str, Cons, Nil>>> = hlist!["foo", false, vec![0u64]]; +``` + +Under this proposal—allowing macros in types—we would be able to use a +macro to improve writing the HList type as well. The complete example +follows: + +```rust +// term-level macro for HLists +macro_rules! hlist { + {} => { Nil }; + { $head:expr } => { Cons($head, Nil) }; + { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; +} + +// type-level macro for HLists +macro_rules! HList { + {} => { Nil }; + { $head:ty } => { Cons<$head, Nil> }; + { $head:ty, $($tail:ty),* } => { Cons<$head, HList!($($tail),*)> }; +} + +let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![0u64]]; +``` + +Operations on HLists can be defined by recursion, using traits with +associated type outputs at the type-level and implementation methods +at the term-level. + +HList append is provided as an example of such an operation. Macros in +types are used to make writing append at the type level more +convenient, e.g., with `Expr!`: + +```rust +use std::ops; + +// nil case for HList append +impl ops::Add for Nil { + type Output = Ys; + + #[inline] + fn add(self, rhs: Ys) -> Ys { + rhs + } +} + +// cons case for HList append +impl ops::Add for Cons where + Xs: ops::Add, +{ + type Output = Cons; + + #[inline] + fn add(self, rhs: Ys) -> Cons { + Cons(self.0, self.1 + rhs) + } +} + +// type macro Expr allows us to expand the + operator appropriately +macro_rules! Expr { + { $A:ty } => { $A }; + { $LHS:tt + $RHS:tt } => { >::Output }; +} + +// test demonstrating term level `xs + ys` and type level `Expr!(Xs + Ys)` +#[test] +fn test_append() { + fn aux(xs: Xs, ys: Ys) -> Expr!(Xs + Ys) where + Xs: ops::Add + { + xs + ys + } + let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![]]; + let ys: HList![u64, [u8; 3], ()] = hlist![0, [0, 1, 2], ()]; + + // parentheses around compound types due to limitations in macro parsing; + // real implementation could use a plugin to avoid this + let zs: Expr!((HList![&str, bool, Vec]) + + (HList![u64, [u8; 3], ()])) + = aux(xs, ys); + assert_eq!(zs, hlist!["foo", false, vec![], 0, [0, 1, 2], ()]) +} +``` + +### Additional Examples ### + +#### Type-level numbers + +Another example where type macros can be useful is in the encoding of +numbers as types. Binary natural numbers can be represented as +follows: + +```rust +struct _0; // 0 bit +struct _1; // 1 bit + +// classify valid bits +trait Bit {} +impl Bit for _0 {} +impl Bit for _1 {} + +// classify positive binary naturals +trait Pos {} +impl Pos for _1 {} +impl Pos for (P, B) {} + +// classify binary naturals with 0 +trait Nat {} +impl Nat for _0 {} +impl Nat for _1 {} +impl Nat for (P, B) {} +``` + +These can be used to index into tuples or HLists generically (linear +time generally or constant time up to a fixed number of +specializations). They can also be used to encode "sized" or "bounded" +data, like vectors: + +```rust +struct LengthVec(Vec); +``` + +The type number can either be a phantom parameter `N` as above, or +represented concretely at the term-level (similar to list). In either +case, a length-safe API can be provided on top of types `Vec`. Because +the length is known statically, unsafe indexing would be allowable by +default. + +We could imagine an idealized API in the following fashion: + +```rust +// push, adding one to the length +fn push(x: A, xs: LengthVec) -> LengthVec; + +// pop, subtracting one from the length +fn pop(store: &mut A, xs: LengthVec) -> LengthVec; + +// append, adding the individual lengths +fn append(xs: LengthVec, ys: LengthVec) -> LengthVec; + +// produce a length respecting iterator from an indexed vector +fn iter(xs: LengthVec) -> LengthIterator; +``` + +However, in order to be able to write something close to that in Rust, +we would need macros in types: + +```rust + +// Nat! would expand integer constants to type-level binary naturals; would +// be implemented as a plugin for efficiency +Nat!(4) ==> ((_1, _0), _0) + +// Expr! would expand + to Add::Output and integer constants to Nat!; see +// the HList append earlier in the RFC for a concrete example of how this +// might be defined +Expr!(N + M) ==> >::Output + +// Now we could expand the following type to something meaningful in Rust: +LengthVec + ==> LengthVec>::Output> + ==> LengthVec>::Output> +``` + +##### Optimization of `Expr`! + +Because `Expr!` could be implemented as a plugin, the opportunity +would exist to perform various optimizations of type-level expressions +during expansion. Partial evaluation would be one approach to +this. Furthermore, expansion-time optimizations would not necessarily +be limited to simple arithmetic expressions but could be used for +other data like HLists. + +#### Conversion from HList to Tuple + +With type macros, it is possible to write macros that convert back and +forth between tuples and HLists in the following fashion: + +```rust +// type-level macro for HLists +macro_rules! HList { + {} => { Nil }; + { $head:ty } => { Cons<$head, Nil> }; + { $head:ty, $($tail:ty),* } => { Cons<$head, HList!($($tail),*)> }; +} + +// term-level macro for HLists +macro_rules! hlist { + {} => { Nil }; + { $head:expr } => { Cons($head, Nil) }; + { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; +} + +// term-level HLists in patterns +macro_rules! hlist_match { + {} => { Nil }; + { $head:ident } => { Cons($head, Nil) }; + { $head:ident, $($tail:ident),* } => { Cons($head, hlist_match!($($tail),*)) }; +} + +// iterate macro for generated comma separated sequences of idents +fn impl_for_seq_upto_expand<'cx>( + ecx: &'cx mut base::ExtCtxt, + span: codemap::Span, + args: &[ast::TokenTree], +) -> Box { + let mut parser = ecx.new_parser_from_tts(args); + + // parse the macro name + let mac = parser.parse_ident(); + + // parse a comma + parser.eat(&token::Token::Comma); + + // parse the number of iterations + let iterations = match parser.parse_lit().node { + ast::Lit_::LitInt(i, _) => i, + _ => { + ecx.span_err(span, "welp"); + return base::DummyResult::any(span); + } + }; + + // generate a token tree: A0, ..., An + let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { + if k % 2 == 0 { + token::str_to_ident(format!("A{}", (k / 2)).as_slice()) + .to_tokens(ecx) + .into_iter() + } else { + let span = codemap::DUMMY_SP; + let token = parse::token::Token::Comma; + vec![ast::TokenTree::TtToken(span, token)] + .into_iter() + } + }).collect::>(); + + // iterate over the ctx and generate impl syntax fragments + let mut items = vec![]; + let mut i = ctx.len(); + for _ in range(0, iterations) { + items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); + i -= 2; + ctx.truncate(i); + } + + // splice the impl fragments into the ast + base::MacItems::new(items.into_iter()) +} + +pub struct ToHList; +pub struct ToTuple; + +// macro to implement: ToTuple(hlist![…]) => (…,) +macro_rules! impl_to_tuple_for_seq { + ($($seq:ident),*) => { + #[allow(non_snake_case)] + impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { + type Output = ($($seq,)*); + #[inline] + extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { + match this { + hlist_match![$($seq),*] => ($($seq,)*) + } + } + } + } +} + +// macro to implement: ToHList((…,)) => hlist![…] +macro_rules! impl_to_hlist_for_seq { + ($($seq:ident),*) => { + #[allow(non_snake_case)] + impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { + type Output = HList![$($seq),*]; + #[inline] + extern "rust-call" fn call(&self, (this,): (($($seq,)*),)) -> HList![$($seq),*] { + match this { + ($($seq,)*) => hlist![$($seq),*] + } + } + } + } +} + +// generate implementations up to length 32 +impl_for_seq_upto!{ impl_to_tuple_for_seq, 32 } +impl_for_seq_upto!{ impl_to_hlist_for_seq, 32 } +``` + +# Drawbacks + +There seem to be few drawbacks to implementing this feature as an +extension of the existing macro machinery. Parsing macro invocations +in types adds a very small amount of additional complexity to the +parser (basically looking for `!`). Having an extra case for macro +invocation in types slightly complicates conversion. As with all +feature proposals, it is possible that designs for future extensions +to the macro system or type system might somehow interfere with this +functionality. + +# Alternatives + +There are no direct alternatives to my knowledge. Extensions to the +type system like data kinds, singletons, and various more elaborate +forms of staged programming (so-called CTFE) could conceivably cover +some cases where macros in types might otherwise be used. It is +unlikely they would provide the same level of functionality as macros, +particularly where plugins are concerned. Instead, such features would +probably benefit from type macros too. + +Not implementing this feature would mean disallowing some useful +programming patterns. There are some discussions in the community +regarding more extensive changes to the type system to address some of +these patterns. However, type macros along with associated types can +already accomplish many of the same things without the significant +engineering cost in terms of changes to the type system. Either way, +type macros would not prevent additional extensions. + +# Unresolved questions + +There is a question as to whether macros in types should allow `<` and +`>` as delimiters for invocations, e.g. `Foo!`. However, this would +raise a number of additional complications and is not necessary to +consider for this RFC. If deemed desirable by the community, this +functionality can be proposed separately. From fbbb16badd1381f968e8fa763b63213a3906e574 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Wed, 18 Feb 2015 10:33:29 -0700 Subject: [PATCH 0396/1195] Link RFC for parameterizing types with constants --- text/0000-type-macros.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 290c2265b9d..14738dea32a 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -257,6 +257,17 @@ this. Furthermore, expansion-time optimizations would not necessarily be limited to simple arithmetic expressions but could be used for other data like HLists. +##### Native alternatives: types parameterized by constant values + +This example with type-level naturals is meant to illustrate the kind +of patterns macros in types enable. I am not suggesting the standard +libraries adopt _this particular_ representation as a means to address +the more general issue of lack of numeric parameterization for +types. There is +[another RFC here](https://github.com/rust-lang/rfcs/pull/884) which +does propose extending the type system to allow parameterization over +constants. + #### Conversion from HList to Tuple With type macros, it is possible to write macros that convert back and From 6c7a4054d26654b8ebad97c55941fc8ac9d4c0b1 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Wed, 18 Feb 2015 10:43:12 -0700 Subject: [PATCH 0397/1195] Add tests to hlist/tuple conversion example --- text/0000-type-macros.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 14738dea32a..478733f5095 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -383,6 +383,20 @@ macro_rules! impl_to_hlist_for_seq { // generate implementations up to length 32 impl_for_seq_upto!{ impl_to_tuple_for_seq, 32 } impl_for_seq_upto!{ impl_to_hlist_for_seq, 32 } + +// test converting an hlist to tuple +#[test] +fn test_to_tuple() { + assert_eq(ToTuple(hlist!["foo", true, (), vec![42u64]]), + ("foo", true, (), vec![42u64])) +} + +// test converting a tuple to hlist +#[test] +fn test_to_hlist() { + assert_eq(ToHList(("foo", true, (), vec![42u64])), + hlist!["foo", true, (), vec![42u64]]) +} ``` # Drawbacks From 53626bd77c038ec57a88c7a00509b06a3522748d Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Wed, 18 Feb 2015 20:06:39 -0700 Subject: [PATCH 0398/1195] Rewording, comments, etc. --- text/0000-type-macros.md | 76 +++++++++++++++++++++++++++------------- 1 file changed, 51 insertions(+), 25 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 478733f5095..f29c7e1ecc0 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -29,14 +29,18 @@ restriction for the following reasons: ## Implementation The proposed feature has been implemented at -[this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). There -is no real novelty to the design as it is simply an extension of the -existing macro machinery to handle the additional case of macro -expansion in types. The biggest change is the addition of a +[this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). The +implementation is very simple and there is no novelty to the +design. The patches make a small modification to the existing macro +expansion functionality in order to support macro invocations in +syntax for types. No changes are made to type-checking or other phases +of the compiler. + +The biggest change introduced by this feature is a [`TyMac`](https://github.com/freebroccolo/rust/blob/f8f8dbb6d332c364ecf26b248ce5f872a7a67019/src/libsyntax/ast.rs#L1274-L1275) -to the `Ty_` enum so that the parser can indicate a macro invocation -in a type position. In other words, `TyMac` is added to the ast and -handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. +case for the `Ty_` enum so that the parser can indicate a macro +invocation in a type position. In other words, `TyMac` is added to the +ast and handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. ## Examples @@ -235,12 +239,14 @@ we would need macros in types: // Nat! would expand integer constants to type-level binary naturals; would // be implemented as a plugin for efficiency -Nat!(4) ==> ((_1, _0), _0) +Nat!(4) + ==> ((_1, _0), _0) // Expr! would expand + to Add::Output and integer constants to Nat!; see // the HList append earlier in the RFC for a concrete example of how this // might be defined -Expr!(N + M) ==> >::Output +Expr!(N + M) + ==> >::Output // Now we could expand the following type to something meaningful in Rust: LengthVec @@ -271,7 +277,12 @@ constants. #### Conversion from HList to Tuple With type macros, it is possible to write macros that convert back and -forth between tuples and HLists in the following fashion: +forth between tuples and HLists. This is very powerful because it lets +us reuse all of the operations we define for HLists (appending, taking +length, adding/removing items, computing permutations, etc.) on tuples +just by converting to HList, computing, then convert back to a tuple. + +The conversion can be implemented in the following fashion: ```rust // type-level macro for HLists @@ -295,8 +306,17 @@ macro_rules! hlist_match { { $head:ident, $($tail:ident),* } => { Cons($head, hlist_match!($($tail),*)) }; } -// iterate macro for generated comma separated sequences of idents -fn impl_for_seq_upto_expand<'cx>( +// `invoke_for_seq_upto` is a `higher-order` macro that takes the name +// of another macro and a number and iteratively invokes the named +// macro with sequences of identifiers, e.g., +// +// invoke_for_seq_upto{ my_mac, 5 } +// ==> my_mac!{ A0, A1, A2, A3, A4 }; +// my_mac!{ A0, A1, A2, A3 }; +// my_mac!{ A0, A1, A2 }; +// ... + +fn invoke_for_seq_upto_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], @@ -348,8 +368,9 @@ fn impl_for_seq_upto_expand<'cx>( pub struct ToHList; pub struct ToTuple; -// macro to implement: ToTuple(hlist![…]) => (…,) -macro_rules! impl_to_tuple_for_seq { +// macro to implement conversion from hlist to tuple, +// e.g., ToTuple(hlist![…]) ==> (…,) +macro_rules! impl_to_tuple { ($($seq:ident),*) => { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { @@ -364,8 +385,9 @@ macro_rules! impl_to_tuple_for_seq { } } -// macro to implement: ToHList((…,)) => hlist![…] -macro_rules! impl_to_hlist_for_seq { +// macro to implement conversion from tuple to hlist, +// e.g., ToHList((…,)) ==> hlist![…] +macro_rules! impl_to_hlist { ($($seq:ident),*) => { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { @@ -381,8 +403,8 @@ macro_rules! impl_to_hlist_for_seq { } // generate implementations up to length 32 -impl_for_seq_upto!{ impl_to_tuple_for_seq, 32 } -impl_for_seq_upto!{ impl_to_hlist_for_seq, 32 } +invoke_for_seq_upto!{ impl_to_tuple, 32 } +invoke_for_seq_upto!{ impl_to_hlist, 32 } // test converting an hlist to tuple #[test] @@ -402,13 +424,17 @@ fn test_to_hlist() { # Drawbacks There seem to be few drawbacks to implementing this feature as an -extension of the existing macro machinery. Parsing macro invocations -in types adds a very small amount of additional complexity to the -parser (basically looking for `!`). Having an extra case for macro -invocation in types slightly complicates conversion. As with all -feature proposals, it is possible that designs for future extensions -to the macro system or type system might somehow interfere with this -functionality. +extension of the existing macro machinery. The change adds a very +small amount of additional complexity to the +[parser](https://github.com/freebroccolo/rust/blob/e09cb32bcc04029dc4c16790e2aaa9811af27f25/src/libsyntax/parse/parser.rs#L1547-L1560) +and +[conversion](https://github.com/freebroccolo/rust/blob/e4b826b7afa1b5496b41ddaa1666014046ac5704/src/librustc_typeck/astconv.rs#L1301-L1303) +but the changes are almost negligible. + +As with all feature proposals, it is possible that designs for future +extensions to the macro system or type system might somehow interfere +with this functionality but it seems unlikely unless they are +significant, breaking changes. # Alternatives From 5a835d7e240ad5e4df6ae24d90f99da00d07e9cd Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 03:56:59 -0700 Subject: [PATCH 0399/1195] Cleanup invoke_for_seq_upto macro --- text/0000-type-macros.md | 68 +++++++++++++++++++++------------------- 1 file changed, 35 insertions(+), 33 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index f29c7e1ecc0..6842868d944 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -315,7 +315,6 @@ macro_rules! hlist_match { // my_mac!{ A0, A1, A2, A3 }; // my_mac!{ A0, A1, A2 }; // ... - fn invoke_for_seq_upto_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, @@ -327,42 +326,45 @@ fn invoke_for_seq_upto_expand<'cx>( let mac = parser.parse_ident(); // parse a comma - parser.eat(&token::Token::Comma); + parser.expect(&token::Token::Comma); // parse the number of iterations - let iterations = match parser.parse_lit().node { - ast::Lit_::LitInt(i, _) => i, - _ => { - ecx.span_err(span, "welp"); - return base::DummyResult::any(span); - } - }; - - // generate a token tree: A0, ..., An - let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { - if k % 2 == 0 { - token::str_to_ident(format!("A{}", (k / 2)).as_slice()) - .to_tokens(ecx) - .into_iter() - } else { - let span = codemap::DUMMY_SP; - let token = parse::token::Token::Comma; - vec![ast::TokenTree::TtToken(span, token)] - .into_iter() + if let ast::Lit_::LitInt(lit, _) = parser.parse_lit().node { + Some(lit) + } else { + None + }.and_then(|iterations| { + + // generate a token tree: A0, ..., An + let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { + if k % 2 == 0 { + token::str_to_ident(format!("A{}", (k / 2)).as_slice()) + .to_tokens(ecx) + .into_iter() + } else { + let span = codemap::DUMMY_SP; + let token = parse::token::Token::Comma; + vec![ast::TokenTree::TtToken(span, token)] + .into_iter() + } + }).collect::>(); + + // iterate over the ctx and generate impl syntax fragments + let mut items = vec![]; + let mut i = ctx.len(); + for _ in range(0, iterations) { + items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); + i -= 2; + ctx.truncate(i); } - }).collect::>(); - - // iterate over the ctx and generate impl syntax fragments - let mut items = vec![]; - let mut i = ctx.len(); - for _ in range(0, iterations) { - items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); - i -= 2; - ctx.truncate(i); - } - // splice the impl fragments into the ast - base::MacItems::new(items.into_iter()) + // splice the impl fragments into the ast + Some(base::MacItems::new(items.into_iter())) + + }).unwrap_or_else(|| { + ecx.span_err(span, "invoke_for_seq_upto!: expected an integer literal argument"); + base::DummyResult::any(span) + }) } pub struct ToHList; From e6f655fe05d31c13c72a54d4746146cdd864afe9 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 04:03:18 -0700 Subject: [PATCH 0400/1195] Add example plugin to expand integers to type nats --- text/0000-type-macros.md | 93 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 87 insertions(+), 6 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 6842868d944..214efe87fe5 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -236,24 +236,105 @@ However, in order to be able to write something close to that in Rust, we would need macros in types: ```rust - -// Nat! would expand integer constants to type-level binary naturals; would -// be implemented as a plugin for efficiency -Nat!(4) - ==> ((_1, _0), _0) - // Expr! would expand + to Add::Output and integer constants to Nat!; see // the HList append earlier in the RFC for a concrete example of how this // might be defined Expr!(N + M) ==> >::Output +// Nat! would expand integer literals to type-level binary naturals +// and be implemented as a plugin for efficiency; see the following +// section for a concrete implementation +Nat!(4) + ==> ((_1, _0), _0) + // Now we could expand the following type to something meaningful in Rust: LengthVec ==> LengthVec>::Output> ==> LengthVec>::Output> ``` +##### Implementation of `Nat!` as a plugin + +The following code demonstrates concretely how `Nat!` can be +implemented as a plugin. As with the `HList!` example, this code is +already usable with the type macros implemented in the branch +referenced earlier in this RFC. + +For efficiency, the binary representation is first constructed as a +string via iteration rather than recursively using `quote` macros. The +string is then parsed as a type, returning an ast fragment. + +```rust +// convert a u64 to a string representation of a type-level binary natural, e.g., +// to_bin_nat(1024) +// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) +#[inline] +fn to_bin_nat(mut num: u64) -> String { + let mut res = String::from_str("_"); + if num < 2 { + res.push_str(num.to_string().as_slice()); + } else { + let mut bin = vec![]; + while num > 0 { + bin.push(num % 2); + num >>= 1; + } + res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); + res.push_str("_"); + res.push_str(bin.pop().unwrap().to_string().as_slice()); + for b in bin.iter().rev() { + res.push_str(", _"); + res.push_str(b.to_string().as_slice()); + res.push_str(")"); + } + } + return res; +} + +// generate a parser to convert a string representation of a type-level natural +// to an ast fragment for a type +#[inline] +pub fn bin_nat_parser<'cx>( + ecx: &'cx mut base::ExtCtxt, + num: u64, +) -> parse::parser::Parser<'cx> { + let filemap = ecx + .codemap() + .new_filemap(String::from_str(""), to_bin_nat(num)); + let reader = lexer::StringReader::new( + &ecx.parse_sess().span_diagnostic, + filemap); + parser::Parser::new( + ecx.parse_sess(), + ecx.cfg(), + Box::new(reader)) +} + +// Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., +// Nat!(1024) +// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) +#[inline] +pub fn nat_expand<'cx>( + ecx: &'cx mut base::ExtCtxt, + span: codemap::Span, + args: &[ast::TokenTree], +) -> Box { + let mut litp = ecx.new_parser_from_tts(args); + if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { + Some(lit) + } else { + None + }.and_then(|lit| { + let mut natp = bin_nat_parser(ecx, lit); + Some(base::MacTy::new(natp.parse_ty())) + }).unwrap_or_else(|| { + ecx.span_err(span, "Nat!: expected an integer literal argument"); + base::DummyResult::any(span) + }) +} +``` + ##### Optimization of `Expr`! Because `Expr!` could be implemented as a plugin, the opportunity From 9bf2861ceffd563078ddae88f151f14a59106368 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 13:56:42 -0700 Subject: [PATCH 0401/1195] Remove unnecessary attributes from examples --- text/0000-type-macros.md | 7 ------- 1 file changed, 7 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 214efe87fe5..d5e6ef88c45 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -128,7 +128,6 @@ use std::ops; impl ops::Add for Nil { type Output = Ys; - #[inline] fn add(self, rhs: Ys) -> Ys { rhs } @@ -140,7 +139,6 @@ impl ops::Add for Cons w { type Output = Cons; - #[inline] fn add(self, rhs: Ys) -> Cons { Cons(self.0, self.1 + rhs) } @@ -269,7 +267,6 @@ string is then parsed as a type, returning an ast fragment. // convert a u64 to a string representation of a type-level binary natural, e.g., // to_bin_nat(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -#[inline] fn to_bin_nat(mut num: u64) -> String { let mut res = String::from_str("_"); if num < 2 { @@ -294,7 +291,6 @@ fn to_bin_nat(mut num: u64) -> String { // generate a parser to convert a string representation of a type-level natural // to an ast fragment for a type -#[inline] pub fn bin_nat_parser<'cx>( ecx: &'cx mut base::ExtCtxt, num: u64, @@ -314,7 +310,6 @@ pub fn bin_nat_parser<'cx>( // Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., // Nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -#[inline] pub fn nat_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, @@ -458,7 +453,6 @@ macro_rules! impl_to_tuple { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { type Output = ($($seq,)*); - #[inline] extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { match this { hlist_match![$($seq),*] => ($($seq,)*) @@ -475,7 +469,6 @@ macro_rules! impl_to_hlist { #[allow(non_snake_case)] impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { type Output = HList![$($seq),*]; - #[inline] extern "rust-call" fn call(&self, (this,): (($($seq,)*),)) -> HList![$($seq),*] { match this { ($($seq,)*) => hlist![$($seq),*] From 719c3693b6d7451a747eddba214391104a19cef4 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 14:00:50 -0700 Subject: [PATCH 0402/1195] Clean up nat plugin example; add term-level macro --- text/0000-type-macros.md | 69 ++++++++++++++++++++++++++++++---------- 1 file changed, 52 insertions(+), 17 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index d5e6ef88c45..6208bf4d25b 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -265,10 +265,11 @@ string is then parsed as a type, returning an ast fragment. ```rust // convert a u64 to a string representation of a type-level binary natural, e.g., -// to_bin_nat(1024) +// nat_str(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -fn to_bin_nat(mut num: u64) -> String { - let mut res = String::from_str("_"); +fn nat_str(mut num: u64) -> String { + let path = "bit::_"; + let mut res = String::from_str(path); if num < 2 { res.push_str(num.to_string().as_slice()); } else { @@ -278,10 +279,11 @@ fn to_bin_nat(mut num: u64) -> String { num >>= 1; } res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); - res.push_str("_"); + res.push_str(path); res.push_str(bin.pop().unwrap().to_string().as_slice()); for b in bin.iter().rev() { - res.push_str(", _"); + res.push_str(", "); + res.push_str(path); res.push_str(b.to_string().as_slice()); res.push_str(")"); } @@ -289,15 +291,14 @@ fn to_bin_nat(mut num: u64) -> String { return res; } -// generate a parser to convert a string representation of a type-level natural -// to an ast fragment for a type -pub fn bin_nat_parser<'cx>( +// Generate a parser with the nat string for `num` as input +fn nat_str_parser<'cx>( ecx: &'cx mut base::ExtCtxt, num: u64, ) -> parse::parser::Parser<'cx> { let filemap = ecx .codemap() - .new_filemap(String::from_str(""), to_bin_nat(num)); + .new_filemap(String::from_str(""), nat_str(num)); let reader = lexer::StringReader::new( &ecx.parse_sess().span_diagnostic, filemap); @@ -307,27 +308,61 @@ pub fn bin_nat_parser<'cx>( Box::new(reader)) } +// Try to parse an integer literal and return a new parser for its nat +// string; this is used to create both a type-level `Nat!` with +// `nat_ty_expand` and term-level `nat!` macro with `nat_tm_expand` +pub fn nat_lit_parser<'cx>( + ecx: &'cx mut base::ExtCtxt, + args: &[ast::TokenTree], +) -> Option> { + let mut litp = ecx.new_parser_from_tts(args); + if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { + Some(nat_str_parser(ecx, lit)) + } else { + None + } +} + // Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., // Nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn nat_expand<'cx>( +pub fn nat_ty_expand<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], ) -> Box { - let mut litp = ecx.new_parser_from_tts(args); - if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { - Some(lit) - } else { - None - }.and_then(|lit| { - let mut natp = bin_nat_parser(ecx, lit); + { + nat_lit_parser(ecx, args) + }.and_then(|mut natp| { Some(base::MacTy::new(natp.parse_ty())) }).unwrap_or_else(|| { ecx.span_err(span, "Nat!: expected an integer literal argument"); base::DummyResult::any(span) }) } + +// Expand nat!(n) to a term-level binary nat where n is an int literal, e.g., +// nat!(1024) +// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) +pub fn nat_tm_expand<'cx>( + ecx: &'cx mut base::ExtCtxt, + span: codemap::Span, + args: &[ast::TokenTree], +) -> Box { + { + nat_lit_parser(ecx, args) + }.and_then(|mut natp| { + Some(base::MacExpr::new(natp.parse_expr())) + }).unwrap_or_else(|| { + ecx.span_err(span, "nat!: expected an integer literal argument"); + base::DummyResult::any(span) + }) +} + +#[test] +fn nats() { + let _: Nat!(42) = nat!(42); +} ``` ##### Optimization of `Expr`! From cdac17c2af6eb7c20a8bff7fd8bffe5b63783d98 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 18:22:04 -0700 Subject: [PATCH 0403/1195] More clean up; mention hygiene --- text/0000-type-macros.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 6208bf4d25b..4b963a76633 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -268,10 +268,9 @@ string is then parsed as a type, returning an ast fragment. // nat_str(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) fn nat_str(mut num: u64) -> String { - let path = "bit::_"; - let mut res = String::from_str(path); + let mut res: String; if num < 2 { - res.push_str(num.to_string().as_slice()); + res = num.to_string(); } else { let mut bin = vec![]; while num > 0 { @@ -279,11 +278,9 @@ fn nat_str(mut num: u64) -> String { num >>= 1; } res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); - res.push_str(path); res.push_str(bin.pop().unwrap().to_string().as_slice()); for b in bin.iter().rev() { res.push_str(", "); - res.push_str(path); res.push_str(b.to_string().as_slice()); res.push_str(")"); } @@ -567,8 +564,16 @@ type macros would not prevent additional extensions. # Unresolved questions +## Alternative syntax for macro invocations in types + There is a question as to whether macros in types should allow `<` and `>` as delimiters for invocations, e.g. `Foo!`. However, this would raise a number of additional complications and is not necessary to consider for this RFC. If deemed desirable by the community, this functionality can be proposed separately. + +## Hygiene and type macros + +This RFC does not address the topic of hygiene regarding macros in +types. It is not clear to me whether there are issues here or not but +it may be worth considering in further detail. From 39be4b21b2a1faf787da8dabca7229eb6f909135 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Fri, 20 Feb 2015 20:35:17 -0700 Subject: [PATCH 0404/1195] Rewording; clarification; cleanup --- text/0000-type-macros.md | 204 ++++++++++++++++++++------------------- 1 file changed, 106 insertions(+), 98 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 4b963a76633..b0249df0cce 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -13,11 +13,10 @@ Macros are currently allowed in syntax fragments for expressions, items, and patterns, but not for types. This RFC proposes to lift that restriction for the following reasons: -1. Increase generality of the macro system - in the absence of a - concrete reason for disallowing macros in types, the limitation - should be removed in order to promote generality and to enable use - cases which would otherwise require resorting either to compiler - plugins or to more elaborate item-level macros. +1. Increase generality of the macro system - the limitation should be + removed in order to promote generality and to enable use cases which + would otherwise require resorting either more elaborate plugins or + macros at the item-level. 2. Enable more programming patterns - macros in type positions provide a means to express **recursion** and **choice** within types in a @@ -28,15 +27,13 @@ restriction for the following reasons: ## Implementation -The proposed feature has been implemented at +The proposed feature has been prototyped at [this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). The -implementation is very simple and there is no novelty to the -design. The patches make a small modification to the existing macro -expansion functionality in order to support macro invocations in -syntax for types. No changes are made to type-checking or other phases -of the compiler. +implementation is straightforward and the impact of the changes are +limited in scope to the macro system. Type-checking and other phases +of compilation should be unaffected. -The biggest change introduced by this feature is a +The most significant change introduced by this feature is a [`TyMac`](https://github.com/freebroccolo/rust/blob/f8f8dbb6d332c364ecf26b248ce5f872a7a67019/src/libsyntax/ast.rs#L1274-L1275) case for the `Ty_` enum so that the parser can indicate a macro invocation in a type position. In other words, `TyMac` is added to the @@ -48,12 +45,12 @@ ast and handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. Heterogeneous lists are one example where the ability to express recursion via type macros is very useful. They can be used as an -alternative to (or in combination with) tuples. Their recursive +alternative to or in combination with tuples. Their recursive structure provide a means to abstract over arity and to manipulate arbitrary products of types with operations like appending, taking length, adding/removing items, computing permutations, etc. -Heterogeneous lists are straightforward to define: +Heterogeneous lists can be defined like so: ```rust struct Nil; // empty HList @@ -65,13 +62,13 @@ impl HList for Nil {} impl HList for Cons {} ``` -However, writing them in code is not so convenient: +However, writing HList terms in code is not very convenient: ```rust let xs = Cons("foo", Cons(false, Cons(vec![0u64], Nil))); ``` -At the term-level, this is easy enough to fix with a macro: +At the term-level, this is an easy fix using macros: ```rust // term-level macro for HLists @@ -84,16 +81,16 @@ macro_rules! hlist { let xs = hlist!["foo", false, vec![0u64]]; ``` -Unfortunately, this is an incomplete solution. HList terms are more -convenient to write but HList types are not: +Unfortunately, this solution is incomplete because we have only made +HList terms easier to write. HList types are still inconvenient: ```rust let xs: Cons<&str, Cons, Nil>>> = hlist!["foo", false, vec![0u64]]; ``` -Under this proposal—allowing macros in types—we would be able to use a -macro to improve writing the HList type as well. The complete example -follows: +Allowing type macros as this RFC proposes would allows us to be +able to use Rust's macros to improve writing the HList type as +well. The complete example follows: ```rust // term-level macro for HLists @@ -117,9 +114,9 @@ Operations on HLists can be defined by recursion, using traits with associated type outputs at the type-level and implementation methods at the term-level. -HList append is provided as an example of such an operation. Macros in -types are used to make writing append at the type level more -convenient, e.g., with `Expr!`: +The HList append operation is provided as an example. type macros are +used to make writing append at the type level (see `Expr!`) more +convenient than specifying the associated type projection manually: ```rust use std::ops; @@ -172,11 +169,12 @@ fn test_append() { ### Additional Examples ### -#### Type-level numbers +#### Type-level numerics -Another example where type macros can be useful is in the encoding of -numbers as types. Binary natural numbers can be represented as -follows: +Type-level numerics are another area where type macros can be +useful. The more common unary encodings (Peano numerals) are not +efficient enough to use in practice so we present an example +demonstrating binary natural numbers instead: ```rust struct _0; // 0 bit @@ -199,29 +197,41 @@ impl Nat for _1 {} impl Nat for (P, B) {} ``` -These can be used to index into tuples or HLists generically (linear -time generally or constant time up to a fixed number of -specializations). They can also be used to encode "sized" or "bounded" -data, like vectors: +These can be used to index into tuples or HLists generically, either +by specifying the path explicitly (e.g., `(a, b, c).at::<(_1, _0)>() +==> c`) or by providing a singleton term with the appropriate type +`(a, b, c).at((_1, _0)) ==> c`. Indexing is linear time in the general +case due to recursion, but can be made constant time for a fixed +number of specialized implementations. + +Type-level numbers can also be used to define "sized" or "bounded" +data, such as a vector indexed by its length: ```rust struct LengthVec(Vec); ``` -The type number can either be a phantom parameter `N` as above, or -represented concretely at the term-level (similar to list). In either -case, a length-safe API can be provided on top of types `Vec`. Because -the length is known statically, unsafe indexing would be allowable by -default. +Similar to the indexing example, the parameter `N` can either serve as +phantom data, or such a struct could also include a term-level +representation of N as another field. + +In either case, a length-safe API could be defined for container types +like `Vec`. "Unsafe" indexing (without bounds checking) into the +underlying container would be safe in general because the length of +the container would be known statically and reflected in the type of +the length-indexed wrapper. We could imagine an idealized API in the following fashion: ```rust // push, adding one to the length -fn push(x: A, xs: LengthVec) -> LengthVec; +fn push(xs: LengthVec, x: A) -> LengthVec; // pop, subtracting one from the length -fn pop(store: &mut A, xs: LengthVec) -> LengthVec; +fn pop(xs: LengthVec, store: &mut A) -> LengthVec; + +// look up an element at an index +fn at(xs: LengthVec, index: M) -> A; // append, adding the individual lengths fn append(xs: LengthVec, ys: LengthVec) -> LengthVec; @@ -230,23 +240,22 @@ fn append(xs: LengthVec, ys: LengthVec) -> Length fn iter(xs: LengthVec) -> LengthIterator; ``` -However, in order to be able to write something close to that in Rust, -we would need macros in types: +We can't write code like the above directly in Rust but we could +approximate it through type-level macros: ```rust // Expr! would expand + to Add::Output and integer constants to Nat!; see -// the HList append earlier in the RFC for a concrete example of how this -// might be defined +// the HList append earlier in the RFC for a concrete example Expr!(N + M) ==> >::Output // Nat! would expand integer literals to type-level binary naturals // and be implemented as a plugin for efficiency; see the following -// section for a concrete implementation +// section for a concrete example Nat!(4) ==> ((_1, _0), _0) -// Now we could expand the following type to something meaningful in Rust: +// `Expr!` and `Nat!` used for the LengthVec type: LengthVec ==> LengthVec>::Output> ==> LengthVec>::Output> @@ -255,9 +264,9 @@ LengthVec ##### Implementation of `Nat!` as a plugin The following code demonstrates concretely how `Nat!` can be -implemented as a plugin. As with the `HList!` example, this code is -already usable with the type macros implemented in the branch -referenced earlier in this RFC. +implemented as a plugin. As with the `HList!` example, this code (with +some additions) compiles and is usable with the type macros prototype +in the branch referenced earlier. For efficiency, the binary representation is first constructed as a string via iteration rather than recursively using `quote` macros. The @@ -268,9 +277,11 @@ string is then parsed as a type, returning an ast fragment. // nat_str(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) fn nat_str(mut num: u64) -> String { + let path = "_"; let mut res: String; if num < 2 { - res = num.to_string(); + res = String::from_str(path); + res.push_str(num.to_string().as_slice()); } else { let mut bin = vec![]; while num > 0 { @@ -278,9 +289,11 @@ fn nat_str(mut num: u64) -> String { num >>= 1; } res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); + res.push_str(path); res.push_str(bin.pop().unwrap().to_string().as_slice()); for b in bin.iter().rev() { res.push_str(", "); + res.push_str(path); res.push_str(b.to_string().as_slice()); res.push_str(")"); } @@ -364,33 +377,32 @@ fn nats() { ##### Optimization of `Expr`! -Because `Expr!` could be implemented as a plugin, the opportunity -would exist to perform various optimizations of type-level expressions -during expansion. Partial evaluation would be one approach to -this. Furthermore, expansion-time optimizations would not necessarily -be limited to simple arithmetic expressions but could be used for -other data like HLists. +Defining `Expr!` as a plugin would provide an opportunity to perform +various optimizations of more complex type-level expressions during +expansion. Partial evaluation would be one way to achieve +this. Furthermore, expansion-time optimizations wouldn't be limited to +arithmetic expressions but could be used for other data like HLists. -##### Native alternatives: types parameterized by constant values +##### Builtin alternatives: types parameterized by constant values -This example with type-level naturals is meant to illustrate the kind -of patterns macros in types enable. I am not suggesting the standard -libraries adopt _this particular_ representation as a means to address -the more general issue of lack of numeric parameterization for -types. There is +The example with type-level naturals serves to illustrate some of the +patterns type macros enable. This RFC is not intended to address the +lack of constant value type parameterization and type-level numerics +specifically. There is [another RFC here](https://github.com/rust-lang/rfcs/pull/884) which -does propose extending the type system to allow parameterization over -constants. +proposes extending the type system to address those issue. #### Conversion from HList to Tuple -With type macros, it is possible to write macros that convert back and -forth between tuples and HLists. This is very powerful because it lets -us reuse all of the operations we define for HLists (appending, taking -length, adding/removing items, computing permutations, etc.) on tuples -just by converting to HList, computing, then convert back to a tuple. +With type macros, it is possible to define conversions back and forth +between tuples and HLists. This is very powerful because it lets us +reuse at the level of tuples all of the recursive operations we can +define for HLists (appending, taking length, adding/removing items, +computing permutations, etc.). -The conversion can be implemented in the following fashion: +Conversions can be defined using macros/plugins and function +traits. Type macros are useful in this example for the associated type +`Output` and method return type in the traits. ```rust // type-level macro for HLists @@ -443,7 +455,7 @@ fn invoke_for_seq_upto_expand<'cx>( None }.and_then(|iterations| { - // generate a token tree: A0, ..., An + // generate a token tree: A0, …, An let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { if k % 2 == 0 { token::str_to_ident(format!("A{}", (k / 2)).as_slice()) @@ -532,48 +544,44 @@ fn test_to_hlist() { # Drawbacks There seem to be few drawbacks to implementing this feature as an -extension of the existing macro machinery. The change adds a very -small amount of additional complexity to the +extension of the existing macro machinery. The change adds a small +amount of additional complexity to the [parser](https://github.com/freebroccolo/rust/blob/e09cb32bcc04029dc4c16790e2aaa9811af27f25/src/libsyntax/parse/parser.rs#L1547-L1560) and [conversion](https://github.com/freebroccolo/rust/blob/e4b826b7afa1b5496b41ddaa1666014046ac5704/src/librustc_typeck/astconv.rs#L1301-L1303) -but the changes are almost negligible. +but the changes are minimal. As with all feature proposals, it is possible that designs for future -extensions to the macro system or type system might somehow interfere -with this functionality but it seems unlikely unless they are -significant, breaking changes. +extensions to the macro system or type system might interfere with +this functionality but it seems unlikely unless they are significant, +breaking changes. # Alternatives -There are no direct alternatives to my knowledge. Extensions to the -type system like data kinds, singletons, and various more elaborate -forms of staged programming (so-called CTFE) could conceivably cover -some cases where macros in types might otherwise be used. It is -unlikely they would provide the same level of functionality as macros, -particularly where plugins are concerned. Instead, such features would -probably benefit from type macros too. - -Not implementing this feature would mean disallowing some useful -programming patterns. There are some discussions in the community -regarding more extensive changes to the type system to address some of -these patterns. However, type macros along with associated types can -already accomplish many of the same things without the significant -engineering cost in terms of changes to the type system. Either way, -type macros would not prevent additional extensions. +There are no _direct_ alternatives. Extensions to the type system like +data kinds, singletons, and other forms of staged programming +(so-called CTFE) might alleviate the need for type macros in some +cases, however it is unlikely that they would provide a comprehensive +replacement, particularly where plugins are concerned. + +Not implementing this feature would mean not taking some reasonably +low-effort steps toward making certain programming patterns +easier. One potential consequence of this might be more pressure to +significantly extend the type system and other aspects of the language +to compensate. # Unresolved questions ## Alternative syntax for macro invocations in types -There is a question as to whether macros in types should allow `<` and -`>` as delimiters for invocations, e.g. `Foo!`. However, this would -raise a number of additional complications and is not necessary to +There is a question as to whether type macros should allow `<` and `>` +as delimiters for invocations, e.g. `Foo!`. This would raise a +number of additional complications and is probably not necessary to consider for this RFC. If deemed desirable by the community, this -functionality can be proposed separately. +functionality should be proposed separately. ## Hygiene and type macros -This RFC does not address the topic of hygiene regarding macros in -types. It is not clear to me whether there are issues here or not but -it may be worth considering in further detail. +This RFC also does not address the topic of hygiene regarding macros +in types. It is not clear whether there are issues here or not but it +may be worth considering in further detail. From 33ef3ebe48495e2426d7ca0b7e2be129f6dffc8e Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Thu, 26 Feb 2015 13:33:11 -0700 Subject: [PATCH 0405/1195] Reword/clarify motivation --- text/0000-type-macros.md | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index b0249df0cce..cb5957e2cb1 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -11,17 +11,21 @@ Allow macros in type positions Macros are currently allowed in syntax fragments for expressions, items, and patterns, but not for types. This RFC proposes to lift that -restriction for the following reasons: +restriction. -1. Increase generality of the macro system - the limitation should be - removed in order to promote generality and to enable use cases which - would otherwise require resorting either more elaborate plugins or - macros at the item-level. +1. This would allow macros to be used more flexibly, avoiding the + need for more complex item-level macros or plugins in some + cases. For example, when creating trait implementations with + macros, it is sometimes useful to be able to define the + associated types using a nested type macro but this is + currently problematic. + +2. Enable more programming patterns, particularly with respect to + type level programming. Macros in type positions provide + convenient way to express recursion and choice. It is possible + to do the same thing purely through programming with associated + types but the resulting code can be cumbersome to read and write. -2. Enable more programming patterns - macros in type positions provide - a means to express **recursion** and **choice** within types in a - fashion that is still legible. Associated types alone can accomplish - the former (recursion/choice) but not the latter (legibility). # Detailed design From a09508c1210b69b8d391e64b8f1f52276c737a81 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Thu, 26 Feb 2015 13:44:01 -0700 Subject: [PATCH 0406/1195] Update links --- text/0000-type-macros.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index cb5957e2cb1..6ca8cbfc85f 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -550,9 +550,9 @@ fn test_to_hlist() { There seem to be few drawbacks to implementing this feature as an extension of the existing macro machinery. The change adds a small amount of additional complexity to the -[parser](https://github.com/freebroccolo/rust/blob/e09cb32bcc04029dc4c16790e2aaa9811af27f25/src/libsyntax/parse/parser.rs#L1547-L1560) +[parser](https://github.com/freebroccolo/rust/commit/a224739e92a3aa1febb67d6371988622bd141361) and -[conversion](https://github.com/freebroccolo/rust/blob/e4b826b7afa1b5496b41ddaa1666014046ac5704/src/librustc_typeck/astconv.rs#L1301-L1303) +[conversion](https://github.com/freebroccolo/rust/commit/9341232087991dee73713dc4521acdce11a799a2) but the changes are minimal. As with all feature proposals, it is possible that designs for future From fe51add9c928b03816453d45b2e12e743af48cdd Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sat, 28 Feb 2015 13:52:51 -0700 Subject: [PATCH 0407/1195] Include additional details in code examples --- text/0000-type-macros.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 6ca8cbfc85f..9e342a444b2 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -57,11 +57,13 @@ length, adding/removing items, computing permutations, etc. Heterogeneous lists can be defined like so: ```rust +#[derive(Copy, Clone, Debug, Eq, Ord, PartialEq, PartialOrd)] struct Nil; // empty HList +#[derive(Copy, Clone, Debug, Eq, Ord, PartialEq, PartialOrd)] struct Cons(H, T); // cons cell of HList // trait to classify valid HLists -trait HList {} +trait HList: MarkerTrait {} impl HList for Nil {} impl HList for Cons {} ``` @@ -185,17 +187,17 @@ struct _0; // 0 bit struct _1; // 1 bit // classify valid bits -trait Bit {} +trait Bit: MarkerTrait {} impl Bit for _0 {} impl Bit for _1 {} // classify positive binary naturals -trait Pos {} +trait Pos: MarkerTrait {} impl Pos for _1 {} impl Pos for (P, B) {} // classify binary naturals with 0 -trait Nat {} +trait Nat: MarkerTrait {} impl Nat for _0 {} impl Nat for _1 {} impl Nat for (P, B) {} From 3dc778bf0733dbe057a89dfcfeec49536c922bde Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sat, 28 Feb 2015 13:53:25 -0700 Subject: [PATCH 0408/1195] Modify `Expr!` to not need extra parentheses --- text/0000-type-macros.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 9e342a444b2..39ee64b449d 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -149,8 +149,10 @@ impl ops::Add for Cons w // type macro Expr allows us to expand the + operator appropriately macro_rules! Expr { - { $A:ty } => { $A }; - { $LHS:tt + $RHS:tt } => { >::Output }; + { ( $($LHS:tt)+ ) } => { Expr!($($LHS)+) }; + { HList ! [ $($LHS:tt)* ] + $($RHS:tt)+ } => { >::Output }; + { $LHS:tt + $($RHS:tt)+ } => { >::Output }; + { $LHS:ty } => { $LHS }; } // test demonstrating term level `xs + ys` and type level `Expr!(Xs + Ys)` @@ -164,10 +166,10 @@ fn test_append() { let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![]]; let ys: HList![u64, [u8; 3], ()] = hlist![0, [0, 1, 2], ()]; - // parentheses around compound types due to limitations in macro parsing; - // real implementation could use a plugin to avoid this - let zs: Expr!((HList![&str, bool, Vec]) + - (HList![u64, [u8; 3], ()])) + // demonstrate recursive expansion of Expr! + let zs: Expr!((HList![&str] + HList![bool] + HList![Vec]) + + (HList![u64] + HList![[u8; 3], ()]) + + HList![]) = aux(xs, ys); assert_eq!(zs, hlist!["foo", false, vec![], 0, [0, 1, 2], ()]) } From 337974fa29fb156ab355f1fd7833c1a772160c4f Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sun, 1 Mar 2015 21:41:13 -0700 Subject: [PATCH 0409/1195] Renaming; fix examples to use MacEager --- text/0000-type-macros.md | 57 ++++++++++++++++++++++------------------ 1 file changed, 31 insertions(+), 26 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 39ee64b449d..ecdf229acbc 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -281,10 +281,14 @@ string via iteration rather than recursively using `quote` macros. The string is then parsed as a type, returning an ast fragment. ```rust -// convert a u64 to a string representation of a type-level binary natural, e.g., -// nat_str(1024) -// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -fn nat_str(mut num: u64) -> String { +// Convert a u64 to a string representation of a type-level binary natural, e.g., +// ast_as_str(1024) +// ==> "(((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0)" +fn ast_as_str<'cx>( + ecx: &'cx base::ExtCtxt, + mut num: u64, + mode: Mode, +) -> String { let path = "_"; let mut res: String; if num < 2 { @@ -306,17 +310,18 @@ fn nat_str(mut num: u64) -> String { res.push_str(")"); } } - return res; + res } -// Generate a parser with the nat string for `num` as input -fn nat_str_parser<'cx>( - ecx: &'cx mut base::ExtCtxt, +// Generate a parser which uses the nat's ast-as-string as its input +fn ast_parser<'cx>( + ecx: &'cx base::ExtCtxt, num: u64, + mode: Mode, ) -> parse::parser::Parser<'cx> { let filemap = ecx .codemap() - .new_filemap(String::from_str(""), nat_str(num)); + .new_filemap(String::from_str(""), ast_as_str(ecx, num, mode)); let reader = lexer::StringReader::new( &ecx.parse_sess().span_diagnostic, filemap); @@ -326,16 +331,16 @@ fn nat_str_parser<'cx>( Box::new(reader)) } -// Try to parse an integer literal and return a new parser for its nat -// string; this is used to create both a type-level `Nat!` with -// `nat_ty_expand` and term-level `nat!` macro with `nat_tm_expand` -pub fn nat_lit_parser<'cx>( - ecx: &'cx mut base::ExtCtxt, +// Try to parse an integer literal and return a new parser which uses +// the nat's ast-as-string as its input +pub fn lit_parser<'cx>( + ecx: &'cx base::ExtCtxt, args: &[ast::TokenTree], + mode: Mode, ) -> Option> { - let mut litp = ecx.new_parser_from_tts(args); - if let ast::Lit_::LitInt(lit, _) = litp.parse_lit().node { - Some(nat_str_parser(ecx, lit)) + let mut lit_parser = ecx.new_parser_from_tts(args); + if let ast::Lit_::LitInt(lit, _) = lit_parser.parse_lit().node { + Some(ast_parser(ecx, lit, mode)) } else { None } @@ -344,15 +349,15 @@ pub fn nat_lit_parser<'cx>( // Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., // Nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn nat_ty_expand<'cx>( +pub fn expand_ty<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], ) -> Box { { - nat_lit_parser(ecx, args) - }.and_then(|mut natp| { - Some(base::MacTy::new(natp.parse_ty())) + lit_parser(ecx, args, Mode::Ty) + }.and_then(|mut ast_parser| { + Some(base::MacEager::ty(ast_parser.parse_ty())) }).unwrap_or_else(|| { ecx.span_err(span, "Nat!: expected an integer literal argument"); base::DummyResult::any(span) @@ -362,15 +367,15 @@ pub fn nat_ty_expand<'cx>( // Expand nat!(n) to a term-level binary nat where n is an int literal, e.g., // nat!(1024) // ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn nat_tm_expand<'cx>( +pub fn expand_tm<'cx>( ecx: &'cx mut base::ExtCtxt, span: codemap::Span, args: &[ast::TokenTree], ) -> Box { { - nat_lit_parser(ecx, args) - }.and_then(|mut natp| { - Some(base::MacExpr::new(natp.parse_expr())) + lit_parser(ecx, args, Mode::Tm) + }.and_then(|mut ast_parser| { + Some(base::MacEager::expr(ast_parser.parse_expr())) }).unwrap_or_else(|| { ecx.span_err(span, "nat!: expected an integer literal argument"); base::DummyResult::any(span) @@ -487,7 +492,7 @@ fn invoke_for_seq_upto_expand<'cx>( } // splice the impl fragments into the ast - Some(base::MacItems::new(items.into_iter())) + Some(base::MacEager::items(SmallVector::many(items))) }).unwrap_or_else(|| { ecx.span_err(span, "invoke_for_seq_upto!: expected an integer literal argument"); From 327716b1dff2bd77c5d51659545bd7236bf43023 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Sun, 1 Mar 2015 23:25:31 -0700 Subject: [PATCH 0410/1195] Modify macro example to match patterns --- text/0000-type-macros.md | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index ecdf229acbc..3b588cb174e 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -80,8 +80,16 @@ At the term-level, this is an easy fix using macros: // term-level macro for HLists macro_rules! hlist { {} => { Nil }; - { $head:expr } => { Cons($head, Nil) }; + {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; + { $head:expr } => { Cons($head, Nil) }; +} + +// term-level HLists in patterns +macro_rules! hlist_pat { + {} => { Nil }; + { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; + { $head:pat } => { Cons($head, Nil) }; } let xs = hlist!["foo", false, vec![0u64]]; @@ -102,8 +110,16 @@ well. The complete example follows: // term-level macro for HLists macro_rules! hlist { {} => { Nil }; - { $head:expr } => { Cons($head, Nil) }; + {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; + { $head:expr } => { Cons($head, Nil) }; +} + +// term-level HLists in patterns +macro_rules! hlist_pat { + {} => { Nil }; + { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; + { $head:pat } => { Cons($head, Nil) }; } // type-level macro for HLists @@ -428,15 +444,16 @@ macro_rules! HList { // term-level macro for HLists macro_rules! hlist { {} => { Nil }; - { $head:expr } => { Cons($head, Nil) }; + {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; + { $head:expr } => { Cons($head, Nil) }; } // term-level HLists in patterns -macro_rules! hlist_match { +macro_rules! hlist_pat { {} => { Nil }; - { $head:ident } => { Cons($head, Nil) }; - { $head:ident, $($tail:ident),* } => { Cons($head, hlist_match!($($tail),*)) }; + { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; + { $head:pat } => { Cons($head, Nil) }; } // `invoke_for_seq_upto` is a `higher-order` macro that takes the name @@ -512,7 +529,7 @@ macro_rules! impl_to_tuple { type Output = ($($seq,)*); extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { match this { - hlist_match![$($seq),*] => ($($seq,)*) + hlist![=> $($seq),*] => ($($seq,)*) } } } From 6e96a76b3ca1fda6847fff8df3ae4488be3a0d8b Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Mon, 9 Mar 2015 21:07:39 -0600 Subject: [PATCH 0411/1195] Remove HList/tuple conversion example --- text/0000-type-macros.md | 150 --------------------------------------- 1 file changed, 150 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index 3b588cb174e..c16751ddd35 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -421,156 +421,6 @@ specifically. There is [another RFC here](https://github.com/rust-lang/rfcs/pull/884) which proposes extending the type system to address those issue. -#### Conversion from HList to Tuple - -With type macros, it is possible to define conversions back and forth -between tuples and HLists. This is very powerful because it lets us -reuse at the level of tuples all of the recursive operations we can -define for HLists (appending, taking length, adding/removing items, -computing permutations, etc.). - -Conversions can be defined using macros/plugins and function -traits. Type macros are useful in this example for the associated type -`Output` and method return type in the traits. - -```rust -// type-level macro for HLists -macro_rules! HList { - {} => { Nil }; - { $head:ty } => { Cons<$head, Nil> }; - { $head:ty, $($tail:ty),* } => { Cons<$head, HList!($($tail),*)> }; -} - -// term-level macro for HLists -macro_rules! hlist { - {} => { Nil }; - {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; - { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; - { $head:expr } => { Cons($head, Nil) }; -} - -// term-level HLists in patterns -macro_rules! hlist_pat { - {} => { Nil }; - { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; - { $head:pat } => { Cons($head, Nil) }; -} - -// `invoke_for_seq_upto` is a `higher-order` macro that takes the name -// of another macro and a number and iteratively invokes the named -// macro with sequences of identifiers, e.g., -// -// invoke_for_seq_upto{ my_mac, 5 } -// ==> my_mac!{ A0, A1, A2, A3, A4 }; -// my_mac!{ A0, A1, A2, A3 }; -// my_mac!{ A0, A1, A2 }; -// ... -fn invoke_for_seq_upto_expand<'cx>( - ecx: &'cx mut base::ExtCtxt, - span: codemap::Span, - args: &[ast::TokenTree], -) -> Box { - let mut parser = ecx.new_parser_from_tts(args); - - // parse the macro name - let mac = parser.parse_ident(); - - // parse a comma - parser.expect(&token::Token::Comma); - - // parse the number of iterations - if let ast::Lit_::LitInt(lit, _) = parser.parse_lit().node { - Some(lit) - } else { - None - }.and_then(|iterations| { - - // generate a token tree: A0, …, An - let mut ctx = range(0, iterations * 2 - 1).flat_map(|k| { - if k % 2 == 0 { - token::str_to_ident(format!("A{}", (k / 2)).as_slice()) - .to_tokens(ecx) - .into_iter() - } else { - let span = codemap::DUMMY_SP; - let token = parse::token::Token::Comma; - vec![ast::TokenTree::TtToken(span, token)] - .into_iter() - } - }).collect::>(); - - // iterate over the ctx and generate impl syntax fragments - let mut items = vec![]; - let mut i = ctx.len(); - for _ in range(0, iterations) { - items.push(quote_item!(ecx, $mac!{ $ctx };).unwrap()); - i -= 2; - ctx.truncate(i); - } - - // splice the impl fragments into the ast - Some(base::MacEager::items(SmallVector::many(items))) - - }).unwrap_or_else(|| { - ecx.span_err(span, "invoke_for_seq_upto!: expected an integer literal argument"); - base::DummyResult::any(span) - }) -} - -pub struct ToHList; -pub struct ToTuple; - -// macro to implement conversion from hlist to tuple, -// e.g., ToTuple(hlist![…]) ==> (…,) -macro_rules! impl_to_tuple { - ($($seq:ident),*) => { - #[allow(non_snake_case)] - impl<$($seq,)*> Fn<(HList![$($seq),*],)> for ToTuple { - type Output = ($($seq,)*); - extern "rust-call" fn call(&self, (this,): (HList![$($seq),*],)) -> ($($seq,)*) { - match this { - hlist![=> $($seq),*] => ($($seq,)*) - } - } - } - } -} - -// macro to implement conversion from tuple to hlist, -// e.g., ToHList((…,)) ==> hlist![…] -macro_rules! impl_to_hlist { - ($($seq:ident),*) => { - #[allow(non_snake_case)] - impl<$($seq,)*> Fn<(($($seq,)*),)> for ToHList { - type Output = HList![$($seq),*]; - extern "rust-call" fn call(&self, (this,): (($($seq,)*),)) -> HList![$($seq),*] { - match this { - ($($seq,)*) => hlist![$($seq),*] - } - } - } - } -} - -// generate implementations up to length 32 -invoke_for_seq_upto!{ impl_to_tuple, 32 } -invoke_for_seq_upto!{ impl_to_hlist, 32 } - -// test converting an hlist to tuple -#[test] -fn test_to_tuple() { - assert_eq(ToTuple(hlist!["foo", true, (), vec![42u64]]), - ("foo", true, (), vec![42u64])) -} - -// test converting a tuple to hlist -#[test] -fn test_to_hlist() { - assert_eq(ToHList(("foo", true, (), vec![42u64])), - hlist!["foo", true, (), vec![42u64]]) -} -``` - # Drawbacks There seem to be few drawbacks to implementing this feature as an From bc720e64fca12a1e685a1bd3617acddebfc50339 Mon Sep 17 00:00:00 2001 From: Darin Morrison Date: Tue, 10 Mar 2015 19:28:48 -0600 Subject: [PATCH 0412/1195] Remove additional examples --- text/0000-type-macros.md | 236 +-------------------------------------- 1 file changed, 2 insertions(+), 234 deletions(-) diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md index c16751ddd35..6f906903695 100644 --- a/text/0000-type-macros.md +++ b/text/0000-type-macros.md @@ -43,9 +43,7 @@ case for the `Ty_` enum so that the parser can indicate a macro invocation in a type position. In other words, `TyMac` is added to the ast and handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. -## Examples - -### Heterogeneous Lists +## Example: Heterogeneous Lists Heterogeneous lists are one example where the ability to express recursion via type macros is very useful. They can be used as an @@ -136,7 +134,7 @@ Operations on HLists can be defined by recursion, using traits with associated type outputs at the type-level and implementation methods at the term-level. -The HList append operation is provided as an example. type macros are +The HList append operation is provided as an example. Type macros are used to make writing append at the type level (see `Expr!`) more convenient than specifying the associated type projection manually: @@ -191,236 +189,6 @@ fn test_append() { } ``` -### Additional Examples ### - -#### Type-level numerics - -Type-level numerics are another area where type macros can be -useful. The more common unary encodings (Peano numerals) are not -efficient enough to use in practice so we present an example -demonstrating binary natural numbers instead: - -```rust -struct _0; // 0 bit -struct _1; // 1 bit - -// classify valid bits -trait Bit: MarkerTrait {} -impl Bit for _0 {} -impl Bit for _1 {} - -// classify positive binary naturals -trait Pos: MarkerTrait {} -impl Pos for _1 {} -impl Pos for (P, B) {} - -// classify binary naturals with 0 -trait Nat: MarkerTrait {} -impl Nat for _0 {} -impl Nat for _1 {} -impl Nat for (P, B) {} -``` - -These can be used to index into tuples or HLists generically, either -by specifying the path explicitly (e.g., `(a, b, c).at::<(_1, _0)>() -==> c`) or by providing a singleton term with the appropriate type -`(a, b, c).at((_1, _0)) ==> c`. Indexing is linear time in the general -case due to recursion, but can be made constant time for a fixed -number of specialized implementations. - -Type-level numbers can also be used to define "sized" or "bounded" -data, such as a vector indexed by its length: - -```rust -struct LengthVec(Vec); -``` - -Similar to the indexing example, the parameter `N` can either serve as -phantom data, or such a struct could also include a term-level -representation of N as another field. - -In either case, a length-safe API could be defined for container types -like `Vec`. "Unsafe" indexing (without bounds checking) into the -underlying container would be safe in general because the length of -the container would be known statically and reflected in the type of -the length-indexed wrapper. - -We could imagine an idealized API in the following fashion: - -```rust -// push, adding one to the length -fn push(xs: LengthVec, x: A) -> LengthVec; - -// pop, subtracting one from the length -fn pop(xs: LengthVec, store: &mut A) -> LengthVec; - -// look up an element at an index -fn at(xs: LengthVec, index: M) -> A; - -// append, adding the individual lengths -fn append(xs: LengthVec, ys: LengthVec) -> LengthVec; - -// produce a length respecting iterator from an indexed vector -fn iter(xs: LengthVec) -> LengthIterator; -``` - -We can't write code like the above directly in Rust but we could -approximate it through type-level macros: - -```rust -// Expr! would expand + to Add::Output and integer constants to Nat!; see -// the HList append earlier in the RFC for a concrete example -Expr!(N + M) - ==> >::Output - -// Nat! would expand integer literals to type-level binary naturals -// and be implemented as a plugin for efficiency; see the following -// section for a concrete example -Nat!(4) - ==> ((_1, _0), _0) - -// `Expr!` and `Nat!` used for the LengthVec type: -LengthVec - ==> LengthVec>::Output> - ==> LengthVec>::Output> -``` - -##### Implementation of `Nat!` as a plugin - -The following code demonstrates concretely how `Nat!` can be -implemented as a plugin. As with the `HList!` example, this code (with -some additions) compiles and is usable with the type macros prototype -in the branch referenced earlier. - -For efficiency, the binary representation is first constructed as a -string via iteration rather than recursively using `quote` macros. The -string is then parsed as a type, returning an ast fragment. - -```rust -// Convert a u64 to a string representation of a type-level binary natural, e.g., -// ast_as_str(1024) -// ==> "(((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0)" -fn ast_as_str<'cx>( - ecx: &'cx base::ExtCtxt, - mut num: u64, - mode: Mode, -) -> String { - let path = "_"; - let mut res: String; - if num < 2 { - res = String::from_str(path); - res.push_str(num.to_string().as_slice()); - } else { - let mut bin = vec![]; - while num > 0 { - bin.push(num % 2); - num >>= 1; - } - res = ::std::iter::repeat('(').take(bin.len() - 1).collect(); - res.push_str(path); - res.push_str(bin.pop().unwrap().to_string().as_slice()); - for b in bin.iter().rev() { - res.push_str(", "); - res.push_str(path); - res.push_str(b.to_string().as_slice()); - res.push_str(")"); - } - } - res -} - -// Generate a parser which uses the nat's ast-as-string as its input -fn ast_parser<'cx>( - ecx: &'cx base::ExtCtxt, - num: u64, - mode: Mode, -) -> parse::parser::Parser<'cx> { - let filemap = ecx - .codemap() - .new_filemap(String::from_str(""), ast_as_str(ecx, num, mode)); - let reader = lexer::StringReader::new( - &ecx.parse_sess().span_diagnostic, - filemap); - parser::Parser::new( - ecx.parse_sess(), - ecx.cfg(), - Box::new(reader)) -} - -// Try to parse an integer literal and return a new parser which uses -// the nat's ast-as-string as its input -pub fn lit_parser<'cx>( - ecx: &'cx base::ExtCtxt, - args: &[ast::TokenTree], - mode: Mode, -) -> Option> { - let mut lit_parser = ecx.new_parser_from_tts(args); - if let ast::Lit_::LitInt(lit, _) = lit_parser.parse_lit().node { - Some(ast_parser(ecx, lit, mode)) - } else { - None - } -} - -// Expand Nat!(n) to a type-level binary nat where n is an int literal, e.g., -// Nat!(1024) -// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn expand_ty<'cx>( - ecx: &'cx mut base::ExtCtxt, - span: codemap::Span, - args: &[ast::TokenTree], -) -> Box { - { - lit_parser(ecx, args, Mode::Ty) - }.and_then(|mut ast_parser| { - Some(base::MacEager::ty(ast_parser.parse_ty())) - }).unwrap_or_else(|| { - ecx.span_err(span, "Nat!: expected an integer literal argument"); - base::DummyResult::any(span) - }) -} - -// Expand nat!(n) to a term-level binary nat where n is an int literal, e.g., -// nat!(1024) -// ==> (((((((((_1, _0), _0), _0), _0), _0), _0), _0), _0), _0) -pub fn expand_tm<'cx>( - ecx: &'cx mut base::ExtCtxt, - span: codemap::Span, - args: &[ast::TokenTree], -) -> Box { - { - lit_parser(ecx, args, Mode::Tm) - }.and_then(|mut ast_parser| { - Some(base::MacEager::expr(ast_parser.parse_expr())) - }).unwrap_or_else(|| { - ecx.span_err(span, "nat!: expected an integer literal argument"); - base::DummyResult::any(span) - }) -} - -#[test] -fn nats() { - let _: Nat!(42) = nat!(42); -} -``` - -##### Optimization of `Expr`! - -Defining `Expr!` as a plugin would provide an opportunity to perform -various optimizations of more complex type-level expressions during -expansion. Partial evaluation would be one way to achieve -this. Furthermore, expansion-time optimizations wouldn't be limited to -arithmetic expressions but could be used for other data like HLists. - -##### Builtin alternatives: types parameterized by constant values - -The example with type-level naturals serves to illustrate some of the -patterns type macros enable. This RFC is not intended to address the -lack of constant value type parameterization and type-level numerics -specifically. There is -[another RFC here](https://github.com/rust-lang/rfcs/pull/884) which -proposes extending the type system to address those issue. - # Drawbacks There seem to be few drawbacks to implementing this feature as an From c80a0dd638defd35fac30c14a3e4dc839ca7716a Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 23 Jul 2015 22:22:27 +0200 Subject: [PATCH 0413/1195] rename according to PR number. --- text/{0000-type-macros.md => 0873-type-macros.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-type-macros.md => 0873-type-macros.md} (100%) diff --git a/text/0000-type-macros.md b/text/0873-type-macros.md similarity index 100% rename from text/0000-type-macros.md rename to text/0873-type-macros.md From bc046ff10bc7551b63c4d851eae8abfa9a9e2412 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 23 Jul 2015 22:22:43 +0200 Subject: [PATCH 0414/1195] add header info --- text/0873-type-macros.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0873-type-macros.md b/text/0873-type-macros.md index 6f906903695..884ce546643 100644 --- a/text/0873-type-macros.md +++ b/text/0873-type-macros.md @@ -1,7 +1,7 @@ -- Feature Name: Macros in type positions +- Feature Name: macros_in_type_positions - Start Date: 2015-02-16 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: rust-lang/rfcs#873 +- Rust Issue: rust-lang/rust#27245 # Summary From 3ae29c6e3d494df25525d3123539c875a9d03e06 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 23 Jul 2015 22:27:44 +0200 Subject: [PATCH 0415/1195] Add entry to README for RFC #873. --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 8b32f51b4dd..83914eb588c 100644 --- a/README.md +++ b/README.md @@ -44,6 +44,7 @@ the direction the language is evolving in. * [0771-std-iter-once.md](text/0771-std-iter-once.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) +* [0873-type-macros.md](text/0873-type-macros.md) * [0888-compiler-fence-intrinsics.md](text/0888-compiler-fence-intrinsics.md) * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0911-const-fn.md](text/0911-const-fn.md) From ef4a671393f84dd23c25f536a568abcc1c342957 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 23 Jul 2015 22:44:37 +0200 Subject: [PATCH 0416/1195] remove file that was artifact of my attempt to show RFC #873 as merged. --- text/0000-type-macros.md | 235 --------------------------------------- 1 file changed, 235 deletions(-) delete mode 100644 text/0000-type-macros.md diff --git a/text/0000-type-macros.md b/text/0000-type-macros.md deleted file mode 100644 index 6f906903695..00000000000 --- a/text/0000-type-macros.md +++ /dev/null @@ -1,235 +0,0 @@ -- Feature Name: Macros in type positions -- Start Date: 2015-02-16 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary - -Allow macros in type positions - -# Motivation - -Macros are currently allowed in syntax fragments for expressions, -items, and patterns, but not for types. This RFC proposes to lift that -restriction. - -1. This would allow macros to be used more flexibly, avoiding the - need for more complex item-level macros or plugins in some - cases. For example, when creating trait implementations with - macros, it is sometimes useful to be able to define the - associated types using a nested type macro but this is - currently problematic. - -2. Enable more programming patterns, particularly with respect to - type level programming. Macros in type positions provide - convenient way to express recursion and choice. It is possible - to do the same thing purely through programming with associated - types but the resulting code can be cumbersome to read and write. - - -# Detailed design - -## Implementation - -The proposed feature has been prototyped at -[this branch](https://github.com/freebroccolo/rust/commits/feature/type_macros). The -implementation is straightforward and the impact of the changes are -limited in scope to the macro system. Type-checking and other phases -of compilation should be unaffected. - -The most significant change introduced by this feature is a -[`TyMac`](https://github.com/freebroccolo/rust/blob/f8f8dbb6d332c364ecf26b248ce5f872a7a67019/src/libsyntax/ast.rs#L1274-L1275) -case for the `Ty_` enum so that the parser can indicate a macro -invocation in a type position. In other words, `TyMac` is added to the -ast and handled analogously to `ExprMac`, `ItemMac`, and `PatMac`. - -## Example: Heterogeneous Lists - -Heterogeneous lists are one example where the ability to express -recursion via type macros is very useful. They can be used as an -alternative to or in combination with tuples. Their recursive -structure provide a means to abstract over arity and to manipulate -arbitrary products of types with operations like appending, taking -length, adding/removing items, computing permutations, etc. - -Heterogeneous lists can be defined like so: - -```rust -#[derive(Copy, Clone, Debug, Eq, Ord, PartialEq, PartialOrd)] -struct Nil; // empty HList -#[derive(Copy, Clone, Debug, Eq, Ord, PartialEq, PartialOrd)] -struct Cons(H, T); // cons cell of HList - -// trait to classify valid HLists -trait HList: MarkerTrait {} -impl HList for Nil {} -impl HList for Cons {} -``` - -However, writing HList terms in code is not very convenient: - -```rust -let xs = Cons("foo", Cons(false, Cons(vec![0u64], Nil))); -``` - -At the term-level, this is an easy fix using macros: - -```rust -// term-level macro for HLists -macro_rules! hlist { - {} => { Nil }; - {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; - { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; - { $head:expr } => { Cons($head, Nil) }; -} - -// term-level HLists in patterns -macro_rules! hlist_pat { - {} => { Nil }; - { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; - { $head:pat } => { Cons($head, Nil) }; -} - -let xs = hlist!["foo", false, vec![0u64]]; -``` - -Unfortunately, this solution is incomplete because we have only made -HList terms easier to write. HList types are still inconvenient: - -```rust -let xs: Cons<&str, Cons, Nil>>> = hlist!["foo", false, vec![0u64]]; -``` - -Allowing type macros as this RFC proposes would allows us to be -able to use Rust's macros to improve writing the HList type as -well. The complete example follows: - -```rust -// term-level macro for HLists -macro_rules! hlist { - {} => { Nil }; - {=> $($elem:tt),+ } => { hlist_pat!($($elem),+) }; - { $head:expr, $($tail:expr),* } => { Cons($head, hlist!($($tail),*)) }; - { $head:expr } => { Cons($head, Nil) }; -} - -// term-level HLists in patterns -macro_rules! hlist_pat { - {} => { Nil }; - { $head:pat, $($tail:tt),* } => { Cons($head, hlist_pat!($($tail),*)) }; - { $head:pat } => { Cons($head, Nil) }; -} - -// type-level macro for HLists -macro_rules! HList { - {} => { Nil }; - { $head:ty } => { Cons<$head, Nil> }; - { $head:ty, $($tail:ty),* } => { Cons<$head, HList!($($tail),*)> }; -} - -let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![0u64]]; -``` - -Operations on HLists can be defined by recursion, using traits with -associated type outputs at the type-level and implementation methods -at the term-level. - -The HList append operation is provided as an example. Type macros are -used to make writing append at the type level (see `Expr!`) more -convenient than specifying the associated type projection manually: - -```rust -use std::ops; - -// nil case for HList append -impl ops::Add for Nil { - type Output = Ys; - - fn add(self, rhs: Ys) -> Ys { - rhs - } -} - -// cons case for HList append -impl ops::Add for Cons where - Xs: ops::Add, -{ - type Output = Cons; - - fn add(self, rhs: Ys) -> Cons { - Cons(self.0, self.1 + rhs) - } -} - -// type macro Expr allows us to expand the + operator appropriately -macro_rules! Expr { - { ( $($LHS:tt)+ ) } => { Expr!($($LHS)+) }; - { HList ! [ $($LHS:tt)* ] + $($RHS:tt)+ } => { >::Output }; - { $LHS:tt + $($RHS:tt)+ } => { >::Output }; - { $LHS:ty } => { $LHS }; -} - -// test demonstrating term level `xs + ys` and type level `Expr!(Xs + Ys)` -#[test] -fn test_append() { - fn aux(xs: Xs, ys: Ys) -> Expr!(Xs + Ys) where - Xs: ops::Add - { - xs + ys - } - let xs: HList![&str, bool, Vec] = hlist!["foo", false, vec![]]; - let ys: HList![u64, [u8; 3], ()] = hlist![0, [0, 1, 2], ()]; - - // demonstrate recursive expansion of Expr! - let zs: Expr!((HList![&str] + HList![bool] + HList![Vec]) + - (HList![u64] + HList![[u8; 3], ()]) + - HList![]) - = aux(xs, ys); - assert_eq!(zs, hlist!["foo", false, vec![], 0, [0, 1, 2], ()]) -} -``` - -# Drawbacks - -There seem to be few drawbacks to implementing this feature as an -extension of the existing macro machinery. The change adds a small -amount of additional complexity to the -[parser](https://github.com/freebroccolo/rust/commit/a224739e92a3aa1febb67d6371988622bd141361) -and -[conversion](https://github.com/freebroccolo/rust/commit/9341232087991dee73713dc4521acdce11a799a2) -but the changes are minimal. - -As with all feature proposals, it is possible that designs for future -extensions to the macro system or type system might interfere with -this functionality but it seems unlikely unless they are significant, -breaking changes. - -# Alternatives - -There are no _direct_ alternatives. Extensions to the type system like -data kinds, singletons, and other forms of staged programming -(so-called CTFE) might alleviate the need for type macros in some -cases, however it is unlikely that they would provide a comprehensive -replacement, particularly where plugins are concerned. - -Not implementing this feature would mean not taking some reasonably -low-effort steps toward making certain programming patterns -easier. One potential consequence of this might be more pressure to -significantly extend the type system and other aspects of the language -to compensate. - -# Unresolved questions - -## Alternative syntax for macro invocations in types - -There is a question as to whether type macros should allow `<` and `>` -as delimiters for invocations, e.g. `Foo!`. This would raise a -number of additional complications and is probably not necessary to -consider for this RFC. If deemed desirable by the community, this -functionality should be proposed separately. - -## Hygiene and type macros - -This RFC also does not address the topic of hygiene regarding macros -in types. It is not clear whether there are issues here or not but it -may be worth considering in further detail. From 9e2391f0685cb6d9c98b62436f6a6a767b6f9da7 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 23 Jul 2015 23:46:50 +0200 Subject: [PATCH 0417/1195] fix links in header for RFC #873. --- text/0873-type-macros.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0873-type-macros.md b/text/0873-type-macros.md index 884ce546643..bab40b17042 100644 --- a/text/0873-type-macros.md +++ b/text/0873-type-macros.md @@ -1,7 +1,7 @@ - Feature Name: macros_in_type_positions - Start Date: 2015-02-16 -- RFC PR: rust-lang/rfcs#873 -- Rust Issue: rust-lang/rust#27245 +- RFC PR: [rust-lang/rfcs#873](https://github.com/rust-lang/rfcs/pull/873) +- Rust Issue: [rust-lang/rust#27245](https://github.com/rust-lang/rust/issues/27245) # Summary From 64e6ee880c8663b30b579fb7c5c8713eeed9d4c7 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 24 Jul 2015 09:04:56 -0700 Subject: [PATCH 0418/1195] RFC 1193 is --cap-lints on the compiler --- text/{0000-cap-lints.md => 1193-cap-lints.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-cap-lints.md => 1193-cap-lints.md} (96%) diff --git a/text/0000-cap-lints.md b/text/1193-cap-lints.md similarity index 96% rename from text/0000-cap-lints.md rename to text/1193-cap-lints.md index ce71a1a9537..efac4c0689d 100644 --- a/text/0000-cap-lints.md +++ b/text/1193-cap-lints.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-07-07 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1193](https://github.com/rust-lang/rfcs/pull/1193) +- Rust Issue: [rust-lang/rust#27259](https://github.com/rust-lang/rust/issues/27259) # Summary From 756182825dd97ca2760391f6375332ed7282f122 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 24 Jul 2015 09:05:22 -0700 Subject: [PATCH 0419/1195] Remove the executable bit on an RFC --- text/0401-coercions.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) mode change 100755 => 100644 text/0401-coercions.md diff --git a/text/0401-coercions.md b/text/0401-coercions.md old mode 100755 new mode 100644 From b6e22b90d9993366bea8abb946065aa6e87a62b0 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 24 Jul 2015 15:05:10 -0400 Subject: [PATCH 0420/1195] RFC #1191 is HIR --- text/{0000-hir.md => 1191-hir.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-hir.md => 1191-hir.md} (97%) diff --git a/text/0000-hir.md b/text/1191-hir.md similarity index 97% rename from text/0000-hir.md rename to text/1191-hir.md index be8ef62c1f2..1c3c2dd3f87 100644 --- a/text/0000-hir.md +++ b/text/1191-hir.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-07-06 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1191](https://github.com/rust-lang/rfcs/pull/1191) +- Rust Issue: N/A # Summary From f0559c9ba5b0d6f0783cdd26019742764a770d7b Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 24 Jul 2015 16:37:00 -0700 Subject: [PATCH 0421/1195] RFC: Stabilize catch_panic with no bounds Stabilize `std::thread::catch_panic` after removing the `Send` and `'static` bounds from the closure parameter. [Rendered](https://github.com/alexcrichton/rfcs/blob/stabilize-catch-panic/text/0000-stabilize-catch-panic.md) --- text/0000-stabilize-catch-panic.md | 365 +++++++++++++++++++++++++++++ 1 file changed, 365 insertions(+) create mode 100644 text/0000-stabilize-catch-panic.md diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md new file mode 100644 index 00000000000..cfba8094c21 --- /dev/null +++ b/text/0000-stabilize-catch-panic.md @@ -0,0 +1,365 @@ +- Feature Name: `catch_panic` +- Start Date: 2015-07-24 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Stabilize `std::thread::catch_panic` after removing the `Send` and `'static` +bounds from the closure parameter. + +# Motivation + +It is currently defined as undefined behavior to have a Rust program panic +across an FFI boundary. For example if C calls into Rust and Rust panics, then +this is undefined behavior. The purpose of the unstable `thread::catch_panic` +function is to solve this problem by enabling you to catch a panic in Rust +before control flow is returned back over to C. As a refresher, the signature of +the function looks like: + +```rust +fn catch_panic(f: F) -> thread::Result + where F: FnOnce() -> R + Send + 'static +``` + +This function will run the closure `f` and if it panics return `Err(Box)`. +If the closure doesn't panic it will return `Ok(val)` where `val` is the +returned value of the closure. Most of these aspects "pretty much make sense", +but an odd part about this signature is the `Send` and `'static` bounds on the +closure provided. At a high level, these two bounds are intended to mitigate +problems related to something many programmers call "exception safety". To +understand why let's first briefly review exception safety in Rust. + +### Exception Safety + +The problem of exception safety often plagues many C++ programmers (and other +languages), and it essentially means that code needs to be ready to handle +exceptional control flow. For Rust this means that code needs to be prepared to +handle panics as any function call can cause a thread to panic. What this +largely boils down to is that a block of code having only one entry point but +possibly many exit points. For example: + +```rust +let mut foo = true; +bar(); +foo = false; +``` + +It may intuitive to say that this block of code returns that `foo`'s value is +always `false`. If, however, the `bar` function panics, then the block of code +will "return" (because of unwinding), but the value of `foo` is still `true`. +Let's take a look at a more harmful example to see how this can go wrong: + +``` +pub fn push_ten_more(v: &mut Vec, t: T) { + unsafe { + v.reserve(10); + let len = v.len(); + v.set_len(len + 10); + for i in 0..10 { + ptr::write(v.as_mut_ptr().offset(len + i), t.clone()); + } + } +} +``` + +While this code may look correct, it's actually not memory safe. If the type +`T`'s `clone` method panics, then this vector will point to uninitialized data. +We extended the vector's length (the call to `set_len`) before we actually wrote +the necessary data, so if a call to `clone` panics it will cause the vector's +destructor to attempt to destroy an uninitialized instance of `T`. + +The problem with this code is that it's not **exception safe**. A small +restructuring can help it become exception safe (e.g. call `set_len` after +`ptr::write`), but it's not always considered when writing code. + +### Catching Exceptions + +Dealing with exception safety is typically more involved in other languages +because of `catch` blocks. The core problem here is that shared state in the +"try" block and the "catch" block can end up getting corrupted. Due to a panic +possibly happening at any time, data may not often prepare for the panic and the +catch (or finally) block will then read this corrupt data. + +Rust has not had to deal with this problem much because there's no stable way to +catch a panic and in a thread. The `catch_panic` function proposed in this RFC, +however, is exactly this. To see how this function is not making Rust memory +unsafe, let's take a look at how memory safety and exception safety interact. + +### Exception Safety and Memory Safety + +If this is the first time you've ever heard about exception safety, this may +sound pretty bad! Chances are you haven't considered how Rust code can "exit" at +many points in a function beyond just the points where you wrote down `return`. +The good news is that Rust by default **is still memory safe** in the face of +this exception safety problem. + +All safe code in Rust is guaranteed to not cause any memory unsafety due to a +panic. There is never any invalid intermediate state which can then be read due +to a destructor running on a panic. It's possible for a **logical** invariant to +be violated as a result of a panic, however. For example if a structure +guarantees that its field `foo` is always an even integer, it may be odd +temporarily while a helper function is called and if that panics then the +logical guarantee is no longer valid. + +As we've also seen, however, it's possible to cause memory unsafety through +panics when dealing with `unsafe` code. The key part of this is that you have to +have `unsafe` somewhere to inject the memory unsafety, and you largely just need +to worry about exception safety in the confines of an unsafe block. + +### Exception Safety in Rust + +Rust does not provide many primitives today to deal with exception safety, but +it's a situation you'll see handled in many locations when browsing unsafe +collections-related code, for example. One case where Rust does help you with +this is an aspect of Mutexes called [**poisoining**][poison]. + +[poison]: http://doc.rust-lang.org/std/sync/struct.Mutex.html#poisoning + +Poisoning is a mechanism for propagating panics among threads to ensure that +inconsistent state is not read. A mutex becomes poisoned if a thread holds the +lock and then panics. Most usage of a mutex simply `unwrap`s the result of +`lock()`, causing a panic in one thread to be propagated to all others that are +reachable. + +A key design aspect of poisoning, however, is that you can opt-out of poisoning. +The `Err` variant of the [`lock` method] provides the ability to gain access to +the mutex anyway. As explained above, exception safety can only lead to memory +unsafety when intermingled with unsafe code. This means that fundamentally +poisoning a Mutex is **not** guaranteeing memory safety, and hence getting +access to a poisoned mutex is not an unsafe operation. + +[`lock` method]: http://doc.rust-lang.org/std/sync/struct.Mutex.html#method.lock + +Exception safety is rarely considered when writing code in Rust, so the standard +library strives to help out as much as possible when it can. Poisoning mutexes +is a good example of this where ignoring panics in remote threads means that +mutexes could very commonly contain corrupted data (not memory unsafe, just +logically corrupt). There's typically an opt-out to these mechanisms, but by +default the standard library provides them. + +### `Send` and `'static` on `catch_panic` + +Alright, now that we've got a bit of background, let's explore why these bounds +were originally added to the `catch_panic` function. It was thought that these +two bounds would provide basically the same level of exception safety protection +that spawning a new thread does (e.g. today this requires both of these bounds). +This in theory meant that the addition of `catch_panic` to the standard library +would not exascerbate the concerns of exception safety. + +It [was discovered][cp-issue], however, that TLS can be used to bypass this +theoretical "this is the same as spawning a thread" boundary. Using TLS means +that you can share non-`Send` data across the `catch_panic` boundary, meaning +the caller of `catch_panic` may see invalid state. + +[cp-issue]: https://github.com/rust-lang/rust/issues/25662 + +As a result, these two bounds have been called into question, and this RFC is +recommending removing both bounds from the `catch_panic` function. + +### Is `catch_panic` unsafe? + +With the removal of the two bounds on this function, we can freely share state +across a "panic boundary". This means that we don't always know for sure if +arbitrary data is corrupted or not. As we've seen above, however, if we're only +dealing with safe Rust then this will not lead to memory unsafety. For memory +unsafety to happen it would require interaction with `unsafe` code at which +point the `unsafe` code is responsible for dealing with exception safety. + +The standard library has a clear definition for what functions are `unsafe`, and +it's precisely those which can lead to memory unsafety in otherwise safe Rust. +Because that is not the case for `catch_panic` it will not be declared as an +`unsafe` function. + +### What about other bounds? + +It has been discussed that there may be possible other bounds or mitigation +strategies for `catch_panic` (to help with the TLS problem described above), and +although it's somewhat unclear as to what this may precisely mean it's still the +case that the standard library will want a `catch_panic` with no bounds in +*some* form or another. + +The standard library is providing the lowest-level tools to create robust APIs, +and inevitably it should not forbid patterns that are safe. Rust itself does +this via the `unsafe` subset by allowing you to build up a safe abstraction on +unsafe underpinnings. Similarly any bound on `catch_panic` will eventually be +too restrictive for someone even though their usage is 100% safe. As a result +the standard library will always want (and was always going to have) a no-bounds +version of this function. + +As a result this RFC proposes not attempting to go through hoops to find a more +restrictive, but more helpful with exception safety, set of bounds for this +function and instead stabilize the no-bounds version. + +# Detailed design + +Stabilize `std::thread::catch_panic` after removing the `Send` and `'static` +bounds from the closure parameter, modifying the signature to be: + +```rust +fn catch_panic(f: F) -> thread::Result where F: FnOnce() -> R +``` + +# Drawbacks + +A major drawback of this RFC is that it can mitigate Rust's error handling +story. On one hand this function can be seen as adding exceptions to Rust as +it's now possible to both throw (panic) and catch (`catch_panic`). The track +record of exceptions in languages like C++, Java, and Python hasn't been great, +and a drawing point of Rust for many has been the lack of exceptions. To help +understand what's going on, let's go through a brief overview of error handling +in Rust today: + +### Result vs Panic + +There are two primary strategies for signaling that a function can fail in Rust +today: + +* `Results` represent errors/edge-cases that the author of the library knew + about, and expects the consumer of the library to handle. +* `panic`s represent errors that the author of the library did not expect to + occur, and therefore does not expect the consumer to handle in any particular + way. + +Another way to put this division is that: + +* `Result`s represent errors that carry additional contextual information. This + information allows them to be handled by the caller of the function producing + the error, modified with additional contextual information, and eventually + converted into an error message fit for a human consumer of the top-level + program. +* `panic`s represent errors that carry no contextual information (except, + perhaps, debug information). Because they represented an unexpected error, + they cannot be easily handled by the caller of the function or presented to a + human consumer of the top-level program (except to say "something unexpected + has gone wrong"). + +Some pros of `Result` are that it signals specific edge cases that you as a +consumer should think about handling and it allows the caller to decide +precisely how to handle the error. A con with `Result` is that defining errors +and writing down `Result` + `try!` is not always the most ergonomic. + +The pros and cons of `panic` are essentially the opposite of `Result`, being +easy to use (nothing to write down other than the panic) but difficult to +determine when a panic can happen or handle it in a custom fashion. + +### Result? Or panic? + +These divisions justify the use of `panic`s for things like out-of-bounds +indexing: such an error represents a programming mistake that (1) the author of +the library was not aware of, by definition, and (2) cannot be easily handled by +the caller, except perhaps to indicate to the human user that an unexpected +error has occurred. + +In terms of heuristics for use: + +* `panic`s should rarely if ever be used to report errors that occurred through + communication with the system or through IO. For example, if a Rust program + shells out to `rustc`, and `rustc` is not found, it might be tempting to use a + panic, because the error is unexpected and hard to recover from. However, a + human consumer of the program would benefit from intermediate code adding + contextual information about the in-progress operation, and the program could + report the error in terms a human can understand. While the error is rare, + **when it happens it is not a programmer error**. +* assertions can produce `panic`s, because the programmer is saying that if the + assertion fails, it means that he has made an unexpected mistake. + +In short, if it would make sense to report an error as a context-free `500 +Internal Server Error` or a red an unknown error has occurred in all cases, it's +an appropriate panic. + +Another key reason to choose `Result` over a panic is that the compiler is +likely to soon grow an option to map a panic to an abort. This is motivated for +portability, compile time, binary size, and a number of factors, but it +fundamentally means that a library which signals errors via panics (and relies +on consumers using `catch_panic`) will not be usable in this context. + +### Will Rust have exceptions? + +After reviewing the cases for `Result` and `panic`, there's still clearly a +niche that both of these two systems are filling, so it's not the case that we +want to scrap one for the other. Rust will indeed have the ability to catch +exceptions to a greater extent than it does today with this RFC, but idiomatic +Rust will continue to follow the above rules for when to use a panic vs a result. + +It's likely that the `catch_panic` function will only be used where it's +absolutely necessary, like FFI boundaries, instead of a general-purpose error +handling mechanism in all code. + +# Alternatives + +One alternative, which is somewhat more of an addition, is to have the standard +library entirely abandon all exception safety mitigation tactics. As explained +in the motivation section, exception safety will not lead to memory unsafety +unless paired with unsafe code, so it is perhaps within the realm of possibility +to remove the tactics of poisoning from mutexes and simply require that +consumers deal with exception safety 100%. + +This alternative is often motivated by saying that there are holes in our +poisoning story or the problem space is too large to tackle via targeted APIs. +This section will look a little bit more in detail about what's going on here. + +For the purpose of this discussion, let's use the term *dangerous* to +refer to code that can produce problems related to exception safety. Exception +safety means we're exposing the following possibly dangerous situation: + +> Dangerous code allows code that uses interior mutability to be interrupted in +> the process of making a mutation, and then allow other code to see the +> incomplete change. + +Today, most Rust code is protected from this danger from two angles: + +* If a piece of code acquires interior mutability through &mut and a panic + occurs, that panic will propagate through the owner of the original value. + Since there can be no outstanding & references to the same value, nobody can + see the incomplete change. +* If a piece of code acquires interior mutability through Mutex and a + panic occurs, attempts by another thread to read the value through + normal means will propagate the panic. + +There are areas in Rust that are not covered by these cases: + +* RefCell (especially with destructors) allows code to get access to a value + with an incomplete change. +* Generally speaking, destructors can observe an incomplete change. +* The Mutex API provides an alternate mechanism of reading a value with an + incomplete change. +* The proposed `catch_panic` API allows the propagation of panics to a boundary + that does not have any ownership restrictions. + +One open question that this question affects: + +* Should a theoretical `Thread::scoped` API propagate panics? + +Looking at these cases that aren't covered in Rust by default, and assuming that +`Thread::scoped` propagates panics by default (with an analogous API to +`PoisonError::into_inner`), we get a table that looks like: + +![img](https://www.evernote.com/l/AAJdvryuzOVFrakUiK6i0IBASP7wysYHN0sB/image.png) + +The main point here is that although this problem space seems sprawling, it is, +in reality, restricted to interior mutability. Enumerating the "dangerous" APIs +seems to be a tractable problem. Calling `RefCell` and `catch_panic` "dangerous" +(with the incomplete mutation problem) would not be problematic. `Mutex` or +`Thread::scoped` would not be dangerous because of the benefits associated with +detecting panics across threads, and this aligns with the table above. Note that +implementations of Drop, because they run during stack unwinding, should be +considered "dangerous" for the purposes of this summary. + +It may not be surprising that the threaded APIs ended up being protected via +APIs, because this kind of sharing is fundamental to threaded code. Making +them "dangerous" would make almost anything you would want to do with threads +"dangerous", and instead we ask users to learn about the danger only when they +try to access the possibly dangerous data. + +In contrast, both `RefCell` and `catch_panic` are more niche tools, making it +reasonable to ask users to learn about the danger when they begin using the +tools in the first place, and then making the access more ergonomic. Despite +labeling being "dangerous" there are strategies to mitigate this such as +building abstractions on top of these primitive which only use `RefCell` or +`catch_panic` as an implementation detail. These higher-level abstractions will +have fewer edge cases and risks associated with them. + +# Unresolved questions + +None currently. From 6e97b1a2dcf12b19d7ebf08e0ebfd1f6eb8aef34 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Wed, 15 Jul 2015 15:14:03 +1200 Subject: [PATCH 0422/1195] Update the RFC process with sub-teams, amongst other things. The libs section was authored by @Gankro, see #1213 and [this discuss thread](https://internals.rust-lang.org/t/the-life-and-death-of-an-api/2087) for previous discussion. We're currently missing guideline for the tools sub-team. I just didn't have anything to hand, hopefully we can remedy that in a followup PR. --- README.md | 169 ++++++++++++++++++++++++-------------------- compiler_changes.md | 47 ++++++++++++ lang_changes.md | 36 ++++++++++ libs_changes.md | 115 ++++++++++++++++++++++++++++++ 4 files changed, 292 insertions(+), 75 deletions(-) create mode 100644 compiler_changes.md create mode 100644 lang_changes.md create mode 100644 libs_changes.md diff --git a/README.md b/README.md index 8b32f51b4dd..b49d4c1b0a2 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ implemented and reviewed via the normal GitHub pull request workflow. Some changes though are "substantial", and we ask that these be put through a bit of a design process and produce a consensus among the Rust -community and the [core team]. +community and the [sub-team]s. The "RFC" (request for comments) process is intended to provide a consistent and controlled path for new features to enter the language @@ -75,19 +75,21 @@ the direction the language is evolving in. * [RFC Postponement] * [Help this is all too informal!] + ## When you need to follow this process [When you need to follow this process]: #when-you-need-to-follow-this-process -You need to follow this process if you intend to make "substantial" -changes to Rust, Cargo, Crates.io, or the RFC process itself. What constitutes -a "substantial" change is evolving based on community norms, but may include -the following. +You need to follow this process if you intend to make "substantial" changes to +Rust, Cargo, Crates.io, or the RFC process itself. What constitutes a +"substantial" change is evolving based on community norms and varies depending +on what part of the ecosystem you are proposing to change, but may include the +following. - Any semantic or syntactic change to the language that is not a bugfix. - Removing language features, including those that are feature-gated. - - Changes to the interface between the compiler and libraries, -including lang items and intrinsics. - - Additions to `std` + - Changes to the interface between the compiler and libraries, including lang + items and intrinsics. + - Additions to `std`. Some changes do not require an RFC: @@ -103,6 +105,15 @@ If you submit a pull request to implement a new feature without going through the RFC process, it may be closed with a polite request to submit an RFC first. +For more details on when an RFC is required, please see the following specific +guidelines, these correspond with some of the Rust community's +[sub-teams](http://www.rust-lang.org/team.html): + +* [language changes](lang_changes.md), +* [library changes](libs_changes.md), +* [compiler changes](compiler_changes.md). + + ## Before creating an RFC [Before creating an RFC]: #before-creating-an-rfc @@ -125,12 +136,12 @@ on the [RFC issue tracker][issues], and occasionally posting review. As a rule of thumb, receiving encouraging feedback from long-standing -project developers, and particularly members of the [core team][core] +project developers, and particularly members of the relevant [sub-team] is a good indication that the RFC is worth pursuing. [issues]: https://github.com/rust-lang/rfcs/issues [discuss]: http://discuss.rust-lang.org/ -[core]: https://github.com/rust-lang/rust/wiki/Note-core-team + ## What the process is [What the process is]: #what-the-process-is @@ -141,49 +152,55 @@ is 'active' and may be implemented with the goal of eventual inclusion into Rust. * Fork the RFC repo http://github.com/rust-lang/rfcs -* Copy `0000-template.md` to `text/0000-my-feature.md` (where -'my-feature' is descriptive. don't assign an RFC number yet). -* Fill in the RFC. Put care into the details: RFCs that do not -present convincing motivation, demonstrate understanding of the -impact of the design, or are disingenuous about the drawbacks or -alternatives tend to be poorly-received. -* Submit a pull request. As a pull request the RFC will receive design -feedback from the larger community, and the author should be prepared -to revise it in response. -* During Rust triage, the pull request will either be closed (for RFCs -that clearly will not be accepted) or assigned a *shepherd*. The -shepherd is a trusted developer who is familiar with the process, who -will help to move the RFC forward, and ensure that the right people -see and review it. -* Build consensus and integrate feedback. RFCs that have broad support -are much more likely to make progress than those that don't receive -any comments. The shepherd assigned to your RFC should help you get -feedback from Rust developers as well. +* Copy `0000-template.md` to `text/0000-my-feature.md` (where 'my-feature' is +descriptive. don't assign an RFC number yet). +* Fill in the RFC. Put care into the details: RFCs that do not present +convincing motivation, demonstrate understanding of the impact of the design, or +are disingenuous about the drawbacks or alternatives tend to be poorly-received. +* Submit a pull request. As a pull request the RFC will receive design feedback +from the larger community, and the author should be prepared to revise it in +response. +* Each pull request will be labeled with the most relevant [sub-team]. +* Each sub-team triages its RFC PRs. The sub-team will will either close the PR +(for RFCs that clearly will not be accepted) or assign it a *shepherd*. The +shepherd is a trusted developer who is familiar with the RFC process, who will +help to move the RFC forward, and ensure that the right people see and review +it. +* Build consensus and integrate feedback. RFCs that have broad support are much +more likely to make progress than those that don't receive any comments. The +shepherd assigned to your RFC should help you get feedback from Rust developers +as well. * The shepherd may schedule meetings with the author and/or relevant -stakeholders to discuss the issues in greater detail, and in some -cases the topic may be discussed at the larger [weekly meeting]. In -either case a summary from the meeting will be posted back to the RFC -pull request. -* Once both proponents and opponents have clarified and defended -positions and the conversation has settled, the shepherd will take it -to the [core team] for a final decision. -* Eventually, someone from the [core team] will either accept the RFC -by merging the pull request, assigning the RFC a number (corresponding -to the pull request number), at which point the RFC is 'active', or -reject it by closing the pull request. +stakeholders to discuss the issues in greater detail. +* The sub-team will discuss the RFC PR, as much as possible in the comment +thread of the PR itself. Offline discussion will be summarized on the PR comment +thread. +* Once both proponents and opponents have clarified and defended positions and +the conversation has settled, the RFC will enter its *final comment period* +(FCP). This is a final opportunity for the community to comment on the PR and is +a reminder for all members of the sub-team to be aware of the RFC. +* The FCP lasts one week. It may be extended if consensus between sub-team +members cannot be reached. At the end of the FCP, the [sub-team] will either +accept the RFC by merging the pull request, assigning the RFC a number +(corresponding to the pull request number), at which point the RFC is 'active', +or reject it by closing the pull request. How exactly the sub-team decide on an +RFC is up to the sub-team. + ## The role of the shepherd [The role of the shepherd]: #the-role-of-the-shepherd -During triage, every RFC will either be closed or assigned a shepherd. -The role of the shepherd is to move the RFC through the process. This -starts with simply reading the RFC in detail and providing initial -feedback. The shepherd should also solicit feedback from people who -are likely to have strong opinions about the RFC. Finally, when this -feedback has been incorporated and the RFC seems to be in a steady -state, the shepherd will bring it to the meeting. In general, the idea -here is to "front-load" as much of the feedback as possible before the -point where we actually reach a decision. +During triage, every RFC will either be closed or assigned a shepherd from the +relevant sub-team. The role of the shepherd is to move the RFC through the +process. This starts with simply reading the RFC in detail and providing initial +feedback. The shepherd should also solicit feedback from people who are likely +to have strong opinions about the RFC. When this feedback has been incorporated +and the RFC seems to be in a steady state, the shepherd and/or sub-team leader +will announce an FCP. In general, the idea here is to "front-load" as much of +the feedback as possible before the point where we actually reach a decision - +by the end of the FCP, the decision on whether or not to accept the RFC should +be obvious from the RFC discussion thread. + ## The RFC life-cycle [The RFC life-cycle]: #the-rfc-life-cycle @@ -205,35 +222,36 @@ through to completion: authors should not expect that other project developers will take on responsibility for implementing their accepted feature. -Modifications to active RFC's can be done in followup PR's. We strive +Modifications to active RFC's can be done in follow-up PR's. We strive to write each RFC in a manner that it will reflect the final design of the feature; but the nature of the process means that we cannot expect every merged RFC to actually reflect what the end result will be at -the time of the next major release; therefore we try to keep each RFC -document somewhat in sync with the language feature as planned, -tracking such changes via followup pull requests to the document. +the time of the next major release. + +In general, once accepted, RFCs should not be substantially changed. Only very +minor changes should be submitted as amendments. More substantial changes should +be new RFCs, with a note added to the original RFC. Exactly what counts as a +"very minor change" is up to the sub-team to decide. There are some more +specific guidelines in the sub-team RFC guidelines for the [language](lang_changes.md), +[libraries](libs_changes.md), and [compiler](compiler_changes.md). -An RFC that makes it through the entire process to implementation is -considered 'complete' and is moved to the 'complete' folder; an RFC -that fails after becoming active is 'inactive' and moves to the -'inactive' folder. ## Reviewing RFC's [Reviewing RFC's]: #reviewing-rfcs While the RFC PR is up, the shepherd may schedule meetings with the author and/or relevant stakeholders to discuss the issues in greater -detail, and in some cases the topic may be discussed at the larger -[weekly meeting]. In either case a summary from the meeting will be +detail, and in some cases the topic may be discussed at a sub-team +meeting. In either case a summary from the meeting will be posted back to the RFC pull request. -The core team makes final decisions about RFCs after the benefits and -drawbacks are well understood. These decisions can be made at any -time, but the core team will regularly issue decisions on at least a -weekly basis. When a decision is made, the RFC PR will either be -merged or closed, in either case with a comment describing the -rationale for the decision. The comment should largely be a summary of -discussion already on the comment thread. +A sub-team makes final decisions about RFCs after the benefits and drawbacks are +well understood. These decisions can be made at any time, but the sub-team will +regularly issue decisions. When a decision is made, the RFC PR will either be +merged or closed, in either case with a comment describing the rationale for the +decision. The comment should largely be a summary of discussion already on the +comment thread. + ## Implementing an RFC [Implementing an RFC]: #implementing-an-rfc @@ -243,7 +261,7 @@ implemented right away. Other accepted RFC's can represent features that can wait until some arbitrary developer feels like doing the work. Every accepted RFC has an associated issue tracking its implementation in the Rust repository; thus that associated issue can -be assigned a priority via the [triage process] that the team uses for +be assigned a priority via the triage process that the team uses for all issues in the Rust repository. The author of an RFC is not obligated to implement it. Of course, the @@ -254,15 +272,18 @@ If you are interested in working on the implementation for an 'active' RFC, but cannot determine if someone else is already working on it, feel free to ask (e.g. by leaving a comment on the associated issue). + ## RFC Postponement [RFC Postponement]: #rfc-postponement -Some RFC pull requests are tagged with the 'postponed' label when they -are closed (as part of the rejection process). An RFC closed with -“postponed” is marked as such because we want neither to think about -evaluating the proposal nor about implementing the described feature -until after the next major release, and we believe that we can afford -to wait until then to do so. +Some RFC pull requests are tagged with the 'postponed' label when they are +closed (as part of the rejection process). An RFC closed with “postponed” is +marked as such because we want neither to think about evaluating the proposal +nor about implementing the described feature until some time in the future, and +we believe that we can afford to wait until then to do so. Historically, +"postponed" was used to postpone features until after 1.0. Postponed PRs may be +re-opened when the time is right. We don't have any formal process for that, you +should ask members of the relevant sub-team. Usually an RFC pull request marked as “postponed” has already passed an informal first round of evaluation, namely the round of “do we @@ -280,6 +301,4 @@ present circumstances. As usual, we are trying to let the process be driven by consensus and community norms, not impose more structure than necessary. -[core team]: https://github.com/mozilla/rust/wiki/Note-core-team -[triage process]: https://github.com/rust-lang/rust/wiki/Note-development-policy#milestone-and-priority-nomination-and-triage -[weekly meeting]: https://github.com/rust-lang/meeting-minutes +[sub-team]: http://www.rust-lang.org/team.html diff --git a/compiler_changes.md b/compiler_changes.md new file mode 100644 index 00000000000..45990e6a37b --- /dev/null +++ b/compiler_changes.md @@ -0,0 +1,47 @@ +# RFC policy - the compiler + +We have not previously had an RFC system for compiler changes, so policy here is +likely to change as we get the hang of things. We don't want to slow down most +compiler development, but on the other hand we do want to do more design work +ahead of time on large additions and refactorings. + +Compiler RFCs will be managed by the compiler sub-team, and tagged `T-compiler`. +The compiler sub-team will do an initial triage of new PRs within a week of +submission. The result of triage will either be that the PR is assigned to a +member of the sub-team for shepherding, the PR is closed because the sub-team +believe it should be done without an RFC, or closed because the sub-team feel it +should clearly not be done and further discussion is not necessary. We'll follow +the standard procedure for shepherding, final comment period, etc. + +Where there is significant design work for the implementation of a language +feature, the preferred workflow is to submit two RFCs - one for the language +design and one for the implementation design. The implementation RFC may be +submitted later if there is scope for large changes to the language RFC. + + +## Changes which need an RFC + +* Large refactorings or redesigns of the compiler +* Changing the API presented to syntax extensions or other compiler plugins in + non-trivial ways +* Adding, removing, or changing a stable compiler flag +* The implementation of new language features where there is significant change + or addition to the compiler +* Any other change which causes backwards incompatible changes to stable + behaviour of the compiler, language, or libraries + + +## Changes which don't need an RFC + +* Bug fixes, improved error messages, etc. +* Minor refactoring/tidying up +* Implmenting language features which have an accepted RFC, where the + implementation does not significantly change the compiler or require + significant new design work +* Adding unstable API for tools (note that all compiler API is currently unstable) +* Adding, removing, or changing an unstable compiler flag (if the compiler flag + is widely used there should be at least some discussion on discuss, or an RFC + in some cases) + +If in doubt it is probably best to just announce the change you want to make to +the compiler subteam on discuss or IRC, and see if anyone feels it needs an RFC. diff --git a/lang_changes.md b/lang_changes.md new file mode 100644 index 00000000000..7e7e6a732e7 --- /dev/null +++ b/lang_changes.md @@ -0,0 +1,36 @@ +# RFC policy - language design + +Pretty much every change to the language needs an RFC. + +Language RFCs are managed by the language sub-team, and tagged `T-lang`. The +language sub-team will do an initial triage of new PRs within a week of +submission. The result of triage will either be that the PR is assigned to a +member of the sub-team for shepherding, the PR is closed as postponed because +the subteam believe it might be a good idea, but is not currently aligned with +Rust's priorities, or the PR is closed because the sub-team feel it should +clearly not be done and further discussion is not necessary. In the latter two +cases, the sub-team will give a detailed explanation. We'll follow the standard +procedure for shepherding, final comment period, etc. + + +## Amendments + +Sometimes in the implementation of an RFC, changes are required. In general +these don't require an RFC as long as they are very minor and in the spirit of +the accepted RFC (essentially bug fixes). In this case implementers should +submit an RFC PR which amends the accepted RFC with the new details. Although +the RFC repository is not intended as a reference manual, it is preferred that +RFCs do reflect what was actually implemented. Amendment RFCs will go through +the same process as regular RFCs, but should be less controversial and thus +should move more quickly. + +When a change is more dramatic, it is better to create a new RFC. The RFC should +be standalone and reference the original, rather than modifying the existing +RFC. You should add a comment to the original RFC with referencing the new RFC +as part of the PR. + +Obviously there is some scope for judgment here. As a guideline, if a change +affects more than one part of the RFC (i.e., is a non-local change), affects the +applicability of the RFC to its motivating use cases, or there are multiple +possible new solutions, then the feature is probably not 'minor' and should get +a new RFC. diff --git a/libs_changes.md b/libs_changes.md new file mode 100644 index 00000000000..eb18ed5d271 --- /dev/null +++ b/libs_changes.md @@ -0,0 +1,115 @@ +# RFC guidelines - libraries sub-team + +# Motivation + +* RFCs are heavyweight: + * RFCs generally take at minimum 2 weeks from posting to land. In + practice it can be more on the order of months for particularly + controversial changes. + * RFCs are a lot of effort to write; especially for non-native speakers or + for members of the community whose strengths are more technical than literary. + * RFCs may involve pre-RFCs and several rewrites to accommodate feedback. + * RFCs require a dedicated shepherd to herd the community and author towards + consensus. + * RFCs require review from a majority of the subteam, as well as an official + vote. + * RFCs can't be downgraded based on their complexity. Full process always applies. + Easy RFCs may certainly land faster, though. + * RFCs can be very abstract and hard to grok the consequences of (no implementation). + +* PRs are low *overhead* but potentially expensive nonetheless: + * Easy PRs can get insta-merged by any rust-lang contributor. + * Harder PRs can be easily escalated. You can ping subject-matter experts for second + opinions. Ping the whole team! + * Easier to grok the full consequences. Lots of tests and Crater to save the day. + * PRs can be accepted optimistically with bors, buildbot, and the trains to guard + us from major mistakes making it into stable. The size of the nightly community + at this point in time can still mean major community breakage regardless of trains, + however. + * HOWEVER: Big PRs can be a lot of work to make only to have that work rejected for + details that could have been hashed out first. *This is the motivation for + having RFCs*. + +* RFCs are *only* meaningful if a significant and diverse portion of the +community actively participates in them. The official teams are not +sufficiently diverse to establish meaningful community consensus by agreeing +amongst themselves. + +* If there are *tons* of RFCs -- especially trivial ones -- people are less +likely to engage with them. Official team members are super busy. Domain experts +and industry professionals are super busy *and* have no responsibility to engage +in RFCs. Since these are *exactly* the most important people to get involved in +the RFC process, it is important that we be maximally friendly towards their +needs. + + +# Is an RFC required? + +The overarching philosophy is: *do whatever is easiest*. If an RFC +would be less work than an implementation, that's a good sign that an RFC is +necessary. That said, if you anticipate controversy, you might want to short-circuit +straight to an RFC. For instance new APIs almost certainly merit an RFC. Especially +as `std` has become more conservative in favour of the much more agile cargoverse. + +* **Submit a PR** if the change is a: + * Bugfix + * Docfix + * Obvious API hole patch, such as adding an API from one type to a symmetric type. + e.g. `Vec -> Box<[T]>` clearly motivates adding `String -> Box` + * Minor tweak to an unstable API (renaming, generalizing) + * Implementing an "obvious" trait like Clone/Debug/etc +* **Submit an RFC** if the change is a: + * New API + * Semantic Change to a stable API + * Generalization of a stable API (e.g. how we added Pattern or Borrow) + * Deprecation of a stable API + * Nontrivial trait impl (because all trait impls are insta-stable) +* **Do the easier thing** if uncertain. (choosing a path is not final) + + +# Non-RFC process + +* A (non-RFC) PR is likely to be **closed** if clearly not acceptable: + * Disproportionate breaking change (small inference breakage may be acceptable) + * Unsound + * Doesn't fit our general design philosophy around the problem + * Better as a crate + * Too marginal for std + * Significant implementation problems + +* A PR may also be closed because an RFC is approriate. + +* A (non-RFC) PR may be **merged as unstable**. In this case, the feature +should have a fresh feature gate and an associated tracking issue for +stabilisation. Note that trait impls and docs are insta-stable and thus have no +tracking issue. This may imply requiring a higher level of scrutiny for such +changes. + +However, an accepted RFC is not a rubber-stamp for merging an implementation PR. +Nor must an implementation PR perfectly match the RFC text. Implementation details +may merit deviations, though obviously they should be justified. The RFC may be +amended if deviations are substantial, but are not generally necessary. RFCs should +favour immutability. The RFC + Issue + PR should form a total explanation of the +current implementation. + +* Once something has been merged as unstable, a shepherd should be assigned + to promote and obtain feedback on the design. + +* Once the API has been unstable for at least one full cycle (6 weeks), + the shepherd (or any library sub-team member) may nominate an API for a + *final comment period* of another cycle. Feedback and other comments should be + posted to the tracking issue. This should be publicized. + +* After the final comment period, an API should ideally take one of two paths: + * **Stabilize** if the change is desired, and consensus is reached + * **Deprecate** is the change is undesired, and consensus is reached + * **Extend the FCP** is the change cannot meet consensus + * If consensus *still* can't be reached, consider requiring a new RFC or + just deprecating as "too controversial for std". + +* If any problems are found with a newly stabilized API during its beta period, + *strongly* favour reverting stability in order to prevent stabilizing a bad + API. Due to the speed of the trains, this is not a serious delay (~2-3 months + if it's not a major problem). + + From bda70c8989378958bbeafa01ed6aab81a809bc17 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 27 Jul 2015 17:28:24 -0700 Subject: [PATCH 0423/1195] Mild tweaks here and there --- text/0000-stabilize-catch-panic.md | 86 ++++++++++++++++++++---------- 1 file changed, 58 insertions(+), 28 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index cfba8094c21..5ab1d68bb4c 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -10,12 +10,21 @@ bounds from the closure parameter. # Motivation -It is currently defined as undefined behavior to have a Rust program panic -across an FFI boundary. For example if C calls into Rust and Rust panics, then -this is undefined behavior. The purpose of the unstable `thread::catch_panic` -function is to solve this problem by enabling you to catch a panic in Rust -before control flow is returned back over to C. As a refresher, the signature of -the function looks like: +In today's Rust it's not currently possible to catch a panic by design. There +are a number of situations, however, where catching a panic is either required +for correctness or necessary for building a useful abstraction: + +* It is currently defined as undefined behavior to have a Rust program panic + across an FFI boundary. For example if C calls into Rust and Rust panics, then + this is undefined behavior. Being able to catch a panic will allow writing + robust C apis in Rust. +* Abstactions like thread pools want to catch the panics of tasks being run + instead of having the thread torn down (and having to spawn a new thread). + +The purpose of the unstable `thread::catch_panic` function is to solve these +problems by enabling you to catch a panic in Rust before control flow is +returned back over to C. As a refresher, the signature of the function looks +like: ```rust fn catch_panic(f: F) -> thread::Result @@ -34,10 +43,14 @@ understand why let's first briefly review exception safety in Rust. The problem of exception safety often plagues many C++ programmers (and other languages), and it essentially means that code needs to be ready to handle -exceptional control flow. For Rust this means that code needs to be prepared to -handle panics as any function call can cause a thread to panic. What this -largely boils down to is that a block of code having only one entry point but -possibly many exit points. For example: +exceptional control flow. This primarily matters when an invariant is +temporarily broken in a region of code which can have exceptional control flow. +What this largely boils down to is that a block of code having only one entry +point but possibly many exit points, and invariants need to be upheld on all +exit points. + +For Rust this means that code needs to be prepared to handle panics as any +unknown function call can cause a thread to panic. For example: ```rust let mut foo = true; @@ -46,9 +59,10 @@ foo = false; ``` It may intuitive to say that this block of code returns that `foo`'s value is -always `false`. If, however, the `bar` function panics, then the block of code -will "return" (because of unwinding), but the value of `foo` is still `true`. -Let's take a look at a more harmful example to see how this can go wrong: +always `false` (e.g. a local invariant of ours). If, however, the `bar` function +panics, then the block of code will "return" (because of unwinding), but the +value of `foo` is still `true`. Let's take a look at a more harmful example to +see how this can go wrong: ``` pub fn push_ten_more(v: &mut Vec, t: T) { @@ -65,26 +79,42 @@ pub fn push_ten_more(v: &mut Vec, t: T) { While this code may look correct, it's actually not memory safe. If the type `T`'s `clone` method panics, then this vector will point to uninitialized data. -We extended the vector's length (the call to `set_len`) before we actually wrote -the necessary data, so if a call to `clone` panics it will cause the vector's -destructor to attempt to destroy an uninitialized instance of `T`. - -The problem with this code is that it's not **exception safe**. A small -restructuring can help it become exception safe (e.g. call `set_len` after -`ptr::write`), but it's not always considered when writing code. +`Vec` has an internal invariant that the first `len` elements are safe to drop +at any time, and we have broken that invariant temporarily with a call to +`set_len`. If a call to `clone` panics then we'll exit this block before +reaching the end, causing the invariant breakage to be leaked. + +The problem with this code is that it's not **exception safe**. There are a +number of common strategies to help mitigate this problem: + +* Use a "finally" block or some other equivalent mechanism to restore invariants + on all exit paths. In Rust this typically manifests itself as a destructor on + a structure as the compiler will ensure that this is run whenever a panic + happens. +* Avoid calling code which can panic (e.g. functions with assertions or + functions with statically unknown implementations) whenever an invariant is + broken. + +In our example of `push_ten_more` we can take the second round of avoiding code +which can panic when an invariant is broken. If we call `set_len` on each +iteration of the loop with `len + i` then the vector's invariant will always bee +respected. ### Catching Exceptions -Dealing with exception safety is typically more involved in other languages -because of `catch` blocks. The core problem here is that shared state in the -"try" block and the "catch" block can end up getting corrupted. Due to a panic -possibly happening at any time, data may not often prepare for the panic and the -catch (or finally) block will then read this corrupt data. +In languages with `catch` blocks exception unsafe code can often cause problems +more frequently. The core problem here is that shared state in the "try" block +and the "catch" block can end up getting corrupted. Due to a panic possibly +happening at any time, data may not often prepare for the panic and the catch +(or finally) block will then read this corrupt data. Rust has not had to deal with this problem much because there's no stable way to -catch a panic and in a thread. The `catch_panic` function proposed in this RFC, -however, is exactly this. To see how this function is not making Rust memory -unsafe, let's take a look at how memory safety and exception safety interact. +catch a panic. One primary area this comes up is dealing with cross-thread +panics, and the standard library poisons mutexes and rwlocks by default to help +deal with this situation. The `catch_panic` function proposed in this RFC, +however, is exactly "catch for Rust". To see how this function is not making +Rust memory unsafe, let's take a look at how memory safety and exception safety +interact. ### Exception Safety and Memory Safety From 9ca55ad48a46051e1f2143c9eb49c261cd2bae18 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 27 Jul 2015 17:50:12 -0700 Subject: [PATCH 0424/1195] Reword exception safety and memory safety --- text/0000-stabilize-catch-panic.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 5ab1d68bb4c..46c61c93008 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -126,16 +126,17 @@ this exception safety problem. All safe code in Rust is guaranteed to not cause any memory unsafety due to a panic. There is never any invalid intermediate state which can then be read due -to a destructor running on a panic. It's possible for a **logical** invariant to -be violated as a result of a panic, however. For example if a structure -guarantees that its field `foo` is always an even integer, it may be odd -temporarily while a helper function is called and if that panics then the -logical guarantee is no longer valid. - -As we've also seen, however, it's possible to cause memory unsafety through -panics when dealing with `unsafe` code. The key part of this is that you have to -have `unsafe` somewhere to inject the memory unsafety, and you largely just need -to worry about exception safety in the confines of an unsafe block. +to a destructor running on a panic. As we've also seen, however, it's possible +to cause memory unsafety through panics when dealing with `unsafe` code. The key +part of this is that you have to have `unsafe` somewhere to inject the memory +unsafety, and you largely just need to worry about exception safety in the +context of unsafe code. + +Even though mixing safe Rust and panics cannot cause undefined behavior, it's +possible for a **logical** invariant to be violated as a result of a panic. +These sorts of situations can often become serious bugs and are difficult to +audit for, so it means that exception safety in Rust is unfortunately not a +situation that can be completely sidestepped. ### Exception Safety in Rust From cfa5f9be6b586c2172ef600ea592febeb1bb68c3 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 27 Jul 2015 21:09:24 -0700 Subject: [PATCH 0425/1195] Fix typo --- text/0000-stabilize-catch-panic.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 46c61c93008..63502a90ac1 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -58,7 +58,7 @@ bar(); foo = false; ``` -It may intuitive to say that this block of code returns that `foo`'s value is +It may be intuitive to say that this block of code returns that `foo`'s value is always `false` (e.g. a local invariant of ours). If, however, the `bar` function panics, then the block of code will "return" (because of unwinding), but the value of `foo` is still `true`. Let's take a look at a more harmful example to From a75b13a3840f6f64a7ed8c54e9e287d2bc88ae60 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 28 Jul 2015 19:56:10 +0200 Subject: [PATCH 0426/1195] RFC to replace `in PLACE { BLOCK }` with `PLACE <- EXPR`. --- text/0000-placement-left-arrow.md | 120 ++++++++++++++++++++++++++++++ 1 file changed, 120 insertions(+) create mode 100644 text/0000-placement-left-arrow.md diff --git a/text/0000-placement-left-arrow.md b/text/0000-placement-left-arrow.md new file mode 100644 index 00000000000..e565cd58fac --- /dev/null +++ b/text/0000-placement-left-arrow.md @@ -0,0 +1,120 @@ +- Feature Name: place_left_arrow_syntax +- Start Date: 2015-07-28 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Rather than trying to find a clever syntax for placement-new that leverages +the `in` keyword, instead use the syntax `PLACE_EXPR <- VALUE_EXPR`. + +This takes advantage of the fact that `<-` was reserved as a token via +historical accident (that for once worked out in our favor). + +# Motivation + +One sentence: the syntax `a <- b` is short, can be parsed without +ambiguity, and is strongly connotated already with assignment. + +Further text (essentially historical background): + +There is much debate about what syntax to use for placement-new. +We started with `box (PLACE_EXPR) VALUE_EXPR`, then migrated towards +leveraging the `in` keyword instead of `box`, yielding `in (PLACE_EXPR) VALUE_EXPR`. + +A lot of people disliked the `in (PLACE_EXPR) VALUE_EXPR` syntax +(see discussion from [RFC 809]). + +[RFC 809]: https://github.com/rust-lang/rfcs/pull/809 + +In response to that discussion (and also due to personal preference) +I suggested the alternative syntax `in PLACE_EXPR { BLOCK_EXPR }`, +which is what landed when [RFC 809] was merged. + +However, it is worth noting that this alternative syntax actually +failed to address a number of objections (some of which also +applied to the original `in (PLACE_EXPR) VALUE_EXPR` syntax): + + * [kennytm](https://github.com/rust-lang/rfcs/pull/809#issuecomment-73071324) + + > While in (place) value is syntactically unambiguous, it looks + > completely unnatural as a statement alone, mainly because there + > are no verbs in the correct place, and also using in alone is + > usually associated with iteration (for x in y) and member + > testing (elem in set). + + * [petrochenkov](https://github.com/rust-lang/rfcs/pull/809#issuecomment-73142168) + + > As C++11 experience has shown, when it's available, it will + > become the default method of inserting elements in containers, + > since it's never performing worse than "normal insertion" and + > is often better. So it should really have as short and + > convenient syntax as possible. + + * [p1start](https://github.com/rust-lang/rfcs/pull/809#issuecomment-73837430) + + > I’m not a fan of in { }, simply because the + > requirement of a block suggests that it’s some kind of control + > flow structure, or that all the statements inside will be + > somehow run ‘in’ the given (or perhaps, as @m13253 + > seems to have interpreted it, for all box expressions to go + > into the given place). It would be our first syntactical + > construct which is basically just an operator that has to + > have a block operand. + +I believe the `PLACE_EXPR <- VALUE_EXPR` syntax addresses all of the +above concerns. + +# Detailed design + +Extend the parser to parse `EXPR <- EXPR`. + +`EXPR <- EXPR` is parsed into an AST form that is desugared in much +the same way that `in EXPR { BLOCK }` or `box (EXPR) EXPR` are +desugared (see [PR 27215]). + +Thus the static and dynamic semantics of `PLACE_EXPR <- VALUE_EXPR` +are *equivalent* to `box (PLACE_EXPR) VALUE_EXPR`. Namely, it is +still an expression form that operates by: + 1. Evaluate the `PLACE_EXPR` to a place + 2. Evaluate `VALUE_EXPR` directly into the constructed place + 3. Return the finalized place value. + +(See protocol as documented in [RFC 809] for more details here.) + +[PR 27215]: https://github.com/rust-lang/rust/pull/27215 + +This parsing form can be separately feature-gated (this RFC was +written assuming that would be the procedure). However, since +placement-`in` landed very recently ([PR 27215]) and is still +feature-gated, we can also just fold this change in with +the pre-existing `placement_in_syntax` feature gate +(though that may be non-intuitive since the keyword `in` is +no longer part of the syntactic form). + +This feature has already been prototyped, see [place-left-syntax branch]. + +[place-left-syntax branch]: https://github.com/rust-lang/rust/compare/rust-lang:master...pnkfelix:place-left-syntax + +# Drawbacks + +The only drawback I am aware of is this [comment from nikomataskis](https://github.com/rust-lang/rfcs/pull/809#issuecomment-73903777) + +> the intent is less clear than with a devoted keyword. + +Note however that this was stated with regards to a hypothetical +overloading of the `=` operator (at least that is my understanding). + +I think the use of the `<-` operator can be considered sufficiently +"devoted" (i.e. separate) syntax to placate the above concern. + +# Alternatives + +See [different surface syntax] from the alternatives from [RFC 809]. + +[different surface syntax]: https://github.com/pnkfelix/rfcs/blob/fsk-placement-box-rfc/text/0000-placement-box.md#same-semantics-but-different-surface-syntax + +# Unresolved questions + +None + From c54de03922e9abd80d0b9282785331fcb202616a Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 28 Jul 2015 19:58:37 +0200 Subject: [PATCH 0427/1195] a note about people referencing the old RFC. --- text/0000-placement-left-arrow.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-placement-left-arrow.md b/text/0000-placement-left-arrow.md index e565cd58fac..fd301fe9118 100644 --- a/text/0000-placement-left-arrow.md +++ b/text/0000-placement-left-arrow.md @@ -96,6 +96,13 @@ This feature has already been prototyped, see [place-left-syntax branch]. [place-left-syntax branch]: https://github.com/rust-lang/rust/compare/rust-lang:master...pnkfelix:place-left-syntax +Finally, it would may be good, as part of this process, to actually +amend the text [RFC 809] itself to use the `a <- b` syntax. +At least, it seems like many people use the RFC's as a reference source +even when they are later outdated. +(An easier option though may be to just add a forward reference to this +RFC from [RFC 809], if this RFC is accepted.) + # Drawbacks The only drawback I am aware of is this [comment from nikomataskis](https://github.com/rust-lang/rfcs/pull/809#issuecomment-73903777) From 2b06035a34c26c15c7148e7520437368f5842d86 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 29 Jul 2015 00:19:00 +0200 Subject: [PATCH 0428/1195] a few clarifying remarks. --- text/0000-placement-left-arrow.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/text/0000-placement-left-arrow.md b/text/0000-placement-left-arrow.md index fd301fe9118..b7c4cad6735 100644 --- a/text/0000-placement-left-arrow.md +++ b/text/0000-placement-left-arrow.md @@ -96,6 +96,19 @@ This feature has already been prototyped, see [place-left-syntax branch]. [place-left-syntax branch]: https://github.com/rust-lang/rust/compare/rust-lang:master...pnkfelix:place-left-syntax +Then, (after sufficient snapshot and/or time passes) remove the following syntaxes: + + * `box (PLACE_EXPR) VALUE_EXPR` + * `in PLACE_EXPR { VALUE_BLOCK }` + +That is, `PLACE_EXPR <- VALUE_EXPR` will be the "one true way" to +express placement-new. + +(Note that support for `box VALUE_EXPR` will remain, and in fact, the +expression `(box ())` expression will become unambiguous and thus we +could make it legal. Because, you know, those boxes of unit have a +syntax that is really important to optimize.) + Finally, it would may be good, as part of this process, to actually amend the text [RFC 809] itself to use the `a <- b` syntax. At least, it seems like many people use the RFC's as a reference source From fc7548c4f913f3d2dd3256e3b35400b192d6ff7b Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 29 Jul 2015 00:26:38 +0200 Subject: [PATCH 0429/1195] add some examples and an alternative. --- text/0000-placement-left-arrow.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/text/0000-placement-left-arrow.md b/text/0000-placement-left-arrow.md index b7c4cad6735..71758e27b16 100644 --- a/text/0000-placement-left-arrow.md +++ b/text/0000-placement-left-arrow.md @@ -65,6 +65,22 @@ applied to the original `in (PLACE_EXPR) VALUE_EXPR` syntax): I believe the `PLACE_EXPR <- VALUE_EXPR` syntax addresses all of the above concerns. +Thus cases like allocating into an arena (which needs to take as input the arena itself +and a value-expression, and returns a reference or handle for the allocated entry in the arena -- i.e. *cannot* return unit) +would look like: + +```rust +let ref_1 = arena <- value_expression; +let ref_2 = arena <- value_expression; +``` + +compare the above against the way this would look under [RFC 809]: + +```rust +let ref_1 = in arena { value_expression }; +let ref_2 = in arena { value_expression }; +``` + # Detailed design Extend the parser to parse `EXPR <- EXPR`. @@ -134,6 +150,14 @@ See [different surface syntax] from the alternatives from [RFC 809]. [different surface syntax]: https://github.com/pnkfelix/rfcs/blob/fsk-placement-box-rfc/text/0000-placement-box.md#same-semantics-but-different-surface-syntax +Also, if we want to try to make it clear that this is not *just* +an assignment, we could combine `in` and `<-`, yielding e.g.: + +```rust +let ref_1 = in arena <- value_expression; +let ref_2 = in arena <- value_expression; +``` + # Unresolved questions None From 6c1c86967e317682de46747d747f97937c3ec343 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 29 Jul 2015 14:30:50 -0700 Subject: [PATCH 0430/1195] wip --- text/0000-stabilize-catch-panic.md | 87 +++++++++++++++++++++++++----- 1 file changed, 73 insertions(+), 14 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 63502a90ac1..66740209143 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -10,9 +10,9 @@ bounds from the closure parameter. # Motivation -In today's Rust it's not currently possible to catch a panic by design. There -are a number of situations, however, where catching a panic is either required -for correctness or necessary for building a useful abstraction: +In today's stable Rust it's not currently possible to catch a panic. There are a +number of situations, however, where catching a panic is either required for +correctness or necessary for building a useful abstraction: * It is currently defined as undefined behavior to have a Rust program panic across an FFI boundary. For example if C calls into Rust and Rust panics, then @@ -21,10 +21,8 @@ for correctness or necessary for building a useful abstraction: * Abstactions like thread pools want to catch the panics of tasks being run instead of having the thread torn down (and having to spawn a new thread). -The purpose of the unstable `thread::catch_panic` function is to solve these -problems by enabling you to catch a panic in Rust before control flow is -returned back over to C. As a refresher, the signature of the function looks -like: +Stabilizing the `catch_panic` function would enable these two use cases, but +let's also take a look at the current signature of the function: ```rust fn catch_panic(f: F) -> thread::Result @@ -33,13 +31,74 @@ fn catch_panic(f: F) -> thread::Result This function will run the closure `f` and if it panics return `Err(Box)`. If the closure doesn't panic it will return `Ok(val)` where `val` is the -returned value of the closure. Most of these aspects "pretty much make sense", -but an odd part about this signature is the `Send` and `'static` bounds on the -closure provided. At a high level, these two bounds are intended to mitigate -problems related to something many programmers call "exception safety". To -understand why let's first briefly review exception safety in Rust. +returned value of the closure. The closure, however, is restricted to only close +over `Send` and `'static` data. This can be overly restrictive at times and it's +also not clear what purpose the bounds are serving today, hence the desire to +remove these bounds. + +Historically Rust has purposefully avoided the foray into the situation of +catching panics, largely because of a problem typically referred to as +"exception safety". To further understand the motivation of stabilization and +relaxing the bounds, let's review what exception safety is and what it means for +Rust. + +# Background: What is exception safety? + +Languages with exceptions have the property that a function can "return" early +if an exception is thrown. This is normally not something that needs to be +worried about, but this form of control flow can often be surprising and +unexpected. If an exception ends up causing unexpected behavior or a bug then +code is said to not be **exception safe**. + +Unexpected bugs arising because of an exception typically boil down to an +invariant being broken at runtime which is then observed later on. For example +many data structures often have a number of invariants that are dynamically +upheld for correctness, but if these invariants are broken then an observation +of the data structure may result in unexpected behavior. Routines inside these +data structures tend to temporarily break invariants as an inherent part of the +implementation, fixing up the state before a function returns, but if an +exception being thrown could cause the function to return early and expose the +broken invariant. The observation of this broken invariant can happen because +of: + +* A finally block (code run on a normal or exceptional return) may still have + access to the broken data structure. +* If an exception can be caught in the language, then the broken data structure + may still be accessible after the exception is caught. + +To be exception safe, code needs to be prepared for an exception to possibly be +thrown whenever an invariant it relies on is broken. There are a number of +tactics to do this, such as: + +* Audit code to ensure it only calls functions which are known to not throw an + exception. +* Place local "cleanup" handlers on the stack to restore invariants whenever a + function returns, either normally or exceptionally. This can be done through + finally blocks in some languages for via destructors in others. +* Catch exceptions locally to perform cleanup before possibly re-raising the + exception. + + + + + + + + + + + + + + + + + + + + + -### Exception Safety The problem of exception safety often plagues many C++ programmers (and other languages), and it essentially means that code needs to be ready to handle @@ -64,7 +123,7 @@ panics, then the block of code will "return" (because of unwinding), but the value of `foo` is still `true`. Let's take a look at a more harmful example to see how this can go wrong: -``` +```rust pub fn push_ten_more(v: &mut Vec, t: T) { unsafe { v.reserve(10); From c78deef0fa4de5492c8870ae4d18c3e9b5f7558c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 29 Jul 2015 14:34:05 -0700 Subject: [PATCH 0431/1195] Remove all mentions of stabilizing lang items --- text/0000-stabilize-no_std.md | 53 ++++++----------------------------- 1 file changed, 9 insertions(+), 44 deletions(-) diff --git a/text/0000-stabilize-no_std.md b/text/0000-stabilize-no_std.md index aa18fa09775..9d8ee92e95d 100644 --- a/text/0000-stabilize-no_std.md +++ b/text/0000-stabilize-no_std.md @@ -5,8 +5,8 @@ # Summary -Stabilize the `#![no_std]` attribute, add a new `#![no_core]` attribute, and -start stabilizing the libcore library. +Tweak the `#![no_std]` attribute, add a new `#![no_core]` attribute, and +pave the way for stabilizing the libcore library. # Motivation @@ -43,10 +43,9 @@ must be available in a stable fashion. This RFC proposes a nuber of changes: -* Stabilize the `#![no_std]` attribute after tweaking its behavior slightly +* Tweak the `#![no_std]` attribute slightly. * Introduce a `#![no_core]` attribute. -* Stabilize the name "core" in libcore. -* Introduce a `#![lang_items_abort]` attribute. +* Pave the way to stabilize the `core` module. ## `no_std` @@ -79,9 +78,9 @@ this attribute instead of `#![no_std]`. ## Stabilization of libcore This RFC does not yet propose a stabilization path for the contents of libcore, -but it proposes stabilizing the name `core` for libcore, paving the way for the -rest of the library to be stabilized. The exact method of stabilizing its -contents will be determined with a future RFC or pull requests. +but it proposes readying to stabilize the name `core` for libcore, paving the +way for the rest of the library to be stabilized. The exact method of +stabilizing its contents will be determined with a future RFC or pull requests. ## Stabilizing lang items @@ -102,18 +101,8 @@ reasons: * These items are pretty obscure and it's not very widely known what they do or how they should be implemented. -For `#![no_std]` to be generally useful, however, these lang items *must* be -able to be defined in one form or another on stable Rust, so this RFC proposes a -new crate attribute, `lang_items_abort`, which will define these functions. Any -crate tagged with `#![lang_items_abort]` will cause the compiler to generate any -necessary language items to get the program to correctly link. Each lang item -generated will simply abort the program as if it called the `intrinsics::abort` -function. - -This attribute will behave the same as `#[lang]` in terms of uniqueness, two -crates declaring `#![lang_items_abort]` cannot be linked together and an -upstream crate declaring this attribute means that no downstream crate has to -worry about it. +Stabilization of these lang items (in any form) will be considered in a future +RFC. # Drawbacks @@ -137,13 +126,6 @@ This RFC just enables creation of Rust static or dynamic libraries which don't depend on the standard library in addition to Rust libraries (rlibs) which do not depend on the standard library. -On the topic of lang items, it's somewhat unfortunate that the implementation of -a panic cannot be defined on stable Rust. The `#![lang_items_abort]` attribute -unconditionally defines all lang items, including `panic_fmt`, so it's not -possible to provide a custom implementation of the `panic_fmt` lang item while -still asking the compiler to define others like `eh_personality` and -`stack_exhausted`. - In stabilizing the `#![no_std]` attribute it's likely that a whole ecosystem of crates will arise which work with `#![no_std]`, but in theory all of these crates should also interoperate with the rest of the ecosystem using `std`. @@ -161,23 +143,6 @@ happen: import the core prelude manually. The burden of adding `#![no_core]` to the compiler, however, is seen as not-too-bad compared to the increase in ergonomics of using `#![no_std]`. -* The lang items could not be required to be defined, and the compiler could - provide aborting stubs to be linked in if they aren't defined anywhere else. - This has the downside of perhaps silently aborting a program, however, without - an explicit opt-in. -* The compiler could not require `eh_personality` or `stack_exhausted` if no - crate in the dependency tree has landing pads enabled or stack overflow checks - enabled. This is quite a difficult situation to get into today, however, as - the libcore distribution always has these enabled and Cargo does not easily - provide a method to configure this when compiling crates. The overhead of - defining these functions seems small and because the compiler could stop - requiring them in the future it seems plausibly ok to require them today. -* The lang items could be stabilized at this time instead of providing a way to - have the compiler generate an appropriate function. The downsides of this - approach, however, were listed above. -* The various language items could not be stabilized at this time, allowing - stable libraries that leverage `#![no_std]` but not stable final artifacts - (e.g. staticlibs, dylibs, or binaries). * Another stable crate could be provided by the distribution which provides definitions of these lang items which are all wired to abort. This has the downside of selecting a name for this crate, however, and also inflating the From fd0afde8bc0484314485955c00d7ca8ce104e789 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 29 Jul 2015 14:37:54 -0700 Subject: [PATCH 0432/1195] RFC 1183 is being able to change the default allocator --- text/{0000-swap-out-jemalloc.md => 1183-swap-out-jemalloc.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-swap-out-jemalloc.md => 1183-swap-out-jemalloc.md} (98%) diff --git a/text/0000-swap-out-jemalloc.md b/text/1183-swap-out-jemalloc.md similarity index 98% rename from text/0000-swap-out-jemalloc.md rename to text/1183-swap-out-jemalloc.md index 737dd905e7c..83de6c58ac5 100644 --- a/text/0000-swap-out-jemalloc.md +++ b/text/1183-swap-out-jemalloc.md @@ -1,7 +1,7 @@ - Feature Name: `allocator` - Start Date: 2015-06-27 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1183](https://github.com/rust-lang/rfcs/pull/1183) +- Rust Issue: [rust-lang/rust#27389](https://github.com/rust-lang/rust/issues/27389) # Summary From 2c2c429ec261ec57e45b4e73c3fd9d9edc41a772 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 29 Jul 2015 18:47:23 -0700 Subject: [PATCH 0433/1195] RFC 1184 is Stabilize no_std --- README.md | 1 + text/{0000-stabilize-no_std.md => 1184-stabilize-no_std.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-stabilize-no_std.md => 1184-stabilize-no_std.md} (98%) diff --git a/README.md b/README.md index 83914eb588c..60273a8fb6e 100644 --- a/README.md +++ b/README.md @@ -60,6 +60,7 @@ the direction the language is evolving in. * [1122-language-semver.md](text/1122-language-semver.md) * [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) * [1156-adjust-default-object-bounds.md](text/1156-adjust-default-object-bounds.md) +* [1184-stabilize-no_std.md](text/1184-stabilize-no_std.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-stabilize-no_std.md b/text/1184-stabilize-no_std.md similarity index 98% rename from text/0000-stabilize-no_std.md rename to text/1184-stabilize-no_std.md index 9d8ee92e95d..6f2cbbb896c 100644 --- a/text/0000-stabilize-no_std.md +++ b/text/1184-stabilize-no_std.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-06-26 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1184 +- Rust Issue: https://github.com/rust-lang/rust/issues/27394 # Summary From a195994f5676d0774b9b48e3a9b2214cb5db1027 Mon Sep 17 00:00:00 2001 From: Andrew Paseltiner Date: Thu, 30 Jul 2015 08:24:37 -0400 Subject: [PATCH 0434/1195] remove question of changing `insert`'s replacement behavior It is more flexible to provide both behaviors. --- text/0000-collection-recovery.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/text/0000-collection-recovery.md b/text/0000-collection-recovery.md index 5481e7143f7..ea2846291f4 100644 --- a/text/0000-collection-recovery.md +++ b/text/0000-collection-recovery.md @@ -167,6 +167,3 @@ Do nothing. # Unresolved questions Are these the best method names? - -Should `{BTreeMap, HashMap}::insert` be changed to replace equivalent keys? This could break code -relying on the old behavior, and would add an additional inconsistency to `OccupiedEntry::insert`. From 1ace1e4c30046c40b23bb1743854081a0f951d6a Mon Sep 17 00:00:00 2001 From: Oliver Schneider Date: Thu, 30 Jul 2015 16:26:03 +0200 Subject: [PATCH 0435/1195] turn statically known erroneous code into a warning and an unconditional panic --- text/0000-compile-time-asserts.md | 88 +++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 text/0000-compile-time-asserts.md diff --git a/text/0000-compile-time-asserts.md b/text/0000-compile-time-asserts.md new file mode 100644 index 00000000000..7aefb2548b7 --- /dev/null +++ b/text/0000-compile-time-asserts.md @@ -0,0 +1,88 @@ +- Feature Name: compile_time_asserts +- Start Date: 2015-07-30 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +If the compiler can detect at compile-time that something will always +cause a `debug_assert` or an `assert` it should instead +insert an unconditional runtime-panic and issue a warning. + +# Motivation + +Expressions are const-evaluated even when they are not in a const environment. + +For example + +```rust +fn blub(t: T) -> T { t } +let x = 5 << blub(42); +``` + +will not cause a compiler error currently, while `5 << 42` will. +If the constant evaluator gets smart enough, it will be able to const evaluate +the `blub` function. This would be a breaking change, since the code would not +compile anymore. (this occurred in https://github.com/rust-lang/rust/pull/26848). + +GNAT (an Ada compiler) does this already: + +```ada +procedure Hello is + Var: Integer range 15 .. 20 := 21; +begin + null; +end Hello; +``` + +The anonymous subtype `Integer range 15 .. 20` only accepts values in `[15, 20]`. +This knowledge is used by GNAT to emit the following warning during compilation: + +``` +warning: value not in range of subtype of "Standard.Integer" defined at line 2 +warning: "Constraint_Error" will be raised at run time +``` + +I don't have a GNAT with `-emit-llvm` handy, but here's the asm with `-O0`: + +```asm +.cfi_startproc +pushq %rbp +.cfi_def_cfa_offset 16 +.cfi_offset 6, -16 +movq %rsp, %rbp +.cfi_def_cfa_register 6 +movl $2, %esi +movl $.LC0, %edi +movl $0, %eax +call __gnat_rcheck_CE_Range_Check +``` + + +# Detailed design + +The PRs https://github.com/rust-lang/rust/pull/26848 and https://github.com/rust-lang/rust/pull/25570 will be setting a precedent +for warning about such situations (WIP, not pushed yet). +All future additions to the const-evaluator need to notify the const evaluator +that when it encounters a statically known erroneous situation, the +entire expression must be replaced by a panic and a warning must be emitted. + +# Drawbacks + +None, if we don't do anything, the const evaluator cannot get much smarter. + +# Alternatives + +## allow breaking changes + +Let the compiler error on things that will unconditionally panic at runtime. + +## only warn, don't influence code generation + +This has the disadvantage, that in release-mode statically known issues like +overflow or shifting more than the number of bits available will not be +caught even at runtime. + +# Unresolved questions + +How to implement this? From 1a7110bd8b6ca96b579604e626d76909ef5853c2 Mon Sep 17 00:00:00 2001 From: Oliver Schneider Date: Fri, 31 Jul 2015 10:35:24 +0200 Subject: [PATCH 0436/1195] add a definition of constant evaluation context --- text/0000-compile-time-asserts.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/text/0000-compile-time-asserts.md b/text/0000-compile-time-asserts.md index 7aefb2548b7..b37380fd0c4 100644 --- a/text/0000-compile-time-asserts.md +++ b/text/0000-compile-time-asserts.md @@ -9,6 +9,20 @@ If the compiler can detect at compile-time that something will always cause a `debug_assert` or an `assert` it should instead insert an unconditional runtime-panic and issue a warning. +# Definition of constant evaluation context + +There are exactly three places where an expression needs to be constant. + +- the initializer of a constant `const foo: ty = EXPR` or `static foo: ty = EXPR` +- the size of an array `[T; EXPR]` +- the length of a repeat expression `[VAL; LEN_EXPR]` + +In the future the body of `const fn` might also be interpreted as a constant +evaluation context. + +Any other expression might still be constant evaluated, but it could just +as well be compiled normally and executed at runtime. + # Motivation Expressions are const-evaluated even when they are not in a const environment. From 3855520c722f606934e6a19502e31f490eea71ab Mon Sep 17 00:00:00 2001 From: Oliver Schneider Date: Fri, 31 Jul 2015 10:35:34 +0200 Subject: [PATCH 0437/1195] minor clarifications --- text/0000-compile-time-asserts.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0000-compile-time-asserts.md b/text/0000-compile-time-asserts.md index b37380fd0c4..d3f8e424cf1 100644 --- a/text/0000-compile-time-asserts.md +++ b/text/0000-compile-time-asserts.md @@ -78,7 +78,8 @@ call __gnat_rcheck_CE_Range_Check The PRs https://github.com/rust-lang/rust/pull/26848 and https://github.com/rust-lang/rust/pull/25570 will be setting a precedent for warning about such situations (WIP, not pushed yet). All future additions to the const-evaluator need to notify the const evaluator -that when it encounters a statically known erroneous situation, the +that when it encounters a statically known erroneous situation while evaluating +an expression outside of a constant evaluation environment, the entire expression must be replaced by a panic and a warning must be emitted. # Drawbacks @@ -93,10 +94,13 @@ Let the compiler error on things that will unconditionally panic at runtime. ## only warn, don't influence code generation +The const evaluator should simply issue a warning and notify it's caller that the expression cannot be evaluated and should be translated. This has the disadvantage, that in release-mode statically known issues like overflow or shifting more than the number of bits available will not be caught even at runtime. +On the other hand, this alternative does not change the behavior of existing code. + # Unresolved questions How to implement this? From 81079080ca8087b9a5ebf118c437c13090aa3f42 Mon Sep 17 00:00:00 2001 From: Oliver Schneider Date: Fri, 31 Jul 2015 10:49:51 +0200 Subject: [PATCH 0438/1195] add const fn unresolved question --- text/0000-compile-time-asserts.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/text/0000-compile-time-asserts.md b/text/0000-compile-time-asserts.md index d3f8e424cf1..796e60a75e8 100644 --- a/text/0000-compile-time-asserts.md +++ b/text/0000-compile-time-asserts.md @@ -103,4 +103,13 @@ On the other hand, this alternative does not change the behavior of existing cod # Unresolved questions -How to implement this? +## How to implement this? + +## Const-eval the body of `const fn` that are never used in a constant environment + +Currently a `const fn` that is called in non-const code is treated just like a normal function. + +In case there is a statically known erroneous situation in the body of the function, +the compiler should raise an error, even if the function is never called. + +The same applies to unused associated constants. From 1354de7b180575bc31560bd26b3f6f93bc264252 Mon Sep 17 00:00:00 2001 From: Oliver Schneider Date: Fri, 31 Jul 2015 21:06:45 +0200 Subject: [PATCH 0439/1195] added c-like enums and patterns to const context list --- text/0000-compile-time-asserts.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-compile-time-asserts.md b/text/0000-compile-time-asserts.md index 796e60a75e8..e6423478b19 100644 --- a/text/0000-compile-time-asserts.md +++ b/text/0000-compile-time-asserts.md @@ -11,11 +11,13 @@ insert an unconditional runtime-panic and issue a warning. # Definition of constant evaluation context -There are exactly three places where an expression needs to be constant. +There are exactly five places where an expression needs to be constant. - the initializer of a constant `const foo: ty = EXPR` or `static foo: ty = EXPR` - the size of an array `[T; EXPR]` - the length of a repeat expression `[VAL; LEN_EXPR]` +- C-Like enum variant discriminant values +- patterns In the future the body of `const fn` might also be interpreted as a constant evaluation context. From 937d9adf039bf8aaa1239f248d7e3b8c5bc3702a Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Fri, 31 Jul 2015 16:46:37 -0700 Subject: [PATCH 0440/1195] Use the extra-field desugaring. --- text/0000-inclusive-ranges.md | 42 ++++++++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 11 deletions(-) diff --git a/text/0000-inclusive-ranges.md b/text/0000-inclusive-ranges.md index 800bdd8d191..7783642eb18 100644 --- a/text/0000-inclusive-ranges.md +++ b/text/0000-inclusive-ranges.md @@ -29,25 +29,25 @@ more dots means more elements. pub struct RangeInclusive { pub start: T, pub end: T, + pub finished: bool, } ``` Writing `a...b` in an expression desugars to `std::ops::RangeInclusive -{ start: a, end: b }`. +{ start: a, end: b, finished: false }`. This struct implements the standard traits (`Clone`, `Debug` etc.), -but, unlike the other `Range*` types, does not implement `Iterator` -directly, since it cannot do so correctly without more internal -state. It can implement `IntoIterator` that converts it into an -iterator type that contains the necessary state. +and implements `Iterator`. The `finished` field is to allow the +`Iterator` implementation to work without hacks (see Alternatives). The use of `...` in a pattern remains as testing for inclusion within that range, *not* a struct match. The author cannot forsee problems with breaking backward compatibility. In particular, one tokenisation of syntax like `1...` -now would be `1. ..` i.e. a floating point number on the left, however, fortunately, -it is actually tokenised like `1 ...`, and is hence an error. +now would be `1. ..` i.e. a floating point number on the left, +however, fortunately, it is actually tokenised like `1 ...`, and is +hence an error with the current compiler. # Drawbacks @@ -58,10 +58,8 @@ semantically.) The `...` vs. `..` distinction is the exact inversion of Ruby's syntax. -Only implementing `IntoIterator` means uses of it in iterator chains -look like `(a...b).into_iter().collect()` instead of -`(a..b).collect()` as with exclusive ones (although this doesn't -affect `for` loops: `for _ in a...b` works fine). +Having an extra field in a language-level desugaring, catering to one +library use-case is a little non-"hygienic". # Alternatives @@ -74,6 +72,28 @@ winner. This RFC doesn't propose non-double-ended syntax, like `a...`, `...b` or `...` since it isn't clear that this is so useful. Maybe it is. +The `finished` field could be omitted, leaving two options: + +- `a...b` only implements `IntoIterator`, not `Iterator`, by + converting to a different type that does have the field. However, + this means that `a...b` behaves differently to `a..b`, so + `(a...b).map(|x| ...)` doesn't work (the `..` version of that is + used reasonably often, in the author's experience) +- `a...b` can implement `Iterator` for types that can be stepped + backwards: the only case that is problematic things cases like + `x...255u8` where the endpoint is the last value in the type's + range. A naive implementation that just steps `x` and compares + against the second value will never terminate: it will yield 254 + (final state: `255...255`), 255 (final state: `0...255`), 0 (final + state: `1...255`). I.e. it will wrap around because it has no way to + detect whether 255 has been yielded or not. However, implementations + of `Iterator` can detect cases like that, and, after yielding `255`, + backwards-step the second piece of state to `255...254`. + + This means that `a...b` can only implement `Iterator` for types that + can be stepped backwards, which isn't always guaranteed, e.g. types + might not have a unique predecessor (walking along a DAG). + # Unresolved questions None so far. From 17e23474ec708907c889e97d84aca1c6d478af62 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Fri, 31 Jul 2015 19:20:51 -0700 Subject: [PATCH 0441/1195] Propose `...b` too. --- text/0000-inclusive-ranges.md | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/text/0000-inclusive-ranges.md b/text/0000-inclusive-ranges.md index 7783642eb18..90a76cc8b94 100644 --- a/text/0000-inclusive-ranges.md +++ b/text/0000-inclusive-ranges.md @@ -31,13 +31,18 @@ pub struct RangeInclusive { pub end: T, pub finished: bool, } + +pub struct RangeToInclusive { + pub end: T, +} ``` Writing `a...b` in an expression desugars to `std::ops::RangeInclusive -{ start: a, end: b, finished: false }`. +{ start: a, end: b, finished: false }`. Writing `...b` in an +expression desugars to `std::ops::RangeToInclusive { end: b }`. -This struct implements the standard traits (`Clone`, `Debug` etc.), -and implements `Iterator`. The `finished` field is to allow the +`RangeInclusive` implements the standard traits (`Clone`, `Debug` +etc.), and implements `Iterator`. The `finished` field is to allow the `Iterator` implementation to work without hacks (see Alternatives). The use of `...` in a pattern remains as testing for inclusion @@ -59,7 +64,9 @@ semantically.) The `...` vs. `..` distinction is the exact inversion of Ruby's syntax. Having an extra field in a language-level desugaring, catering to one -library use-case is a little non-"hygienic". +library use-case is a little non-"hygienic". It is especially strange +that the field isn't consistent across the different `...` +desugarings. # Alternatives From f642cd8058a570f115fc14481e48834d484ec1df Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Sun, 2 Aug 2015 17:41:36 -0700 Subject: [PATCH 0442/1195] specify that CoerceUnsized should ignore PhantomData fields --- text/0982-dst-coercion.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0982-dst-coercion.md b/text/0982-dst-coercion.md index c4263fd28b8..deff76c3f61 100644 --- a/text/0982-dst-coercion.md +++ b/text/0982-dst-coercion.md @@ -113,7 +113,7 @@ in the `Target` type. Assuming `Fs` is the type of a field in `Self` and `Ft` is the type of the corresponding field in `Target`, then either `Ft <: Fs` or `Fs: CoerceUnsized` (note that this includes some built-in coercions, coercions unrelated to unsizing are excluded, these could probably be added later, if needed). -* There must be only one field that is coerced. +* There must be only one non-PhantomData field that is coerced. * We record for each impl, the index of the field in the `Self` type which is coerced. @@ -135,7 +135,7 @@ is auto-deref'ed, but not autoref'ed. ### On encountering an adjustment (translation phase) * In trans (which is post-monomorphisation) we should always be able to find an -impl for any `CoerceUnsized` bound. +impl for any `CoerceUnsized` bound. * If the impl is for a built-in pointer type, then we use the current coercion code for the various pointer kinds (`Box` has different behaviour than `&` and `*` pointers). From f71c4b32e27215dd40c0c0483a883c825c575ad0 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Mon, 3 Aug 2015 11:30:30 -0700 Subject: [PATCH 0443/1195] Use intrinsics for arithmetic instead of built-in operators. --- text/0000-simd-infrastructure.md | 35 +++++++++++++++++++------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 41c7d8bd2e3..7fedaabed8f 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -166,8 +166,8 @@ but will be shimmed as efficiently as possible. - shuffles and extracting/inserting elements - comparisons - -Lastly, arithmetic and conversions are supported via built-in operators. +- arithmetic +- conversions ### Shuffles & element operations @@ -260,21 +260,28 @@ shuffles. Ensuring that `T` and `U` has the same length, and that `U` is appropriately "boolean"-y. Libraries can use traits to ensure that these will be enforced by the type checker too. -### Built-in functionality +### Arithmetic + +Intrinsics will be provided for arithmetic operations like addition +and multiplication. + +```rust +extern { + fn simd_add(x: T, y: T) -> T; + fn simd_mul(x: T, y: T) -> T; + // ... +} +``` -Any type marked `repr(simd)` automatically has the `+`, `-` and `*` -operators work. The `/` operator works for floating point, and the -`<<` and `>>` ones work for integers. +These will have codegen time checks that the element type is correct: -SIMD vectors can be converted with `as`. As with intrinsics, this is -"duck-typed" it is possible to cast a vector type `V` to a type `W` if -their lengths match and their elements are castable (i.e. are -primitives), there's no enforcement of nominal types. +- `add`, `sub`, `mul`: any float or integer type +- `div`: any float type +- `and`, `or`, `xor`, `shl` (shift left), `shr` (shift right): any + integer type -All of these operators and conversions are never checked (in the sense -of the arithmetic overflow checks of `-C debug-assertions`): explicit -SIMD is essentially only required for speed, and checking inflates one -instruction to 5 or more. +(The integer types are `i8`, ..., `i64`, `u8`, ..., `u64` and the +float types are `f32` and `f64`.) ### Why not inline asm? From 8b2ec8c16f8b9eafd4d0bd7fb20a187a0399028a Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Mon, 3 Aug 2015 11:33:18 -0700 Subject: [PATCH 0444/1195] Accidentally: - an extra word. - a subject-verb agreement. - an ly. - a plural. --- text/0000-simd-infrastructure.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 7fedaabed8f..cdfd8057c34 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -63,7 +63,7 @@ building tools that can be wrapped into a more uniform API later. ## Types -There is a new attributes: `repr(simd)`. +There is a new attribute: `repr(simd)`. ```rust #[repr(simd)] @@ -135,7 +135,7 @@ enforce a specific nominal type. NB. The structural typing is just for the declaration: if a SIMD intrinsic is declared to take a type `X`, it must always be called with `X`, even if other types are structurally equal to `X`. Also, within a -signature, SIMD types that must be structurally equal must be nominal +signature, SIMD types that must be structurally equal must be nominally equal. I.e. if the `add_...` all refer to the same intrinsic to add a SIMD vector of bytes, @@ -256,7 +256,7 @@ extern "rust-intrinsic" { ``` These are type checked during code-generation similarly to the -shuffles. Ensuring that `T` and `U` has the same length, and that `U` +shuffles: ensuring that `T` and `U` have the same length, and that `U` is appropriately "boolean"-y. Libraries can use traits to ensure that these will be enforced by the type checker too. @@ -406,6 +406,6 @@ cfg_if_else! { # Unresolved questions -- Should integer vectors get `/` and `%` automatically? Most CPUs - don't support them for vectors. However +- Should integer vectors get division automatically? Most CPUs + don't support them for vectors. - How should out-of-bounds shuffle and insert/extract indices be handled? From c9306cd7d40b529fb8cec947e35c6084c431410c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 3 Aug 2015 17:30:06 -0700 Subject: [PATCH 0445/1195] Overhaul text --- text/0000-stabilize-catch-panic.md | 512 +++++++++++------------------ 1 file changed, 183 insertions(+), 329 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 66740209143..51a9e35a46e 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -50,78 +50,61 @@ worried about, but this form of control flow can often be surprising and unexpected. If an exception ends up causing unexpected behavior or a bug then code is said to not be **exception safe**. -Unexpected bugs arising because of an exception typically boil down to an -invariant being broken at runtime which is then observed later on. For example -many data structures often have a number of invariants that are dynamically -upheld for correctness, but if these invariants are broken then an observation -of the data structure may result in unexpected behavior. Routines inside these -data structures tend to temporarily break invariants as an inherent part of the -implementation, fixing up the state before a function returns, but if an -exception being thrown could cause the function to return early and expose the -broken invariant. The observation of this broken invariant can happen because -of: - -* A finally block (code run on a normal or exceptional return) may still have - access to the broken data structure. -* If an exception can be caught in the language, then the broken data structure - may still be accessible after the exception is caught. - -To be exception safe, code needs to be prepared for an exception to possibly be -thrown whenever an invariant it relies on is broken. There are a number of -tactics to do this, such as: - -* Audit code to ensure it only calls functions which are known to not throw an - exception. -* Place local "cleanup" handlers on the stack to restore invariants whenever a - function returns, either normally or exceptionally. This can be done through - finally blocks in some languages for via destructors in others. -* Catch exceptions locally to perform cleanup before possibly re-raising the - exception. - - - - - - - - - - - - - - - - - - - - - - - -The problem of exception safety often plagues many C++ programmers (and other -languages), and it essentially means that code needs to be ready to handle -exceptional control flow. This primarily matters when an invariant is -temporarily broken in a region of code which can have exceptional control flow. -What this largely boils down to is that a block of code having only one entry -point but possibly many exit points, and invariants need to be upheld on all -exit points. - -For Rust this means that code needs to be prepared to handle panics as any -unknown function call can cause a thread to panic. For example: - -```rust -let mut foo = true; -bar(); -foo = false; -``` - -It may be intuitive to say that this block of code returns that `foo`'s value is -always `false` (e.g. a local invariant of ours). If, however, the `bar` function -panics, then the block of code will "return" (because of unwinding), but the -value of `foo` is still `true`. Let's take a look at a more harmful example to -see how this can go wrong: +The idea of throwing an exception causing bugs may sound a bit alien, so it's +helpful to drill down into exactly why this is the case. Bugs related to +exception safety are comprised of two critical components: + +1. An invariant of a data structure is broken. +2. This broken invariant is the later observed. + +Exceptional control flow often exacerbates this first component of breaking +invariants. For example many data structures often have a number of invariants +that are dynamically upheld for correctness, and the type's routines can +temporarily break these invariants to be fixed up before the function returns. +If, however, an exception is thrown in this interim period the broken invariant +could be accidentally exposed. + +The second component, observing a broken invariant, can sometimes be difficult +in the face of exceptions, but languages often have constructs to enable these +sorts of witnesses. Two primary methods of doing so are something akin to +finally blocks (code run on a normal or exceptional return) or just catching the +exception. In both cases code which later runs that has access to the original +data structure then it will see the broken invariants. + +Now that we've got a better understanding of how an exception might cause a bug +(e.g. how code can be "exception unsafe"), let's take a look how we can make +code exception safe. To be exception safe, code needs to be prepared for an +exception to be thrown whenever an invariant it relies on is broken, for +example: + +* Code can be audited to ensure it only calls functions which are statically + known to not throw an exception. +* Local "cleanup" handlers can be placed on the stack to restore invariants + whenever a function returns, either normally or exceptionally. This can be + done through finally blocks in some languages for via destructors in others. +* Exceptions can be caught locally to perform cleanup before possibly re-raising + the exception. + +With all that in mind, we've now identified problems that can arise via +exceptions (an invariant is broken and then observed) as well as methods to +ensure that prevent this from happening. In languages like C++ this means that +we can be memory safe in the face of exceptions and in languages like Java we +can ensure that our logical invariants are upheld. Given this background let's +take a look at how any of this applies to Rust. + +# Background: What is exception safety in Rust? + +> Note: This section describes the current state of Rust today without this RFC +> implemented + +Up to now we've been talking about exceptions and exception safety, but from a +Rust perspective we can just replace this with panics and panic safety. Panics +in Rust are currently implemented essentially as a C++ exception under the hood. +As a result, **exception safety is something that needs to be handled in Rust +code**. + +One of the primary examples where panics need to be handled in Rust is unsafe +code. Let's take a look at an example where this matters: ```rust pub fn push_ten_more(v: &mut Vec, t: T) { @@ -136,169 +119,108 @@ pub fn push_ten_more(v: &mut Vec, t: T) { } ``` -While this code may look correct, it's actually not memory safe. If the type -`T`'s `clone` method panics, then this vector will point to uninitialized data. -`Vec` has an internal invariant that the first `len` elements are safe to drop -at any time, and we have broken that invariant temporarily with a call to -`set_len`. If a call to `clone` panics then we'll exit this block before -reaching the end, causing the invariant breakage to be leaked. - -The problem with this code is that it's not **exception safe**. There are a -number of common strategies to help mitigate this problem: - -* Use a "finally" block or some other equivalent mechanism to restore invariants - on all exit paths. In Rust this typically manifests itself as a destructor on - a structure as the compiler will ensure that this is run whenever a panic - happens. -* Avoid calling code which can panic (e.g. functions with assertions or - functions with statically unknown implementations) whenever an invariant is - broken. - -In our example of `push_ten_more` we can take the second round of avoiding code -which can panic when an invariant is broken. If we call `set_len` on each -iteration of the loop with `len + i` then the vector's invariant will always bee -respected. - -### Catching Exceptions - -In languages with `catch` blocks exception unsafe code can often cause problems -more frequently. The core problem here is that shared state in the "try" block -and the "catch" block can end up getting corrupted. Due to a panic possibly -happening at any time, data may not often prepare for the panic and the catch -(or finally) block will then read this corrupt data. - -Rust has not had to deal with this problem much because there's no stable way to -catch a panic. One primary area this comes up is dealing with cross-thread -panics, and the standard library poisons mutexes and rwlocks by default to help -deal with this situation. The `catch_panic` function proposed in this RFC, -however, is exactly "catch for Rust". To see how this function is not making -Rust memory unsafe, let's take a look at how memory safety and exception safety -interact. - -### Exception Safety and Memory Safety - -If this is the first time you've ever heard about exception safety, this may -sound pretty bad! Chances are you haven't considered how Rust code can "exit" at -many points in a function beyond just the points where you wrote down `return`. -The good news is that Rust by default **is still memory safe** in the face of -this exception safety problem. - -All safe code in Rust is guaranteed to not cause any memory unsafety due to a -panic. There is never any invalid intermediate state which can then be read due -to a destructor running on a panic. As we've also seen, however, it's possible -to cause memory unsafety through panics when dealing with `unsafe` code. The key -part of this is that you have to have `unsafe` somewhere to inject the memory -unsafety, and you largely just need to worry about exception safety in the -context of unsafe code. - -Even though mixing safe Rust and panics cannot cause undefined behavior, it's -possible for a **logical** invariant to be violated as a result of a panic. -These sorts of situations can often become serious bugs and are difficult to -audit for, so it means that exception safety in Rust is unfortunately not a -situation that can be completely sidestepped. - -### Exception Safety in Rust - -Rust does not provide many primitives today to deal with exception safety, but -it's a situation you'll see handled in many locations when browsing unsafe -collections-related code, for example. One case where Rust does help you with -this is an aspect of Mutexes called [**poisoining**][poison]. - -[poison]: http://doc.rust-lang.org/std/sync/struct.Mutex.html#poisoning - -Poisoning is a mechanism for propagating panics among threads to ensure that -inconsistent state is not read. A mutex becomes poisoned if a thread holds the -lock and then panics. Most usage of a mutex simply `unwrap`s the result of -`lock()`, causing a panic in one thread to be propagated to all others that are -reachable. - -A key design aspect of poisoning, however, is that you can opt-out of poisoning. -The `Err` variant of the [`lock` method] provides the ability to gain access to -the mutex anyway. As explained above, exception safety can only lead to memory -unsafety when intermingled with unsafe code. This means that fundamentally -poisoning a Mutex is **not** guaranteeing memory safety, and hence getting -access to a poisoned mutex is not an unsafe operation. - -[`lock` method]: http://doc.rust-lang.org/std/sync/struct.Mutex.html#method.lock - -Exception safety is rarely considered when writing code in Rust, so the standard -library strives to help out as much as possible when it can. Poisoning mutexes -is a good example of this where ignoring panics in remote threads means that -mutexes could very commonly contain corrupted data (not memory unsafe, just -logically corrupt). There's typically an opt-out to these mechanisms, but by -default the standard library provides them. - -### `Send` and `'static` on `catch_panic` - -Alright, now that we've got a bit of background, let's explore why these bounds -were originally added to the `catch_panic` function. It was thought that these -two bounds would provide basically the same level of exception safety protection -that spawning a new thread does (e.g. today this requires both of these bounds). -This in theory meant that the addition of `catch_panic` to the standard library -would not exascerbate the concerns of exception safety. - -It [was discovered][cp-issue], however, that TLS can be used to bypass this -theoretical "this is the same as spawning a thread" boundary. Using TLS means -that you can share non-`Send` data across the `catch_panic` boundary, meaning -the caller of `catch_panic` may see invalid state. - -[cp-issue]: https://github.com/rust-lang/rust/issues/25662 - -As a result, these two bounds have been called into question, and this RFC is -recommending removing both bounds from the `catch_panic` function. - -### Is `catch_panic` unsafe? - -With the removal of the two bounds on this function, we can freely share state -across a "panic boundary". This means that we don't always know for sure if -arbitrary data is corrupted or not. As we've seen above, however, if we're only -dealing with safe Rust then this will not lead to memory unsafety. For memory -unsafety to happen it would require interaction with `unsafe` code at which -point the `unsafe` code is responsible for dealing with exception safety. - -The standard library has a clear definition for what functions are `unsafe`, and -it's precisely those which can lead to memory unsafety in otherwise safe Rust. -Because that is not the case for `catch_panic` it will not be declared as an -`unsafe` function. - -### What about other bounds? - -It has been discussed that there may be possible other bounds or mitigation -strategies for `catch_panic` (to help with the TLS problem described above), and -although it's somewhat unclear as to what this may precisely mean it's still the -case that the standard library will want a `catch_panic` with no bounds in -*some* form or another. - -The standard library is providing the lowest-level tools to create robust APIs, -and inevitably it should not forbid patterns that are safe. Rust itself does -this via the `unsafe` subset by allowing you to build up a safe abstraction on -unsafe underpinnings. Similarly any bound on `catch_panic` will eventually be -too restrictive for someone even though their usage is 100% safe. As a result -the standard library will always want (and was always going to have) a no-bounds -version of this function. - -As a result this RFC proposes not attempting to go through hoops to find a more -restrictive, but more helpful with exception safety, set of bounds for this -function and instead stabilize the no-bounds version. +While this code may look correct, it's actually not memory safe. +`Vec` has an internal invariant that its first `len` elements are safe to drop +at any time. Our function above has temporarily broken this invariant with the +call to `set_len` (the next 10 elements are uninitialized). If the type `T`'s +`clone` method panics then this broken invariant will escape the function. The +broken `Vec` is then observed during its destructor, leading to the eventual +memory unsafety. + +It's important to keep in mind that panic safety in Rust is not solely limited +to memory safety. *Logical invariants* are often just as critical to keep correct +during execution and no `unsafe` code in Rust is needed to break a logical +invariant. In practice, however, these sorts of bugs are rarely observed due to +Rust's design: + +* Rust doesn't expose uninitialized memory +* Panics cannot be caught in a thread +* Across threads data is poisoned by default on panics +* Idiomatic Rust must opt in to extra amounts of sharing data across boundaries + +With these mitigation tactics, it ends up being the case that **safe Rust code +can mostly ignore exception safety concerns**. That being said, it does not mean +that safe Rust code can *always* ignore exception safety issues. There are a +number of methods to subvert the mitigation strategies listed above: + +1. When poisoning data across threads, antidotes are available to access + poisoned data. Namely the [`PoisonError` type][pet] allows safe access to the + poisoned information. +2. Single-threaded types with interior mutability, such as `RefCell`, allow for + sharing data across stack frames such that a broken invariant could + eventually be observed. +3. Whenever a thread panics, the destructors for its stack variables will be run + as the thread unwinds. Destructors may have access to data which was also + accessible lower on the stack (such as through `RefCell` or `Rc`) which has a + broken invariant, and the destructor may then witness this. + +[pet]: http://doc.rust-lang.org/std/sync/struct.PoisonError.html + +Despite these methods to subvert the mitigations placed by default in Rust, a +key part of exception safety in Rust is that **safe code can never lead to +memory unsafety**, regardless of whether it panics or not. Memory unsafety +triggered as part of a panic can always be traced back to an `unsafe` block. + +With all that background out of the way now, let's take a look at the guts of +this RFC. # Detailed design -Stabilize `std::thread::catch_panic` after removing the `Send` and `'static` -bounds from the closure parameter, modifying the signature to be: +At its heard, the change this RFC is proposing is to stabilize +`std::thread::catch_panic` after removing the `Send` and `'static` bounds from +the closure parameter, modifying the signature to be: ```rust -fn catch_panic(f: F) -> thread::Result where F: FnOnce() -> R +fn catch_panic R, R>(f: F) -> thread::Result ``` +More generally, however, this RFC also claims that this stable function does +not radically alter Rust's exception safety story (explained above). + +### Exception safety mitigation + +A mitigation strategy for exception safety listed above is that a panic cannot +be caught within a thread, and this change would move that bullet to the list of +"methods to subvert the mitigation strategies" instead. Catching a panic (and +not having `'static` on the bounds list) makes it easier to observe broken +invariants of data structures shared across the `catch_panic` boundary, which +can possibly increase the likelihood of exception safety issues arising. + +One of the key reasons Rust doesn't provide an exhaustive set of mitigation +strategies is that the design of the language and standard library lead to +idiomatic code not having to worry about exception safety. The use cases for +`catch_panic` are relatively niche, and it is not expected for `catch_panic` to +overnight become the idiomatic method of handling errors in Rust. + +Essentially, the addition of `catch_panic`: + +* Does not mean that *only now* does Rust code need to consider exception + safety. This is something that already must be handled today. +* Does not mean that safe code everywhere must start worrying about exception + safety. This function is not the primary method to signal errors in Rust + (discussed later) and only adds a minor bullet to the list of situations that + safe Rust already needs to worry about exception safety in. + +### Will Rust have exceptions? + +In a technical sense this RFC is not "adding exceptions to Rust" as they +already exist in the form of panics. What this RFC is adding, however, is a +construct via which to catch these exceptions, bringing the standard library +closer to the exception support in other languages. Idiomatic usage of Rust, +however, will continue to follow the guidelines listed below for using a Result +vs using a panic (which also do not need to change to account for this RC). + +It's likely that the `catch_panic` function will only be used where it's +absolutely necessary, like FFI boundaries, instead of a general-purpose error +handling mechanism in all code. + # Drawbacks -A major drawback of this RFC is that it can mitigate Rust's error handling -story. On one hand this function can be seen as adding exceptions to Rust as -it's now possible to both throw (panic) and catch (`catch_panic`). The track -record of exceptions in languages like C++, Java, and Python hasn't been great, -and a drawing point of Rust for many has been the lack of exceptions. To help -understand what's going on, let's go through a brief overview of error handling -in Rust today: +A drawback of this RFC is that it can water down Rust's error handling story. +With the addition of a "catch" construct for exceptions, it may be unclear to +library authors whether to use panics or `Result` for their error types. There +are fairly clear guidelines and conventions about using a `Result` vs a `panic` +today, however, and they're summarized below for completeness. ### Result vs Panic @@ -316,13 +238,11 @@ Another way to put this division is that: * `Result`s represent errors that carry additional contextual information. This information allows them to be handled by the caller of the function producing the error, modified with additional contextual information, and eventually - converted into an error message fit for a human consumer of the top-level - program. + converted into an error message fit for a top-level program. * `panic`s represent errors that carry no contextual information (except, perhaps, debug information). Because they represented an unexpected error, - they cannot be easily handled by the caller of the function or presented to a - human consumer of the top-level program (except to say "something unexpected - has gone wrong"). + they cannot be easily handled by the caller of the function or presented to + the top-level program (except to say "something unexpected has gone wrong"). Some pros of `Result` are that it signals specific edge cases that you as a consumer should think about handling and it allows the caller to decide @@ -338,44 +258,24 @@ determine when a panic can happen or handle it in a custom fashion. These divisions justify the use of `panic`s for things like out-of-bounds indexing: such an error represents a programming mistake that (1) the author of the library was not aware of, by definition, and (2) cannot be easily handled by -the caller, except perhaps to indicate to the human user that an unexpected -error has occurred. - -In terms of heuristics for use: - -* `panic`s should rarely if ever be used to report errors that occurred through - communication with the system or through IO. For example, if a Rust program - shells out to `rustc`, and `rustc` is not found, it might be tempting to use a - panic, because the error is unexpected and hard to recover from. However, a - human consumer of the program would benefit from intermediate code adding - contextual information about the in-progress operation, and the program could - report the error in terms a human can understand. While the error is rare, - **when it happens it is not a programmer error**. -* assertions can produce `panic`s, because the programmer is saying that if the - assertion fails, it means that he has made an unexpected mistake. - -In short, if it would make sense to report an error as a context-free `500 -Internal Server Error` or a red an unknown error has occurred in all cases, it's -an appropriate panic. +the caller. + +In terms of heuristics for use, `panic`s should rarely if ever be used to report +routine errors for example through communication with the system or through IO. +If a Rust program shells out to `rustc`, and `rustc` is not found, it might be +tempting to use a panic because the error is unexpected and hard to recover +from. A user of the program, however, would benefit from intermediate code +adding contextual information about the in-progress operation, and the program +could report the error in terms a they can understand. While the error is +rare, **when it happens it is not a programmer error**. In short, panics are +roughly analogous to an opaque "an unexpected error has occurred" message. Another key reason to choose `Result` over a panic is that the compiler is likely to soon grow an option to map a panic to an abort. This is motivated for -portability, compile time, binary size, and a number of factors, but it +portability, compile time, binary size, and a number of other factors, but it fundamentally means that a library which signals errors via panics (and relies on consumers using `catch_panic`) will not be usable in this context. -### Will Rust have exceptions? - -After reviewing the cases for `Result` and `panic`, there's still clearly a -niche that both of these two systems are filling, so it's not the case that we -want to scrap one for the other. Rust will indeed have the ability to catch -exceptions to a greater extent than it does today with this RFC, but idiomatic -Rust will continue to follow the above rules for when to use a panic vs a result. - -It's likely that the `catch_panic` function will only be used where it's -absolutely necessary, like FFI boundaries, instead of a general-purpose error -handling mechanism in all code. - # Alternatives One alternative, which is somewhat more of an addition, is to have the standard @@ -383,72 +283,26 @@ library entirely abandon all exception safety mitigation tactics. As explained in the motivation section, exception safety will not lead to memory unsafety unless paired with unsafe code, so it is perhaps within the realm of possibility to remove the tactics of poisoning from mutexes and simply require that -consumers deal with exception safety 100%. - -This alternative is often motivated by saying that there are holes in our -poisoning story or the problem space is too large to tackle via targeted APIs. -This section will look a little bit more in detail about what's going on here. - -For the purpose of this discussion, let's use the term *dangerous* to -refer to code that can produce problems related to exception safety. Exception -safety means we're exposing the following possibly dangerous situation: - -> Dangerous code allows code that uses interior mutability to be interrupted in -> the process of making a mutation, and then allow other code to see the -> incomplete change. - -Today, most Rust code is protected from this danger from two angles: - -* If a piece of code acquires interior mutability through &mut and a panic - occurs, that panic will propagate through the owner of the original value. - Since there can be no outstanding & references to the same value, nobody can - see the incomplete change. -* If a piece of code acquires interior mutability through Mutex and a - panic occurs, attempts by another thread to read the value through - normal means will propagate the panic. - -There are areas in Rust that are not covered by these cases: - -* RefCell (especially with destructors) allows code to get access to a value - with an incomplete change. -* Generally speaking, destructors can observe an incomplete change. -* The Mutex API provides an alternate mechanism of reading a value with an - incomplete change. -* The proposed `catch_panic` API allows the propagation of panics to a boundary - that does not have any ownership restrictions. - -One open question that this question affects: - -* Should a theoretical `Thread::scoped` API propagate panics? - -Looking at these cases that aren't covered in Rust by default, and assuming that -`Thread::scoped` propagates panics by default (with an analogous API to -`PoisonError::into_inner`), we get a table that looks like: - -![img](https://www.evernote.com/l/AAJdvryuzOVFrakUiK6i0IBASP7wysYHN0sB/image.png) - -The main point here is that although this problem space seems sprawling, it is, -in reality, restricted to interior mutability. Enumerating the "dangerous" APIs -seems to be a tractable problem. Calling `RefCell` and `catch_panic` "dangerous" -(with the incomplete mutation problem) would not be problematic. `Mutex` or -`Thread::scoped` would not be dangerous because of the benefits associated with -detecting panics across threads, and this aligns with the table above. Note that -implementations of Drop, because they run during stack unwinding, should be -considered "dangerous" for the purposes of this summary. - -It may not be surprising that the threaded APIs ended up being protected via -APIs, because this kind of sharing is fundamental to threaded code. Making -them "dangerous" would make almost anything you would want to do with threads -"dangerous", and instead we ask users to learn about the danger only when they -try to access the possibly dangerous data. - -In contrast, both `RefCell` and `catch_panic` are more niche tools, making it -reasonable to ask users to learn about the danger when they begin using the -tools in the first place, and then making the access more ergonomic. Despite -labeling being "dangerous" there are strategies to mitigate this such as -building abstractions on top of these primitive which only use `RefCell` or -`catch_panic` as an implementation detail. These higher-level abstractions will -have fewer edge cases and risks associated with them. +consumers deal with exception safety 100% of the time. + +This alternative is often motivated by saying that there are enough methods to +subvert the default mitigation tactics that it's not worth trying to plug some +holes and not others. Upon closer inspection, however, the areas where safe code +needs to worry about exception safety are isolated to the single-threaded +situations. For example `RefCell`, destructors, and `catch_panic` all only +expose data possibly broken through a panic in a single thread. + +Once a thread boundary is crossed, the only current way to share data mutably is +via `Mutex` or `RwLock`, both of which are poisoned by default. This sort of +sharing is fundamental to threaded code, and poisoning by default allows safe +code to freely use many threads without having to consider exception safety +across threads (as poisoned data will tear down all connected threads). + +This property of multithreaded programming in Rust is seen as strong enough that +poisoning should not be removed by default, and in fact a new hypothetical +`thread::scoped` API (a rough counterpart of `catch_panic`) could also propagate +panics by default (like poisoning) with an ability to opt out (like +`PoisonError`). # Unresolved questions From 8c397d02f8dac91c66b92a9289aca6da27b92519 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 4 Aug 2015 09:12:44 -0700 Subject: [PATCH 0446/1195] Edits for clarity --- text/0000-stabilize-catch-panic.md | 170 ++++++++++++++++------------- 1 file changed, 96 insertions(+), 74 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 51a9e35a46e..43e7365e80c 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -10,14 +10,15 @@ bounds from the closure parameter. # Motivation -In today's stable Rust it's not currently possible to catch a panic. There are a -number of situations, however, where catching a panic is either required for -correctness or necessary for building a useful abstraction: +In today's stable Rust it's not possible to catch a panic on the thread that +caused it. There are a number of situations, however, where catching a panic is +either required for correctness or necessary for building a useful abstraction: * It is currently defined as undefined behavior to have a Rust program panic across an FFI boundary. For example if C calls into Rust and Rust panics, then this is undefined behavior. Being able to catch a panic will allow writing - robust C apis in Rust. + C apis in Rust that do not risk aborting the process they are embedded into. + * Abstactions like thread pools want to catch the panics of tasks being run instead of having the thread torn down (and having to spawn a new thread). @@ -32,9 +33,9 @@ fn catch_panic(f: F) -> thread::Result This function will run the closure `f` and if it panics return `Err(Box)`. If the closure doesn't panic it will return `Ok(val)` where `val` is the returned value of the closure. The closure, however, is restricted to only close -over `Send` and `'static` data. This can be overly restrictive at times and it's -also not clear what purpose the bounds are serving today, hence the desire to -remove these bounds. +over `Send` and `'static` data. These bounds can be overly restrictive, and due +to thread-local storage they can be subverted, making it unclear what purpose +they serve. This RFC proposes to remove the bounds as well. Historically Rust has purposefully avoided the foray into the situation of catching panics, largely because of a problem typically referred to as @@ -45,10 +46,11 @@ Rust. # Background: What is exception safety? Languages with exceptions have the property that a function can "return" early -if an exception is thrown. This is normally not something that needs to be -worried about, but this form of control flow can often be surprising and -unexpected. If an exception ends up causing unexpected behavior or a bug then -code is said to not be **exception safe**. +if an exception is thrown. While exceptions aren't too hard to reason about when +thrown explicitly, they can be problematic when they are thrown by code being +called -- especially when that code isn't known in advance. Code is **exception +safe** if it works correctly even when the functions it calls into throw +exceptions. The idea of throwing an exception causing bugs may sound a bit alien, so it's helpful to drill down into exactly why this is the case. Bugs related to @@ -58,11 +60,11 @@ exception safety are comprised of two critical components: 2. This broken invariant is the later observed. Exceptional control flow often exacerbates this first component of breaking -invariants. For example many data structures often have a number of invariants -that are dynamically upheld for correctness, and the type's routines can -temporarily break these invariants to be fixed up before the function returns. -If, however, an exception is thrown in this interim period the broken invariant -could be accidentally exposed. +invariants. For example many data structures have a number of invariants that +are dynamically upheld for correctness, and the type's routines can temporarily +break these invariants to be fixed up before the function returns. If, however, +an exception is thrown in this interim period the broken invariant could be +accidentally exposed. The second component, observing a broken invariant, can sometimes be difficult in the face of exceptions, but languages often have constructs to enable these @@ -81,7 +83,7 @@ example: known to not throw an exception. * Local "cleanup" handlers can be placed on the stack to restore invariants whenever a function returns, either normally or exceptionally. This can be - done through finally blocks in some languages for via destructors in others. + done through finally blocks in some languages or via destructors in others. * Exceptions can be caught locally to perform cleanup before possibly re-raising the exception. @@ -101,7 +103,7 @@ Up to now we've been talking about exceptions and exception safety, but from a Rust perspective we can just replace this with panics and panic safety. Panics in Rust are currently implemented essentially as a C++ exception under the hood. As a result, **exception safety is something that needs to be handled in Rust -code**. +code today**. One of the primary examples where panics need to be handled in Rust is unsafe code. Let's take a look at an example where this matters: @@ -136,12 +138,15 @@ Rust's design: * Rust doesn't expose uninitialized memory * Panics cannot be caught in a thread * Across threads data is poisoned by default on panics -* Idiomatic Rust must opt in to extra amounts of sharing data across boundaries +* Idiomatic Rust must opt in to extra sharing across boundaries (e.g. `RefCell`) +* Destructors are relatively rare and uninteresting in safe code -With these mitigation tactics, it ends up being the case that **safe Rust code -can mostly ignore exception safety concerns**. That being said, it does not mean -that safe Rust code can *always* ignore exception safety issues. There are a -number of methods to subvert the mitigation strategies listed above: +These mitigations all address the *second* aspect of exception unsafety: +observation of broken invariants. With the tactics in place, it ends up being +the case that **safe Rust code can largely ignore exception safety +concerns**. That being said, it does not mean that safe Rust code can *always* +ignore exception safety issues. There are a number of methods to subvert the +mitigation strategies listed above: 1. When poisoning data across threads, antidotes are available to access poisoned data. Namely the [`PoisonError` type][pet] allows safe access to the @@ -156,6 +161,11 @@ number of methods to subvert the mitigation strategies listed above: [pet]: http://doc.rust-lang.org/std/sync/struct.PoisonError.html +But all of these "subversions" fall outside the realm of normal, idiomatic, safe +Rust code, and so they all serve as a "heads up" that panic safety might be an +issue. Thus, in practice, Rust programmers worry about exception safety far less +than in languages with full-blown exceptions. + Despite these methods to subvert the mitigations placed by default in Rust, a key part of exception safety in Rust is that **safe code can never lead to memory unsafety**, regardless of whether it panics or not. Memory unsafety @@ -166,7 +176,7 @@ this RFC. # Detailed design -At its heard, the change this RFC is proposing is to stabilize +At its heart, the change this RFC is proposing is to stabilize `std::thread::catch_panic` after removing the `Send` and `'static` bounds from the closure parameter, modifying the signature to be: @@ -177,50 +187,39 @@ fn catch_panic R, R>(f: F) -> thread::Result More generally, however, this RFC also claims that this stable function does not radically alter Rust's exception safety story (explained above). -### Exception safety mitigation +## Will Rust have exceptions? -A mitigation strategy for exception safety listed above is that a panic cannot -be caught within a thread, and this change would move that bullet to the list of -"methods to subvert the mitigation strategies" instead. Catching a panic (and -not having `'static` on the bounds list) makes it easier to observe broken -invariants of data structures shared across the `catch_panic` boundary, which -can possibly increase the likelihood of exception safety issues arising. +In a technical sense this RFC is not "adding exceptions to Rust" as they already +exist in the form of panics. What this RFC is adding, however, is a construct +via which to catch these exceptions within a thread, bringing the standard +library closer to the exception support in other languages. -One of the key reasons Rust doesn't provide an exhaustive set of mitigation -strategies is that the design of the language and standard library lead to -idiomatic code not having to worry about exception safety. The use cases for -`catch_panic` are relatively niche, and it is not expected for `catch_panic` to -overnight become the idiomatic method of handling errors in Rust. +Catching a panic (and especially not having `'static` on the bounds list) makes +it easier to observe broken invariants of data structures shared across the +`catch_panic` boundary, which can possibly increase the likelihood of exception +safety issues arising. -Essentially, the addition of `catch_panic`: +The risk of this step is that catching panics becomes an idiomatic way to deal +with error-handling, thereby making exception safety much more of a headache +than it is today. Whereas we intend for the `catch_panic` function to only be +used where it's absolutely necessary, e.g. for FFI boundaries. How do we ensure +that `catch_panic` isn't overused? -* Does not mean that *only now* does Rust code need to consider exception - safety. This is something that already must be handled today. -* Does not mean that safe code everywhere must start worrying about exception - safety. This function is not the primary method to signal errors in Rust - (discussed later) and only adds a minor bullet to the list of situations that - safe Rust already needs to worry about exception safety in. +There are two key reasons we don't except `catch_panic` to become idiomatic: -### Will Rust have exceptions? +1. We have already established very strong conventions around error handling, + and in particular around the use of panic and `Result`, and stabilized usage + around them in the standard library. There is little chance these conventions + would change overnight. -In a technical sense this RFC is not "adding exceptions to Rust" as they -already exist in the form of panics. What this RFC is adding, however, is a -construct via which to catch these exceptions, bringing the standard library -closer to the exception support in other languages. Idiomatic usage of Rust, -however, will continue to follow the guidelines listed below for using a Result -vs using a panic (which also do not need to change to account for this RC). +2. We have long intended to provide an option to treat every use of `panic!` as + an abort, which is motivated by portability, compile time, binary size, and a + number of other factors. Assuming we take this step, it would be extremely + unwise for a library to signal expected errors via panics and rely on + consumers using `catch_panic` to handle them. -It's likely that the `catch_panic` function will only be used where it's -absolutely necessary, like FFI boundaries, instead of a general-purpose error -handling mechanism in all code. - -# Drawbacks - -A drawback of this RFC is that it can water down Rust's error handling story. -With the addition of a "catch" construct for exceptions, it may be unclear to -library authors whether to use panics or `Result` for their error types. There -are fairly clear guidelines and conventions about using a `Result` vs a `panic` -today, however, and they're summarized below for completeness. +For reference, here's a summary of the conventions around `Result` and `panic`, +which still hold good after this RFC: ### Result vs Panic @@ -229,9 +228,10 @@ today: * `Results` represent errors/edge-cases that the author of the library knew about, and expects the consumer of the library to handle. + * `panic`s represent errors that the author of the library did not expect to - occur, and therefore does not expect the consumer to handle in any particular - way. + occur, such as a contract violation, and therefore does not expect the + consumer to handle in any particular way. Another way to put this division is that: @@ -239,6 +239,7 @@ Another way to put this division is that: information allows them to be handled by the caller of the function producing the error, modified with additional contextual information, and eventually converted into an error message fit for a top-level program. + * `panic`s represent errors that carry no contextual information (except, perhaps, debug information). Because they represented an unexpected error, they cannot be easily handled by the caller of the function or presented to @@ -251,14 +252,13 @@ and writing down `Result` + `try!` is not always the most ergonomic. The pros and cons of `panic` are essentially the opposite of `Result`, being easy to use (nothing to write down other than the panic) but difficult to -determine when a panic can happen or handle it in a custom fashion. - -### Result? Or panic? +determine when a panic can happen or handle it in a custom fashion, even with +`catch_panic`. These divisions justify the use of `panic`s for things like out-of-bounds indexing: such an error represents a programming mistake that (1) the author of -the library was not aware of, by definition, and (2) cannot be easily handled by -the caller. +the library was not aware of, by definition, and (2) cannot be meaningfully +handled by the caller. In terms of heuristics for use, `panic`s should rarely if ever be used to report routine errors for example through communication with the system or through IO. @@ -270,11 +270,28 @@ could report the error in terms a they can understand. While the error is rare, **when it happens it is not a programmer error**. In short, panics are roughly analogous to an opaque "an unexpected error has occurred" message. -Another key reason to choose `Result` over a panic is that the compiler is -likely to soon grow an option to map a panic to an abort. This is motivated for -portability, compile time, binary size, and a number of other factors, but it -fundamentally means that a library which signals errors via panics (and relies -on consumers using `catch_panic`) will not be usable in this context. +Stabilizing `catch_panic` does little to change the tradeoffs around `Result` +and `panic` that led to these conventions. + +## Why remove the bounds? + +The main reason to remove the `'static` and `Send` bounds on `catch_panic` is +that they don't actually enforce anything. Using thread-local storage, it's +possible to share mutable data across a call to `catch_panic` even if that data +isn't `'static` or `Send`. And allowing borrowed data, in particular, is helpful +for thread pools that need to execute closures with borrowed data within them; +essentially, the worker threads are executing multiple "semantic threads" over +their lifetime, and the `catch_panic` boundary represents the end of these +"semantic threads". + +# Drawbacks + +A drawback of this RFC is that it can water down Rust's error handling story. +With the addition of a "catch" construct for exceptions, it may be unclear to +library authors whether to use panics or `Result` for their error types. As we +discussed above, however, Rust's design around error handling has always had to +deal with these two strategies, and our conventions don't materially change by +stabilizing `catch_panic`. # Alternatives @@ -306,4 +323,9 @@ panics by default (like poisoning) with an ability to opt out (like # Unresolved questions -None currently. +- Is it worth keeping the `'static` and `Send` bounds as a mitigation measure in + practice, even if they aren't enforceable in theory? That would require thread + pools to use unsafe code, but that could be acceptable. + +- Should `catch_panic` be stabilized within `std::thread` where it lives today, + or somewhere else? From 66ea017d3b167b3f747c9751640d0c3623ea363e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 4 Aug 2015 09:56:14 -0700 Subject: [PATCH 0447/1195] Add link to issue about catch_panic + TLS --- text/0000-stabilize-catch-panic.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 43e7365e80c..a4f7371ac55 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -34,8 +34,10 @@ This function will run the closure `f` and if it panics return `Err(Box)`. If the closure doesn't panic it will return `Ok(val)` where `val` is the returned value of the closure. The closure, however, is restricted to only close over `Send` and `'static` data. These bounds can be overly restrictive, and due -to thread-local storage they can be subverted, making it unclear what purpose -they serve. This RFC proposes to remove the bounds as well. +to thread-local storage [they can be subverted][tls-subvert], making it unclear +what purpose they serve. This RFC proposes to remove the bounds as well. + +[tls-subvert]: https://github.com/rust-lang/rust/issues/25662 Historically Rust has purposefully avoided the foray into the situation of catching panics, largely because of a problem typically referred to as From c68967f80b8cca85d27f745f626c41cb9fc66a4b Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 4 Aug 2015 10:09:13 -0700 Subject: [PATCH 0448/1195] Final adjustments --- text/0000-stabilize-catch-panic.md | 33 +++++++++++++++--------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index a4f7371ac55..f21ba09eecd 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -11,15 +11,15 @@ bounds from the closure parameter. # Motivation In today's stable Rust it's not possible to catch a panic on the thread that -caused it. There are a number of situations, however, where catching a panic is +caused it. There are a number of situations, however, where this is either required for correctness or necessary for building a useful abstraction: * It is currently defined as undefined behavior to have a Rust program panic across an FFI boundary. For example if C calls into Rust and Rust panics, then this is undefined behavior. Being able to catch a panic will allow writing - C apis in Rust that do not risk aborting the process they are embedded into. + C APIs in Rust that do not risk aborting the process they are embedded into. -* Abstactions like thread pools want to catch the panics of tasks being run +* Abstractions like thread pools want to catch the panics of tasks being run instead of having the thread torn down (and having to spawn a new thread). Stabilizing the `catch_panic` function would enable these two use cases, but @@ -73,7 +73,7 @@ in the face of exceptions, but languages often have constructs to enable these sorts of witnesses. Two primary methods of doing so are something akin to finally blocks (code run on a normal or exceptional return) or just catching the exception. In both cases code which later runs that has access to the original -data structure then it will see the broken invariants. +data structure will see the broken invariants. Now that we've got a better understanding of how an exception might cause a bug (e.g. how code can be "exception unsafe"), let's take a look how we can make @@ -203,22 +203,23 @@ safety issues arising. The risk of this step is that catching panics becomes an idiomatic way to deal with error-handling, thereby making exception safety much more of a headache -than it is today. Whereas we intend for the `catch_panic` function to only be -used where it's absolutely necessary, e.g. for FFI boundaries. How do we ensure -that `catch_panic` isn't overused? +than it is today (as it's more likely that a broken invariant is later +witnessed). The `catch_panic` function is intended to only be used +where it's absolutely necessary, e.g. for FFI boundaries, but how can it be +ensured that `catch_panic` isn't overused? -There are two key reasons we don't except `catch_panic` to become idiomatic: +There are two key reasons `catch_panic` likely won't become idiomatic: -1. We have already established very strong conventions around error handling, - and in particular around the use of panic and `Result`, and stabilized usage - around them in the standard library. There is little chance these conventions +1. There are already strong and established conventions around error handling, + and in particular around the use of panic and `Result` with stabilized usage + of them in the standard library. There is little chance these conventions would change overnight. -2. We have long intended to provide an option to treat every use of `panic!` as - an abort, which is motivated by portability, compile time, binary size, and a - number of other factors. Assuming we take this step, it would be extremely - unwise for a library to signal expected errors via panics and rely on - consumers using `catch_panic` to handle them. +2. There has long been a desire to treat every use of `panic!` as an abort + which is motivated by portability, compile time, binary size, and a number of + other factors. Assuming this step is taken, it would be extremely unwise for + a library to signal expected errors via panics and rely on consumers using + `catch_panic` to handle them. For reference, here's a summary of the conventions around `Result` and `panic`, which still hold good after this RFC: From 8637d7a429eb478de44ab3194ecc7ef96394487d Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Tue, 4 Aug 2015 13:05:12 -0700 Subject: [PATCH 0449/1195] clarify extreme operator behaviour --- text/0560-integer-overflow.md | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/text/0560-integer-overflow.md b/text/0560-integer-overflow.md index de14896fe01..205b6dcb489 100644 --- a/text/0560-integer-overflow.md +++ b/text/0560-integer-overflow.md @@ -125,10 +125,14 @@ The error conditions that can arise, and their defined results, are as follows. The intention is that the defined results are the same as the defined results today. The only change is that now a panic may result. -- The operations `+`, `-`, `*`, `/`, `%` can underflow and - overflow. +- The operations `+`, `-`, `*`, can underflow and overflow. When checking is + enabled this will panic. When checking is disabled this will two's complement + wrap. +- The operations `/`, `%` are nonsensical for the arguments `INT_MIN` and `-1`. + When this occurs there is an unconditional panic. - Shift operations (`<<`, `>>`) can shift a value of width `N` by more - than `N` bits. + than `N` bits. This is prevented by unconditionally masking the bits + of the right-hand-side to wrap modulo `N`. ## Enabling overflow checking @@ -145,7 +149,7 @@ potential overflow (and, in particular, for code where overflow is expected and normal, they will be immediately guided to use the wrapping methods introduced below). However, because these checks will be compiled out whenever an optimized build is produced, final code -wilil not pay a performance penalty. +will not pay a performance penalty. In the future, we may add additional means to control when overflow is checked, such as scoped attributes or a global, independent @@ -451,17 +455,7 @@ were: # Unresolved questions -The C semantics of wrapping operations in some cases are undefined: - -- `INT_MIN / -1`, `INT_MIN % -1` -- Shifts by an excessive number of bits - -This RFC takes no position on the correct semantics of these -operations, simply preserving the existing semantics. However, it may -be worth trying to define the wrapping semantics of these operations -in a portable way, even if that implies some runtime cost. Since these -are all error conditions, this is an orthogonal topic to the matter of -overflow. +None today (see Updates section below). # Future work @@ -491,6 +485,10 @@ Since it was accepted, the RFC has been updated as follows: 2. `as` was changed to restore the behavior before the RFC (that is, it truncates to the target bitwidth and reinterprets the highest order bit, a.k.a. sign-bit, as necessary, as a C cast would). +3. Shifts were specified to mask off the bits of over-long shifts. +4. Overflow was specified to be two's complement wrapping (this was mostly + a clarification). +5. `INT_MIN / -1` and `INT_MIN % -1` panics. # Acknowledgements and further reading From 2db1328d97151db6cf4b99a7df2a49178c12ab80 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Tue, 4 Aug 2015 13:10:06 -0700 Subject: [PATCH 0450/1195] add updates section --- text/0982-dst-coercion.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0982-dst-coercion.md b/text/0982-dst-coercion.md index deff76c3f61..92c26884981 100644 --- a/text/0982-dst-coercion.md +++ b/text/0982-dst-coercion.md @@ -176,3 +176,9 @@ indicate the field type which is coerced, for example). # Unresolved questions None + +# Updates since being accepted + +Since it was accepted, the RFC has been updated as follows: + +1. `CoerceUnsized` was specified to ingore PhantomData fields. From 3e400d96becbf74aa05e69764b374e28543121eb Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 4 Aug 2015 15:20:42 -0700 Subject: [PATCH 0451/1195] Expand on why Send/'static bounds are removed --- text/0000-stabilize-catch-panic.md | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index f21ba09eecd..95a229e4d4d 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -278,14 +278,24 @@ and `panic` that led to these conventions. ## Why remove the bounds? -The main reason to remove the `'static` and `Send` bounds on `catch_panic` is -that they don't actually enforce anything. Using thread-local storage, it's -possible to share mutable data across a call to `catch_panic` even if that data -isn't `'static` or `Send`. And allowing borrowed data, in particular, is helpful -for thread pools that need to execute closures with borrowed data within them; -essentially, the worker threads are executing multiple "semantic threads" over -their lifetime, and the `catch_panic` boundary represents the end of these -"semantic threads". +There are a few reasons to remove the `'static` and `Send` bounds on the +`catch_panic` function: + +* One of the primary use cases of `catch_panic` is in an FFI context, where lots + of `*mut` and `*const` pointers are flying around. These two types aren't + `Send` by default, so having their values cross the `catch_panic` boundary + would be highly un-ergonomic (albeit still possible). As a result, this RFC + proposes removing the `Send` bound from the function. + +* A reason to remove the `'static` bound is that it doesn't provide rock-solid + exception-safety mitigation. Using thread-local storage it's possible to + share mutable data across a call to `catch_panic` even if that data isn't + `'static`. + +* Borrowed data, in particular, is helpful for thread pools that need + to execute closures with borrowed data within them; essentially, the worker + threads are executing multiple "semantic threads" over their lifetime, and the + `catch_panic` boundary represents the end of these "semantic threads". # Drawbacks From d0eb19b4034cb1f8ac92cd36166d41c1fa3daf21 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 18:56:17 +0200 Subject: [PATCH 0452/1195] First draft of nonparametric-dropck RFC. --- text/0000-nonparametric-dropck.md | 532 ++++++++++++++++++++++++++++++ 1 file changed, 532 insertions(+) create mode 100644 text/0000-nonparametric-dropck.md diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md new file mode 100644 index 00000000000..d3cc35af451 --- /dev/null +++ b/text/0000-nonparametric-dropck.md @@ -0,0 +1,532 @@ +- Feature Name: dropck_parametricity +- Start Date: 2015-08-05 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Revise the Drop Check (`dropck`) part of Rust's static analyses in two +ways. In the context of this RFC, these revisions are respectively +named `cannot-assume-parametricity` and `unguarded-escape-hatch`. + + 1. `cannot-assume-parametricity` (CAP): Make `dropck` analysis stop + relying on parametricity of type-parameters. + + 2. `unguarded-escape-hatch` (UGEH): Add an attribute (with some name + starting with "unsafe") that a library designer can attach to a + `drop` implementation that will allow a destructor to side-step + the `dropck`'s constraints (unsafely). + +# Motivation + +## Background: Parametricity in `dropck` + +The Drop Check rule (`dropck`) for [Sound Generic Drop][] relies on a +reasoning process that needs to infer that the behavior of a +polymorphic function (e.g. `fn foo`) does not depend on the +concrete type instantiations of any of its *unbounded* type parameters +(e.g. `T` in `fn foo`), at least beyond the behavior of the +destructor (if any) for those type parameters. + +[Sound Generic Drop]: https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md + +This property is a (weakened) form of a property known in academic +circles as *Parametricity*. +(See e.g. [Reynolds, IFIP 1983][Rey83], [Wadler, FPCA 1989][Wad89].) + + * Parametricity, in this context, essentially says that the compiler + can reason about the body of `foo` (and the subroutines that `foo` + invokes) without having to think about the particular concrete + types that the type parameter `T` is instantiated with. + `foo` cannot do anything with a `t: T` except: + + 1. move `t` to some other owner expecting a `T` or, + + 2. drop `t`, running its destructor and freeing associated resources. + + * For example, this allows the compiler to deduce that even if `T` is + instantiated with a concrete type like `&Vec`, the body of + `foo` cannot actually read any `u32` data out of the vector. More + details about this are available on the [Sound Generic Drop][] RFC. + +### Reynolds +[Rey83]: #reynolds +John C. Reynolds. "Types, abstraction and parametric polymorphism". IFIP 1983 +http://www.cse.chalmers.se/edu/year/2010/course/DAT140_Types/Reynolds_typesabpara.pdf + +### Wadler +[Wad89]: #wadler +Philip Wadler. "Theorems for free!". FPCA 1989 +http://ttic.uchicago.edu/~dreyer/course/papers/wadler.pdf + +## "Mistakes were made" + +The parametricity-based reasoning in the +[Drop Check analysis][Sound Generic Drop] (`dropck`) was clever, but +fragile and unproven. + + * Regarding its fragility, it has been shown to have + [bugs][parametricity-insufficient]; in particular, parametricity is + a necessary but *not* sufficient condition to justify the + inferences that `dropck` makes. + + * Regarding its unproven nature, `dropck` violated the heuristic in + Rust's design to not incorporate ideas unless those ideas had + already been proven effective elsewhere. + +[parametricity-insufficient]: https://github.com/rust-lang/rust/issues/26656 + +These issues might alone provide motivation for ratcheting back on +`dropck`'s rules in the short term, putting in a more conservative +rule in the stable release channel while allowing experimentation with +more-aggressive feature-gated rules in the development nightly release +channel. + +However, there is also a specific reason why we want to ratchet back +on the `dropck` analysis as soon as possible. + +## Impl specialization is inherently non-parametric + +The parametricity requirement in the Drop Check rule over-restricts +the design space for future language changes. + +In particular, the [impl specialization] RFC describes a language +change that will allow the invocation of a polymorphic function `f` to +end up in different sequences of code based solely on the concrete +type of `T`, *even* when `T` has no trait bounds within its +declaration in `f`. + +[impl specialization]: https://github.com/rust-lang/rfcs/pull/1210 + +# Detailed design + +Revise the Drop Check (`dropck`) part of Rust's static analyses in two +ways. In the context of this RFC, these revisions are respectively +named `cannot-assume-parametricity` (CAP) and `unguarded-escape-hatch` (UGEH). + +Though the revisions are given distinct names, they both fall under +the feature gate `dropck_parametricity`. (Note however that this +might be irrelevant to CAP; see [CAP stabilization details][]. + +## cannot-assume-parametricity + +The heart of CAP is this: make `dropck` analysis stop relying on +parametricity of type-parameters. + +### Changes to the Drop-Check Rule + +The Drop-Check Rule (both in its original form and as revised here) +dicates when a lifetime `'a` must strictly outlive some value `v`, +where `v` owns data of type `D`; the rule gave two circumstances where +`'a` must strictly outlive the scope of `v`. + + * The first circumstance (`D` is directly instantiated at `'a`) + remains unchanged by this RFC. + + * The second circumstance (`D` has some type parameter with + trait-provided methods, i.e. that could be invoked within `Drop`) + is broadened by this RFC to simply say "`D` has some type + parameter." + +That is, under the changes of this RFC, whether the type parameter has +a trait-bound is irrelevant to the Drop-Check Rule. The reason is that +any type parameter, regardless of whether it has a trait bound or not, +may end up participating in [impl specialization], and thus could +expose an otherwise invisible reference `&'a AlreadyDroppedData`. + +`cannot-assume-parametricity` is a breaking change, since the language +will start assuming that a destructor for a data-type definition such +as `struct Parametri` may read from data held in its `C` parameter, +even though the `fn drop` formerly appeared to be parametric with +respect to `C`. This will cause `rustc` to reject code that it had +previously accepted. + +### CAP stabilization details +[CAP stabilization details]: #cap-stabilization-details + +`cannot-assume-parametricity` will be incorporated into the beta +and stable Rust channels, to ensure that destructor code atop +stable channels in the wild stop relying on parametricity as soon +as possible. This will enable new language features such as +[impl specialization]. + + * It is not yet clear whether it is feasible to include a warning + cycle for CAP. + + * For now, this RFC is proposing to remove the parts of Drop-Check + that attempted to prove that the `impl Drop` was parametric with + respect to `T`. This would mean that there would be more warning + cycle; `dropck` would simply start rejecting more code. + There would be no way to opt back into the old `dropck` rules. + + * (However, during implementation of this change, we should + double-check whether a warning-cycle is in fact feasible.) + +## unguarded-escape-hatch + +The heart of `unguarded-escape-hatch` (UGEH) is this: Provide a new, +unsafe (and unstable) attribute-based escape hatch for use in the +standard library for cases where Drop Check is too strict. + +### Why we need an escape hatch + +The original motivation for the parametricity special-case in the +original Drop-Check rule was due to an observation that collection +types such as `TypedArena` or `Vec` were often used to +contain values that wanted to refer to each other. + +An example would be an element type like +`struct Concrete<'a>(u32, Cell>>);`, and then +instantiations of `TypedArena` or `Vec`. +This pattern has been used within `rustc`, for example, +to store elements of a linked structure within an arena. + +Without the parametricity special-case, the existence of a destructor +on `TypedArena` or `Vec` led the Drop-Check analysis to conclude +that those destructors might hypothetically read from the references +held within `T` -- forcing `dropck` to reject those destructors. + +(Note that `Concrete` itself has no destructor; if it did, then +`dropck`, both as originally stated and under the changes of this RFC, +*would* force the `'a` parameter of any instance to strictly outlive +the instance value, thus ruling out cross-references in the same +`TypedArena` or `Vec`.) + +Of course, the whole point of this RFC is that using parametricity as +the escape hatch seems like it does not suffice. But we still need +*some* escape hatch. + +### The new escape hatch: an unsafe attribute + +This leads us to the second component of the RFC, `unguarded-escape-hatch` (UGEH): +Add an attribute (with a name starting with "unsafe") that a library +designer can attach to a `drop` implementation that will allow a +destructor to side-step the `dropck`'s constraints (unsafely). + +This RFC proposes the attribute name `unsafe_destructor_blind_to_params`. +This name was specifically chosen to be long and ugly; see +[UGEH stabilization details] for further discussion. + +Much like the `unsafe_destructor` attribute that we had in the past, +this attribute relies on the programmer to ensure that the destructor +cannot actually be used unsoundly. It states an (unproven) assumption +that the given implementation of `drop` (and all functions that this +`drop` may transitively calls) will never read or modify a value of +any type parameter, apart from the trivial operations of either +dropping the value or moving the value from one location to another. + + * (In particular, it certainly must not dereference any `&`-reference + within such a value, though this RFC is adopts a somewhat stronger + requirement to encourage the attribute to only be used for the + limited case of parametric collection types, where one need not do + anything more than move or drop values.) + +The above assumption must hold regardless of what impact +[impl specialization][] has on the resolution of all function calls. + +### UGEH stabilization details +[UGEH stabilization details]: #ugeh-stabilization-details + +The proposed attribute is only a *short-term* patch to work-around a +bug exposed by the combination of two desirable features (namely +[impl specialization] and [`dropck`][Sound Generic Drop]). + +In particular, using the attribute in cases where control-flow in the +destructor can reach functions that may be specialized on a +type-parameter `T` may expose the system to use-after-free scenarios +or other unsound conditions. This may a non-trivial thing for the +programmer to prove. + + * Short term strategy: The working assumption of this RFC is that the + standard library developers will use the proposed attribute in + cases where the destructor *is* parametric with respect to all type + parameters, even though the compiler cannot currently prove this to + be the case. + + The new attribute will be restricted to non-stable channels, like + any other new feature under a feature-gate. + + * Long term strategy: This RFC does not make any formal guarantees + about the long-term strategy for including an escape hatch. In + particular, this RFC does *not* propose that we stabilize the + proposed attribute + + It may be possible for future language changes to allow us to + directly express the necessary parametricity properties. + See further discussion in [Alternatives][]. + + The suggested attribute name (`unsafe_destructor_blind_to_params` + above) was deliberately selected to be long and ugly, in order to + discourage it from being stabilized in the future without at least + some significant discussion. (Likewise, the acronym "UGEH" was + chosen for its likely pronounciation "ugh", again a reminder that + we do not *want* to adopt this approach for the long term.) + + +## Examples of code changes under the RFC + +This section shows some code examples, starting with code that works +today and must continue to work tomorrow, then showing an example of +code that will start being rejected, and ending with an example of the +UGEH attribute. + +### Examples of code that must continue to work + +Here is some code that works today and must continue to work in the future: + +```rust +use std::cell::Cell; + +struct Concrete<'a>(u32, Cell>>); + +fn main() { + let mut data = Vec::new(); + data.push(Concrete(0, Cell::new(None))); + data.push(Concrete(0, Cell::new(None))); + + data[0].1.set(Some(&data[1])); + data[1].1.set(Some(&data[0])); +} +``` + +In the above, we are building up a vector, pushing `Concrete` elements +onto it, and then later linking those concrete elements together via +optional references held in a cell in each concrete element. + +We can even wrap the vector in a struct that holds it. This also must +continue to work (and will do so under this RFC); such structural +composition is a common idiom in Rust code. + +```rust +use std::cell::Cell; + +struct Concrete<'a>(u32, Cell>>); + +struct Foo { data: Vec } + +fn main() { + let mut foo = Foo { data: Vec::new() }; + foo.data.push(Concrete(0, Cell::new(None))); + foo.data.push(Concrete(0, Cell::new(None))); + + foo.data[0].1.set(Some(&foo.data[1])); + foo.data[1].1.set(Some(&foo.data[0])); +} +``` + +### Examples of code that will start to be rejected + +The main change injected by this RFC is this: due to `cannot-assume-parametricity`, +an attempt to add a destructor to the `struct Foo` above will cause the +code above to be rejected, because we will assume that the destructor for `Foo` +may invoke methods on the concrete elements that dereferences their links. + +Thus, this code will be rejected: + +```rust +use std::cell::Cell; + +struct Concrete<'a>(u32, Cell>>); + +struct Foo { data: Vec } + +// This is the new `impl Drop` +impl Drop for Foo { + fn drop(&mut self) { } +} + +fn main() { + let mut foo = Foo { data: Vec::new() }; + foo.data.push(Concrete(0, Cell::new(None))); + foo.data.push(Concrete(0, Cell::new(None))); + + foo.data[0].1.set(Some(&foo.data[1])); + foo.data[1].1.set(Some(&foo.data[0])); +} +``` + +NOTE: Based on a preliminary crater run, it seems that mixing together +destructors with this sort of cyclic structure is sufficiently rare +that *no* crates on `crates.io` actually regressed under the new rule: +everything that compiled before the change continued to compile after +it. + +### Example of the unguarded-escape-hatch + +If the developer of `Foo` has access to the feature-gated +escape-hatch, and is willing to assert that the destructor for `Foo` +does nothing with the links in the data, then the developer can work +around the above rejection of the code by adding the corresponding +attribute. + +```rust +#![feature(dropck_parametricity)] +use std::cell::Cell; + +struct Concrete<'a>(u32, Cell>>); + +struct Foo { data: Vec } + +impl Drop for Foo { + #[unsafe_destructor_blind_to_params] // This is the UGEH attribute + fn drop(&mut self) { } +} + +fn main() { + let mut foo = Foo { data: Vec::new() }; + foo.data.push(Concrete(0, Cell::new(None))); + foo.data.push(Concrete(0, Cell::new(None))); + + foo.data[0].1.set(Some(&foo.data[1])); + foo.data[1].1.set(Some(&foo.data[0])); +} +``` + +# Drawbacks + +As should be clear by the tone of this RFC, the +`unguarded-escape-hatch` is clearly a hack. It is subtle and unsafe, +just as `unsafe_destructor` was (and for the most part, the whole +point of [Sound Generic Drop][] was to remove `unsafe_destructor` from +the language). + + * However, the expectation is that most clients will have no need to + ever use the `unguarded-escape-hatch`. + + * It may suffice to use the escape hatch solely within the collection + types of `libstd`. + + * Otherwise, if clients outside of `libstd` determine that they *do* + need to be able to write destructors that need to bypass `dropck` + safely, then we can (and *should*) investigate one of the + [sound alternatives][continue supporting parametricity], rather + than stabilize the unsafe hackish escape hatch.. + +# Alternatives +[alternatives]: #alternatives + +## CAP without UGEH + +One might consider adopting `cannot-assume-parametricity` without +`unguarded-escape-hatch`. However, unless some other sort of escape +hatch were added, this path would break much more code. + +## UGEH for lifetime parameters + +Since we're already being unsafe here, one might consider having +the `unsafe_destructor_blind_to_params` apply to lifetime parameters +as well as type parameters. + +However, given that the `unsafe_destructor_blind_to_params` attribute +is only intended as a short-term band-aid (see +[UGEH stabilization details][]) it seems better to just make it only as +broad as it needs to be (and no broader). + +## "Sort-of Guarded" Escape Hatch + +We could add the escape hatch but continue employing the current +dropck analysis to it. This would essentially mean that code would have +to apply the unsafe attribute to be considered for parametricity, but +if there were obvious problems (namely, if the type parameter had a trait bound) +then the attempt to opt into parametricity would be ignored and the +strict ordering restrictions on the lifetimes would be imposed. + +I only mention this because it occurred to me in passing; I do not +really think it has much of a benefit. It would potentially lead +someone to think that their code has been proven sound (since the +`dropck` would catch some mistakes in programmer reasoning) but the +pitfalls with respect to specialization would remain. + +## Continue Supporting Parametricity +[continue supporting parametricity]: #continue-supporting-parametricity +There may be ways to revise the language so that functions can declare +that they must be parametric with respect to their type parameters. +Here we sketch two potential ideas for how one might do this, mostly to +give a hint of why this is not a trivial change to the language. + +Neither design is likely to be adopted, at least as described here, +because both of them impose significant burdens on implementors of +parametric destructors, as we will see. + +(Also, if we go down this path, we will need to fix other bugs in the +Drop Check rule, where, as previously noted, parametricity is a +necessary but *insufficient* condition for soundness.) + +### Parametricity via effect-system attributes + +One feature of the [impl specialization] RFC is that all functions that +can be specialized must be declared as such, via the `default` keyword. + +This leads us to one way that a function could declare that its body +must not be allows to call into specialized methods: an attribute like +`#[unspecialized]`. The `#[unspecialized]` attribute, when applied to +a function `fn foo()`, would mean two things: + + * `foo` is not allowed to call any functions that have the `default` keyword. + + * `foo` is only allowed to call functions that are also marked `#[unspecialized]` + +All `fn drop` methods would be required to be `#[unspecialized]`. + +It is the second bullet that makes this an ad-hoc effect system: it provides +a recursive property that ensures that during the extent of the call to `foo`, +we will never invoke a function marked as `default`. + +It is also this second bullet that represents a signficant burden on +the destructor implementor. In particular, it immediately rules out +using any library routine unless that routine has been marked as +`#[unspecialized]`. The attribute is unlikely to be included on any +function unless the its developer is making a destructor that calls it +in tandem. + +### Parametricity via some `?`-bound + +Another approach starts from another angle: As described earlier, +parametricity in `dropck` is the requirement that `fn drop` cannot do +anything with a `t: T` (where `T` is some relevant type parameter) +except: + + 1. move `t` to some other owner expecting a `T` or, + + 2. drop `t`, running its destructor and freeing associated resources. + +So, perhaps it would be more natural to express this requirement +via a bound. + +We would start with the assumption that functions may be +non-parametric (and thus their implementations may be specialized to +specific types). + +But then if you want to declare a function as having a stronger +constraint on its behavior (and thus expanding its potential callers +to ones that require parametricity), you could add a bound `T: ?Special`. + +The Drop-check rule would treat `T: ?Special` type-parameters as parametric, +and other type-parameters as non-parametric. + +The marker trait `Special` would be an OIBIT that all sized types would get. + +Any expression in the context of a type-parameter binding of the form +`` would not be allowed to call any `default` method +where `T` could affect the specialization process. + +(The careful reader will probably notice the potential sleight-of-hand +here: is this really any different from the effect-system attributes +proposed earlier? Perhaps not, though it seems likely that the finer +grain parameter-specific treatment proposed here is more expressive, +at least in theory.) + +Like the previous proposal, this design represents a significant +burden on the destructor implementor: Again, the `T: ?Special` +attribute is unlikely to be included on any function unless the its +developer is making a destructor that calls it in tandem. + +# Unresolved questions + + * What name to use for the attribute? + Is `unsafe_destructor_blind_to_params` sufficiently long and ugly? ;) + + * What is the real long-term plan? + + * Should we consider merging the discussion of alternatives + into the [impl specialization] RFC? From 145b46ddfda7278e5ffd4fe3a882d288b444bb38 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:15:23 +0200 Subject: [PATCH 0453/1195] fix a typo. --- text/0000-nonparametric-dropck.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index d3cc35af451..86113d334a0 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -211,7 +211,7 @@ Much like the `unsafe_destructor` attribute that we had in the past, this attribute relies on the programmer to ensure that the destructor cannot actually be used unsoundly. It states an (unproven) assumption that the given implementation of `drop` (and all functions that this -`drop` may transitively calls) will never read or modify a value of + `drop` may transitively call) will never read or modify a value of any type parameter, apart from the trivial operations of either dropping the value or moving the value from one location to another. From 20d727324148af7ea56ef81dc94e601d595663a1 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:17:37 +0200 Subject: [PATCH 0454/1195] move paper references to a separate bibliography section. --- text/0000-nonparametric-dropck.md | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index 86113d334a0..de4c008df73 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -49,16 +49,6 @@ circles as *Parametricity*. `foo` cannot actually read any `u32` data out of the vector. More details about this are available on the [Sound Generic Drop][] RFC. -### Reynolds -[Rey83]: #reynolds -John C. Reynolds. "Types, abstraction and parametric polymorphism". IFIP 1983 -http://www.cse.chalmers.se/edu/year/2010/course/DAT140_Types/Reynolds_typesabpara.pdf - -### Wadler -[Wad89]: #wadler -Philip Wadler. "Theorems for free!". FPCA 1989 -http://ttic.uchicago.edu/~dreyer/course/papers/wadler.pdf - ## "Mistakes were made" The parametricity-based reasoning in the @@ -530,3 +520,16 @@ developer is making a destructor that calls it in tandem. * Should we consider merging the discussion of alternatives into the [impl specialization] RFC? + +# Bibliography + +### Reynolds +[Rey83]: #reynolds +John C. Reynolds. "Types, abstraction and parametric polymorphism". IFIP 1983 +http://www.cse.chalmers.se/edu/year/2010/course/DAT140_Types/Reynolds_typesabpara.pdf + +### Wadler +[Wad89]: #wadler +Philip Wadler. "Theorems for free!". FPCA 1989 +http://ttic.uchicago.edu/~dreyer/course/papers/wadler.pdf + From ac82bb8ce17773ac71c8c8c8ca1a3ebedec4b7a1 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:18:34 +0200 Subject: [PATCH 0455/1195] fix a typo --- text/0000-nonparametric-dropck.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index de4c008df73..0153f8c1b42 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -96,7 +96,7 @@ named `cannot-assume-parametricity` (CAP) and `unguarded-escape-hatch` (UGEH). Though the revisions are given distinct names, they both fall under the feature gate `dropck_parametricity`. (Note however that this -might be irrelevant to CAP; see [CAP stabilization details][]. +might be irrelevant to CAP; see [CAP stabilization details][]). ## cannot-assume-parametricity From f69b3e220183912f6928e4486a6b9ec57faa2922 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:22:39 +0200 Subject: [PATCH 0456/1195] add fwd links to examples. --- text/0000-nonparametric-dropck.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index 0153f8c1b42..4c04709197b 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -129,7 +129,9 @@ will start assuming that a destructor for a data-type definition such as `struct Parametri` may read from data held in its `C` parameter, even though the `fn drop` formerly appeared to be parametric with respect to `C`. This will cause `rustc` to reject code that it had -previously accepted. +previously accepted (below are some examples that +[continue to work][examples-continue-to-work] and +some that [start being rejected][examples-start-reject]). ### CAP stabilization details [CAP stabilization details]: #cap-stabilization-details @@ -261,6 +263,7 @@ code that will start being rejected, and ending with an example of the UGEH attribute. ### Examples of code that must continue to work +[examples-continue-to-work]: #examples-of-code-that-must-continue-to-work Here is some code that works today and must continue to work in the future: @@ -305,6 +308,7 @@ fn main() { ``` ### Examples of code that will start to be rejected +[examples-start-reject]: #examples-of-code-that-will-start-to-be-rejected The main change injected by this RFC is this: due to `cannot-assume-parametricity`, an attempt to add a destructor to the `struct Foo` above will cause the @@ -342,6 +346,7 @@ everything that compiled before the change continued to compile after it. ### Example of the unguarded-escape-hatch +[examples-escape-hatch]: #example-of-the-unguarded-escape-hatch If the developer of `Foo` has access to the feature-gated escape-hatch, and is willing to assert that the destructor for `Foo` From b506941d66c3ef7b7d8ba072772898271d97f7a4 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:27:59 +0200 Subject: [PATCH 0457/1195] a more specific link to the relevant alternative --- text/0000-nonparametric-dropck.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index 4c04709197b..a231bac4570 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -245,7 +245,7 @@ programmer to prove. It may be possible for future language changes to allow us to directly express the necessary parametricity properties. - See further discussion in [Alternatives][]. + See further discussion in the [continue supporting parametricity][] alternative. The suggested attribute name (`unsafe_destructor_blind_to_params` above) was deliberately selected to be long and ugly, in order to From 722049498f910d7415024db3b2a9931c8beb7349 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:29:01 +0200 Subject: [PATCH 0458/1195] add link to the relevant parametricity-insufficient bit --- text/0000-nonparametric-dropck.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index a231bac4570..49fb5419081 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -445,7 +445,7 @@ parametric destructors, as we will see. (Also, if we go down this path, we will need to fix other bugs in the Drop Check rule, where, as previously noted, parametricity is a -necessary but *insufficient* condition for soundness.) +[necessary but *insufficient* condition][parametricity-insufficient] for soundness.) ### Parametricity via effect-system attributes From a576f8049ac57783057ced707109ccf369b7b085 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:37:32 +0200 Subject: [PATCH 0459/1195] fix poor wording --- text/0000-nonparametric-dropck.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index 49fb5419081..e250565ef65 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -464,7 +464,7 @@ a function `fn foo()`, would mean two things: All `fn drop` methods would be required to be `#[unspecialized]`. It is the second bullet that makes this an ad-hoc effect system: it provides -a recursive property that ensures that during the extent of the call to `foo`, +a recursive property ensuring that during the extent of the call to `foo`, we will never invoke a function marked as `default`. It is also this second bullet that represents a signficant burden on From 8e4e55e2573be990077bb82aa89332e92ace8751 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 5 Aug 2015 20:38:26 +0200 Subject: [PATCH 0460/1195] fill in a final missing step in the reasoning presented. --- text/0000-nonparametric-dropck.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-nonparametric-dropck.md b/text/0000-nonparametric-dropck.md index e250565ef65..aab682378ba 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/0000-nonparametric-dropck.md @@ -465,7 +465,8 @@ All `fn drop` methods would be required to be `#[unspecialized]`. It is the second bullet that makes this an ad-hoc effect system: it provides a recursive property ensuring that during the extent of the call to `foo`, -we will never invoke a function marked as `default`. +we will never invoke a function marked as `default` (and therefore, I *think*, +will never even potentially invoke a method that has been specialized). It is also this second bullet that represents a signficant burden on the destructor implementor. In particular, it immediately rules out From 5614b94a69b6f4f68d88da93bdf288dd17fe3890 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 6 Aug 2015 08:55:11 -0400 Subject: [PATCH 0461/1195] Add text documenting the impact on crates.io --- text/0000-projections-lifetimes-and-wf.md | 120 +++++++++++++++++----- 1 file changed, 97 insertions(+), 23 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index feff7ba7acd..68ba942ed1b 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -55,17 +55,17 @@ flexible, so that e.g. we can deduce that `T::Foo: 'a` if `T: 'a`, and similarly that `T::Foo` is well-formed if `T` is well-formed. As a bonus, the new rules are also sound. ;) -**The outlives relation is simpler.** The older definition for the -outlives relation `T: 'a` was rather subtle. The new rule basically -says that if all type/lifetime parameters appearing in the type `T` -must outlive `'a`, then `T: 'a` (though there can also be other ways -for us to decide that `T: 'a` is valid, such as in-scope where -clauses). So for example `fn(&'x X): 'a` if `'x: 'a` and `X: 'a` -(presuming that `X` is a type parameter). The older rules were based -on what kind of data was actually *reachable*, and hence accepted this -type (since no data of `&'x X` is reachable from a function pointer). -This change primarily affects struct declarations, since they may now -require additional outlives bounds: +**Simpler outlives relation.** The older definition for the outlives +relation `T: 'a` was rather subtle. The new rule basically says that +if all type/lifetime parameters appearing in the type `T` must outlive +`'a`, then `T: 'a` (though there can also be other ways for us to +decide that `T: 'a` is valid, such as in-scope where clauses). So for +example `fn(&'x X): 'a` if `'x: 'a` and `X: 'a` (presuming that `X` is +a type parameter). The older rules were based on what kind of data was +actually *reachable*, and hence accepted this type (since no data of +`&'x X` is reachable from a function pointer). This change primarily +affects struct declarations, since they may now require additional +outlives bounds: ```rust // OK now, but after this RFC requires `X: 'a`: @@ -89,7 +89,7 @@ doesn't check this in associated type definitions: ```rust impl Iterator for SomethingElse { - type Item = SomeStruct; // OK now, not after this RFC + type Item = SomeStruct; // accepted now, not after this RFC } ``` @@ -107,16 +107,86 @@ traits like the following were found in the wild: ```rust trait Foo { // currently accepted, but should require that Self: Sized - fn method(&self, value: Option) { - // note: default implement here - } + fn method(&self, value: Option); } ``` -Because this method supplies a default implementation, it requires -that the argument types are well-formed, which in turn means that -`Self: Sized` must hold. But for some reason (of which I am actually -not entirely sure) this is not checked now. +To be well-formed, an `Option` type requires that `T: Sized`. In +this case, though `T=Self`, and `Self` is not `Sized` by +default. Therefore, this trait should be declared `trait Foo: Sized` +to be legal. The compiler is currently *attempting* to enforce these +rules, but many cases were overlooked in practice. + +### Impact on crates.io + +This RFC has been largely implemented and tested against crates.io. A +[total of 43 (root) crates are affected][crater-all] by the +changes. Interestingly, **the vast majority of warnings/errors that +occur are not due to new rules introduced by this RFC**, but rather +due to older rules being more correctly enforced. + +Of the affected crates, **40 are receiving future compatibility +warnings and hence continue to build for the time being**. In the +[remaining three cases][crater-errors], it was not possible to isolate +the effects of the new rules, and hence the compiler reports an error +rather than a future compatibility warning. + +What follows is a breakdown of the reason that crates on crates.io are +receiving errors or warnings. Each row in the table corresponds to one +of the explanations above. + +Problem | Future-compat. warnings | Errors | +----------------------------- | ----------------------- | ------ | +More types are sanity checked | 35 | 3 | +Simpler outlives relation | 5 | | + +As you can see, by far the largest source of problems is simply that +we are now sanity checking more types. This was always the intent, but +there were bugs in the compiler that led to it either skipping +checking altogether or only partially applying the rules. It is +interesting to drill down a bit further into the 38 warnings/errors +that resulted from more types being sanity checked in order to see +what kinds of mistakes are being caught: + +Case | Problem | Number | +---- | ----------------------------- | ------ | + 1 | `Self: Sized` required | 26 | + 2 | `Foo: Bar` required | 11 | + 3 | Not object safe | 1 | + +An example of each case follows: + +**Cases 1 and 2.** In the compiler today, types appearing in trait methods +are incompletely checked. This leads to a lot of traits with +insufficient bounds. By far the most common example was that the +`Self` parameter would appear in a context where it must be sized, +usually when it is embedded within another type (e.g., +`Option`). Here is an example: + +```rust +trait Test { + fn test(&self) -> Option; + // ~~~~~~~~~~~~ + // Incorrectly permitted before. +} +``` + +Because `Option` requires that `T: Sized`, this trait should be +declared as follows: + +```rust +trait Test: Sized { + fn test(&self) -> Option; +} +``` + +**Case 2.** Case 2 is the same as case 1, except that the missing +bound is some trait other than `Sized`, or in some cases an outlives +bound like `T: 'a`. + +**Case 3.** The compiler currently permits non-object-safe traits to +be used as types, even if objects could never actually be created +([#21953]). ### Projections and the outlives relation @@ -148,10 +218,10 @@ still lead to [annoying errors in some situations][#23442]. Finding a better solution has been on the agenda for some time. Simultaneously, we realized in [#24622] that the compiler had a bug -that caused it erroneously assume that every projection like `I::Item` -outlived the current function body, just as it assumes that type -parameters like `I` outlive the current function body. **This bug can -lead to unsound behavior.** Unfortunately, simply implementing the +that caused it to erroneously assume that every projection like +`I::Item` outlived the current function body, just as it assumes that +type parameters like `I` outlive the current function body. **This bug +can lead to unsound behavior.** Unfortunately, simply implementing the naive fix for #24622 exacerbates the shortcomings of the current rules for projections, causing widespread compilation failures in all sorts of reasonable and obviously correct code. @@ -978,3 +1048,7 @@ then `R ⊢ P': 'a`. Proceed by induction and by cases over the form of `P`: 'a`. In other words, if all the type/lifetime parameters that appear in a type outlive `'a`, then the type outlives `'a`. Follows by inspection of the outlives rules. + +[crater-errors]: https://gist.github.com/nikomatsakis/2f851e2accfa7ba2830d#root-regressions-sorted-by-rank +[crater-all]: https://gist.github.com/nikomatsakis/364fae49de18268680f2#root-regressions-sorted-by-rank +[#21953]: https://github.com/rust-lang/rust/issues/21953 From b17fdb8aa491834f8f0fe2bbefa6d87597714404 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 6 Aug 2015 15:14:59 -0400 Subject: [PATCH 0462/1195] pnkfelix nits --- text/0000-projections-lifetimes-and-wf.md | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index 68ba942ed1b..7499b062fe4 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -412,8 +412,8 @@ These are inference rules written in a primitive ASCII notation. :) As part of defining the outlives relation, we need to track the set of lifetimes that are bound within the type we are looking at. Let's call that set `R=`. Initially, this set `R` is empty, but it -will grow as we traverse through types like fns or objects, which can -bind region names. +will grow as we traverse through types like fns or object fragments, +which can bind region names via `for<..>`. #### Simple outlives rules @@ -455,22 +455,29 @@ or projections are involved: The outlives relation for lifetimes depends on whether the lifetime in question was bound within a type or not. In the usual case, we decide -the relationship between two lifetimes by consulting the environment. -Lifetimes representing scopes within the current fn have a -relationship derived from the code itself, while lifetime parameters -have relationships defined by where-clauses and implied bounds. +the relationship between two lifetimes by consulting the environment, +or using the reflexive property. Lifetimes representing scopes within +the current fn have a relationship derived from the code itself, while +lifetime parameters have relationships defined by where-clauses and +implied bounds. + OutlivesRegionEnv: 'x ∉ R // not a bound region ('x: 'a) in Env // derivable from where-clauses etc -------------------------------------------------- R ⊢ 'x: 'a - + + OutlivesRegionReflexive: + -------------------------------------------------- + R ⊢ 'a: 'a + For higher-ranked lifetimes, we simply ignore the relation, since the -lifetime is not yet known. This means for example that `fn<'a> fn(&'a +lifetime is not yet known. This means for example that `for<'a> fn(&'a i32): 'x` holds, even though we do not yet know what region `'a` is (and in fact it may be instantiated many times with different values on each call to the fn). + OutlivesRegionBound: 'x ∈ R // bound region -------------------------------------------------- R ⊢ 'x: 'a From af2135286944a53fa9ecd5e84713002cfa51309b Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 6 Aug 2015 15:19:28 -0400 Subject: [PATCH 0463/1195] correct some out of date text that Ralf identified --- text/0000-projections-lifetimes-and-wf.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index 7499b062fe4..1d153100300 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -710,14 +710,15 @@ lifetime names. Let's start with the rule for fn types: -------------------------------------------------- R ⊢ for fn(T1..Tn) -> T0 WF -Basically, this rule says that a `fn` type is *always* WF, regardless -of what types it references. This certainly accepts a type like -`for<'a> fn(x: &'a T)`. However, it also accepts some types that it -probably shouldn't. Consider for example if we had a type like -`NoHash` that is not hashable; in that case, it'd be nice if -`fn(HashMap)` were not considered well-formed. But these -rules would accept it, because `HashMap` appears inside a -fn signature. +Basically, this rule adds the bound lifetimes to the set `R` and then +checks whether the argument and return type are well-formed. We'll see +in the next section that means that any requirements on those types +which reference bound identifiers are just assumed to hold, but the +remainder are checked. For example, if we have a type `HashSet` +which requires that `K: Hash`, then `fn(HashSet)` would be +illegal since `NoHash: Hash` does not hold, but `for<'a> +fn(HashSet<&'a NoHash>)` *would* be legal, since `&'a NoHash: Hash` +involves a bound region `'a`. See the next section for details. Note that `fn` types do not require that `T0..Tn` be `Sized`. This is intentional. The limitation that only sized values can be passed as From 47f6ae9c5f3c935b1d943da5366fbd4531a89e35 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Thu, 6 Aug 2015 16:01:19 -0700 Subject: [PATCH 0464/1195] Use the platform-intrinsic ABI instead of rust-intrinsic. --- text/0000-simd-infrastructure.md | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index cdfd8057c34..40aba05464c 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -1,4 +1,4 @@ -- Feature Name: simd_basics, cfg_target_feature +- Feature Name: simd_basics, platform_intrinsics, cfg_target_feature - Start Date: 2015-06-02 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -45,7 +45,7 @@ those features enabled. The design comes in three parts, all on the path to stabilisation: - types (`feature(simd_basics)`) -- operations (`feature(simd_basics)`) +- operations (`feature(platform_intrinsics)`) - platform detection (`feature(cfg_target_feature)`) The general idea is to avoid bad performance cliffs, so that an @@ -116,8 +116,10 @@ intrinsics would be on the path to stabilisation (that is, one can "import" them with `extern` in stable code), and would not be exported by `std`. +Example: + ```rust -extern "rust-intrinsic" { +extern "platform-intrinsic" { fn x86_mm_abs_epi16(a: Simd8) -> Simd8; // ... } @@ -144,7 +146,7 @@ SIMD vector of bytes, struct A(u8, u8, ..., u8); struct B(u8, u8, ..., u8); -extern "rust-intrinsic" { +extern "platform-intrinsic" { fn add_aaa(x: A, y: A) -> A; // ok fn add_bbb(x: B, y: B) -> B; // ok fn add_aab(x: A, y: A) -> B; // error, expected B, found A @@ -169,6 +171,16 @@ but will be shimmed as efficiently as possible. - arithmetic - conversions +All of these intrinsics are imported via an `extern` directive similar +to the process for pre-existing intrinsics like `transmute`, however, +the SIMD operations are provided under a special ABI: +`platform-intrinsic`. Use of this ABI (and hence the intrinsics) is +initially feature-gated under the `platform_intrinsics` feature +name. Why `platform-intrinsic` rather than say `simd-intrinsic`? There +are non-SIMD platform-specific instructions that may be nice to expose +(for example, Intel defines an `_addcarry_u32` intrinsic corresponding +to the `ADC` instruction). + ### Shuffles & element operations One of the most powerful features of SIMD is the ability to rearrange @@ -185,7 +197,7 @@ shuffles without having to understand all the details of every platform specific intrinsic for shuffling. ```rust -extern "rust-intrinsic" { +extern "platform-intrinsic" { fn simd_shuffle2(v: T, w: T, i0: u32, i1: u32) -> Simd2; fn simd_shuffle4(v: T, w: T, i0: u32, i1: u32, i2: u32, i3: u32) -> Sidm4; fn simd_shuffle8(v: T, w: T, @@ -226,7 +238,7 @@ vectors are provided, to allow modelling the SIMD vectors as actual CPU registers as much as possible: ```rust -extern "rust-intrinsic" { +extern "platform-intrinsic" { fn simd_insert(v: T, i0: u32, elem: Elem) -> T; fn simd_extract(v: T, i0: u32) -> Elem; } @@ -245,7 +257,7 @@ return vectors, as required. The raw signatures would look like: ```rust -extern "rust-intrinsic" { +extern "platform-intrinsic" { fn simd_eq(v: T, w: T) -> U; fn simd_ne(v: T, w: T) -> U; fn simd_lt(v: T, w: T) -> U; @@ -266,7 +278,7 @@ Intrinsics will be provided for arithmetic operations like addition and multiplication. ```rust -extern { +extern "platform-intrinsic" { fn simd_add(x: T, y: T) -> T; fn simd_mul(x: T, y: T) -> T; // ... @@ -363,7 +375,6 @@ cfg_if_else! { # Alternatives -- The SIMD on-route-to-stable intrinsics could have their own ABI - Intrinsics could instead by namespaced by ABI, `extern "x86-intrinsic"`, `extern "arm-intrinsic"`. - There could be more syntactic support for shuffles, either with true From 4a4e6aefe43b0912b233c29d621e879c978c3875 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Thu, 6 Aug 2015 16:03:06 -0700 Subject: [PATCH 0465/1195] feature(simd_basics) -> feature(repr_simd) This feature gate now only applies to the attribute, so it might as well be more specific. --- text/0000-simd-infrastructure.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 40aba05464c..a3527afe8dd 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -1,4 +1,4 @@ -- Feature Name: simd_basics, platform_intrinsics, cfg_target_feature +- Feature Name: repr_simd, platform_intrinsics, cfg_target_feature - Start Date: 2015-06-02 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -44,7 +44,7 @@ those features enabled. The design comes in three parts, all on the path to stabilisation: -- types (`feature(simd_basics)`) +- types (`feature(repr_simd)`) - operations (`feature(platform_intrinsics)`) - platform detection (`feature(cfg_target_feature)`) From 3fdbb607294b5fd0ddc6985d8dda299d0f09fba9 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Thu, 6 Aug 2015 17:27:51 -0700 Subject: [PATCH 0466/1195] References into repr(packed) structs should be `unsafe`. --- text/0000-repr-packed-unsafe-ref.md | 438 ++++++++++++++++++++++++++++ 1 file changed, 438 insertions(+) create mode 100644 text/0000-repr-packed-unsafe-ref.md diff --git a/text/0000-repr-packed-unsafe-ref.md b/text/0000-repr-packed-unsafe-ref.md new file mode 100644 index 00000000000..473c7a991e1 --- /dev/null +++ b/text/0000-repr-packed-unsafe-ref.md @@ -0,0 +1,438 @@ +- Feature Name: NA +- Start Date: 2015-08-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Taking a reference into a struct marked `repr(packed)` should become +`unsafe`, because it can lead to undefined behaviour. `repr(packed)` +structs need to be banned from storing `Drop` types for this reason. + +# Motivation + +Issue [#27060](https://github.com/rust-lang/rust/issues/27060) noticed +that it was possible to trigger undefined behaviour in safe code via +`repr(packed)`, by creating references `&T` which don't satisfy the +expected alignment requirements for `T`. + +Concretely, the compiler assumes that any reference (or raw pointer, +in fact) will be aligned to at least `align_of::()`, i.e. the +following snippet should run successfully: + +```rust +let some_reference: &T = /* arbitrary code */; + +let actual_address = some_reference as *const _ as usize; +let align = std::mem::align_of::(); + +assert_eq!(actual_address % align, 0); +``` + +However, `repr(packed)` allows on to violate this, by creating values +of arbitrary types that are stored at "random" byte addresses, by +removing the padding normally inserted to maintain alignment in +`struct`s. E.g. suppose there's a struct `Foo` defined like +`#[repr(packed, C)] struct Foo { x: u8, y: u32 }`, and there's an +instance of `Foo` allocated at a 0x1000, the `u32` will be placed at +`0x1001`, which isn't 4-byte aligned (the alignment of `u32`). + +Issue #27060 has a snippet which crashes at runtime on at least two +x86-64 CPUs (the author's and the one playpen runs on) and almost +certainly most other platforms. + +```rust +#![feature(simd, test)] + +extern crate test; + +// simd types require high alignment or the CPU faults +#[simd] +#[derive(Debug, Copy, Clone)] +struct f32x4(f32, f32, f32, f32); + +#[repr(packed)] +#[derive(Copy, Clone)] +struct Unalign(T); + +struct Breakit { + x: u8, + y: Unalign +} + +fn main() { + let val = Breakit { x: 0, y: Unalign(f32x4(0.0, 0.0, 0.0, 0.0)) }; + + test::black_box(&val); + + println!("before"); + + let ok = val.y; + test::black_box(ok.0); + + println!("middle"); + + let bad = val.y.0; + test::black_box(bad); + + println!("after"); +} +``` + +On playpen, it prints: + +``` +before +middle +playpen: application terminated abnormally with signal 4 (Illegal instruction) +``` + +That is, the `bad` variable is causing the CPU to fault. The `let` +statement is (in pseudo-Rust) behaving like `let bad = +load_with_alignment(&val.y.0, align_of::());`, but the +alignment isn't satisfied. (The `ok` line is compiled to a `movupd` +instruction, while the `bad` is compiled to a `movapd`: `u` == +unaligned, `a` == aligned.) + +(NB. The use of SIMD types in the example is just to be able to +demonstrate the problem on x86. That platform is generally fairly +relaxed about pointer alignments and so SIMD & its specialised `mov` +instructions are the easiest way to demonstrate the violated +assumptions at runtime. Other platforms may fault on other types.) + +Being able to assume that accesses are aligned is useful, for +performance, and almost all references will be correctly aligned +anyway (`repr(packed)` types and internal references into them are +quite rare). + +The problems with unaligned accesses can be avoided by ensuring that +the accesses are actually aligned (e.g. via runtime checks, or other +external constraints the compiler cannot understand directly). For +example, consider the following + +```rust +#[repr(packed, C)] +struct Bar { + x: u8, + y: u16, + z: u8, + w: u32, +} +``` + +Taking a reference to some of those fields may cause undefined +behaviour, but not always. It is always correct to take +a reference to `x` or `z` since `u8` has alignment 1. If the struct +value itself is 4-byte aligned (which is not guaranteed), `w` will +also be 4-byte aligned since the `u8, u16, u8` take up 4 bytes, hence +it is correct to take a reference to `w` in this case (and only that +case). Similarly, it is only correct to take a reference to `y` if the +struct is at an odd address, so that the `u16` starts at an even one +(i.e. is 2-byte aligned). + +# Detailed design + +It is `unsafe` to take a reference to the field of a `repr(packed)` +struct. It is still possible, but it is up to the programmer to ensure +that the alignment requirements are satisfied. Referencing +(by-reference, or by-value) a subfield of a struct (including indexing +elements of a fixed-length array) stored inside a `repr(packed)` +struct counts as taking a reference to the `packed` field and hence is +unsafe. + +It is still legal to manipulate the fields of a `packed` struct by +value, e.g. the following is correct (and not `unsafe`), no matter the +alignment of `bar`: + +```rust +let bar: Bar = ...; + +let x = bar.y; +bar.w = 10; +``` + +It is illegal to store a type `T` implementing `Drop` (including a +generic type) in a `repr(packed)` type, since the destructor of `T` is +passed a reference to that `T`. The crater run (see appendix) found no +crate that needs to use `repr(packed)` to store a `Drop` type (or a +generic type). The generic type rule is conservatively approximated by +disallowing generic `repr(packed)` structs altogether, but this can be +relaxed (see Alternatives). + +Concretely, this RFC is proposing the introduction of the `// error`s +in the following code. + +```rust +struct Baz { + x: u8, +} + +#[repr(packed)] +struct Qux { // error: generic repr(packed) struct + y: Baz, + z: u8, + w: String, // error: storing a Drop type in a repr(packed) struct + t: [u8; 4], +} + +let mut qux = Qux { ... }; + +// all ok: +let y_val = qux.y; +let z_val = qux.z; +let t_val = qux.t; +qux.y = Baz { ... }; +qux.z = 10; +qux.t = [0, 1, 2, 3]; + +// new errors: + +let y_ref = &qux.y; // error: taking a reference to a field of a repr(packed) struct is unsafe +let z_ref = &mut qux.z; // ditto +let y_ptr: *const _ = &qux.y; // ditto +let z_ptr: *mut _ = &mut qux.z; // ditto + +let x_val = qux.y.x; // error: directly using a subfield of a field of a repr(packed) struct is unsafe +let x_ref = &qux.y.x; // ditto +qux.y.x = 10; // ditto + +let t_val = qux.t[0]; // error: directly indexing an array in a field of a repr(packed) struct is unsafe +let t_ref = &qux.t[0]; // ditto +qux.t[0] = 10; // ditto +``` + +(NB. the subfield and indexing cases can be resolved by first copying +the packed field's value onto the stack, and then accessing the +desired value.) + +## Staging + +This change will first land as warnings indicating that code will be +broken, with the warnings switched to the intended errors after one +release cycle. + +# Drawbacks + +This will cause some functionality to stop working in +possibly-surprising ways (NB. the drawback here is mainly the +"possibly-surprising", since the functionality is broken with general +`packed` types.). For example, `#[derive]` usually takes references to +the fields of structs, and so `#[derive(Clone)]` will generate +errors. However, this use of derive is incorrect in general (no +guarantee that the fields are aligned), and, one can easily replace it +by: + +```rust +#[derive(Copy)] +#[repr(packed)] +struct Foo { ... } + +impl Clone for Foo { fn clone(&self) -> Foo { *self } } +``` + +Similarly, `println!("{}", foo.bar)` will be an error despite there +not being a visible reference (`println!` takes one internally), +however, this can be resolved by, for instance, assigning to a +temporary. + +# Alternatives + +- A short-term solution would be to feature gate `repr(packed)` while + the kinks are worked out of it +- Taking an internal reference could be made flat-out illegal, and the + times when it is correct simulated by manual raw-pointer + manipulation. +- The rules could be made less conservative in several cases, however + the crater run didn't indicate any need for this: + - a generic `repr(packed)` struct can use the generic in ways that + avoids problems with `Drop`, e.g. if the generic is bounded by + `Copy`, or if the type is only used in ways that are `Copy` such + as behind a `*const T`. + - using a subfield of a field of a `repr(packed)` struct by-value + could be OK. + +# Unresolved questions + +None. + +# Appendix + +## Crater analysis + +Crater was run on 2015/07/23 with a patch that feature gated `repr(packed)`. + +High-level summary: + +- several unnecessary uses of `repr(packed)` (patches have been + submitted and merged to remove all of these) +- most necessary ones are to match the declaration of a struct in C +- many "necessary" uses can be replaced by byte arrays/arrays of smaller types +- 8 crates are currently on stable themselves (unsure about deps), 4 are already on nightly + - 1 of the 8, http2parse, is essentially only used by a nightly-only crate (tendril) + - 4 of the stable and 1 of the nightly crates don't need `repr(packed)` at all + +| | stable | needed | FFI only | +|------------|--------|--------|----------| +| image | ✓ | | | +| nix | ✓ | ✓ | ✓ | +| tendril | | ✓ | | +| assimp-sys | ✓ | ✓ | ✓ | +| stemmer | ✓ | | | +| x86 | ✓ | ✓ | ✓ | +| http2parse | ✓ | ✓ | | +| nl80211rs | ✓ | ✓ | ✓ | +| openal | ✓ | | | +| elfloader | | ✓ | ✓ | +| x11 | ✓ | | | +| kiss3d | ✓ | | | + +More detailed analysis inline with broken crates. (Don't miss `kiss3d` in the non-root section.) + +### Regression report c85ba3e9cb4620c6ec8273a34cce6707e91778cb vs. 7a265c6d1280932ba1b881f31f04b03b20c258e5 + +* From: c85ba3e9cb4620c6ec8273a34cce6707e91778cb +* To: 7a265c6d1280932ba1b881f31f04b03b20c258e5 + +#### Coverage + +* 2617 crates tested: 1404 working / 1151 broken / 40 regressed / 0 fixed / 22 unknown. + +#### Regressions + +* There are 11 root regressions +* There are 40 regressions + +#### Root regressions, sorted by rank: + +* [image-0.3.11](https://crates.io/crates/image) + ([before](https://tools.taskcluster.net/task-inspector/#V6QBA9LfTT6mhFJ0Yo7nJg)) + ([after](https://tools.taskcluster.net/task-inspector/#QU9d4XEPSWOg7CIGFpATDg)) + - [use](https://github.com/PistonDevelopers/image/blob/8e64e0d78e465ddfa13cd6627dede5fd258386f6/src/tga/decoder.rs#L75) + seems entirely unnecessary (no raw bytewise operations on the + struct itself) + + On stable. +* [nix-0.3.9](https://crates.io/crates/nix) + ([before](https://tools.taskcluster.net/task-inspector/#X3HMXrq4S_GMNbeeAY8i6w)) + ([after](https://tools.taskcluster.net/task-inspector/#kz0vDaAhRRuKww2l-FvYpQ)) + - [use](https://github.com/carllerche/nix-rust/blob/5801318c0c4c6eeb3431144a89496830f55d6628/src/sys/epoll.rs#L98) + required to match + [C struct](https://github.com/torvalds/linux/blob/de182468d1bb726198abaab315820542425270b7/include/uapi/linux/eventpoll.h#L53-L62) + + On stable. +* [tendril-0.1.2](https://crates.io/crates/tendril) + ([before](https://tools.taskcluster.net/task-inspector/#zQH7ShADR5O9eQe1mg3e6A)) + ([after](https://tools.taskcluster.net/task-inspector/#zI-PoIZHTm-7Urq3CLsXeg)) + - [use 1](https://github.com/servo/tendril/blob/faf97ded26213e561f8ad2768113cc05b6424748/src/buf32.rs#L19) + not strictly necessary? + - [use 2](https://github.com/servo/tendril/blob/faf97ded26213e561f8ad2768113cc05b6424748/src/tendril.rs#L43) + required on 64-bit platforms to get size_of::<Header>() == 12 rather + than 16. + - [use 3](https://github.com/servo/tendril/blob/faf97ded26213e561f8ad2768113cc05b6424748/src/tendril.rs#L91), + as above, does some precise tricks with the layout for optimisation. + + Requires nightly. +* [assimp-sys-0.0.3](https://crates.io/crates/assimp-sys) ([before](https://tools.taskcluster.net/task-inspector/#rTrUh0VQR2uWXMQw14kRIA)) ([after](https://tools.taskcluster.net/task-inspector/#AR36o35FRV-mVInHKWFDrg)) + - [many uses](https://github.com/Eljay/assimp-sys/search?utf8=%E2%9C%93&q=packed), + required to match + [C structs](https://github.com/assimp/assimp/blob/f3d418a199cfb7864c826665016e11c65ddd7aa9/include/assimp/types.h#L227) + (one example). In author's words: + + > [11:36:15] <eljay> huon: well my assimp binding is basically abandoned for now if you are just worried about breaking things, and seems unlikely anyone is using it :P + + On stable. +* [stemmer-0.1.1](https://crates.io/crates/stemmer) ([before](https://tools.taskcluster.net/task-inspector/#0Affr5PrTnGoBukeRwuiKw)) ([after](https://tools.taskcluster.net/task-inspector/#8xGRmPxOQS2NHbvgXMvmWQ)) + - [use](https://github.com/lady-segfault/stemmer-rs/blob/4090dcf7a258df5031c10754c8de118e0ca93512/src/stemmer.rs#L7), completely unnecessary + + On stable. +* [x86-0.2.0](https://crates.io/crates/x86) ([before](https://tools.taskcluster.net/task-inspector/#__VYVs6QSYm4JF68fSXibw)) ([after](https://tools.taskcluster.net/task-inspector/#xj8paeiaR0OGkK1v2raHYg)) + - [several similar uses](https://github.com/gz/rust-x86/search?utf8=%E2%9C%93&q=packed), + specific layout necessary for raw interaction with CPU features + + Requires nightly. +* [http2parse-0.0.3](https://crates.io/crates/http2parse) ([before](https://tools.taskcluster.net/task-inspector/#CUr_5dfgQMywZmG_ER7ZGQ)) ([after](https://tools.taskcluster.net/task-inspector/#rQO3m_8iQQapN2l-PvGrRw)) + - [use](https://github.com/reem/rust-http2parse/blob/b363139ac2f81fa25db504a9256face9f8c799b6/src/payload.rs#L206), + used to get super-fast "parsing" of headers, by transmuting + `&[u8]` to `&[Setting]`. + + On stable, however: + + ```irc + [11:30:38] reem: why is https://github.com/reem/rust-http2parse/blob/b363139ac2f81fa25db504a9256face9f8c799b6/src/payload.rs#L208 packed? + [11:31:59] huon: I transmute from & [u8] to & [Setting] + [11:32:35] So repr packed gets me the layout I need + [11:32:47] With no padding between the u8 and u16 + [11:33:11] and between Settings + [11:33:17] ok + [11:33:22] (stop doing bad things :P ) + [11:34:00] (there's some problems with repr(packed) https://github.com/rust-lang/rust/issues/27060 and we may be feature gating it) + [11:35:02] reem: wait, aren't there endianness problems? + [11:36:16] Ah yes, looks like I forgot to finish the Setting interface + [11:36:27] The identifier and value methods take care of converting to types values + [11:36:39] The goal is just to avoid copying the whole buffer and requiring an allocation + [11:37:01] Right now the whole parser takes like 9 ns to parse a frame + [11:39:11] would you be sunk if repr(packed) was feature gated? + [11:40:17] or, is maybe something like `struct SettingsRaw { identifier: [u8; 2], value: [u8; 4] }` OK (possibly with conversion functions etc.)? + [11:40:46] Yea, I could get around it if I needed to + [11:40:58] Anyway the primary consumer is transfer and I'm running on nightly there + [11:41:05] So it doesn't matter too much + ``` + +* [nl80211rs-0.1.0](https://crates.io/crates/nl80211rs) ([before](https://tools.taskcluster.net/task-inspector/#rhEG57vQQHWiVCcS3kIWrA)) ([after](https://tools.taskcluster.net/task-inspector/#s97ED8oXQ4WN-Pbm3ZsFJQ)) + - [three similar uses](https://github.com/carrotsrc/nl80211rs/search?utf8=%E2%9C%93&q=packed) + to match + [C struct](http://lxr.free-electrons.com/source/include/uapi/linux/nl80211.h#L2288). + + On stable. +* [openal-0.2.1](https://crates.io/crates/openal) ([before](https://tools.taskcluster.net/task-inspector/#XUvl-638T82xgGwkrxpz5g)) ([after](https://tools.taskcluster.net/task-inspector/#Oc9wEFpbQM2Tja9sv0qt4g)) + - [several similar uses](https://github.com/meh/rust-openal/blob/9e35fd284f25da7fe90a8307de85a6ec6d392ea1/src/util.rs#L6), + probably unnecessary, just need the struct to behave like + `[f32; 3]`: pointers to it + [are passed](https://github.com/meh/rust-openal/blob/9e35fd284f25da7fe90a8307de85a6ec6d392ea1/src/listener/listener.rs#L204-L205) + to [functions expecting `*mut f32`](https://github.com/meh/rust-openal-sys/blob/master/src/al.rs#L146) pointers. + + On stable. +* [elfloader-0.0.1](https://crates.io/crates/elfloader) ([before](https://tools.taskcluster.net/task-inspector/#ssE4lk0xR3q1qYZBXK24aA)) ([after](https://tools.taskcluster.net/task-inspector/#SAH7AAVIToKkhf7QRK4C1g)) + - [two similar uses](https://github.com/gz/rust-elfloader/blob/d61db7c83d66ce65da92aed5e33a4baf35f4c1e7/src/elf.rs#L362), + required to match file headers/formats exactly. + + Requires nightly. +* [x11cap-0.1.0](https://crates.io/crates/x11cap) ([before](https://tools.taskcluster.net/task-inspector/#7wn8cjqXSOaZfpekKRY-yw)) ([after](https://tools.taskcluster.net/task-inspector/#bA6LwPreTMa8R_zYNt8Z3w)) + - [use](https://github.com/bryal/X11Cap/blob/d11b7170e6fa7c1ab370c69887b9ce71a542335d/src/lib.rs#L41) unnecessary. + + Requires nightly. + +#### Non-root regressions, sorted by rank: + +* [glium-0.8.0](https://crates.io/crates/glium) ([before](https://tools.taskcluster.net/task-inspector/#m5yEIEu-QEeM_2t4_11Opg)) ([after](https://tools.taskcluster.net/task-inspector/#Wztxoh9SQ-GqA4F3inaR9Q)) +* [mio-0.4.1](https://crates.io/crates/mio) ([before](https://tools.taskcluster.net/task-inspector/#RtT-HmwbTYuG0djpAkVLvA)) ([after](https://tools.taskcluster.net/task-inspector/#Lx1d3ukPSGyRIwIDt_w0gw)) +* [piston_window-0.11.0](https://crates.io/crates/piston_window) ([before](https://tools.taskcluster.net/task-inspector/#QE421inlRgShgoXKcUkEEA)) ([after](https://tools.taskcluster.net/task-inspector/#wIKQPW_7TjmrztHQ4Kk3hw)) +* [piston2d-gfx_graphics-0.4.0](https://crates.io/crates/piston2d-gfx_graphics) ([before](https://tools.taskcluster.net/task-inspector/#hIUDm8m6QrCdOpSF30aPjQ)) ([after](https://tools.taskcluster.net/task-inspector/#HOw14MCoQxGj7GjYIy-Lng)) +* [piston-gfx_texture-0.2.0](https://crates.io/crates/piston-gfx_texture) ([before](https://tools.taskcluster.net/task-inspector/#om-wlRW-Tm65MTlrpa8u7Q)) ([after](https://tools.taskcluster.net/task-inspector/#m9e9Vx58RA6KhCljujzzMQ)) +* [piston2d-glium_graphics-0.3.0](https://crates.io/crates/piston2d-glium_graphics) ([before](https://tools.taskcluster.net/task-inspector/#vHeYcL2gRT2aIz9JeksAfw)) ([after](https://tools.taskcluster.net/task-inspector/#yEKBSm1BQ_C0O-4GKhQgUQ)) +* [html5ever-0.2.0](https://crates.io/crates/html5ever) ([before](https://tools.taskcluster.net/task-inspector/#C0yCazihTWa4x2GxCUxasQ)) ([after](https://tools.taskcluster.net/task-inspector/#Vbl4HjqcQlq4-sJ2m1yBnQ)) +* [caribon-0.6.2](https://crates.io/crates/caribon) ([before](https://tools.taskcluster.net/task-inspector/#AJZzG5gLSY-WVMKc-MoV5w)) ([after](https://tools.taskcluster.net/task-inspector/#ornLa3ZaSC-Zbz7ICg33Tg)) +* [gj-0.0.2](https://crates.io/crates/gj) ([before](https://tools.taskcluster.net/task-inspector/#xhaiB76FQAKCEsmBkQtp1A)) ([after](https://tools.taskcluster.net/task-inspector/#rBJke3wpQqaq7wmEiQtLJA)) +* [glium_text-0.5.0](https://crates.io/crates/glium_text) ([before](https://tools.taskcluster.net/task-inspector/#IMdXVtTYSIaDrCRQ6SbLTA)) ([after](https://tools.taskcluster.net/task-inspector/#t322h_mzQGarVmsf5MHqKA)) +* [glyph_packer-0.0.0](https://crates.io/crates/glyph_packer) ([before](https://tools.taskcluster.net/task-inspector/#JmIVzau8RyOhnlTvdsRIHQ)) ([after](https://tools.taskcluster.net/task-inspector/#7k9GF09SQPya4ZrLuR6cJw)) +* [html5ever_dom_sink-0.2.0](https://crates.io/crates/html5ever_dom_sink) ([before](https://tools.taskcluster.net/task-inspector/#7GJmaAYKS9WNqnbCx5XMrw)) ([after](https://tools.taskcluster.net/task-inspector/#pHotnKLkTAqK4-LP-n2MUQ)) +* [identicon-0.1.0](https://crates.io/crates/identicon) ([before](https://tools.taskcluster.net/task-inspector/#15nnASVgStmrwqdCS1q8Rg)) ([after](https://tools.taskcluster.net/task-inspector/#WgJb_jEMQIebNgb_D2uq7Q)) +* [assimp-0.0.4](https://crates.io/crates/assimp) ([before](https://tools.taskcluster.net/task-inspector/#-i-FYpJ2Rz-bcmxGVmxoOQ)) ([after](https://tools.taskcluster.net/task-inspector/#HXR8V8NeRMyOxF0Nnhdl0w)) +* [jamkit-0.2.4](https://crates.io/crates/jamkit) ([before](https://tools.taskcluster.net/task-inspector/#mcpl8Z62Td-DFfoi9AqRnw)) ([after](https://tools.taskcluster.net/task-inspector/#XGOIXxqpRbCMy5bZ42GV5w)) +* [coap-0.1.0](https://crates.io/crates/coap) ([before](https://tools.taskcluster.net/task-inspector/#SI137HlpRsSuQrlhxlRHpQ)) ([after](https://tools.taskcluster.net/task-inspector/#dT3pt46pQtmy3CvIaC_71Q)) +* [kiss3d-0.1.2](https://crates.io/crates/kiss3d) ([before](https://tools.taskcluster.net/task-inspector/#2Bbro6uZQQCudv2ClalFTw)) ([after](https://tools.taskcluster.net/task-inspector/#9vRbugDKTDm94fjw6BcS6A)) + - [use](https://github.com/sebcrozet/kiss3d/blob/1c1d39d5f8a428609b2f7809c7237e8853ac24e9/src/text/glyph.rs#L7) seems to be unnecessary: semantically useless, just a space "optimisation", which actually makes no difference because the Vec field will be appropriately aligned always. + + On stable. +* [compass-sprite-0.0.3](https://crates.io/crates/compass-sprite) ([before](https://tools.taskcluster.net/task-inspector/#dTcfDsk1QYKWtK7EH5gnwg)) ([after](https://tools.taskcluster.net/task-inspector/#rElhdv9GS8-Zi14LSL-6Ng)) +* [dcpu16-gui-0.0.3](https://crates.io/crates/dcpu16-gui) ([before](https://tools.taskcluster.net/task-inspector/#mtbOQfFUTDiZcMUc65LD3w)) ([after](https://tools.taskcluster.net/task-inspector/#co31ZVgNQ1mYyDCnSwBxJg)) +* [piston3d-gfx_voxel-0.1.1](https://crates.io/crates/piston3d-gfx_voxel) ([before](https://tools.taskcluster.net/task-inspector/#2nZmq4zORIOdJ-ErCOCmww)) ([after](https://tools.taskcluster.net/task-inspector/#epzWs2zuSiWxfoWyMCv0Kw)) +* [dev-0.0.7](https://crates.io/crates/dev) ([before](https://tools.taskcluster.net/task-inspector/#5hSafPV2RlKlubg7WHniPw)) ([after](https://tools.taskcluster.net/task-inspector/#ITQ6zXYpSAC3_AtmMe4xRw)) +* [rustty-0.1.3](https://crates.io/crates/rustty) ([before](https://tools.taskcluster.net/task-inspector/#jlstxp6mSPqzQ1n3FgHSRA)) ([after](https://tools.taskcluster.net/task-inspector/#HgrQz6UVQ5yCkVX25Py-2w)) +* [skeletal_animation-0.1.1](https://crates.io/crates/skeletal_animation) ([before](https://tools.taskcluster.net/task-inspector/#nyMUzqs6RZKIZJ1v1xcglA)) ([after](https://tools.taskcluster.net/task-inspector/#10lM9Vh5SBa7YD3swbm6pw)) +* [slabmalloc-0.0.1](https://crates.io/crates/slabmalloc) ([before](https://tools.taskcluster.net/task-inspector/#li_vsJY8S9-OKEP_KIzEyQ)) ([after](https://tools.taskcluster.net/task-inspector/#1lcKVbKVQNqkKSfwEKIvkg)) +* [spidev-0.1.0](https://crates.io/crates/spidev) ([before](https://tools.taskcluster.net/task-inspector/#5YidcvWyQ0KSmX_9yHjL5A)) ([after](https://tools.taskcluster.net/task-inspector/#mmDafSdlSIS-xfDvyeIckQ)) +* [sysfs_gpio-0.3.2](https://crates.io/crates/sysfs_gpio) ([before](https://tools.taskcluster.net/task-inspector/#KEO87BJHSB-9wNHvTGgEiQ)) ([after](https://tools.taskcluster.net/task-inspector/#44Qnzq6CSBSrMti4utYEZQ)) +* [texture_packer-0.0.1](https://crates.io/crates/texture_packer) ([before](https://tools.taskcluster.net/task-inspector/#-yNhXPaFSBK59eEPRBChVw)) ([after](https://tools.taskcluster.net/task-inspector/#dY5YnW-uTRuCAxxh93_P1w)) +* [falcon-0.0.1](https://crates.io/crates/falcon) ([before](https://tools.taskcluster.net/task-inspector/#hsFGvgrWTL6yY5JVjm20Sw)) ([after](https://tools.taskcluster.net/task-inspector/#YMYfL2KkTH2fct8CD9nqUg)) +* [filetype-0.2.0](https://crates.io/crates/filetype) ([before](https://tools.taskcluster.net/task-inspector/#bCC3ps_gT6m05BNm5lEnFw)) ([after](https://tools.taskcluster.net/task-inspector/#trGw9uPMTgiuxp-w821ZgA)) From ddd634141ae96b9736d80695ebfe23d480d9153e Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Wed, 5 Aug 2015 22:54:42 -0700 Subject: [PATCH 0467/1195] Forbid wildcard dependencies on crates.io --- text/0000-no-wildcard-deps.md | 113 ++++++++++++++++++++++++++++++++++ 1 file changed, 113 insertions(+) create mode 100644 text/0000-no-wildcard-deps.md diff --git a/text/0000-no-wildcard-deps.md b/text/0000-no-wildcard-deps.md new file mode 100644 index 00000000000..cd87b43f56e --- /dev/null +++ b/text/0000-no-wildcard-deps.md @@ -0,0 +1,113 @@ +- Feature Name: N/A +- Start Date: 2015-07-23 +- RFC PR: +- Rust Issue: + +# Summary + +A Cargo crate's dependencies are associated with constraints that specify the +set of versions of the dependency with which the crate is compatible. These +constraints range from accepting exactly one version (`=1.2.3`), to +accepting a range of versions (`^1.2.3`, `~1.2.3`, `>= 1.2.3, < 3.0.0`), to +accepting any version at all (`*`). This RFC proposes to update crates.io to +reject publishes of crates that have compile or build dependencies with +version constraints that have no upper bound. + +# Motivation + +Version constraints are a delicate balancing act between stability and +flexibility. On one extreme, one can lock dependencies to an exact version. +From one perspective, this is great, since the dependencies a user will consume +will be the same that the developers tested against. However, on any nontrival +project, one will inevitably run into conflicts where library A depends on +version `1.2.3` of library B, but library C depends on version `1.2.4`, at +which point, the only option is to force the version of library B to one of +them and hope everything works. + +On the other hand, a wildcard (`*`) constraint will never conflict with +anything! There are other things to worry about here, though. A version +constraint is fundamentally an assertion from a library's author to its users +that the library will work with any version of a dependency that matches its +constraint. A wildcard constraint is claiming that the library will work with +any version of the dependency that has ever been released *or will ever be +released, forever*. This is a somewhat absurd guarantee to make - forever is a +long time! + +Absurd guarantees on their own are not necessarily sufficient motivation to +make a change like this. The real motivation is the effect that these +guarantees have on consumers of libraries. + +As an example, consider the [openssl](https://crates.io/crates/openssl) crate. +It is one of the most popular libraries on crates.io, with several hundred +downloads every day. 50% of the [libraries that depend on it](https://crates.io/crates/openssl/reverse_dependencies) +have a wildcard constraint on the version. Almost all of them them will fail +to compile against version 0.7 of openssl when it is released. When that +happens, users of those libraries will be forced to manually override Cargo's +version selection every time it is recalculated. This is not a fun time. + +Bad version restrictions are also "viral". Even if a developer is careful to +pick dependencies that have reasonable version restrictions, there could be a +wildcard constraint hiding five transitive levels down. Manually searching the +entire dependency graph is an exercise in frustration that shouldn't be +necessary. + +On the other hand, consider a library that has a version constraint of `^0.6`. +When openssl 0.7 releases, the library will either continue to work against +version 0.7, or it won't. In the first case, the author can simply extend the +constraint to `>= 0.6, < 0.8` and consumers can use it with version 0.6 or 0.7 +without any trouble. If it does not work against version 0.7, consumers of the +library are fine! Their code will continue to work without any manual +intervention. The author can update the library to work with version 0.7 and +release a new version with a constraint of `^0.7` to support consumers that +want to use that newer release. + +Making crates.io more picky than Cargo itself is not a new concept; it +currently [requires several items](https://github.com/rust-lang/crates.io/blob/8c85874b6b967e1f46ae2113719708dce0c16d32/src/krate.rs#L746-L759) in published crates that Cargo will not: + + * A valid license + * A description + * A list of authors + +All of these requirements are in place to make it easier for developers to use +the libraries uploaded to crates.io - that's why crates are published, after +all! A restriction on wildcards is another step down that path. + +Note that this restriction would only apply to normal compile dependencies and +build dependencies, but not to dev dependencies. Dev dependencies are only used +when testing a crate, so it doesn't matter to downstream consumers if they +break. + +# Detailed design + +Alter crates.io's pre-publish behavior to check the version constraints of all +compile and build dependencies, and reject those that have no upper bound. For +example, these would be rejected: + + * `*` + * `> 0.3` + * `>= 0.3` + +While these would not: + + * `>= 0.3, < 0.5` + * `^0.3` + * `~0.3` + * `=0.3.1` + +# Drawbacks + +The barrier to entry when publishing a crate will be mildly higher. + +In theory, there could be contexts where an unbounded version constraint is +actually appropriate? + +# Alternatives + +We could continue allowing these kinds of constraints, but complain in a +"sufficiently annoying" manner during publishes to discourage their use. + +# Unresolved questions + +Should crates.io also forbid constraints that reference versions of +dependencies that don't yet exist? For example, a constraint of `>= 0.3, < 0.5` +where the dependency has no published versions in the `0.4` range. From 4fe4506fe02df5d9a13981ef0faf1f1967b2c8fd Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 7 Aug 2015 06:35:59 -0400 Subject: [PATCH 0468/1195] Merge RFC #1219, "use group as". --- README.md | 1 + text/{0000-use-group-as.md => 1219-use-group-as.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-use-group-as.md => 1219-use-group-as.md} (91%) diff --git a/README.md b/README.md index 60273a8fb6e..47535a73222 100644 --- a/README.md +++ b/README.md @@ -61,6 +61,7 @@ the direction the language is evolving in. * [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) * [1156-adjust-default-object-bounds.md](text/1156-adjust-default-object-bounds.md) * [1184-stabilize-no_std.md](text/1184-stabilize-no_std.md) +* [1219-use-group-as.md](text/1219-use-group-as.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-use-group-as.md b/text/1219-use-group-as.md similarity index 91% rename from text/0000-use-group-as.md rename to text/1219-use-group-as.md index 1977175bbd8..4fbe3476c28 100644 --- a/text/0000-use-group-as.md +++ b/text/1219-use-group-as.md @@ -1,7 +1,7 @@ - Feature Name: use_group_as - Start Date: 2015-02-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1219](https://github.com/rust-lang/rfcs/pull/1219) +- Rust Issue: [rust-lang/rust#27578](https://github.com/rust-lang/rust/issues/27578) # Summary From abcd4be3b000864fcc89875d68e093040f1c74ca Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 7 Aug 2015 07:44:42 -0400 Subject: [PATCH 0469/1195] rewrite the inference section to be less reliant on the compiler implementation --- text/0000-projections-lifetimes-and-wf.md | 83 +++++++++++++++++------ 1 file changed, 64 insertions(+), 19 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index 1d153100300..d58c16176be 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -446,7 +446,7 @@ or projections are involved: -------------------------------------------------- R ⊢ for fn(T1..Tn) -> T0 - OutlivesTraitRef: + OutlivesFragment: ∀i. R,r.. ⊢ Pi: 'a -------------------------------------------------- R ⊢ for TraitId: 'a @@ -471,6 +471,12 @@ implied bounds. -------------------------------------------------- R ⊢ 'a: 'a + OutlivesRegionTransitive: + R ⊢ 'a: 'c + R ⊢ 'c: 'b + -------------------------------------------------- + R ⊢ 'a: 'b + For higher-ranked lifetimes, we simply ignore the relation, since the lifetime is not yet known. This means for example that `for<'a> fn(&'a i32): 'x` holds, even though we do not yet know what region `'a` is @@ -508,13 +514,15 @@ higher-ranked lifetimes. (This is somewhat stricter than necessary, but reflects the behavior of my prototype implementation.) OutlivesProjectionEnv: - >::Id: 'a in Env + >::Id: 'b in Env + <> ⊢ 'b: 'a -------------------------------------------------- <> ⊢ >::Id: 'a OutlivesProjectionTraitDef: WC = [Xi => Pi] WhereClauses(Trait) - >::Id: 'a in WC + >::Id: 'b in WC + <> ⊢ 'b: 'a -------------------------------------------------- <> ⊢ >::Id: 'a @@ -624,16 +632,46 @@ any lifetime or type parameters. #### Implementation complications -One complication for the implementation is that there are so many -potential outlives rules for projections. In particular, the rule that -says `>>: 'a` holds if `Pi: 'a` is not an "if and -only if" rule. So, for example, if we know that `T: 'a` and `'b: 'a`, -then we know that `>:: Item: 'a` (for any trait and -item), but not vice versa. This complicates inference significantly, -since if variables are involved, we do not know whether to create -edges between the variables or not (put another way, the simple -dataflow model we are currently using doesn't truly suffice for these -rules). +The current region inference code only permits constraints of the +form: + +``` +C = r0: r1 + | C AND C +``` + +This is convenient because a simple fixed-point iteration suffices to +find the minimal regions which satisfy the constraints. + +Unfortunately, this constraint model does not scale to the outlives +rules for projections. Consider a trait reference like `>::Item: 'Y`, where `'X` and `'Y` are both region variables +whose value is being inferred. At this point, there are several +inference rules which could potentially apply. Let us assume that +there is a where-clause in the environment like `>::Item: 'b`. In that case, *if* `'X == 'a` and `'b: 'Y`, +then we could employ the `OutlivesProjectionEnv` rule. This would +correspond to a constraint set like: + +``` +C = 'X:'a AND 'a:'X AND 'b:'Y +``` + +Otherwise, if `T: 'a` and `'X: 'Y`, then we could use the +`OutlivesProjectionComponents` rule, which would require a constraint +set like: + +``` +C = C1 AND 'X:'Y +``` + +where `C1` is the constraint set for `T:'a`. + +As you can see, these two rules yielded distinct constraint sets. +Ideally, we would combine them with an `OR` constraint, but no such +constraint is available. Adding such a constraint complicates how +inference works, since a fixed-point iteration is no longer +sufficient. This complication is unfortunate, but to a large extent already exists with where-clauses and trait matching (see e.g. [#21974]). (Moreover, @@ -642,11 +680,17 @@ take several inputs (the parameters to the trait) which may or may not be related to the actual type definition in question.) For the time being, the current implementation takes a pragmatic -approach based on heuristics. It tries to avoid adding edges to the -region graph in various common scenarios, and in the end falls back to -enforcing conditions that may be stricter than necessary, but which -certainly suffice. We have not yet encountered an example in practice -where the current implementation rules do not suffice. +approach based on heuristics. It first examines whether any region +bounds are declared in the trait and, if so, prefers to use +those. Otherwise, if there are region variables in the projection, +then it falls back to the `OutlivesProjectionComponents` rule. This is +always sufficient but may be stricter than necessary. If there are no +region variables in the projection, then it can simply run inference +to completion and check each of the other two rules in turn. (It is +still necessary to run inference because the bound may be a region +variable.) So far this approach has sufficed for all situations +encountered in practice. Eventually, we should extend the region +inferencer to a richer model that includes "OR" constraints. ### The WF relation @@ -718,7 +762,8 @@ remainder are checked. For example, if we have a type `HashSet` which requires that `K: Hash`, then `fn(HashSet)` would be illegal since `NoHash: Hash` does not hold, but `for<'a> fn(HashSet<&'a NoHash>)` *would* be legal, since `&'a NoHash: Hash` -involves a bound region `'a`. See the next section for details. +involves a bound region `'a`. See the "Checking Conditions" section +for details. Note that `fn` types do not require that `T0..Tn` be `Sized`. This is intentional. The limitation that only sized values can be passed as From a3d22d818ac271a7761e59801bdda62347b02272 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 7 Aug 2015 07:59:03 -0400 Subject: [PATCH 0470/1195] clarify object type fragments --- text/0000-projections-lifetimes-and-wf.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/0000-projections-lifetimes-and-wf.md index d58c16176be..bb91a827c09 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/0000-projections-lifetimes-and-wf.md @@ -404,6 +404,11 @@ types: r = 'x // Region name We'll use this to describe the rules in detail. + +A quick note on terminology: an "object type fragment" is part of an +object type: so if you have `Box`, `FnMut()` and `Send` +are object type fragments. Object type fragments are identical to full +trait references, except that they do not have a self type (no `P0`). ### Syntactic definition of the outlives relation @@ -799,10 +804,8 @@ and a trait object like `Foo+'x`, when we require that `'static: 'x` (which is true, clearly, but in some cases the implicit bounds from traits are not `'static` but rather some named lifetime). -The next clause states that all object type fragments must be WF (an -"object type fragment" is part of an object type: so if you have -`Box`, `FnMut()` and `Send` are object type -fragments). An object type fragment is WF if its components are WF: +The next clause states that all object type fragments must be WF. An +object type fragment is WF if its components are WF: WfObjectFragment: ∀i. R, r.. ⊢ Pi From 17522cb2e537a3f588cf439f4312e4d212f441d1 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 7 Aug 2015 08:04:36 -0400 Subject: [PATCH 0471/1195] merge and link RFC #1214 -- projections, lifetimes, and wf --- README.md | 1 + ...fetimes-and-wf.md => 1214-projections-lifetimes-and-wf.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-projections-lifetimes-and-wf.md => 1214-projections-lifetimes-and-wf.md} (99%) diff --git a/README.md b/README.md index 47535a73222..1e11470a619 100644 --- a/README.md +++ b/README.md @@ -61,6 +61,7 @@ the direction the language is evolving in. * [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) * [1156-adjust-default-object-bounds.md](text/1156-adjust-default-object-bounds.md) * [1184-stabilize-no_std.md](text/1184-stabilize-no_std.md) +* [1214-projections-lifetimes-and-wf.md](text/1214-projections-lifetimes-and-wf.md) * [1219-use-group-as.md](text/1219-use-group-as.md) ## Table of Contents diff --git a/text/0000-projections-lifetimes-and-wf.md b/text/1214-projections-lifetimes-and-wf.md similarity index 99% rename from text/0000-projections-lifetimes-and-wf.md rename to text/1214-projections-lifetimes-and-wf.md index bb91a827c09..12a1aafe632 100644 --- a/text/0000-projections-lifetimes-and-wf.md +++ b/text/1214-projections-lifetimes-and-wf.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1214](https://github.com/rust-lang/rfcs/pull/1214) +- Rust Issue: [rust-lang/rust#27579](https://github.com/rust-lang/rust/issues/27579) # Summary From 61d1436d9d9f41f4741add6c201d6284661e1453 Mon Sep 17 00:00:00 2001 From: Oliver Schneider Date: Fri, 7 Aug 2015 15:08:26 +0200 Subject: [PATCH 0472/1195] use alternative "warn + normal codegen" instead --- text/0000-compile-time-asserts.md | 56 +++++++++++++------------------ 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/text/0000-compile-time-asserts.md b/text/0000-compile-time-asserts.md index e6423478b19..72cbcfcc7d6 100644 --- a/text/0000-compile-time-asserts.md +++ b/text/0000-compile-time-asserts.md @@ -5,9 +5,9 @@ # Summary -If the compiler can detect at compile-time that something will always -cause a `debug_assert` or an `assert` it should instead -insert an unconditional runtime-panic and issue a warning. +If the constant evaluator encounters erronous code during the evaluation of +an expression that is not part of a true constant evaluation context a warning +must be emitted and the expression needs to be translated normally. # Definition of constant evaluation context @@ -41,6 +41,26 @@ If the constant evaluator gets smart enough, it will be able to const evaluate the `blub` function. This would be a breaking change, since the code would not compile anymore. (this occurred in https://github.com/rust-lang/rust/pull/26848). +# Detailed design + +The PRs https://github.com/rust-lang/rust/pull/26848 and https://github.com/rust-lang/rust/pull/25570 will be setting a precedent +for warning about such situations (WIP, not pushed yet). + +When the constant evaluator fails while evaluating a normal expression, +a warning will be emitted and normal translation needs to be resumed. + +# Drawbacks + +None, if we don't do anything, the const evaluator cannot get much smarter. + +# Alternatives + +## allow breaking changes + +Let the compiler error on things that will unconditionally panic at runtime. + +## insert an unconditional panic instead of generating regular code + GNAT (an Ada compiler) does this already: ```ada @@ -75,38 +95,8 @@ call __gnat_rcheck_CE_Range_Check ``` -# Detailed design - -The PRs https://github.com/rust-lang/rust/pull/26848 and https://github.com/rust-lang/rust/pull/25570 will be setting a precedent -for warning about such situations (WIP, not pushed yet). -All future additions to the const-evaluator need to notify the const evaluator -that when it encounters a statically known erroneous situation while evaluating -an expression outside of a constant evaluation environment, the -entire expression must be replaced by a panic and a warning must be emitted. - -# Drawbacks - -None, if we don't do anything, the const evaluator cannot get much smarter. - -# Alternatives - -## allow breaking changes - -Let the compiler error on things that will unconditionally panic at runtime. - -## only warn, don't influence code generation - -The const evaluator should simply issue a warning and notify it's caller that the expression cannot be evaluated and should be translated. -This has the disadvantage, that in release-mode statically known issues like -overflow or shifting more than the number of bits available will not be -caught even at runtime. - -On the other hand, this alternative does not change the behavior of existing code. - # Unresolved questions -## How to implement this? - ## Const-eval the body of `const fn` that are never used in a constant environment Currently a `const fn` that is called in non-const code is treated just like a normal function. From 10ded18b7f9c7d65d5b6ecdec9e30b381c476a11 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 7 Aug 2015 09:33:27 -0700 Subject: [PATCH 0473/1195] RFC 980: read_exact --- README.md | 1 + text/{0000-read-all.md => 0980-read-exact.md} | 7 +++---- 2 files changed, 4 insertions(+), 4 deletions(-) rename text/{0000-read-all.md => 0980-read-exact.md} (98%) diff --git a/README.md b/README.md index 1e11470a619..96d1a1774d1 100644 --- a/README.md +++ b/README.md @@ -49,6 +49,7 @@ the direction the language is evolving in. * [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0911-const-fn.md](text/0911-const-fn.md) * [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) +* [0980-read-exact.md](text/0980-read-exact.md) * [0982-dst-coercion.md](text/0982-dst-coercion.md) * [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) * [1011-process.exit.md](text/1011-process.exit.md) diff --git a/text/0000-read-all.md b/text/0980-read-exact.md similarity index 98% rename from text/0000-read-all.md rename to text/0980-read-exact.md index d2c841813a4..f703b9c72e2 100644 --- a/text/0000-read-all.md +++ b/text/0980-read-exact.md @@ -1,7 +1,7 @@ -- Feature Name: read_exact and ErrorKind::UnexpectedEOF +- Feature Name: read_exact - Start Date: 2015-03-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/980 +- Rust Issue: https://github.com/rust-lang/rust/issues/27585 # Summary @@ -282,4 +282,3 @@ the following reasons: compressed file where the uncompressed size was given in a header), `read_full` has to always write to the output buffer, so there's not much to gain over a generic looping implementation calling `read`. - From 5ca9a352c65c0ee7ed1f1969930d6d60dc974a42 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 29 Jul 2015 14:30:23 -0700 Subject: [PATCH 0474/1195] RFC: Policy for rust-lang crates --- text/0000-rust-lang-crates.md | 193 ++++++++++++++++++++++++++++++++++ 1 file changed, 193 insertions(+) create mode 100644 text/0000-rust-lang-crates.md diff --git a/text/0000-rust-lang-crates.md b/text/0000-rust-lang-crates.md new file mode 100644 index 00000000000..b0e9f4ca840 --- /dev/null +++ b/text/0000-rust-lang-crates.md @@ -0,0 +1,193 @@ +- Feature Name: N/A +- Start Date: 2015-07-29 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC proposes a policy around the crates under the rust-lang github +organization that are not part of the Rust distribution (compiler or standard +library). At a high level, it proposes that these crates be: + +- Governed similarly to the standard library; +- Maintained at a similar level to the standard library, including platform support; +- Carefully curated for quality. + +# Motivation + +There are three main motivations behind this RFC. + +**Keeping `std` small**. There is a widespread desire to keep the standard + library reasonably small, and for good reason: the stability promises made in + `std` are tied to the versioning of Rust itself, as are updates to it, meaning + that the standard library has much less flexibility than other crates + enjoy. While we *do* plan to continue to grow `std`, and there are legitimate + reasons for APIs to live there, we still plan to take a minimalistic + approach. See + [this discussion](https://internals.rust-lang.org/t/what-should-go-into-the-standard-library/2158) + for more details. + +The desire to keep `std` small is in tension with the desire to provide +high-quality libraries *that belong to the whole Rust community* and cover a +wider range of functionality. The poster child here is the +[regex crate](https://github.com/rust-lang/regex), which provides vital +functionality but is not part of the standard library or basic Rust distribution +-- and which is, in principle, under the control of the whole Rust community. + +This RFC resolves the tension between a "batteries included" Rust and a small +`std` by treating `rust-lang` crates as, in some sense, "the rest of the +standard library". While this doesn't solve the entire problem of curating the +library ecosystem, it offers a big step for some of the most significant/core +functionality we want to commit to. + +**Staging `std`**. For cases where we do want to grow the standard library, we + of course want to heavily vet APIs before their stabilization. Historically + we've done so by landing the APIs directly in `std`, but marked unstable, + relegating their use to nightly Rust. But in many cases, new `std` APIs can + just as well begin their life as external crates, usable on stable Rust, and + ultimately stabilized wholesale. The recent + [`std::net` RFC](https://github.com/rust-lang/rfcs/pull/1158) is a good + example of this phenomenon. + +The main challenge to making this kind of "`std` staging" work is getting +sufficient visibility, central management, and community buy-in for the library +prior to stabilization. When there is widespread desire to extend `std` in a +certain way, this RFC proposes that the extension can start its life as an +external rust-lang crate (ideally usable by stable Rust). It also proposes an +eventual migration path into `std`. + +**Cleanup**. During the stabilization of `std`, a fair amount of functionality + was moved out into external crates hosted under the rust-lang github + organization. The quality and future prospects of these crates varies widely, + and we would like to begin to organize and clean them up. + +# Detailed design + +## The lifecycle of a rust-lang crate + +First, two additional github organizations are proposed: + +- rust-lang-nursery +- rust-lang-deprecated + +New cratess start their life in a `0.X` series that lives in the +rust-lang-nursery. Crates in this state do not represent a major commitment from +the Rust maintainers; rather, they signal a trial period. A crate enters the +nursery when (1) there is already a working body of code and (2) the library +subteam approves a petition for inclusion. The petition is informal (not an +RFC), and can take the form of a discuss post laying out the motivation and +perhaps some high-level design principles, and linking to the working code. + +If the library team accepts a crate into the nursery, they are indicating an +*interest* in ultimately advertising the crate as "a core part of Rust", and in +maintaining the crate permanently. During the 0.X series in the nursery, the +original crate author maintains control of the crate, approving PRs and so on, +but the library subteam and broader community is expected to participate. As +we'll see below, nursery crates will be advertised (though not in the same way +as full rust-lang crates), increasing the chances that the crate is scrutinized +before being promoted to the next stage. + +Eventually, a nursery crate will either fail (and move to rust-lang-deprecated) +or reach a point where a 1.0 release would be appropriate. The failure case can +be decided at any point by the library subteam. + +If, on the other hand, a library reaches the 1.0 point, it is ready to be +promoted into rust-lang proper. To do so, an RFC must be written outlining the +motivation for the crate, the reasons that community ownership are important, +and delving into the API design and its rationale design. These RFCs are +intended to follow similar lines to the pre-1.0 stabilization RFCs for the +standard library (such as +[collections](https://github.com/rust-lang/rfcs/pull/235) or +[Duration](https://github.com/rust-lang/rfcs/pull/1040)) -- which have been very +successful in improving API design prior to stabilization. Once a "1.0 RFC" is +approved by the libs team, the crate moves into the rust-lang organization, and +is henceforth governed by the whole Rust community. That means in particular +that significant changes (certainly those that would require a major version +bump, but other substantial PRs as well) are reviewed by the library subteam and +may require an RFC. On the other hand, the community has broadly agreed to +maintain the library in perpetuity (unless it is later deprecated). And again, +as we'll see below, the promoted crate is very visibly advertised as part of the +"core Rust" package. + +Promotion to 1.0 requires first-class support on all first-tier platforms, +except for platform-specific libraries. + +Crates in rust-lang may issue new major versions, just like any other crates, +though such changes should go through the RFC process. While the library subteam +is responsible for major decisions about the library after 1.0, its original +author(s) will of course wield a great deal of influence, and their objections +will be given due weight in the consensus process. + +### Relation to `std` + +In many cases, the above description of the crate lifecycle is complete. But +some rust-lang crates are destined for std. Usually this will be clear up front. + +When a std-destined crate has reached sufficient maturity, the libs subteam can +call a "final comment period" for moving it into `std` proper. Assuming there +are no blocking objections, the code is moved into `std`, and the original repo +is left intact, with the following changes: + +- a minor version bump, +- *conditionally* replacing all definitions with `pub use` from `std` (which + will require the ability to `cfg` switch on feature/API availability -- a + highly-desired feature on its own). + +By re-routing the library to `std` when available we provide seamless +compatibility between users of the library externally and in `std`. In +particular, traits and types defined in the crate are compatible across either +way of importing them. + +### Deprecation + +At some point a library may become stale -- either because it failed to make it +out of the nursery, or else because it was supplanted by a superior library. The +libs subteam can deprecate nursery crates at any time, and can deprecate +rust-lang crates through an RFC. Deprecated crates move to rust-lang-deprecated +and are subsequently minimally maintained. + +## Advertising + +Part of the reason for having rust-lang crates is to have a clear, short list of +libraries that are broadly useful, vetted and maintained. But where should this +list appear? + +This RFC doesn't specify the complete details, but proposes a basic direction: + +- The crates in rust-lang should appear in the sidebar in the core rustdocs + distributed with Rust, along side the standard library. (For nightly releases, + we should include the nursery crates as well.) + +- The crates should also be published on crates.io, and should somehow be +*badged*. But the design of a badging/curation system for crates.io is out of +scope for this RFC. + +# Drawbacks + +The drawbacks of this RFC are largely social: + +* Emphasizing rust-lang crates may alienate some in the Rust community, since it + means that certain libraries obtain a special "blessing". This is mitigated by + the fact that these libraries also become owned by the community at large. + +* On the other hand, requiring that ownership/governance be transferred to the + library subteam may be a disincentive for library authors, since they lose + unilateral control of their libraries. But this is an inherent aspect of the + policy design, and the vastly increased visibility of libraries is likely a + strong enough incentive to overcome this downside. + +# Alternatives + +The main alternative would be to not maintain other crates under the rust-lang +umbrella, and to offer some other means of curation (the latter of which is +needed in any case). + +That would be a missed opportunity, however; Rust's governance and maintenance +model has been very successful so far, and given our minimalistic plans for the +standard library, it is very appealing to have *some* other way to apply the +full Rust community in taking care of additional crates. + +# Unresolved questions + +Part of the maintenance standard for Rust is the CI infrastructure, including +bors/homu, From aef4f00adcac61515cad4b4975dbf6f6c3237950 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 4 Aug 2015 10:31:12 -0700 Subject: [PATCH 0475/1195] Add plans for existing crates --- text/0000-rust-lang-crates.md | 40 +++++++++++++++++++++++++++++++++-- 1 file changed, 38 insertions(+), 2 deletions(-) diff --git a/text/0000-rust-lang-crates.md b/text/0000-rust-lang-crates.md index b0e9f4ca840..91ba5832b6a 100644 --- a/text/0000-rust-lang-crates.md +++ b/text/0000-rust-lang-crates.md @@ -143,8 +143,11 @@ way of importing them. At some point a library may become stale -- either because it failed to make it out of the nursery, or else because it was supplanted by a superior library. The libs subteam can deprecate nursery crates at any time, and can deprecate -rust-lang crates through an RFC. Deprecated crates move to rust-lang-deprecated -and are subsequently minimally maintained. +rust-lang crates through an RFC. This is expected to be a rare occurrence. + +Deprecated crates move to rust-lang-deprecated and are subsequently minimally +maintained. Alternatively, if someone volunteers to maintain the crate, +ownership can be transferred externally. ## Advertising @@ -162,6 +165,39 @@ This RFC doesn't specify the complete details, but proposes a basic direction: *badged*. But the design of a badging/curation system for crates.io is out of scope for this RFC. +## Plan for existing crates + +There are already a number of non-`std` crates in rust-lang. Below, we give the +full list along with recommended actions: + +### Unsure + +- fourcc +- getopts +- glob +- hexfloat +- rlibc +- rustc-serialize +- semver +- tempdir +- term +- threadpool +- time +- uuid + +### Move to rust-lang-nursery + +- rand +- regex +- libc +- bitflags +- log + +### Move to rust-lang-deprecated + +- url +- num + # Drawbacks The drawbacks of this RFC are largely social: From e581824eac0ed268092b554738ca754790b64f4e Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 7 Aug 2015 10:21:57 -0700 Subject: [PATCH 0476/1195] Render final judgment on existing crates --- text/0000-rust-lang-crates.md | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/text/0000-rust-lang-crates.md b/text/0000-rust-lang-crates.md index 91ba5832b6a..cd953084767 100644 --- a/text/0000-rust-lang-crates.md +++ b/text/0000-rust-lang-crates.md @@ -170,33 +170,35 @@ scope for this RFC. There are already a number of non-`std` crates in rust-lang. Below, we give the full list along with recommended actions: -### Unsure +### Transfer ownership + +Please volunteer if you're interested in taking one of these on! -- fourcc -- getopts -- glob -- hexfloat - rlibc -- rustc-serialize - semver -- tempdir -- term - threadpool -- time -- uuid ### Move to rust-lang-nursery -- rand -- regex -- libc - bitflags +- getopts +- glob +- libc - log +- rand (note, @huonw has a major revamp in the works) +- regex +- rustc-serialize (but will likely be replaced by serde or other approach eventually) +- tempdir (destined for `std` after reworking) +- uuid ### Move to rust-lang-deprecated -- url -- num +- fourcc: highly niche +- hexfloat: niche +- num: this is essentially a dumping ground from 1.0 stabilization; needs a complete re-think. +- term: API needs total overhaul +- time: needs total overhaul destined for std +- url: replaced by https://github.com/servo/rust-url # Drawbacks From a690d2faff5c63f9af5ef2095dc1dbcbe4e6bffa Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 7 Aug 2015 10:26:03 -0700 Subject: [PATCH 0477/1195] Add missing text to unresolved questions --- text/0000-rust-lang-crates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-rust-lang-crates.md b/text/0000-rust-lang-crates.md index cd953084767..dcda7c41a84 100644 --- a/text/0000-rust-lang-crates.md +++ b/text/0000-rust-lang-crates.md @@ -228,4 +228,4 @@ full Rust community in taking care of additional crates. # Unresolved questions Part of the maintenance standard for Rust is the CI infrastructure, including -bors/homu, +bors/homu. What level of CI should we provide for these crates, and how do we do it? From 32ed8d465098ed16c89e3295479328c3ced052fa Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Fri, 7 Aug 2015 11:30:08 -0700 Subject: [PATCH 0478/1195] clarify reasoning --- text/0560-integer-overflow.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/text/0560-integer-overflow.md b/text/0560-integer-overflow.md index 205b6dcb489..539f225c1cd 100644 --- a/text/0560-integer-overflow.md +++ b/text/0560-integer-overflow.md @@ -128,11 +128,12 @@ defined results today. The only change is that now a panic may result. - The operations `+`, `-`, `*`, can underflow and overflow. When checking is enabled this will panic. When checking is disabled this will two's complement wrap. -- The operations `/`, `%` are nonsensical for the arguments `INT_MIN` and `-1`. - When this occurs there is an unconditional panic. -- Shift operations (`<<`, `>>`) can shift a value of width `N` by more - than `N` bits. This is prevented by unconditionally masking the bits - of the right-hand-side to wrap modulo `N`. +- The operations `/`, `%` for the arguments `INT_MIN` and `-1` + will unconditionally panic. This is unconditional for legacy reasons. +- Shift operations (`<<`, `>>`) on a value of with `N` can be passed a shift value + >= `N`. It is unclear what behaviour should result from this, so the shift value + is unconditionally masked to be modulo `N` to ensure that the argument is always + in range. ## Enabling overflow checking From 005e5b201c6ddfbdac24ae5b079211874ac8f54c Mon Sep 17 00:00:00 2001 From: mdinger Date: Fri, 7 Aug 2015 15:12:37 -0400 Subject: [PATCH 0479/1195] typo --- text/1219-use-group-as.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1219-use-group-as.md b/text/1219-use-group-as.md index 4fbe3476c28..15dd88f2ea6 100644 --- a/text/1219-use-group-as.md +++ b/text/1219-use-group-as.md @@ -18,7 +18,7 @@ use std::io::{ # Motivation -THe current design requires the above example to be written like this: +The current design requires the above example to be written like this: ```rust use std::io::Error as IoError; From 8f4ca22854aaba3233c52a8b901ce7d5bc112137 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 7 Aug 2015 16:05:48 -0400 Subject: [PATCH 0480/1195] Correct typo in text of RFC1214 --- text/1214-projections-lifetimes-and-wf.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1214-projections-lifetimes-and-wf.md b/text/1214-projections-lifetimes-and-wf.md index 12a1aafe632..65b896495f5 100644 --- a/text/1214-projections-lifetimes-and-wf.md +++ b/text/1214-projections-lifetimes-and-wf.md @@ -574,7 +574,7 @@ and `T: 'x` (from the rule `OutlivesReference`). But often we are in a situation where we can't normalize the projection (for example, a projection like `I::Item` where we only -know that `I: Iterator`). (For example, What can we do then? The rule +know that `I: Iterator`). What can we do then? The rule `OutlivesProjectionComponents` says that if we can conclude that every lifetime/type parameter `Pi` to the trait reference outlives `'x`, then we know that a projection from those parameters outlives `'x`. In From 04d873699df679dc0095b98347bdaaa7085fddfd Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 7 Aug 2015 16:08:16 -0400 Subject: [PATCH 0481/1195] Correct another typo in RFC 1214 that was overlooked --- text/1214-projections-lifetimes-and-wf.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1214-projections-lifetimes-and-wf.md b/text/1214-projections-lifetimes-and-wf.md index 65b896495f5..642fd041a3f 100644 --- a/text/1214-projections-lifetimes-and-wf.md +++ b/text/1214-projections-lifetimes-and-wf.md @@ -449,7 +449,7 @@ or projections are involved: OutlivesFunction: ∀i. R,r.. ⊢ Ti: 'a -------------------------------------------------- - R ⊢ for fn(T1..Tn) -> T0 + R ⊢ for fn(T1..Tn) -> T0: 'a OutlivesFragment: ∀i. R,r.. ⊢ Pi: 'a From 01befd552cffc1404ab4162ae7a731b64aaf3dab Mon Sep 17 00:00:00 2001 From: Tshepang Lekhonkhobe Date: Sun, 5 Jul 2015 16:11:34 +0200 Subject: [PATCH 0482/1195] 0401: misc improvements --- text/0401-coercions.md | 64 +++++++++++++++++++++--------------------- 1 file changed, 32 insertions(+), 32 deletions(-) diff --git a/text/0401-coercions.md b/text/0401-coercions.md index a6bebbe9d6b..554ac61e11c 100644 --- a/text/0401-coercions.md +++ b/text/0401-coercions.md @@ -65,62 +65,62 @@ equality. A coercion is implicit and has no syntax. A coercion can only occur at certain coercion sites in a program, these are typically places where the desired type -is explicit or can be dervied by propagation from explicit types (without type +is explicit or can be derived by propagation from explicit types (without type inference). The base cases are: * In `let` statements where an explicit type is given: in `let _: U = e;`, `e` - is coerced to to have type `U`; + is coerced to have type `U` -* In statics and consts, similarly to `let` statements; +* In statics and consts, similarly to `let` statements * In argument position for function calls. The value being coerced is the actual parameter and it is coerced to the type of the formal parameter. For example, where `foo` is defined as `fn foo(x: U) { ... }` and is called with `foo(e);`, - `e` is coerced to have type `U`; + `e` is coerced to have type `U` * Where a field of a struct or variant is instantiated. E.g., where `struct Foo - { x: U }` and the instantiation is `Foo { x: e }`, `e` is coerced to to have - type `U`; + { x: U }` and the instantiation is `Foo { x: e }`, `e` is coerced to have + type `U` * The result of a function, either the final line of a block if it is not semi- colon terminated or any expression in a `return` statement. For example, for - `fn foo() -> U { e }`, `e` is coerced to to have type `U`; + `fn foo() -> U { e }`, `e` is coerced to have type `U` If the expression in one of these coercion sites is a coercion-propagating expression, then the relevant sub-expressions in that expression are also coercion sites. Propagation recurses from these new coercion sites. Propagating expressions and their relevant sub-expressions are: -* array literals, where the array has type `[U, ..n]`, each sub-expression in - the array literal is a coercion site for coercion to type `U`; +* Array literals, where the array has type `[U, ..n]`, each sub-expression in + the array literal is a coercion site for coercion to type `U` -* array literals with repeating syntax, where the array has type `[U, ..n]`, the - repeated sub-expression is a coercion site for coercion to type `U`; +* Array literals with repeating syntax, where the array has type `[U, ..n]`, the + repeated sub-expression is a coercion site for coercion to type `U` -* tuples, where a tuple is a coercion site to type `(U_0, U_1, ..., U_n)`, each +* Tuples, where a tuple is a coercion site to type `(U_0, U_1, ..., U_n)`, each sub-expression is a coercion site for the respective type, e.g., the zero-th - sub-expression is a coercion site to `U_0`; + sub-expression is a coercion site to `U_0` -* the box expression, if the expression has type `Box`, the sub-expression is +* The box expression, if the expression has type `Box`, the sub-expression is a coercion site to `U` (I expect this to be generalised when `box` expressions - are); + are) -* parenthesised sub-expressions (`(e)`), if the expression has type `U`, then - the sub-expression is a coercion site to `U`; +* Parenthesised sub-expressions (`(e)`), if the expression has type `U`, then + the sub-expression is a coercion site to `U` -* blocks, if a block has type `U`, then the last expression in the block (if it +* Blocks, if a block has type `U`, then the last expression in the block (if it is not semicolon-terminated) is a coercion site to `U`. This includes blocks which are part of control flow statements, such as `if`/`else`, if the block has a known type. Note that we do not perform coercions when matching traits (except for -receivers, see below). If there is an impl for some type `U` and `T` coerces to +receivers, see below). If there is an impl for some type `U`, and `T` coerces to `U`, that does not constitute an implementation for `T`. For example, the following will not type check, even though it is OK to coerce `t` to `&T` and there is an impl for `&T`: -``` +```rust struct T; trait Trait {} @@ -136,33 +136,33 @@ fn main() { ``` In a cast expression, `e as U`, the compiler will first attempt to coerce `e` to -`U`, only if that fails will the conversion rules for casts (see below) be +`U`, and only if that fails will the conversion rules for casts (see below) be applied. Coercion is allowed between the following types: -* `T` to `U` if `T` is a subtype of `U` (the 'identity' case); +* `T` to `U` if `T` is a subtype of `U` (the 'identity' case) * `T_1` to `T_3` where `T_1` coerces to `T_2` and `T_2` coerces to `T_3` - (transitivity case); + (transitivity case) -* `&mut T` to `&T`; +* `&mut T` to `&T` -* `*mut T` to `*const T`; +* `*mut T` to `*const T` -* `&T` to `*const T`; +* `&T` to `*const T` -* `&mut T` to `*mut T`; +* `&mut T` to `*mut T` * `T` to `U` if `T` implements `CoerceUnsized` (see below) and `T = Foo<...>` and `U = Foo<...>` (for any `Foo`, when we get HKT I expect this could be a - constraint on the `CoerceUnsized` trait, rather than being checked here); + constraint on the `CoerceUnsized` trait, rather than being checked here) * From TyCtor(`T`) to TyCtor(coerce_inner(`T`)) (these coercions could be - provided by implementing `CoerceUnsized` for all instances of TyCtor); + provided by implementing `CoerceUnsized` for all instances of TyCtor) + where TyCtor(`T`) is one of `&T`, `&mut T`, `*const T`, `*mut T`, or `Box`. -where TyCtor(`T`) is one of `&T`, `&mut T`, `*const T`, `*mut T`, or `Box`. -And where coerce_inner is defined as +And where coerce_inner is defined as: * coerce_inner(`[T, ..n]`) = `[T]`; @@ -204,7 +204,7 @@ It should be possible to coerce smart pointers (e.g., `Rc`) in the same way as the built-in pointers. In order to do so, we provide two traits and an intrinsic to allow users to make their smart pointers work with the compiler's coercions. It might be possible to implement some of the coercions described for built-in -pointers using this machinery, whether that is a good idea or not is an +pointers using this machinery, and whether that is a good idea or not is an implementation detail. ``` From fd38d78feee04065ab0629785413057df4f64228 Mon Sep 17 00:00:00 2001 From: John Hodge Date: Sun, 9 Aug 2015 16:33:15 +0800 Subject: [PATCH 0483/1195] Amend #911 const-fn to allow unsafe const functions --- text/0911-const-fn.md | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/text/0911-const-fn.md b/text/0911-const-fn.md index 38dc58809ee..de4aae48e51 100644 --- a/text/0911-const-fn.md +++ b/text/0911-const-fn.md @@ -172,6 +172,18 @@ const fn arithmetic_ops() -> [fn(T, T) -> T; 4] { } ``` +`const` functions can also be unsafe, allowing construction of types that require +invariants to be maintained (e.g. `std::ptr::Unique` requires a non-null pointer) +```rust +struct OptionalInt(u32); +impl OptionalInt { + /// Value must be non-zero + unsafe const fn new(val: u32) -> OptionalInt { + OptionalInt(val) + } +} +``` + # Drawbacks * A design that is not conservative enough risks creating backwards compatibility @@ -211,8 +223,6 @@ that a certain method of that trait is implemented as `const`. # Unresolved questions -* Allow `unsafe const fn`? The implementation cost is negligible, but I am not -certain it needs to exist. * Keep recursion or disallow it for now? The conservative choice of having no recursive `const fn`s would not affect the usecases intended for this RFC. If we do allow it, we probably need a recursion limit, and/or an evaluation @@ -226,3 +236,9 @@ cannot be taken for granted, at least `if`/`else` should eventually work. - This RFC was accepted on 2015-04-06. The primary concerns raised in the discussion concerned CTFE, and whether the `const fn` strategy locks us into an undesirable plan there. + +# Updates since being accepted + +Since it was accepted, the RFC has been updated as follows: + +1. Allowed `unsafe const fn` From e528bfe4d414ec5431b0f780b0b4adea16502a51 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 10 Aug 2015 13:12:13 -0400 Subject: [PATCH 0484/1195] typo: change name of rule from WfObjectFragment to WfTraitReference --- text/1214-projections-lifetimes-and-wf.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1214-projections-lifetimes-and-wf.md b/text/1214-projections-lifetimes-and-wf.md index 642fd041a3f..9364d3dfe04 100644 --- a/text/1214-projections-lifetimes-and-wf.md +++ b/text/1214-projections-lifetimes-and-wf.md @@ -825,7 +825,7 @@ In some contexts, we want to check a trait reference, such as the ones that appear in where clauses or type parameter bounds. The rules for this are given here: - WfObjectFragment: + WfTraitReference: ∀i. R, r.. ⊢ Pi C = WhereClauses(Id) // and the conditions declared on Id must hold... R, r0...rn ⊢ [P0..Pn] C // ...after substituting parameters, of course From c4bf5e186b339308e073a2e726e8f62e5e80b539 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 12 Aug 2015 11:48:29 -0700 Subject: [PATCH 0485/1195] Remove struct flattening. This is non-trivial (for me) to implement, and ended up not being that useful, i.e. it wasn't needed to make useful things. --- text/0000-simd-infrastructure.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index a3527afe8dd..1c6b8cbf74b 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -76,9 +76,7 @@ struct Simd2(T, T); The `simd` `repr` can be attached to a struct and will cause such a struct to be compiled to a SIMD vector. It can be generic, but it is required that any fully monomorphised instance of the type consist of -only a single "primitive" type, repeated some number of times. Types -are flattened, so, for `struct Bar(u64);`, `Simd2` has the same -representation as `Simd2`. +only a single "primitive" type, repeated some number of times. The `repr(simd)` may not enforce that any trait bounds exists/does the right thing at the type checking level for generic `repr(simd)` From 653267060bb2e8307c6eac27d8ecd02a9d136370 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 12 Aug 2015 11:50:16 -0700 Subject: [PATCH 0486/1195] Change shuffles to use arrays of indices. This is *far* more scalable than having an argument for each value. Thanks to @pnkfelix for the suggestion. --- text/0000-simd-infrastructure.md | 24 ++++++++---------------- 1 file changed, 8 insertions(+), 16 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 1c6b8cbf74b..0b6fb616e1d 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -196,16 +196,10 @@ platform specific intrinsic for shuffling. ```rust extern "platform-intrinsic" { - fn simd_shuffle2(v: T, w: T, i0: u32, i1: u32) -> Simd2; - fn simd_shuffle4(v: T, w: T, i0: u32, i1: u32, i2: u32, i3: u32) -> Sidm4; - fn simd_shuffle8(v: T, w: T, - i0: u32, i1: u32, i2: u32, i3: u32, - i4: u32, i5: u32, i6: u32, i7: u32) -> Simd8; - fn simd_shuffle16(v: T, w: T, - i0: u32, i1: u32, i2: u32, i3: u32, - i4: u32, i5: u32, i6: u32, i7: u32 - i8: u32, i9: u32, i10: u32, i11: u32, - i12: u32, i13: u32, i14: u32, i15: u32) -> Simd16; + fn simd_shuffle2(v: T, w: T, idx: [i32; 2]) -> Simd2; + fn simd_shuffle4(v: T, w: T, idx: [i32; 4]) -> Sidm4; + fn simd_shuffle8(v: T, w: T, idx: [i32; 8]) -> Simd8; + fn simd_shuffle16(v: T, w: T, idx: [i32; 16]) -> Simd16; } ``` @@ -214,10 +208,8 @@ time, ensure that `T` is a SIMD vector, `Elem` is the element type of `T` etc. Libraries can use traits to ensure that these will be enforced by the type checker too. -This approach has some downsides: `simd_shuffle32` (e.g. `Simd32` -on AVX, and `Simd32` on AVX-512) and especially `simd_shuffle64` -(e.g. `Simd64` on AVX-512) are unwieldy. These have similar type -"safety"/code-generation errors to the vectors themselves. +This approach has similar type "safety"/code-generation errors to the +vectors themselves. These operations are semantically: @@ -225,10 +217,10 @@ These operations are semantically: // vector of double length let z = concat(v, w); -return [z[i0], z[i1], z[i2], ...] +return [z[idx[0]], z[idx[1]], z[idx[2]], ...] ``` -The indices `iN` have to be compile time constants. Out of bounds +The index array `idx` has to be compile time constants. Out of bounds indices yield unspecified results. Similarly, intrinsics for inserting/extracting elements into/out of From 8e3a0deba208e4b42216cf210570755f79893aca Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 12 Aug 2015 11:54:37 -0700 Subject: [PATCH 0487/1195] shuffles don't rely on generic types for return values. This has less type safety, but doesn't require generic simd types to exist: #[repr(simd)] struct Simd2(T, T); --- text/0000-simd-infrastructure.md | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 0b6fb616e1d..7bc38d2f90d 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -196,17 +196,17 @@ platform specific intrinsic for shuffling. ```rust extern "platform-intrinsic" { - fn simd_shuffle2(v: T, w: T, idx: [i32; 2]) -> Simd2; - fn simd_shuffle4(v: T, w: T, idx: [i32; 4]) -> Sidm4; - fn simd_shuffle8(v: T, w: T, idx: [i32; 8]) -> Simd8; - fn simd_shuffle16(v: T, w: T, idx: [i32; 16]) -> Simd16; + fn simd_shuffle2(v: T, w: T, idx: [i32; 2]) -> U; + fn simd_shuffle4(v: T, w: T, idx: [i32; 4]) -> U; + fn simd_shuffle8(v: T, w: T, idx: [i32; 8]) -> U; + fn simd_shuffle16(v: T, w: T, idx: [i32; 16]) -> U; } ``` The raw definitions are only checked for validity at monomorphisation -time, ensure that `T` is a SIMD vector, `Elem` is the element type of -`T` etc. Libraries can use traits to ensure that these will be -enforced by the type checker too. +time, ensure that `T` and `U` are SIMD vector with the same element +type, `U` has the appropriate length etc. Libraries can use traits to +ensure that these will be enforced by the type checker too. This approach has similar type "safety"/code-generation errors to the vectors themselves. @@ -362,6 +362,17 @@ cfg_if_else! { pointers(/references?) in `repr(simd)` types. - allow (and ignore for everything but type checking) zero-sized types in `repr(simd)` structs, to allow tagging them with markers +- the shuffle intrinsics could be made more relaxed in their type + checking (i.e. not require that they return their second type + parameter), to allow more type safety when combined with generic + simd types: + + #[repr(simd)] struct Simd2(T, T); + extern "platform-intrinsic" { + fn simd_shuffle2(x: T, y: T, idx: [u32; 2]) -> Simd2; + } + + This should be a backwards-compatible generalisation. # Alternatives @@ -404,7 +415,6 @@ cfg_if_else! { - use generic intrinsics like shuffles for the arithmetic operations, instead of providing the operations implicitly. - # Unresolved questions - Should integer vectors get division automatically? Most CPUs From 54b0927ea11ae544549edaa9a79b5ca1b6225a91 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 12 Aug 2015 11:59:12 -0700 Subject: [PATCH 0488/1195] Intrinsics-for-operations is now the RFC, not an alternative. Also, the comparison comment no longer makes sense. --- text/0000-simd-infrastructure.md | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index 7bc38d2f90d..d4d8558099a 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -240,11 +240,8 @@ similarly to the shuffles. ### Comparisons -Comparisons are implemented via intrinsics, because the current -comparison operator infrastructure doesn't easily lend itself to -return vectors, as required. - -The raw signatures would look like: +Comparisons are implemented via intrinsics. The raw signatures would +look like: ```rust extern "platform-intrinsic" { @@ -412,8 +409,6 @@ cfg_if_else! { - have 100% guaranteed type-safety for generic `#[repr(simd)]` types and the generic intrinsics. This would probably require a relatively complicated set of traits (with compiler integration). -- use generic intrinsics like shuffles for the arithmetic operations, - instead of providing the operations implicitly. # Unresolved questions From 9e31ad3327327eb51849623f5f5bf7cc2afb58c0 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 12 Aug 2015 14:25:05 -0700 Subject: [PATCH 0489/1195] Out of bounds indices are errors (backwards compat to relax). --- text/0000-simd-infrastructure.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index d4d8558099a..dde310e1f16 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -221,7 +221,7 @@ return [z[idx[0]], z[idx[1]], z[idx[2]], ...] ``` The index array `idx` has to be compile time constants. Out of bounds -indices yield unspecified results. +indices yield errors. Similarly, intrinsics for inserting/extracting elements into/out of vectors are provided, to allow modelling the SIMD vectors as actual From 91a2b360ae7b4c6e820b57d937f6012ffc448f73 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 12 Aug 2015 14:27:05 -0700 Subject: [PATCH 0490/1195] Only invalid to *call* intrinsics on bad platforms. It's valid to `extern` them, though. --- text/0000-simd-infrastructure.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index dde310e1f16..f0d1f5eec04 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -126,11 +126,11 @@ extern "platform-intrinsic" { These all use entirely concrete types, and this is the core interface to these intrinsics: essentially it is just allowing code to exactly specify a CPU instruction to use. These intrinsics only actually work -on a subset of the CPUs that Rust targets, and are only be available -for `extern`ing on those targets. The signatures are typechecked, but -in a "duck-typed" manner: it will just ensure that the types are SIMD -vectors with the appropriate length and element type, it will not -enforce a specific nominal type. +on a subset of the CPUs that Rust targets, and will result in compile +time errors if they are called on platforms that do not support +them. The signatures are typechecked, but in a "duck-typed" manner: it +will just ensure that the types are SIMD vectors with the appropriate +length and element type, it will not enforce a specific nominal type. NB. The structural typing is just for the declaration: if a SIMD intrinsic is declared to take a type `X`, it must always be called with `X`, From 60931df73c8733fecd662dff900a5f172f01eee4 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Wed, 12 Aug 2015 14:27:55 -0700 Subject: [PATCH 0491/1195] There can be more shuffles. --- text/0000-simd-infrastructure.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index f0d1f5eec04..f1ad633420d 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -200,6 +200,7 @@ extern "platform-intrinsic" { fn simd_shuffle4(v: T, w: T, idx: [i32; 4]) -> U; fn simd_shuffle8(v: T, w: T, idx: [i32; 8]) -> U; fn simd_shuffle16(v: T, w: T, idx: [i32; 16]) -> U; + // ... } ``` From 05835cb8a72cd8e9970f297ef9eba4f0b4aeb230 Mon Sep 17 00:00:00 2001 From: Paul Dicker Date: Thu, 13 Aug 2015 14:25:57 +0200 Subject: [PATCH 0492/1195] Document and expand the open options RFC --- 0000-open-options.md | 543 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 543 insertions(+) create mode 100644 0000-open-options.md diff --git a/0000-open-options.md b/0000-open-options.md new file mode 100644 index 00000000000..e3874c56607 --- /dev/null +++ b/0000-open-options.md @@ -0,0 +1,543 @@ +- Feature Name: expand-open-options +- Start Date: 2015-08-04 +- RFC PR: +- Rust Issue: + +# Summary + +Document and expand the open options. + + +# Motivation + +The options that can be passed to the os when opening a file vary between +systems. And even if the options seem the same or similar, there may be +unexpected corner cases. + +This RFC attempts to +- describe the different corner cases and behaviour of various operating + systems. +- describe the intended behaviour and interaction of Rusts options. +- remedy cross-platform inconsistencies. +- suggest extra options to expose more platform-specific options. + + +# Detailed design + +## Access modes + +### Read-only +Open a file for read-only. + +### Write-only +Open a file for write-only. + +If a file already exist, the contents of that file get overwritten, but it is +not truncated. Example: +``` +// contents of file before: "aaaaaaaa" +file.write(b"bbbb") +// contents of file after: "bbbbaaaa" +``` + +### Read-write +This is the simple combinations of read-only and write-only. + +### Append-mode +Append-mode is similar to write-only, but all writes always happen at the end of +the file. This mode is especially useful if multiple processes or threads write +to a single file, like a log file. The operating system guarantees all writes +are atomic: no writes get mangled because another process writes at the same +time. No guarantees are made about the order writes end up in the file though. + +Note: sadly append-mode is not atomic on NFS filesystems. + +One maybe obvious note when using append-mode: make sure that all data that +belongs together, is written the the file in one operation. This can be done +by concatenating strings before passing them to `write()`, or using a buffered +writer (with a more than adequately sized buffer) and calling `flush()` when the +message is complete. + +_Implementation detail_: On Windows opening a file in append-mode has one flag +_less_, the right to change existing data is removed. On Unix opening a file in +append-mode has one flag _extra_, that sets the status of the file descriptor to +append-mode. You could say that on Windows write is a superset of append, while +on Unix append is a superset of write. + +Because of this append is treated as a seperate access mode in Rust, and if +`.append(true)` is specified than `.write()` is ignored. + +### Read-append +Writing to the file works exactly the same as in append-mode. + +Reading is more difficult, and may involve a lot of seeking. When the file is +opened, the position for reading may be set at the end of the file, so you +should first seek to the beginning. Also after every write the position is set +to the end of the file. So before writing you should save the current position, +and restore it after the write. +``` +try!(file.seek(SeekFrom::Start(0))); +try!(file.read(&mut buffer)); +let pos = try!(file.seek(SeekFrom::Current(0))); +try!(file.write(b"foo")); +try!(file.seek(SeekFrom::Start(pos))); +try!(file.read(&mut buffer)); +``` + +### No access mode set +On Windows it is possible to open a file without setting an access mode. You can +do practically nothing with the file, but you can read +[metadata](https://msdn.microsoft.com/en-us/library/windows/desktop/aa363788%28v=vs.85%29.aspx) +such as the file size or timestamp. + +On Unix it is traditionally not possible to open a file without specifying the +access mode, because of the way the access flags where defined: something like +`O_RDONLY = 0`, `O_WRONLY = 1` and `O_RDWR = 2`. When no flags are set, the +access mode is `0` and you fall back to opening the file read-only. + +Linux since version 2.6.39 has functionality similar to Windows by opening the +file with `O_RDONLY | O_PATH`. Since version 3.6 you can call `fstat` on a file +descriptor opened this way. + +For what it's worth +[Gnu Hurd](http://www.gnu.org/software/libc/manual/html_node/Access-Modes.html) +allows opening files without an access mode, because it defines `O_RDONLY = 1` +and `O_WRONLY = 2`. It allows all operations on the file that do not involve +reading or writing the data, like `chmod`. + +On Unix systems that fall back to opening the file read-only, Rust will fail +opening the file with `E_INVALID`. Otherwise, if for example you are developing +on OS X but forget to set `.read(true)` when opening a file, it would work on +OS X but not on other systems. + +### Windows-specific +`.desired_access(FILE_READ_DATA)` + +On Windows you can detail whether you want to have read and/or write access to +the files data, attributes and/or extended attributes. Managing premissions in +such detail has proven itself too difficult, and generally not worth it. + +In Rust, `.read(true)` gives you read access to the data, attributes and +extended attributes. Similarly, `.write(true)` gives write access to those +three, and the right to append data beyond the current end of the file. + +But if you want fine-grained control, with `desired_access` you have it. + + +## Creation modes + +creation mode | file exists | file does not exist | Unix | Windows | +:----------------------------|-------------|---------------------|:------------------|:------------------------------------------| +(not set) | open | fail | | OPEN_EXISTING | +.create(true) | open | create | O_CREAT | OPEN_ALWAYS | +.truncate(true) | truncate | fail | O_TRUNC | TRUNCATE_EXISTING | +.create(true).truncate(true) | truncate | create | O_CREAT + O_TRUNC | CREATE_ALWAYS | +.create_new(true) | fail | create | O_CREAT + O_EXCL | CREATE_NEW + FILE_FLAG_OPEN_REPARSE_POINT | + +### Not set +Open an existing file. Fails if the file does not exist. + +### Create +`.create(true)` + +Open an existing file, or create a new file if it already exists. + +Even if the access mode is read-only, it seems all operating systems can still +create a new file (for whatever use reading from an empty file may be). + +### Truncate +`.truncate(true)` + +Open an existing file, and truncate it to zero length. Fails if the file does +not exist. Attributes and permissions of the truncated file are preserved. + +Truncating will not work in read-only or append mode. Some platforms may support +this, but Rust does not allow this for cross-platform consistency (besides it +being sane behaviour). + +### Create and truncate +`.create(true).truncate(true)` + +The logical combination of create and truncate. + +### Create_new +`.create_new(true)` + +Create a new file, and fail if it already exist. + +On Unix this options started its life as a security measure. If you first check +if a file does not exists with `exists()` and then call `open()`, some other +process may have created in the in mean time. `.create_new()` is an atomic +operation that will fail if a file already exist at the location. + +`.create_new()` has a special rule on Unix for dealing with symlinks. If there +is a symlink at the final element of its path (e.g. the filename), open will +fail. This is to prevent a vulnerability where an unprivileged process could +trick a privileged process into following a symlink and overwriting a file the +unprivileged process has no access to. +See [Exploiting symlinks and tmpfiles](https://lwn.net/Articles/250468/). +On Windows this behaviour is imitated by specifying not only `CREATE_NEW` but +also `FILE_FLAG_OPEN_REPARSE_POINT`. + +Simply put: nothing is allowed to exist on the target location, also no +(dangling) symlink. + +if `.create_new(true)` is set, `.create()` and `.truncate()` are ignored. + + +### Unix-specific: Mode +`.mode(0o666)` + +On Unix the new file is created by default with permissions `0o666` minus the +systems `umask` (see [Wikipedia](https://en.wikipedia.org/wiki/Umask)). It is +possible to set on other mode with this option. + +If a file already exist or `.create(true)` or `.create_new(true)` are not +specified, `.mode()` is ignored. + +Rust currently does not expose a way to modify the umask. + +### Windows-specific: Attributes +`.attributes(FILE_ATTRIBUTE_READONLY | FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM)` + +Files on Windows can have several attributes, most commonly one or more of the +following four: readonly, hidden, system and archive. Most +[others](https://msdn.microsoft.com/en-us/library/windows/desktop/gg258117%28v=vs.85%29.aspx) +are properties set by the file system. It seems of the others only +`FILE_ATTRIBUTE_ENCRYPTED` and `FILE_ATTRIBUTE_TEMPORARY` are meaningful to set +when creating a new file. + +It is no use to set the archive flag, as Windows sets it automatically when the +file is newly created or modified. This flag may then be used by backup +applications as an indication of which files have changed. + +If a _new_ file is created because it does not yet exist and `.create(true)` or +`.create_new(true)` are specified, the new file is given the attributes declared +with `.attributes()`. + +If an _existing_ file is opened with `.create(true).truncate(true)`, its +attributes are also replaced by the ones specified with `.attributes()`. + +In all other cases the attributes get ignored. + + +## Sharing / locking +On Unix it is possible for multiple processes to read and write to the same file +at the same time. + +When you open a file on Windows, the system by default denies other processes to +read or write to the file, or delete it. By setting the sharing mode, it is +possible to allow other processes read, write and/or delete access. For +cross-platform consistency, Rust imitates Unix by setting all sharing flags. + +Unix has no equivalent to the kind of file locking that Windows has. It has two +types of advisory locking, POSIX and BSD-style. Advisory means any process that +does not use locking itself can happily ignore the locking af another process. +As if that is not bad enough, they both have +[problems](http://0pointer.de/blog/projects/locking.html) that make them close +to unusable for modern multi-treaded programs. Linux may in some very rare cases +support mandatory file locking, but it is just as broken as advisory. + +For Rust, the sharing mode can be set with a Windows-specific option. Given the +problems above, i don't expect there to ever be a cross-platform option for file +locking. + +### Windows-specific: Share mode +`.share_mode(FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE)` + +It is possible to set the individual share permissions with `.share_mode()`. + +The current philosophy of this function is that others should have no rights, +unless explicitely granted. I think a better fit for Rust would be to give all +others all rights, unless explicitly denied, e.g.: +`.share_mode(DENY_READ | DENY_WRITE | DENY_DELETE)`. + + +## Caching behaviour + +### Read cache hint + +Instead of requesting only the data neccesary for a single `read()` call from a +storage device, an operating system may request more data than nessesary to have +it already available for the next read call (e.g. the read-ahead cache). If you +read the file sequentially this is beneficial, for completely random access it +can become a penalty. Operating systems generally have good heuristics, but you +may get a performance win if you give the os a hint about how you will read the +file. + +Do some real-world benchmarks before setting this option. + +#### Cache hint +``` +.cache_hint(enum CacheHint) + +enum CacheHint { + Normal, + Sequential, + Random, +} +``` + +On Windows this maps to the flags `FILE_FLAG_SEQUENTIAL_SCAN` and +`FILE_FLAG_RANDOM_ACCESS`. On Linux and FreeBSD they map to the system call +`posix_fadvise()` with the flags `POSIX_FADV_SEQUENTIAL` and +`POSIX_FADV_RANDOM`. + +This option is ignored on operating systems that do not support caching hints. + +### Write cache + +See [Ensuring data reaches disk](https://lwn.net/Articles/457667/) + +1. copy data to kernel space +2. the kernel may wait a short wile +3. data is written to the cache of the storage device +4. data is written to persistent storage + +The Rust functions `sync_all()` and `sync_data()` control step 2: they force all +data in the write buffer of the kernel to be written to the storage device. This +is important to ensure critical data reaches the storage device in case of a +system crash or power outage, but comes with a large performance penalty. + +All modern operating systems also support a mode where each call to `write()` +will not return until the data is written to the storage device, dus removing +step 2 for _all_ writes. This can be a useful options for writing critical data, +where you would call `sync_data()` after each write. This saves a system call +for each write, and you are sure to never forget it. + +#### Sync all +`.sync_all(true)`: implement an open option with the same name as the free +standing call. + +On Windows this means setting the flag `FILE_FLAG_WRITE_THROUGH`, and on Unix +(except OS X) the flag `O_SYNC`. + +OS X does not support `O_SYNC`, but it is possible to call fcntl with +`F_NOCACHE` to get the same effect. This has the side-effect that data also does +not end up in the read cache, so this can have a performance penalty when +reading if a file is opened with a read-write access mode. + +#### Sync data +`.sync_data(true)` + +Some systems support only syncing the data written, but can wait with updating +less critical metadata such as the last modified timestamp. If the metadata is +not critical (and it rarely is), you should always use `sync_data()` as an easy +performance win. + +Linux since version 2.6.33 supports this mode with `O_DSYNC`, as does Solaris +and recent versions of NetBSD. If a system does not support only syncing data, +Rust will fall back to full syncing. + +If `.sync_all(true)` is specified, `.sync_data()` is ignored. + + +### Completely bypass the kernel +Normally the os kernel will process read or write calls and store the data +temporarily in a kernel-space buffer. The kernel makes sure the data size and +alignment of reads and writes correspondent to sectors on the storage device, +usually 512 or 4096 bytes. Also the kernel can keep data recently read or +written in cache, to speed up future file operations. + +Some operating systems allow you to completely bypass the copy of data to or +from kernel space. This is generally a bad idea. Applications will have figure +out and handle alignment restrictions themself, and implement manual caching. It +is mostly useful for database applications that may have more knowledge obout +their optimal caching behaviour than the os. And it can have a use when reading +many gigabytes of data (like a backup process), which may destroy the os cache +for other processes. + +This is available on Windows with the flag `FILE_FLAG_NO_BUFFERING`, and on +Linux and some variants of BSD with `O_DIRECT`. Making correct use of this mode +involves low-level tuning and operating system dependant behaviour. It makes no +sense for Rust to expose this as a simple, cross-platform option. For +applications that really wish to use it, it is no problem to submit it as a +custom flag. + + +## Asynchronous IO +Out op scope. + + +## Other options + +### Inheritance of file descriptors + +Leaking file descriptors to child processes can cause problems and can be a +security vulnerability. See this report by +[Python](https://www.python.org/dev/peps/pep-0446/). + +On Windows, child processes do not inherit file descriptors by default (but this +can be changed). On Unix they always inherit, unless the close-on-exec flag is +set. + +The close on exec flag can be set atomically when opening the file, or later +with `fcntl`. The `O_CLOEXEC` flag is in the relatively new POSIX-2008 standard, +and all modern versions of Unix support it. The following table lists for which +operating systems we can rely on the flag to be supported. + +os | since version | oldest supported version +:-------------|:--------------|:------------------------ +OS X | 10.6 | 10.7? +Linux | 2.6.23 | 2.6.32 (supported by Rust) +FreeBSD | 8.3 | 8.4 +OpenBSD | 5.0 | 5.7 +NetBSD | 6.0 | 5.0 +Dragonfly BSD | 3.2 | ? (3.2 is not updated since 2012-12-14) +Solaris | 11 | 10 + +This means we can always set the flag `O_CLOEXEC`, and do an additional `fcntl` +if the os is NetBSD or Solaris. + + +### Custom flags +`.custom_flags()` + +Windows and the various flavors of Unix support flags that are not +cross-platform, but that can be useful in some circumstances. On Unix they will +be passed as the variable _flags_ to `open`, on Windows as the +_dwFlagsAndAttributes_ parameter. + +The cross-platform options of Rust can do magic: they can set any flag neccesary +to ensure it works as expected. For example, `.append(true)` on Unix not only +sets the flag `O_APPEND`, but also automatically `O_WRONLY` or `O_RDWR`. This +special treatment is not available for the custom flags. + +Custom flags can only set flags, not remove flags set by Rusts options. + +For the custom flags on Unix, the bits that define the access mode are masked +out with `O_ACCMODE`, to ensure they do not interfere with the access mode set by Rusts options. + +| [Windows](https://msdn.microsoft.com/en-us/library/windows/desktop/hh449426%28v=vs.85%29.aspx): +|:--------------------------- +| FILE_FLAG_BACKUP_SEMANTICS +| FILE_FLAG_DELETE_ON_CLOSE +| FILE_FLAG_NO_BUFFERING +| FILE_FLAG_OPEN_NO_RECALL +| FILE_FLAG_OPEN_REPARSE_POINT +| FILE_FLAG_OVERLAPPED +| FILE_FLAG_POSIX_SEMANTICS +| FILE_FLAG_RANDOM_ACCESS +| FILE_FLAG_SESSION_AWARE +| FILE_FLAG_SEQUENTIAL_SCAN +| FILE_FLAG_WRITE_THROUGH + +Unix: + +| Posix | Linux | OS X | FreeBSD | OpenBSD | NetBSD |Dragonfly BSD| Solaris | +|:------------|:------------|:------------|:------------|:------------|:------------|:------------|:------------| +| O_DIRECTORY | O_DIRECTORY | | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | +| O_NOCTTY | O_NOCTTY | | O_NOCTTY | | O_NOCTTY | | O_NOCTTY | +| O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | +| O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | +| O_DSYNC | O_DSYNC | | | | O_DSYNC | | O_DSYNC | +| O_RSYNC | | | | | O_RSYNC | | O_RSYNC | +| O_SYNC | O_SYNC | | O_SYNC | O_SYNC | O_SYNC | O_FSYNC | O_SYNC | +| | O_DIRECT | | O_DIRECT | | O_DIRECT | O_DIRECT | | +| | O_ASYNC | | | | O_ASYNC | | | +| | O_NOATIME | | | | | | | +| | O_PATH | | | | | | | +| | O_TMPFILE | | | | | | | +| | | O_SHLOCK | O_SHLOCK | O_SHLOCK | O_SHLOCK | O_SHLOCK | | +| | | O_EXLOCK | O_EXLOCK | O_EXLOCK | O_EXLOCK | O_EXLOCK | | +| | | O_SYMLINK | | | | | | +| | | O_EVTONLY | | | | | | +| | | | | | O_NOSIGPIPE | | | +| | | | | | O_ALT_IO | | | +| | | | | | | | O_NOLINKS | +| | | | | | | | O_XATTR | +| [Posix](http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html) | [Linux](http://man7.org/linux/man-pages/man2/open.2.html) | [OS X](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/open.2.html) | [FreeBSD](https://www.freebsd.org/cgi/man.cgi?query=open&sektion=2) | [OpenBSD](http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man2/open.2?query=open&sec=2) | [NetBSD](http://netbsd.gw.com/cgi-bin/man-cgi?open+2+NetBSD-current) | [Dragonfly BSD](http://leaf.dragonflybsd.org/cgi/web-man?command=open§ion=2) | [Solaris](http://docs.oracle.com/cd/E23824_01/html/821-1463/open-2.html) | + + +### Windows-specific flags and attributes + +The following variables for CreateFile2 currently have no equivalent functions +in Rust to set them: +``` +DWORD dwSecurityQosFlags; +LPSECURITY_ATTRIBUTES lpSecurityAttributes; +HANDLE hTemplateFile; +``` + + +## Changes from current + +### Access mode +- Current: `.append(true)` requires `.write(true)` on Unix, but not on Windows. + New: ignore `.write()` if `.append(true)` is specified. +- Current: when `.append(true)` is set, it is not possible to modify file + attributes on Windows, but it is possible to change the file mode on Unix. + New: allow file attributes to be modified on Windows in append-mode. +- Current: when no access mode is set, this falls back to opening the file + read-only on Unix. + New: open with `O_RDONLY | O_PATH` on Linux, and fail with `E_INVALID` on all + other Unix variants. +- Maybe rename the Windows-specific `.desired_access()` to `.access_mode()`? + +### Creation mode +- Do not allow `.truncate(true)` if the access mode is read-only and/or append. + This is currently buggy on Windows, and works on some versions of Unix, but + not on others (implementation defined). +- Implement `.create_new()`. +- Remove the Windows-specific `.creation_disposition()`. + It has no use, because all its options can be set in a cross-platform way. +- Split the Windows-specific `.flags_and_attributes()` into `.flags()` and + `.attributes()`. This is a form of future-proofing, as the new Windows 8 + `Createfile2` also splits these attributes. This has the advantage of a clear + seperation between file attributes, that are somewhat similar to Unix mode + bits, and the custom flags that modify the behaviour of the current file + handle. + +### Sharing / locking +- Currently `.share_mode()` grants permissions, change it to grant by default, + and possibly deny permissions. + +### Caching behaviour +- Implement `.cache_hint()`. Should this use an enum? +- Implement `.sync_all()` and `.sync_data()`. + +### Other options +- Set the close-on-exec flag atomically on Unix if supported. +- Implement `.custom_flags()` on Windows and Unix to pass custom flags to the +system. + + + +# Drawbacks + +This adds a thin layer on top of the raw operating system calls. In this +[pull request](https://github.com/rust-lang/rust/pull/26772#issuecomment-126753342) +the conclusion was: this seems like a good idea for a "high level" abstraction +like OpenOptions. + +This adds extra options that many applications can do without (otherwise they +were already implemented). + +Also this RFC is in line with the vision for IO in the +[IO-OS-redesign](https://github.com/rust-lang/rfcs/blob/master/text/0517-io-os-reform.md#vision-for-io): +- [The APIs] should impose essentially zero cost over the underlying OS + services; the core APIs should map down to a single syscall unless more are + needed for cross-platform compatibility. +- The APIs should largely feel like part of "Rust" rather than part of any + legacy, and they should enable truly portable code. +- Coverage. The std APIs should over time strive for full coverage of non-niche, + cross-platform capabilities. + + +# Alternatives + +Keep the status quo. + +# Unresolved questions + +Implementation and testing of `.nofollow()`, `.sync_all()` and `.sync_data()` +could uncover some corner cases, but I don't expect any that would give great +trouble. + +Should `.cache_hint()` take an enum? + +Rename the Windows-specific `.desired_access()` to `.access_mode()`? + +What should be done about the missing variables for `CreateFile2`? + +Are there any other options that we should define while at it? \ No newline at end of file From 0e4a855ce0a278f3abda80812cce9ee9080abc7c Mon Sep 17 00:00:00 2001 From: Paul Dicker Date: Fri, 14 Aug 2015 17:04:52 +0200 Subject: [PATCH 0493/1195] Document Windows access flags, and truncate problem Also some spelling fixes, and two comments from @nagisa --- 0000-open-options.md | 77 ++++++++++++++++++++++++++++++++------------ 1 file changed, 56 insertions(+), 21 deletions(-) diff --git a/0000-open-options.md b/0000-open-options.md index e3874c56607..a0dfeaa1b83 100644 --- a/0000-open-options.md +++ b/0000-open-options.md @@ -64,7 +64,7 @@ append-mode has one flag _extra_, that sets the status of the file descriptor to append-mode. You could say that on Windows write is a superset of append, while on Unix append is a superset of write. -Because of this append is treated as a seperate access mode in Rust, and if +Because of this append is treated as a separate access mode in Rust, and if `.append(true)` is specified than `.write()` is ignored. ### Read-append @@ -114,7 +114,7 @@ OS X but not on other systems. `.desired_access(FILE_READ_DATA)` On Windows you can detail whether you want to have read and/or write access to -the files data, attributes and/or extended attributes. Managing premissions in +the files data, attributes and/or extended attributes. Managing permissions in such detail has proven itself too difficult, and generally not worth it. In Rust, `.read(true)` gives you read access to the data, attributes and @@ -123,6 +123,34 @@ three, and the right to append data beyond the current end of the file. But if you want fine-grained control, with `desired_access` you have it. +As a reference, this are the flags set by Rusts access modes: + +bit| flag | read | write | read-write | append | read-append | +--:|:----------------------|:-----:|:-----:|:----------:|:------:|:-----------:| + | **generic rights** | | | | | | +31 | GENERIC_READ | set | | set | | set | +30 | GENERIC_WRITE | | set | set | | | +29 | GENERIC_EXECUTE | | | | | | +28 | GENERIC_ALL | | | | | | + | **specific rights** | | | | | | + 0 | FILE_READ_DATA |implied| | implied | | implied | + 1 | FILE_WRITE_DATA | |implied| implied | | | + 2 | FILE_APPEND_DATA | |implied| implied | set | set | + 3 | FILE_READ_EA |implied| | implied | | implied | + 4 | FILE_WRITE_EA | |implied| implied | set | set | + 6 | FILE_EXECUTE | | | | | | + 7 | FILE_READ_ATTRIBUTES |implied| | implied | | implied | + 8 | FILE_WRITE_ATTRIBUTES | |implied| implied | set | set | + | **standard rights** | | | | | | +16 | DELETE | | | | | | +17 | READ_CONTROL |implied|implied| implied | set | set+implied | +18 | WRITE_DAC | | | | | | +19 | WRITE_OWNER | | | | | | +20 | SYNCHRONIZE |implied|implied| implied | set | set+implied | + +The implied flags can be specified explicitly with the constants +`FILE_GENERIC_READ` and `FILE_GENERIC_WRITE`. + ## Creation modes @@ -140,7 +168,7 @@ Open an existing file. Fails if the file does not exist. ### Create `.create(true)` -Open an existing file, or create a new file if it already exists. +Open an existing file, or create a new file if it does not already exists. Even if the access mode is read-only, it seems all operating systems can still create a new file (for whatever use reading from an empty file may be). @@ -155,6 +183,9 @@ Truncating will not work in read-only or append mode. Some platforms may support this, but Rust does not allow this for cross-platform consistency (besides it being sane behaviour). +On Windows truncating will only work if the `GENERIC_WRITE` flag is set. Setting +the equivalent individual flags is not enough. + ### Create and truncate `.create(true).truncate(true)` @@ -235,11 +266,11 @@ types of advisory locking, POSIX and BSD-style. Advisory means any process that does not use locking itself can happily ignore the locking af another process. As if that is not bad enough, they both have [problems](http://0pointer.de/blog/projects/locking.html) that make them close -to unusable for modern multi-treaded programs. Linux may in some very rare cases -support mandatory file locking, but it is just as broken as advisory. +to unusable for modern multi-threaded programs. Linux may in some very rare +cases support mandatory file locking, but it is just as broken as advisory. For Rust, the sharing mode can be set with a Windows-specific option. Given the -problems above, i don't expect there to ever be a cross-platform option for file +problems above, I don't expect there to ever be a cross-platform option for file locking. ### Windows-specific: Share mode @@ -248,7 +279,7 @@ locking. It is possible to set the individual share permissions with `.share_mode()`. The current philosophy of this function is that others should have no rights, -unless explicitely granted. I think a better fit for Rust would be to give all +unless explicitly granted. I think a better fit for Rust would be to give all others all rights, unless explicitly denied, e.g.: `.share_mode(DENY_READ | DENY_WRITE | DENY_DELETE)`. @@ -257,8 +288,8 @@ others all rights, unless explicitly denied, e.g.: ### Read cache hint -Instead of requesting only the data neccesary for a single `read()` call from a -storage device, an operating system may request more data than nessesary to have +Instead of requesting only the data necessary for a single `read()` call from a +storage device, an operating system may request more data than necessary to have it already available for the next read call (e.g. the read-ahead cache). If you read the file sequentially this is beneficial, for completely random access it can become a penalty. Operating systems generally have good heuristics, but you @@ -272,7 +303,7 @@ Do some real-world benchmarks before setting this option. .cache_hint(enum CacheHint) enum CacheHint { - Normal, + None, Sequential, Random, } @@ -300,7 +331,7 @@ is important to ensure critical data reaches the storage device in case of a system crash or power outage, but comes with a large performance penalty. All modern operating systems also support a mode where each call to `write()` -will not return until the data is written to the storage device, dus removing +will not return until the data is written to the storage device, thus removing step 2 for _all_ writes. This can be a useful options for writing critical data, where you would call `sync_data()` after each write. This saves a system call for each write, and you are sure to never forget it. @@ -320,7 +351,7 @@ reading if a file is opened with a read-write access mode. #### Sync data `.sync_data(true)` -Some systems support only syncing the data written, but can wait with updating +Some systems support syncing only the data written, but can wait with updating less critical metadata such as the last modified timestamp. If the metadata is not critical (and it rarely is), you should always use `sync_data()` as an easy performance win. @@ -340,12 +371,12 @@ usually 512 or 4096 bytes. Also the kernel can keep data recently read or written in cache, to speed up future file operations. Some operating systems allow you to completely bypass the copy of data to or -from kernel space. This is generally a bad idea. Applications will have figure -out and handle alignment restrictions themself, and implement manual caching. It -is mostly useful for database applications that may have more knowledge obout -their optimal caching behaviour than the os. And it can have a use when reading -many gigabytes of data (like a backup process), which may destroy the os cache -for other processes. +from kernel space. This is generally a bad idea. Applications will have to +figure out and handle alignment restrictions themselves, and implement manual +caching. It is mostly useful for database applications that may have more +knowledge about their optimal caching behaviour than the os. And it can have a +use when reading many gigabytes of data (like a backup process), which may +destroy the os cache for other processes. This is available on Windows with the flag `FILE_FLAG_NO_BUFFERING`, and on Linux and some variants of BSD with `O_DIRECT`. Making correct use of this mode @@ -393,12 +424,12 @@ if the os is NetBSD or Solaris. ### Custom flags `.custom_flags()` -Windows and the various flavors of Unix support flags that are not +Windows and the various flavours of Unix support flags that are not cross-platform, but that can be useful in some circumstances. On Unix they will be passed as the variable _flags_ to `open`, on Windows as the _dwFlagsAndAttributes_ parameter. -The cross-platform options of Rust can do magic: they can set any flag neccesary +The cross-platform options of Rust can do magic: they can set any flag necessary to ensure it works as expected. For example, `.append(true)` on Unix not only sets the flag `O_APPEND`, but also automatically `O_WRONLY` or `O_RDWR`. This special treatment is not available for the custom flags. @@ -468,6 +499,10 @@ HANDLE hTemplateFile; - Current: when `.append(true)` is set, it is not possible to modify file attributes on Windows, but it is possible to change the file mode on Unix. New: allow file attributes to be modified on Windows in append-mode. +- Current: `.read()` and `.write()` set individual bit flags instead of generic + flags. New: Set generic flags, as recommend by Microsoft. e.g. `GENERIC_WRITE` + instead of `FILE_GENERIC_WRITE` and `GENERIC_READ` instead of + `FILE_GENERIC_READ`. Currently truncate is broken on Windows, this fixes it. - Current: when no access mode is set, this falls back to opening the file read-only on Unix. New: open with `O_RDONLY | O_PATH` on Linux, and fail with `E_INVALID` on all @@ -484,7 +519,7 @@ HANDLE hTemplateFile; - Split the Windows-specific `.flags_and_attributes()` into `.flags()` and `.attributes()`. This is a form of future-proofing, as the new Windows 8 `Createfile2` also splits these attributes. This has the advantage of a clear - seperation between file attributes, that are somewhat similar to Unix mode + separation between file attributes, that are somewhat similar to Unix mode bits, and the custom flags that modify the behaviour of the current file handle. From 135ba7d7b4809ce0656ddce668c9c1f3f26b20a8 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Fri, 14 Aug 2015 09:55:19 -0700 Subject: [PATCH 0494/1195] Internal references are legal. Automatic crazy boolean bit-packing is crazy. --- text/0000-simd-infrastructure.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index f1ad633420d..b3474958311 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -87,12 +87,6 @@ would layer type-safety on top (i.e. generic `repr(simd)` types would use some `unsafe` trait as a bound that is designed to only be implemented by types that will work). -It is illegal to take an internal reference to the fields of a -`repr(simd)` type, because the representation of booleans may require -modification, so that booleans are bit-packed. The official external -library providing SIMD support will have private fields so this will -not be generally observable. - Adding `repr(simd)` to a type may increase its minimum/preferred alignment, based on platform behaviour. (E.g. x86 wants its 128-bit SSE vectors to be 128-bit aligned.) From 67fea6e98fecb0a05adb419ddc0cf504e5e0ba04 Mon Sep 17 00:00:00 2001 From: Huon Wilson Date: Fri, 14 Aug 2015 10:09:45 -0700 Subject: [PATCH 0495/1195] Type-level integer/values alternatives for shuffles. --- text/0000-simd-infrastructure.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/0000-simd-infrastructure.md b/text/0000-simd-infrastructure.md index b3474958311..4c8974fdfce 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/0000-simd-infrastructure.md @@ -390,6 +390,17 @@ cfg_if_else! { compiler can know this. The `repr(simd)` approach means there may be more than one SIMD-vector type with the `Simd8` shape (or, in fact, there may be zero). +- With type-level integers, there could be one shuffle intrinsic: + + fn simd_shuffle(x: T, y: T, idx: [u32; N]) -> U; + + NB. It is possible to add this as an additional intrinsic (possibly + deprecating the `simd_shuffleNNN` forms) later. +- Type-level values can be applied more generally: since the shuffle + indices have to be compile time constants, the shuffle could be + + fn simd_shuffle(x: T, y: T) -> U; + - Instead of platform detection, there could be feature detection (e.g. "platform supports something equivalent to x86's `DPPS`"), but there probably aren't enough cross-platform commonalities for this From f088784d2f035aa5e3604f59a12c793760dc1e6c Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 14 Aug 2015 15:31:53 -0400 Subject: [PATCH 0496/1195] add an edit history to the patched RFc --- text/0550-macro-future-proofing.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 42f0608d8de..6cfdec75fa3 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -139,3 +139,8 @@ reasonable freedom and can be extended in the future. same issue would come up. 3. Do nothing. This is very dangerous, and has the potential to essentially freeze Rust's syntax for fear of accidentally breaking a macro. + +# Edit History + +- Updated by https://github.com/rust-lang/rfcs/pull/1209, which added + semicolons into the follow set for types. From 8290b61b8a7d7215d36977c50f6a315f710215de Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 14 Aug 2015 13:11:22 -0700 Subject: [PATCH 0497/1195] RFC 1200 is cargo install --- text/{0000-cargo-install.md => 1200-cargo-install.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-cargo-install.md => 1200-cargo-install.md} (99%) diff --git a/text/0000-cargo-install.md b/text/1200-cargo-install.md similarity index 99% rename from text/0000-cargo-install.md rename to text/1200-cargo-install.md index 3580320af93..f51d8e2a4fd 100644 --- a/text/0000-cargo-install.md +++ b/text/1200-cargo-install.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-07-10 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1200](https://github.com/rust-lang/rfcs/pull/1200) +- Rust Issue: N/A # Summary From eecfd7828180814a20453ef91c792c25d9a77f24 Mon Sep 17 00:00:00 2001 From: Ulrik Sverdrup Date: Mon, 10 Aug 2015 23:53:17 +0200 Subject: [PATCH 0498/1195] RFC: `.drain(range)` and `.drain()` --- text/0000-drain-range-2.md | 209 +++++++++++++++++++++++++++++++++++++ 1 file changed, 209 insertions(+) create mode 100644 text/0000-drain-range-2.md diff --git a/text/0000-drain-range-2.md b/text/0000-drain-range-2.md new file mode 100644 index 00000000000..459b4ad8eac --- /dev/null +++ b/text/0000-drain-range-2.md @@ -0,0 +1,209 @@ +- Feature Name: drain-range +- Start Date: 2015-08-14 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Implement `.drain(range)` and `.drain()` respectively as appropriate on collections. + +# Motivation + +The `drain` methods and their draining iterators serve to mass remove elements +from a collection, receiving them by value in an iterator, while the collection +keeps its allocation intact (if applicable). + +The range parameterized variants of drain are a generalization of `drain`, to +affect just a subrange of the collection, for example removing just an index range +from a vector. + +`drain` thus serves both to consume all or some elements from a collection without +consuming the collection itself. The ranged `drain` allows bulk removal of +elements, more efficently than any other safe API. + +# Detailed design + +- Implement `.drain(a..b)` where `a` and `b` are indices, for all + collections that are sequences. +- Implement `.drain()` for other collections. This is just like `.drain(..)` would be + (drain the whole collection). +- Ranged drain accepts all range types, currently .., a.., ..b, a..b, + and drain will accept inclusive end ranges ("closed ranges") if they are implemented. +- Drain removes every element in the range. +- Drain returns an iterator that produces the removed items by value. +- Drain removes the whole range, regardless if you iterate the draining iterator + or not. +- Drain preserves the collection's capacity where it is possible. + +## Collections + +`Vec` and `String` already have ranged drain, so they are complete. + +`HashMap` and `HashSet` already have `.drain()`, so they are complete; +their elements have no meaningful order. + +`BinaryHeap` already has `.drain()`, and just like its other iterators, +it promises no particular order. So this collection is already complete. + +The following collections need updated implementations: + +`VecDeque` should implement `.drain(range)` for index ranges, just like `Vec` +does. + +`LinkedList` should implement `.drain(range)` for index ranges. Just +like the other seqences, this is a `O(n)` operation, and `LinkedList` already +has other indexed methods (`.split_off()`). + +## `BTreeMap` and `BTreeSet` + +`BTreeMap` already has a ranged iterator, `.range(a, b)`, and `drain` for +`BTreeMap` and `BTreeSet` should have arguments completely consistent the range +method. This will be addressed separately. + +## `IntoCheckedRange` trait + +The existing trait `collections::range::RangeArgument` will be replaced by +`IntoCheckedRange`, and will be used for `drain` methods that use a range +parameter. + +`IntoCheckedRange` is designed to allow bounds checking half-open and closed +ranges. Bounds checking before conversion allows handling otherwise tricky +extreme values correctly. It is an `unsafe trait` so that bounds checking can +be trusted. Below is a sketched-out implementation. + +```rust +/// Convert `Self` into a half open `usize` range that slices +/// a sequence indexed from 0 to `len`. +/// Return `Err` with a faulty index if out of bounds. +/// +/// Unsafe because: Implementation is trusted to bounds check correctly. +pub unsafe trait IntoCheckedRange { + fn into_checked_range(self, len: usize) -> Result, usize>; +} + +unsafe impl IntoCheckedRange for RangeFull { + #[inline] + fn into_checked_range(self, len: usize) -> Result, usize> { + Ok(0..len) + } +} + +unsafe impl IntoCheckedRange for RangeFrom { + #[inline] + fn into_checked_range(self, len: usize) -> Result, usize> { + if self.start <= len { + Ok(self.start..len) + } else { Err(self.start) } + } +} + +unsafe impl IntoCheckedRange for RangeTo { + #[inline] + fn into_checked_range(self, len: usize) -> Result, usize> { + if self.end <= len { + Ok(0..self.end) + } else { Err(self.end) } + } +} + +unsafe impl IntoCheckedRange for Range { + #[inline] + fn into_checked_range(self, len: usize) -> Result, usize> { + if self.start <= self.end && self.end <= len { + Ok(self.start..self.end) + } else { Err(cmp::max(self.start, self.end)) } + } +} + +// For illustration, this is what a closed range impl would look like +pub struct ClosedRangeSketch { + pub start: T, + pub end: T, +} + +unsafe impl IntoCheckedRange for ClosedRangeSketch { + fn into_checked_range(self, len: usize) -> Result, usize> { + if self.start <= self.end && self.end < len { + Ok(self.start..self.end + 1) + } else { Err(cmp::max(self.start, self.end)) } + } +} +``` + +Example use of `IntoCheckedRange`: + +```rust +pub fn drain(&mut self, range: R) -> Drain + where R: IntoCheckedRange +{ + let remove_range = match range.into_checked_range(self.len()) { + Err(i) => panic!("drain: Index {} is out of bounds", i), + Ok(r) => r, + }; + /* impl omitted */ +``` + +## Stabilization + +The following can be stabilized as they are: + +- `HashMap::drain` +- `HashSet::drain` +- `BinaryHeap::drain` + +The following can be stabilized, but their argument's trait is not stable: + +- `Vec::drain` +- `String::drain` + +The following will be heading towards stabilization after changes: + +- `VecDeque::drain` + +The `IntoCheckedRange` trait will not be stabilized until we have closed ranges. + +# Drawbacks + +- Collections disagree on if they are drained with a range (`Vec`) or not (`HashMap`) +- No trait for the drain method. + +# Alternatives + +- Use a trait for the drain method and let all collections implement it. This + will force all collections to use a single parameter (a range) for the drain + method. + +- Provide `.splice(range, iterator)` for `Vec` instead of `.drain(range)`: + + ```rust + fn splice(&mut self, range: R, iter: I) -> Splice + where R: IntoCheckedRange, I: IntoIterator + ``` + + if the method `.splice()` would both return an iterator of the replaced elements, + and consume an iterator (of arbitrary length) to replace the removed range, then + it includes drain's tasks. + +- RFC #574 proposed accepting either a single index (single key for maps) + or a range for ranged drain, so an alternative would be to do that. The + single index case is however out of place, and writing a range that spans + a single index is easy. + +- Use the name `.remove_range(a..b)` instead of `.drain(a..b)`. Since the method + has two simultaneous roles, removing a range and yielding a range as an iterator, + either role could guide the name. + This alternative name was not very popular with the rust developers I asked + (but they are already used to what `drain` means in rust context). + +- Provide `.drain()` without arguments and separate range drain into a separate + method name, implemented in addition to `drain` where applicable. + +- Do not support closed ranges in `drain`. + +- `BinaryHeap::drain` could drain the heap in sorted order. The primary proposal + is arbitrary order, to match preexisting `BinaryHeap` iterators. + +# Unresolved questions + +- Concrete shape of the `BTreeMap` API is not resolved here +- Will closed ranges be used for the `drain` API? From dd2feed6d2b7ee085287b7c9b400f2181f0ba5b8 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 14 Aug 2015 16:48:31 -0400 Subject: [PATCH 0499/1195] Merge RFC #1211 (MIR) --- text/{0000-mir.md => 1211-mir.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-mir.md => 1211-mir.md} (99%) diff --git a/text/0000-mir.md b/text/1211-mir.md similarity index 99% rename from text/0000-mir.md rename to text/1211-mir.md index 20bdbbf0843..547b9b7b27b 100644 --- a/text/0000-mir.md +++ b/text/1211-mir.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1211](https://github.com/rust-lang/rfcs/pull/1211) +- Rust Issue: [rust-lang/rust#27840](https://github.com/rust-lang/rust/issues/27840) # Summary From 85ee6c6e83b15ca5a132a289c9d473b5a1663fb5 Mon Sep 17 00:00:00 2001 From: Paul Dicker Date: Sun, 16 Aug 2015 14:11:34 +0200 Subject: [PATCH 0500/1195] Describe `.create(true).truncate(true)` on Windows And a few small nits --- 0000-open-options.md | 94 ++++++++++++++++++++++++++++++-------------- 1 file changed, 64 insertions(+), 30 deletions(-) diff --git a/0000-open-options.md b/0000-open-options.md index a0dfeaa1b83..b7474e40a18 100644 --- a/0000-open-options.md +++ b/0000-open-options.md @@ -29,6 +29,7 @@ This RFC attempts to ### Read-only Open a file for read-only. + ### Write-only Open a file for write-only. @@ -40,9 +41,11 @@ file.write(b"bbbb") // contents of file after: "bbbbaaaa" ``` + ### Read-write This is the simple combinations of read-only and write-only. + ### Append-mode Append-mode is similar to write-only, but all writes always happen at the end of the file. This mode is especially useful if multiple processes or threads write @@ -67,6 +70,7 @@ on Unix append is a superset of write. Because of this append is treated as a separate access mode in Rust, and if `.append(true)` is specified than `.write()` is ignored. + ### Read-append Writing to the file works exactly the same as in append-mode. @@ -100,7 +104,7 @@ file with `O_RDONLY | O_PATH`. Since version 3.6 you can call `fstat` on a file descriptor opened this way. For what it's worth -[Gnu Hurd](http://www.gnu.org/software/libc/manual/html_node/Access-Modes.html) +[GNU/Hurd](http://www.gnu.org/software/libc/manual/html_node/Access-Modes.html) allows opening files without an access mode, because it defines `O_RDONLY = 1` and `O_WRONLY = 2`. It allows all operations on the file that do not involve reading or writing the data, like `chmod`. @@ -110,8 +114,9 @@ opening the file with `E_INVALID`. Otherwise, if for example you are developing on OS X but forget to set `.read(true)` when opening a file, it would work on OS X but not on other systems. + ### Windows-specific -`.desired_access(FILE_READ_DATA)` +`.access_mode(FILE_READ_DATA)` On Windows you can detail whether you want to have read and/or write access to the files data, attributes and/or extended attributes. Managing permissions in @@ -121,7 +126,13 @@ In Rust, `.read(true)` gives you read access to the data, attributes and extended attributes. Similarly, `.write(true)` gives write access to those three, and the right to append data beyond the current end of the file. -But if you want fine-grained control, with `desired_access` you have it. +But if you want fine-grained control, with `access_mode` you have it. + +`.access_mode()` overrides the access mode set with Rusts cross-platform +options. Reasons to do so: +- it is not possible to un-set the flags set by Rusts options; +- otherwise the cross-platform options have to be wrapped with `#[cfg(unix)]`, + instead of only having to wrap the Windows-specific option. As a reference, this are the flags set by Rusts access modes: @@ -162,9 +173,11 @@ creation mode | file exists | file does not exist | Unix .create(true).truncate(true) | truncate | create | O_CREAT + O_TRUNC | CREATE_ALWAYS | .create_new(true) | fail | create | O_CREAT + O_EXCL | CREATE_NEW + FILE_FLAG_OPEN_REPARSE_POINT | + ### Not set Open an existing file. Fails if the file does not exist. + ### Create `.create(true)` @@ -173,23 +186,43 @@ Open an existing file, or create a new file if it does not already exists. Even if the access mode is read-only, it seems all operating systems can still create a new file (for whatever use reading from an empty file may be). + ### Truncate `.truncate(true)` Open an existing file, and truncate it to zero length. Fails if the file does not exist. Attributes and permissions of the truncated file are preserved. -Truncating will not work in read-only or append mode. Some platforms may support -this, but Rust does not allow this for cross-platform consistency (besides it -being sane behaviour). +Truncating will not work if the access mode is not write-only or read-write +(e.g. read-only and/or append mode). Some platforms may support this, but Rust +does not allow it for cross-platform consistency (besides it being sane +behaviour). On Windows truncating will only work if the `GENERIC_WRITE` flag is set. Setting the equivalent individual flags is not enough. + ### Create and truncate `.create(true).truncate(true)` -The logical combination of create and truncate. +Open an existing file and truncate it to zero length, or create a new file if it +does not already exists. + +Like `.create(true)`, even if the access mode is read-only it seems all +operating systems can still create a new file. + +Contrary to only `.truncate(true)`, with `.create(true).truncate(true)` Windows +_can_ truncate an existing file without requiring the `GENERIC_WRITE` flag. But +for cross-platform consistency Rust should not allow this if the access mode is +not write-only or read-write. +TODO: What to do if the access mode is set with the Windows-specific `.access_mode()`? + +On Windows the attributes of an existing file can cause `.open()` to fail. If +the existing file has the attribute _hidden_ set, it is necessary to open with +`FILE_ATTRIBUTE_HIDDEN`. Similarly if the existing file has the attribute +_system_ set, it is necessary to open with `FILE_ATTRIBUTE_SYSTEM`. See +the Windows-specific `.attributes()` below on how to set these. + ### Create_new `.create_new(true)` @@ -228,18 +261,20 @@ specified, `.mode()` is ignored. Rust currently does not expose a way to modify the umask. + ### Windows-specific: Attributes `.attributes(FILE_ATTRIBUTE_READONLY | FILE_ATTRIBUTE_HIDDEN | FILE_ATTRIBUTE_SYSTEM)` Files on Windows can have several attributes, most commonly one or more of the following four: readonly, hidden, system and archive. Most [others](https://msdn.microsoft.com/en-us/library/windows/desktop/gg258117%28v=vs.85%29.aspx) -are properties set by the file system. It seems of the others only -`FILE_ATTRIBUTE_ENCRYPTED` and `FILE_ATTRIBUTE_TEMPORARY` are meaningful to set -when creating a new file. +are properties set by the file system. Of the others only +`FILE_ATTRIBUTE_ENCRYPTED`, `FILE_ATTRIBUTE_TEMPORARY` and +`FILE_ATTRIBUTE_OFFLINE` can be set when creating a new file. All others are +silently ignored. -It is no use to set the archive flag, as Windows sets it automatically when the -file is newly created or modified. This flag may then be used by backup +It is no use to set the archive attribute, as Windows sets it automatically when +the file is newly created or modified. This flag may then be used by backup applications as an indication of which files have changed. If a _new_ file is created because it does not yet exist and `.create(true)` or @@ -247,7 +282,8 @@ If a _new_ file is created because it does not yet exist and `.create(true)` or with `.attributes()`. If an _existing_ file is opened with `.create(true).truncate(true)`, its -attributes are also replaced by the ones specified with `.attributes()`. +existing attributes are preserved and combined with the ones declared with +`.attributes()`. In all other cases the attributes get ignored. @@ -273,6 +309,7 @@ For Rust, the sharing mode can be set with a Windows-specific option. Given the problems above, I don't expect there to ever be a cross-platform option for file locking. + ### Windows-specific: Share mode `.share_mode(FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE)` @@ -287,7 +324,6 @@ others all rights, unless explicitly denied, e.g.: ## Caching behaviour ### Read cache hint - Instead of requesting only the data necessary for a single `read()` call from a storage device, an operating system may request more data than necessary to have it already available for the next read call (e.g. the read-ahead cache). If you @@ -298,12 +334,13 @@ file. Do some real-world benchmarks before setting this option. + #### Cache hint ``` .cache_hint(enum CacheHint) enum CacheHint { - None, + None, Sequential, Random, } @@ -316,8 +353,8 @@ On Windows this maps to the flags `FILE_FLAG_SEQUENTIAL_SCAN` and This option is ignored on operating systems that do not support caching hints. -### Write cache +### Write cache See [Ensuring data reaches disk](https://lwn.net/Articles/457667/) 1. copy data to kernel space @@ -336,6 +373,7 @@ step 2 for _all_ writes. This can be a useful options for writing critical data, where you would call `sync_data()` after each write. This saves a system call for each write, and you are sure to never forget it. + #### Sync all `.sync_all(true)`: implement an open option with the same name as the free standing call. @@ -348,6 +386,7 @@ OS X does not support `O_SYNC`, but it is possible to call fcntl with not end up in the read cache, so this can have a performance penalty when reading if a file is opened with a read-write access mode. + #### Sync data `.sync_data(true)` @@ -393,7 +432,6 @@ Out op scope. ## Other options ### Inheritance of file descriptors - Leaking file descriptors to child processes can cause problems and can be a security vulnerability. See this report by [Python](https://www.python.org/dev/peps/pep-0446/). @@ -437,7 +475,8 @@ special treatment is not available for the custom flags. Custom flags can only set flags, not remove flags set by Rusts options. For the custom flags on Unix, the bits that define the access mode are masked -out with `O_ACCMODE`, to ensure they do not interfere with the access mode set by Rusts options. +out with `O_ACCMODE`, to ensure they do not interfere with the access mode set +by Rusts options. | [Windows](https://msdn.microsoft.com/en-us/library/windows/desktop/hh449426%28v=vs.85%29.aspx): |:--------------------------- @@ -455,7 +494,7 @@ out with `O_ACCMODE`, to ensure they do not interfere with the access mode set b Unix: -| Posix | Linux | OS X | FreeBSD | OpenBSD | NetBSD |Dragonfly BSD| Solaris | +| POSIX | Linux | OS X | FreeBSD | OpenBSD | NetBSD |Dragonfly BSD| Solaris | |:------------|:------------|:------------|:------------|:------------|:------------|:------------|:------------| | O_DIRECTORY | O_DIRECTORY | | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | | O_NOCTTY | O_NOCTTY | | O_NOCTTY | | O_NOCTTY | | O_NOCTTY | @@ -477,11 +516,10 @@ Unix: | | | | | | O_ALT_IO | | | | | | | | | | | O_NOLINKS | | | | | | | | | O_XATTR | -| [Posix](http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html) | [Linux](http://man7.org/linux/man-pages/man2/open.2.html) | [OS X](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/open.2.html) | [FreeBSD](https://www.freebsd.org/cgi/man.cgi?query=open&sektion=2) | [OpenBSD](http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man2/open.2?query=open&sec=2) | [NetBSD](http://netbsd.gw.com/cgi-bin/man-cgi?open+2+NetBSD-current) | [Dragonfly BSD](http://leaf.dragonflybsd.org/cgi/web-man?command=open§ion=2) | [Solaris](http://docs.oracle.com/cd/E23824_01/html/821-1463/open-2.html) | +| [POSIX](http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html) | [Linux](http://man7.org/linux/man-pages/man2/open.2.html) | [OS X](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/open.2.html) | [FreeBSD](https://www.freebsd.org/cgi/man.cgi?query=open&sektion=2) | [OpenBSD](http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man2/open.2?query=open&sec=2) | [NetBSD](http://netbsd.gw.com/cgi-bin/man-cgi?open+2+NetBSD-current) | [Dragonfly BSD](http://leaf.dragonflybsd.org/cgi/web-man?command=open§ion=2) | [Solaris](http://docs.oracle.com/cd/E23824_01/html/821-1463/open-2.html) | ### Windows-specific flags and attributes - The following variables for CreateFile2 currently have no equivalent functions in Rust to set them: ``` @@ -507,7 +545,7 @@ HANDLE hTemplateFile; read-only on Unix. New: open with `O_RDONLY | O_PATH` on Linux, and fail with `E_INVALID` on all other Unix variants. -- Maybe rename the Windows-specific `.desired_access()` to `.access_mode()`? +- Rename the Windows-specific `.desired_access()` to `.access_mode()` ### Creation mode - Do not allow `.truncate(true)` if the access mode is read-only and/or append. @@ -528,7 +566,7 @@ HANDLE hTemplateFile; and possibly deny permissions. ### Caching behaviour -- Implement `.cache_hint()`. Should this use an enum? +- Implement `.cache_hint()`. - Implement `.sync_all()` and `.sync_data()`. ### Other options @@ -537,9 +575,7 @@ HANDLE hTemplateFile; system. - # Drawbacks - This adds a thin layer on top of the raw operating system calls. In this [pull request](https://github.com/rust-lang/rust/pull/26772#issuecomment-126753342) the conclusion was: this seems like a good idea for a "high level" abstraction @@ -560,14 +596,12 @@ Also this RFC is in line with the vision for IO in the # Alternatives - Keep the status quo. -# Unresolved questions -Implementation and testing of `.nofollow()`, `.sync_all()` and `.sync_data()` -could uncover some corner cases, but I don't expect any that would give great -trouble. +# Unresolved questions +Implementation and testing of `.sync_all()` and `.sync_data()` could uncover +some corner cases, but I don't expect any that would give great trouble. Should `.cache_hint()` take an enum? From d4e6c55e6554ffdc924d87d6e046362133731f0f Mon Sep 17 00:00:00 2001 From: Paul Dicker Date: Mon, 17 Aug 2015 13:34:33 +0200 Subject: [PATCH 0501/1195] Put combination of access modes and creation modes in a seperate section --- 0000-open-options.md | 68 ++++++++++++++++++++++++++++++-------------- 1 file changed, 46 insertions(+), 22 deletions(-) diff --git a/0000-open-options.md b/0000-open-options.md index b7474e40a18..0e021b42cfa 100644 --- a/0000-open-options.md +++ b/0000-open-options.md @@ -167,14 +167,14 @@ The implied flags can be specified explicitly with the constants creation mode | file exists | file does not exist | Unix | Windows | :----------------------------|-------------|---------------------|:------------------|:------------------------------------------| -(not set) | open | fail | | OPEN_EXISTING | +not set (open existing) | open | fail | | OPEN_EXISTING | .create(true) | open | create | O_CREAT | OPEN_ALWAYS | .truncate(true) | truncate | fail | O_TRUNC | TRUNCATE_EXISTING | .create(true).truncate(true) | truncate | create | O_CREAT + O_TRUNC | CREATE_ALWAYS | .create_new(true) | fail | create | O_CREAT + O_EXCL | CREATE_NEW + FILE_FLAG_OPEN_REPARSE_POINT | -### Not set +### Not set (open existing) Open an existing file. Fails if the file does not exist. @@ -183,9 +183,6 @@ Open an existing file. Fails if the file does not exist. Open an existing file, or create a new file if it does not already exists. -Even if the access mode is read-only, it seems all operating systems can still -create a new file (for whatever use reading from an empty file may be). - ### Truncate `.truncate(true)` @@ -193,13 +190,9 @@ create a new file (for whatever use reading from an empty file may be). Open an existing file, and truncate it to zero length. Fails if the file does not exist. Attributes and permissions of the truncated file are preserved. -Truncating will not work if the access mode is not write-only or read-write -(e.g. read-only and/or append mode). Some platforms may support this, but Rust -does not allow it for cross-platform consistency (besides it being sane -behaviour). - -On Windows truncating will only work if the `GENERIC_WRITE` flag is set. Setting -the equivalent individual flags is not enough. +Note when using the Windows-specific `.access_mode()`: truncating will only work +if the `GENERIC_WRITE` flag is set. Setting the equivalent individual flags is +not enough. ### Create and truncate @@ -208,14 +201,9 @@ the equivalent individual flags is not enough. Open an existing file and truncate it to zero length, or create a new file if it does not already exists. -Like `.create(true)`, even if the access mode is read-only it seems all -operating systems can still create a new file. - -Contrary to only `.truncate(true)`, with `.create(true).truncate(true)` Windows -_can_ truncate an existing file without requiring the `GENERIC_WRITE` flag. But -for cross-platform consistency Rust should not allow this if the access mode is -not write-only or read-write. -TODO: What to do if the access mode is set with the Windows-specific `.access_mode()`? +Note when using the Windows-specific `.access_mode()`: Contrary to only +`.truncate(true)`, with `.create(true).truncate(true)` Windows _can_ truncate an +existing file without requiring any flags to be set. On Windows the attributes of an existing file can cause `.open()` to fail. If the existing file has the attribute _hidden_ set, it is necessary to open with @@ -288,6 +276,42 @@ existing attributes are preserved and combined with the ones declared with In all other cases the attributes get ignored. +### Combination of access modes and creation modes + +Some combinations of creation modes and access modes do not make sense. + +For example: `.create(true)` when opening read-only. If the file does not +already exist, it is created and you start reading from an empty file. And it is +questionable whether you have permission to create a new file if you don't have +write access. A new file is created on all systems I have tested, but there is +no documentation that explicitly guarantees this behaviour. + +The same is true for `.truncate(true)` with read and/or append mode. Should an +existing file be modified if you don't have write permission? On Unix it is +undefined +(see [some](http://www.monkey.org/openbsd/archive/tech/0009/msg00299.html) +[comments](http://www.monkey.org/openbsd/archive/tech/0009/msg00304.html) on the +OpenBSD mailing list). The behaviour on Windows is inconsistent and depends on +whether `.create(true)` is set. + +To give guarantees about cross-platform (and sane) behaviour, Rust should allow +only the following combinations of access modes and creations modes: + +creation mode | read | write | read-write | append | read-append | +:-----------------------|:-----:|:-----:|:----------:|:------:|:-----------:| +not set (open existing) | X | X | X | X | X | +create | | X | X | X | X | +truncate | | X | X | | | +create and truncate | | X | X | | | +create_new | | X | X | X | X | + +It is possible to bypass these restrictions by using system-specific options (as +in this case you already have to take care of cross-platform support yourself). +On Unix this is done by setting the creation mode using `.custom_flags()` with +`O_CREAT`, `O_TRUNC` and/or `O_EXCL`. On Windows this can be done by manually +specifying `.access_mode()` (see above). + + ## Sharing / locking On Unix it is possible for multiple processes to read and write to the same file at the same time. @@ -554,8 +578,8 @@ HANDLE hTemplateFile; - Implement `.create_new()`. - Remove the Windows-specific `.creation_disposition()`. It has no use, because all its options can be set in a cross-platform way. -- Split the Windows-specific `.flags_and_attributes()` into `.flags()` and - `.attributes()`. This is a form of future-proofing, as the new Windows 8 +- Split the Windows-specific `.flags_and_attributes()` into `.custom_flags()` + and `.attributes()`. This is a form of future-proofing, as the new Windows 8 `Createfile2` also splits these attributes. This has the advantage of a clear separation between file attributes, that are somewhat similar to Unix mode bits, and the custom flags that modify the behaviour of the current file From d41ee16dd86ce3bd8ef7c5df594dc22bc883a96b Mon Sep 17 00:00:00 2001 From: Tobias Bucher Date: Wed, 19 Aug 2015 17:17:20 +0200 Subject: [PATCH 0502/1195] RFC: Allow a re-export for `main` --- text/0000-main-reexport.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 text/0000-main-reexport.md diff --git a/text/0000-main-reexport.md b/text/0000-main-reexport.md new file mode 100644 index 00000000000..64cba1c7ead --- /dev/null +++ b/text/0000-main-reexport.md @@ -0,0 +1,34 @@ +- Feature Name: main_reexport +- Start Date: 2015-08-19 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow a re-export of a function as entry point `main`. + +# Motivation + +Functions and re-exports of functions usually behave the same way, but they do +not for the program entry point `main`. This RFC aims to fix this inconsistency. + +The above mentioned inconsistency means that e.g. you currently cannot use a +library's exported function as your main function. + +# Detailed design + +Use the symbol `main` at the top-level of a crate that is compiled as a program +(`--crate-type=bin`) – instead of explicitly only accepting directly-defined +functions, also allow re-exports. + +# Drawbacks + +None. + +# Alternatives + +None. + +# Unresolved questions + +None. From e78f3e9cebc1497ea80b09dc20cb10d34248bdc3 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 19 Aug 2015 17:06:17 -0700 Subject: [PATCH 0503/1195] Put catch_panic in a new std::panic module instead --- text/0000-stabilize-catch-panic.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 95a229e4d4d..7be3ff68a76 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -5,8 +5,8 @@ # Summary -Stabilize `std::thread::catch_panic` after removing the `Send` and `'static` -bounds from the closure parameter. +Move `std::thread::catch_panic` to `std::panic::catch` after removing the `Send` +and `'static` bounds from the closure parameter. # Motivation @@ -178,12 +178,13 @@ this RFC. # Detailed design -At its heart, the change this RFC is proposing is to stabilize -`std::thread::catch_panic` after removing the `Send` and `'static` bounds from -the closure parameter, modifying the signature to be: +At its heart, the change this RFC is proposing is to move +`std::thread::catch_panic` to a new `std::panic` module and rename the function +to `catch`. Additionally, the `Send` and `'static` bounds from the closure +parameter will be removed, modifying the signature to be: ```rust -fn catch_panic R, R>(f: F) -> thread::Result +fn catch R, R>(f: F) -> thread::Result ``` More generally, however, this RFC also claims that this stable function does From 969912aaf0bde8cf8de886b03a816b0618992696 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Thu, 20 Aug 2015 11:35:07 -0700 Subject: [PATCH 0504/1195] add unresolved question --- text/0982-dst-coercion.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0982-dst-coercion.md b/text/0982-dst-coercion.md index 92c26884981..98d32fadeb5 100644 --- a/text/0982-dst-coercion.md +++ b/text/0982-dst-coercion.md @@ -175,7 +175,11 @@ indicate the field type which is coerced, for example). # Unresolved questions -None +It is unclear to what extent DST coercions should support multiple fields that +refer to the same type parameter. `PhantomData` should definitely be +supported as an "extra" field that's skipped, but can all zero-sized fields +be skipped? Are there cases where this would enable by-passing the abstractions +that make some API safe? # Updates since being accepted From 2ac3284d209a9d6e000915f8af3035e525a04db7 Mon Sep 17 00:00:00 2001 From: Tobias Bucher Date: Fri, 21 Aug 2015 12:08:43 +0200 Subject: [PATCH 0505/1195] Add some examples and refer to the issue that led to the RFC --- text/0000-main-reexport.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/text/0000-main-reexport.md b/text/0000-main-reexport.md index 64cba1c7ead..0f1007cc5bf 100644 --- a/text/0000-main-reexport.md +++ b/text/0000-main-reexport.md @@ -15,6 +15,23 @@ not for the program entry point `main`. This RFC aims to fix this inconsistency. The above mentioned inconsistency means that e.g. you currently cannot use a library's exported function as your main function. +Example: + + pub mod foo { + pub fn bar() { + println!("Hello world!"); + } + } + pub use foo::bar as main; + +Example 2: + + extern crate main_functions; + pub use main_functions::rmdir as main; + +See also https://github.com/rust-lang/rust/issues/27640 for the corresponding +issue discussion. + # Detailed design Use the symbol `main` at the top-level of a crate that is compiled as a program From c5b08af92aee5386970a77c708a59c7306583b2e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 26 Aug 2015 17:03:00 -0700 Subject: [PATCH 0506/1195] RFC 1212 is tweaking lines()/lines_any() --- ...0-line-endings.md => 1212-line-endings.md} | 72 +++++++++---------- 1 file changed, 36 insertions(+), 36 deletions(-) rename text/{0000-line-endings.md => 1212-line-endings.md} (80%) diff --git a/text/0000-line-endings.md b/text/1212-line-endings.md similarity index 80% rename from text/0000-line-endings.md rename to text/1212-line-endings.md index 440f07ea401..aaf327b0607 100644 --- a/text/0000-line-endings.md +++ b/text/1212-line-endings.md @@ -1,68 +1,68 @@ -- Feature Name: line_endings +- Feature Name: `line_endings` - Start Date: 2015-07-10 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1212](https://github.com/rust-lang/rfcs/pull/1212) +- Rust Issue: [rust-lang/rust#28032](https://github.com/rust-lang/rust/issues/28032) # Summary -Change all functions dealing with reading "lines" to treat both '\n' and '\r\n' +Change all functions dealing with reading "lines" to treat both '\n' and '\r\n' as a valid line-ending. # Motivation -The current behavior of these functions is to treat only '\n' as line-ending. -This is surprising for programmers experienced in other languages. Many -languages open files in a "text-mode" per default, which means when they iterate -over the lines, they don't have to worry about the two kinds of line-endings. -Such programmers will be surprised to learn that they have to take care of such -details themselves in Rust. Some may not even have heard of the distinction +The current behavior of these functions is to treat only '\n' as line-ending. +This is surprising for programmers experienced in other languages. Many +languages open files in a "text-mode" per default, which means when they iterate +over the lines, they don't have to worry about the two kinds of line-endings. +Such programmers will be surprised to learn that they have to take care of such +details themselves in Rust. Some may not even have heard of the distinction between two styles of line-endings. -The current design also violates the "do what I mean" principle. Both '\r\n' and -'\n' are widely used as line-separators. By talking about the concept of -"lines", it is clear that the current file (or buffer, really) is considered to -be in text format. It is thus very reasonable to expect "lines" to apply to both +The current design also violates the "do what I mean" principle. Both '\r\n' and +'\n' are widely used as line-separators. By talking about the concept of +"lines", it is clear that the current file (or buffer, really) is considered to +be in text format. It is thus very reasonable to expect "lines" to apply to both kinds of encoding lines in binary format. -In particular, if the crate is developed on Linux or Mac, the programmer will -probably have most of his input encoded with only '\n' for the line-endings. He -may use the functions talking about "lines", and they will work all right. It is -only when someone runs this crate on input that contains '\r\n' that the bug +In particular, if the crate is developed on Linux or Mac, the programmer will +probably have most of his input encoded with only '\n' for the line-endings. He +may use the functions talking about "lines", and they will work all right. It is +only when someone runs this crate on input that contains '\r\n' that the bug will be uncovered. The editor has personally run into this issue when reading line-by-line from stdin, with the program suddenly failing on Windows. # Detailed design -The following functions will have to be changed: `BufRead::lines` and -`str::lines`. They both should treat '\r\n' as marking the end of a line. This -can be implemented, for example, by first splitting at '\n' like now and then +The following functions will have to be changed: `BufRead::lines` and +`str::lines`. They both should treat '\r\n' as marking the end of a line. This +can be implemented, for example, by first splitting at '\n' like now and then removing a trailing '\r' right before returning data to the caller. -Furthermore, `str::lines_any` (the only function currently dealing with both -kinds of line-endings) is deprecated, as it is then functionally equivalent with +Furthermore, `str::lines_any` (the only function currently dealing with both +kinds of line-endings) is deprecated, as it is then functionally equivalent with `str::lines`. # Drawbacks -This is a semantics-breaking change, changing the behavior of released, stable -API. However, as argued above, the new behavior is much less surprising than the -old one - so one could consider this fixing a bug in the original -implementation. There are alternatives available for the case that one really -wants to split at '\n' only, namely `BufRead::split` and `str::split`. However, -`BufRead:split` does not iterate over `String`, but rather over `Vec`, so +This is a semantics-breaking change, changing the behavior of released, stable +API. However, as argued above, the new behavior is much less surprising than the +old one - so one could consider this fixing a bug in the original +implementation. There are alternatives available for the case that one really +wants to split at '\n' only, namely `BufRead::split` and `str::split`. However, +`BufRead:split` does not iterate over `String`, but rather over `Vec`, so users have to insert an additional explicit call to `String::from_utf8`. # Alternatives -There's the obvious alternative of not doing anything. This leaves a gap in the -features Rust provides to deal with text files, making it hard to treat both +There's the obvious alternative of not doing anything. This leaves a gap in the +features Rust provides to deal with text files, making it hard to treat both kinds of line-endings uniformly. -The second alternative is to add `BufRead::lines_any` which works similar to -`str::lines_any` in that it deals with both '\n' and '\r\n'. This provides all -the necessary functionality, but it still leaves people with the need to choose -one of the two functions - and potentially choosing the wrong one. In -particular, the functions with the shorter, nicer name (the existing ones) will +The second alternative is to add `BufRead::lines_any` which works similar to +`str::lines_any` in that it deals with both '\n' and '\r\n'. This provides all +the necessary functionality, but it still leaves people with the need to choose +one of the two functions - and potentially choosing the wrong one. In +particular, the functions with the shorter, nicer name (the existing ones) will almost always *not* be the right choice. # Unresolved questions From 9ebc36978bc6bba6ab52021989be384d418170ab Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 26 Aug 2015 17:25:38 -0700 Subject: [PATCH 0507/1195] Rename to recover, re-add 'static --- text/0000-stabilize-catch-panic.md | 47 ++++++++++++++---------------- 1 file changed, 22 insertions(+), 25 deletions(-) diff --git a/text/0000-stabilize-catch-panic.md b/text/0000-stabilize-catch-panic.md index 7be3ff68a76..63614f9d047 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/0000-stabilize-catch-panic.md @@ -5,8 +5,8 @@ # Summary -Move `std::thread::catch_panic` to `std::panic::catch` after removing the `Send` -and `'static` bounds from the closure parameter. +Move `std::thread::catch_panic` to `std::panic::recover` after removing the +`Send` bound from the closure parameter. # Motivation @@ -180,11 +180,11 @@ this RFC. At its heart, the change this RFC is proposing is to move `std::thread::catch_panic` to a new `std::panic` module and rename the function -to `catch`. Additionally, the `Send` and `'static` bounds from the closure -parameter will be removed, modifying the signature to be: +to `catch`. Additionally, the `Send` bound from the closure parameter will be +removed (`'static` will stay), modifying the signature to be: ```rust -fn catch R, R>(f: F) -> thread::Result +fn recover R + 'static, R>(f: F) -> thread::Result ``` More generally, however, this RFC also claims that this stable function does @@ -197,10 +197,9 @@ exist in the form of panics. What this RFC is adding, however, is a construct via which to catch these exceptions within a thread, bringing the standard library closer to the exception support in other languages. -Catching a panic (and especially not having `'static` on the bounds list) makes -it easier to observe broken invariants of data structures shared across the -`catch_panic` boundary, which can possibly increase the likelihood of exception -safety issues arising. +Catching a panic makes it easier to observe broken invariants of data structures +shared across the `catch_panic` boundary, which can possibly increase the +likelihood of exception safety issues arising. The risk of this step is that catching panics becomes an idiomatic way to deal with error-handling, thereby making exception safety much more of a headache @@ -277,26 +276,24 @@ roughly analogous to an opaque "an unexpected error has occurred" message. Stabilizing `catch_panic` does little to change the tradeoffs around `Result` and `panic` that led to these conventions. -## Why remove the bounds? +## Why remove `Send`? -There are a few reasons to remove the `'static` and `Send` bounds on the -`catch_panic` function: +One of the primary use cases of `recover` is in an FFI context, where lots +of `*mut` and `*const` pointers are flying around. These two types aren't +`Send` by default, so having their values cross the `catch_panic` boundary +would be highly un-ergonomic (albeit still possible). As a result, this RFC +proposes removing the `Send` bound from the function. -* One of the primary use cases of `catch_panic` is in an FFI context, where lots - of `*mut` and `*const` pointers are flying around. These two types aren't - `Send` by default, so having their values cross the `catch_panic` boundary - would be highly un-ergonomic (albeit still possible). As a result, this RFC - proposes removing the `Send` bound from the function. +## Why keep `'static`? -* A reason to remove the `'static` bound is that it doesn't provide rock-solid - exception-safety mitigation. Using thread-local storage it's possible to - share mutable data across a call to `catch_panic` even if that data isn't - `'static`. +This RFC proposes leaving the `'static` bound on the closure parameter for now. +There isn't a clearly strong case (such as for `Send`) to remove this parameter +just yet, and it helps mitigate exception safety issues related to shared +references across the `recover` boundary. -* Borrowed data, in particular, is helpful for thread pools that need - to execute closures with borrowed data within them; essentially, the worker - threads are executing multiple "semantic threads" over their lifetime, and the - `catch_panic` boundary represents the end of these "semantic threads". +There is conversely also not a clearly strong case for *keeping* this bound, but +as it's the more conservative route (and backwards compatible to remove) it will +remain for now. # Drawbacks From 0893030d0e4c1213984641cf5f5358d32effea1d Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 26 Aug 2015 17:36:00 -0700 Subject: [PATCH 0508/1195] RFC 1242 is rust-lang crates --- text/{0000-rust-lang-crates.md => 1242-rust-lang-crates.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-rust-lang-crates.md => 1242-rust-lang-crates.md} (99%) diff --git a/text/0000-rust-lang-crates.md b/text/1242-rust-lang-crates.md similarity index 99% rename from text/0000-rust-lang-crates.md rename to text/1242-rust-lang-crates.md index dcda7c41a84..dc24add8ffe 100644 --- a/text/0000-rust-lang-crates.md +++ b/text/1242-rust-lang-crates.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-07-29 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1242](https://github.com/rust-lang/rfcs/pull/1242) +- Rust Issue: N/A # Summary From 5442de0c3ec6c663a05bcdf3d4d75bcdcc0a0422 Mon Sep 17 00:00:00 2001 From: Andrew Paseltiner Date: Thu, 27 Aug 2015 10:01:18 -0400 Subject: [PATCH 0509/1195] update for libs team changes --- text/0000-collection-recovery.md | 69 ++------------------------------ 1 file changed, 3 insertions(+), 66 deletions(-) diff --git a/text/0000-collection-recovery.md b/text/0000-collection-recovery.md index ea2846291f4..1fa126a32f2 100644 --- a/text/0000-collection-recovery.md +++ b/text/0000-collection-recovery.md @@ -5,8 +5,7 @@ # Summary -Add element-recovery methods to the set types in `std`. Add key-recovery methods to the map types -in `std` in order to facilitate this. +Add element-recovery methods to the set types in `std`. # Motivation @@ -87,10 +86,10 @@ Add the following element-recovery methods to `std::collections::{BTreeSet, Hash ```rust impl Set { // Like `contains`, but returns a reference to the element if the set contains it. - fn element(&self, element: &Q) -> Option<&T>; + fn get(&self, element: &Q) -> Option<&T>; // Like `remove`, but returns the element if the set contained it. - fn remove_element(&mut self, element: &Q) -> Option; + fn take(&mut self, element: &Q) -> Option; // Like `insert`, but replaces the element with the given one and returns the previous element // if the set contained it. @@ -98,72 +97,10 @@ impl Set { } ``` -In order to implement the above methods, add the following key-recovery methods to -`std::collections::{BTreeMap, HashMap}`: - -```rust -impl Map { - // Like `get`, but additionally returns a reference to the entry's key. - fn key_value(&self, key: &Q) -> Option<(&K, &V)>; - - // Like `get_mut`, but additionally returns a reference to the entry's key. - fn key_value_mut(&mut self, key: &Q) -> Option<(&K, &mut V)>; - - // Like `remove`, but additionally returns the entry's key. - fn remove_key_value(&mut self, key: &Q) -> Option<(K, V)>; - - // Like `insert`, but additionally replaces the key with the given one and returns the previous - // key and value if the map contained it. - fn replace(&mut self, key: K, value: V) -> Option<(K, V)>; -} -``` - -Add the following key-recovery methods to `std::collections::{btree_map, hash_map}::OccupiedEntry`: - -```rust -impl<'a, K, V> OccupiedEntry<'a, K, V> { - // Like `get`, but additionally returns a reference to the entry's key. - fn key_value(&self) -> (&K, &V); - - // Like `get_mut`, but additionally returns a reference to the entry's key. - fn key_value_mut(&mut self) -> (&K, &mut V); - - // Like `into_mut`, but additionally returns a reference to the entry's key. - fn into_key_value_mut(self) -> (&'a K, &'a mut V); - - // Like `remove`, but additionally returns the entry's key. - fn remove_key_value(self) -> (K, V); -} -``` - -Add the following key-recovery methods to `std::collections::{btree_map, hash_map}::VacantEntry`: - -```rust -impl<'a, K, V> VacantEntry<'a, K, V> { - /// Returns a reference to the entry's key. - fn key(&self) -> &K; - - // Like `insert`, but additionally returns a reference to the entry's key. - fn insert_key_value(self, value: V) -> (&'a K, &'a mut V); - - // Returns the entry's key without inserting it into the map. - fn into_key(self) -> K; -} -``` - # Drawbacks This complicates the collection APIs. -The distinction between `insert` and `replace` may be confusing. It would be more consistent to -call `Set::replace` `Set::insert_element` and `Map::replace` `Map::insert_key_value`, but -`BTreeMap` and `HashMap` do not replace equivalent keys in their `insert` methods, so rather than -have `insert` and `insert_key_value` behave differently in that respect, `replace` is used instead. - # Alternatives Do nothing. - -# Unresolved questions - -Are these the best method names? From b8648e413c2c43c85e02aa0f1ff64f56663d026e Mon Sep 17 00:00:00 2001 From: Andrew Paseltiner Date: Thu, 27 Aug 2015 10:18:34 -0400 Subject: [PATCH 0510/1195] rename RFC --- text/{0000-collection-recovery.md => 0000-set-recovery.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-collection-recovery.md => 0000-set-recovery.md} (98%) diff --git a/text/0000-collection-recovery.md b/text/0000-set-recovery.md similarity index 98% rename from text/0000-collection-recovery.md rename to text/0000-set-recovery.md index 1fa126a32f2..fac97bdbeee 100644 --- a/text/0000-collection-recovery.md +++ b/text/0000-set-recovery.md @@ -1,4 +1,4 @@ -- Feature Name: collection_recovery +- Feature Name: set_recovery - Start Date: 2015-07-08 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) From 50057bcba17bb33844527f1ee1ceb7c9ad9d4181 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 27 Aug 2015 13:47:36 -0700 Subject: [PATCH 0511/1195] RFC 1194 is set recovery methods --- text/{0000-set-recovery.md => 1194-set-recovery.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-set-recovery.md => 1194-set-recovery.md} (93%) diff --git a/text/0000-set-recovery.md b/text/1194-set-recovery.md similarity index 93% rename from text/0000-set-recovery.md rename to text/1194-set-recovery.md index fac97bdbeee..8a2e0a7e1ca 100644 --- a/text/0000-set-recovery.md +++ b/text/1194-set-recovery.md @@ -1,7 +1,7 @@ -- Feature Name: set_recovery +- Feature Name: `set_recovery` - Start Date: 2015-07-08 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1194](https://github.com/rust-lang/rfcs/pull/1194) +- Rust Issue: [rust-lang/rust#28050](https://github.com/rust-lang/rust/issues/28050) # Summary From cafe0d843603e9593416ac325b11fb3759c232f9 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 31 Aug 2015 08:44:14 -0700 Subject: [PATCH 0512/1195] RFC 1236 is std::panic::recover --- ...ilize-catch-panic.md => 1236-stabilize-catch-panic.md} | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) rename text/{0000-stabilize-catch-panic.md => 1236-stabilize-catch-panic.md} (98%) diff --git a/text/0000-stabilize-catch-panic.md b/text/1236-stabilize-catch-panic.md similarity index 98% rename from text/0000-stabilize-catch-panic.md rename to text/1236-stabilize-catch-panic.md index 63614f9d047..6559a80b1a3 100644 --- a/text/0000-stabilize-catch-panic.md +++ b/text/1236-stabilize-catch-panic.md @@ -1,7 +1,7 @@ -- Feature Name: `catch_panic` +- Feature Name: `recover` - Start Date: 2015-07-24 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1236](https://github.com/rust-lang/rfcs/pull/1236) +- Rust Issue: [rust-lang/rust#27719](https://github.com/rust-lang/rust/issues/27719) # Summary @@ -180,7 +180,7 @@ this RFC. At its heart, the change this RFC is proposing is to move `std::thread::catch_panic` to a new `std::panic` module and rename the function -to `catch`. Additionally, the `Send` bound from the closure parameter will be +to `recover`. Additionally, the `Send` bound from the closure parameter will be removed (`'static` will stay), modifying the signature to be: ```rust From 4741ef8082271f7dc093ff5bb38fc5009be0223f Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 1 Sep 2015 13:14:51 +1200 Subject: [PATCH 0513/1195] Minor changes --- README.md | 10 ++++++---- compiler_changes.md | 7 ++++++- libs_changes.md | 11 +++++------ 3 files changed, 17 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index b49d4c1b0a2..232550e8a4b 100644 --- a/README.md +++ b/README.md @@ -199,7 +199,9 @@ and the RFC seems to be in a steady state, the shepherd and/or sub-team leader will announce an FCP. In general, the idea here is to "front-load" as much of the feedback as possible before the point where we actually reach a decision - by the end of the FCP, the decision on whether or not to accept the RFC should -be obvious from the RFC discussion thread. +usually be obvious from the RFC discussion thread. On occasion, there may not be +consensus but discussion has stalled. In this case, the relevant team will make +a decision. ## The RFC life-cycle @@ -248,9 +250,9 @@ posted back to the RFC pull request. A sub-team makes final decisions about RFCs after the benefits and drawbacks are well understood. These decisions can be made at any time, but the sub-team will regularly issue decisions. When a decision is made, the RFC PR will either be -merged or closed, in either case with a comment describing the rationale for the -decision. The comment should largely be a summary of discussion already on the -comment thread. +merged or closed. In either case, if the reasoning is not clear from the +discussion in thread, the sub-team will add a comment describing the rationale +for the decision. ## Implementing an RFC diff --git a/compiler_changes.md b/compiler_changes.md index 45990e6a37b..75137743041 100644 --- a/compiler_changes.md +++ b/compiler_changes.md @@ -26,7 +26,12 @@ submitted later if there is scope for large changes to the language RFC. non-trivial ways * Adding, removing, or changing a stable compiler flag * The implementation of new language features where there is significant change - or addition to the compiler + or addition to the compiler. There is obviously some room for interpretation + about what consitutes a "significant" change and how much detail the + implementation RFC needs. For guidance, [associated items](text/0195-associated-items.md) + and [UFCS](text/0132-ufcs.md) would clearly need an implementation RFC, + [type ascription](text/0803-type-ascription.md) and + [lifetime elision](text/0141-lifetime-elision.md) would not. * Any other change which causes backwards incompatible changes to stable behaviour of the compiler, language, or libraries diff --git a/libs_changes.md b/libs_changes.md index eb18ed5d271..31f1de0210d 100644 --- a/libs_changes.md +++ b/libs_changes.md @@ -27,8 +27,7 @@ at this point in time can still mean major community breakage regardless of trains, however. * HOWEVER: Big PRs can be a lot of work to make only to have that work rejected for - details that could have been hashed out first. *This is the motivation for - having RFCs*. + details that could have been hashed out first. * RFCs are *only* meaningful if a significant and diverse portion of the community actively participates in them. The official teams are not @@ -95,10 +94,10 @@ current implementation. * Once something has been merged as unstable, a shepherd should be assigned to promote and obtain feedback on the design. -* Once the API has been unstable for at least one full cycle (6 weeks), - the shepherd (or any library sub-team member) may nominate an API for a - *final comment period* of another cycle. Feedback and other comments should be - posted to the tracking issue. This should be publicized. +* Every time a release cycle ends, the libs teams assesses the current unstable + APIs and selects some number of them for potential stabilization during the + next cycle. These are announced for FCP at the beginning of the cycle, and + (possibly) stabilized just before the beta is cut. * After the final comment period, an API should ideally take one of two paths: * **Stabilize** if the change is desired, and consensus is reached From edcb3f0c913a340d581c4450c9d314579955f37d Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 3 Jun 2015 15:56:51 +0200 Subject: [PATCH 0514/1195] new RFC: deprecation --- text/0000-deprecation.md | 42 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 text/0000-deprecation.md diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md new file mode 100644 index 00000000000..ace7e7e7d24 --- /dev/null +++ b/text/0000-deprecation.md @@ -0,0 +1,42 @@ +- Feature Name: A plan for deprecating APIs within Rust +- Start Date: 2015-06-03 +- RFC PR: +- Rust Issue: + +# Summary + +There has been an ongoing [discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) about how we are going to evolve the standard library. This RFC tries to condense the consensus + +# Motivation + +We want to guide the deprecation efforts to allow std to evolve freely to get the best possible API while ensuring minimum-breakage backwards compatibility for users and allow std authors to remove API items for a given version of Rust. Basically have our cake and eat it. Yum, cake. + +Of course we cannot really keep and remove a feature at the same time. To square this circle, we can follow the process outlined herein. + +# Detailed design + +We already declare deprecation in terms of Rust versions (like "1.0", "1.2"). The current attribute looks like `#[deprecated(since = "1.0.0", reason="foo")]`. This should be extended to add an optional `removed_at` key, to state that the item should be made inaccessible at that version. Note that while this allows for marking items as deprecated, there is absolutely no provision to actually *remove* items. In fact this proposal bans removing an API type outright, unless security concerns are deemed more important than the resulting breakage from removing it or the API item has some fault that means it cannot be used correctly at all (thus leaving the API in place would result in the same level of breakage than removing it). + +Currently every rustc version implements only its own version, having multiple versions is possible using something like multirust, though this does not work within a build. Also currently rustc versions do not guarantee interoperability. This RFC aims to change this situation. + +First, crates should state their target version using a `#![version = "1.0.0"]` attribute. Cargo should insert the current rust version by default on `cargo new` and *warn* if no version is defined on all other commands. It may optionally *note* that the specified target version is outdated on `cargo package`. [crates.io](https://crates.io) may deny packages that do not declare a version to give the target version requirement more weight to library authors. Cargo should also be able to hold back a new library version if its declared target version is newer than the rust version installed on the system. In those cases, cargo should emit a warning urging the user to upgrade their rust installation. + +`rustc` should use this target version definition to check for deprecated items. If no target version is defined, deprecation checking is deactivated (as we cannot assume a specific rust version), however a warning stating the same should be issued (as with cargo – we should probably make cargo not warn on build to get rid of duplicate warnings). Otherwise, use of API items whose `since` attribute is less or equal to the target version of the crate should trigger a warning, while API items whose `removed_at` attribute is less or equal to the target version should trigger an error. + +`rustdoc` should mark deprecated APIs as such (e.g. make them in a lighter gray font) and relegate removed APIs to a section below all others (and that may be hidden via a checkbox). We should not completely remove the documentation, as users of libraries that target old versions may still have a use for them, but neither should we let them clutter the docs. + +# Drawbacks + +By requiring full backwards-compatibility, we will never be able to actually remove stuff from the APIs, which will probably lead to some bloat. + +# Alternatives + +* Follow a more agressive strategy that actually removes stuff from the API. This would make it easier for the libstd creators at some cost for library and application writers, as they are required to keep up to date or face breakage +* Hide deprecated items in the docs: This could be done either by putting them into a linked extra page or by adding a "show deprecated" checkbox that may be default be checked or not, depending on who you ask. This will however confuse people, who see the deprecated APIs in some code, but cannot find them in the docs anymore +* Allow to distinguish "soft" and "hard" deprecation, so that an API can be marked as "soft" deprecated to dissuade new uses before hard deprecation is decided. Allowing people to specify deprecation in future version appears to have much of the same benefits without needing a new attribute key. +* Decide deprecation on a per-case basis. This is what we do now. The proposal just adds a well-defined process to it +* Never deprecate anything. Evolve the API by adding stuff only. Rust would be crushed by the weight of its own cruft before 2.0 even has a chance to land. Users will be uncertain which APIs to use + +# Unresolved questions + +Should we allow library writers to use the same features for deprecating their API items? From 282ed615b673dae01bb5dbedd5c9c8daaae6bb30 Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 3 Jun 2015 16:25:21 +0200 Subject: [PATCH 0515/1195] word wrapped, thanks to steveklabnik --- text/0000-deprecation.md | 93 ++++++++++++++++++++++++++++++++-------- 1 file changed, 75 insertions(+), 18 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index ace7e7e7d24..7a48f1d055a 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -5,38 +5,95 @@ # Summary -There has been an ongoing [discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) about how we are going to evolve the standard library. This RFC tries to condense the consensus +There has been an ongoing [discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) about how we are going to evolve the standard library. This RFC tries to condense the consensus. # Motivation We want to guide the deprecation efforts to allow std to evolve freely to get the best possible API while ensuring minimum-breakage backwards compatibility for users and allow std authors to remove API items for a given version of Rust. Basically have our cake and eat it. Yum, cake. -Of course we cannot really keep and remove a feature at the same time. To square this circle, we can follow the process outlined herein. +Of course we cannot really keep and remove a feature at the same time. +To square this circle, we can follow the process outlined herein. # Detailed design -We already declare deprecation in terms of Rust versions (like "1.0", "1.2"). The current attribute looks like `#[deprecated(since = "1.0.0", reason="foo")]`. This should be extended to add an optional `removed_at` key, to state that the item should be made inaccessible at that version. Note that while this allows for marking items as deprecated, there is absolutely no provision to actually *remove* items. In fact this proposal bans removing an API type outright, unless security concerns are deemed more important than the resulting breakage from removing it or the API item has some fault that means it cannot be used correctly at all (thus leaving the API in place would result in the same level of breakage than removing it). - -Currently every rustc version implements only its own version, having multiple versions is possible using something like multirust, though this does not work within a build. Also currently rustc versions do not guarantee interoperability. This RFC aims to change this situation. - -First, crates should state their target version using a `#![version = "1.0.0"]` attribute. Cargo should insert the current rust version by default on `cargo new` and *warn* if no version is defined on all other commands. It may optionally *note* that the specified target version is outdated on `cargo package`. [crates.io](https://crates.io) may deny packages that do not declare a version to give the target version requirement more weight to library authors. Cargo should also be able to hold back a new library version if its declared target version is newer than the rust version installed on the system. In those cases, cargo should emit a warning urging the user to upgrade their rust installation. - -`rustc` should use this target version definition to check for deprecated items. If no target version is defined, deprecation checking is deactivated (as we cannot assume a specific rust version), however a warning stating the same should be issued (as with cargo – we should probably make cargo not warn on build to get rid of duplicate warnings). Otherwise, use of API items whose `since` attribute is less or equal to the target version of the crate should trigger a warning, while API items whose `removed_at` attribute is less or equal to the target version should trigger an error. - -`rustdoc` should mark deprecated APIs as such (e.g. make them in a lighter gray font) and relegate removed APIs to a section below all others (and that may be hidden via a checkbox). We should not completely remove the documentation, as users of libraries that target old versions may still have a use for them, but neither should we let them clutter the docs. +We already declare deprecation in terms of Rust versions (like "1.0", +"1.2"). The current attribute looks like `#[deprecated(since = "1.0.0", +reason="foo")]`. This should be extended to add an optional +`removed_at` key, to state that the item should be made inaccessible at +that version. Note that while this allows for marking items as +deprecated, there is purposely no provision to actually *remove* items. +In fact this proposal bans removing an API type outright, unless +security concerns are deemed more important than the resulting breakage +from removing it or the API item has some fault that means it cannot be +used correctly at all (thus leaving the API in place would result in +the same level of breakage than removing it). + +Currently every rustc version implements only its own version, having +multiple versions is possible using something like multirust, though +this does not work within a build. Also currently rustc versions do not +guarantee interoperability. This RFC aims to change this situation. + +First, crates should state their target version using a `#![version = +"1.0.0"]` attribute. Cargo should insert the current rust version by +default on `cargo new` and *warn* if no version is defined on all other +commands. It may optionally *note* that the specified target version is +outdated on `cargo package`. [crates.io](https://crates.io) may deny +packages that do not declare a version to give the target version +requirement more weight to library authors. Cargo should also be able +to hold back a new library version if its declared target version is +newer than the rust version installed on the system. In those cases, +cargo should emit a warning urging the user to upgrade their rust +installation. + +`rustc` should use this target version definition to check for +deprecated items. If no target version is defined, deprecation checking +is deactivated (as we cannot assume a specific rust version), however a +warning stating the same should be issued (as with cargo – we should +probably make cargo not warn on build to get rid of duplicate +warnings). Otherwise, use of API items whose `since` attribute is less +or equal to the target version of the crate should trigger a warning, +while API items whose `removed_at` attribute is less or equal to the +target version should trigger an error. + +`rustdoc` should mark deprecated APIs as such (e.g. make them in a +lighter gray font) and relegate removed APIs to a section below all +others (and that may be hidden via a checkbox). We should not +completely remove the documentation, as users of libraries that target +old versions may still have a use for them, but neither should we let +them clutter the docs. # Drawbacks -By requiring full backwards-compatibility, we will never be able to actually remove stuff from the APIs, which will probably lead to some bloat. +By requiring full backwards-compatibility, we will never be able to +actually remove stuff from the APIs, which will probably lead to some +bloat. Other successful languages have lived with this for multiple +decades, so it appears the tradeoff has seen some confirmation already. # Alternatives -* Follow a more agressive strategy that actually removes stuff from the API. This would make it easier for the libstd creators at some cost for library and application writers, as they are required to keep up to date or face breakage -* Hide deprecated items in the docs: This could be done either by putting them into a linked extra page or by adding a "show deprecated" checkbox that may be default be checked or not, depending on who you ask. This will however confuse people, who see the deprecated APIs in some code, but cannot find them in the docs anymore -* Allow to distinguish "soft" and "hard" deprecation, so that an API can be marked as "soft" deprecated to dissuade new uses before hard deprecation is decided. Allowing people to specify deprecation in future version appears to have much of the same benefits without needing a new attribute key. -* Decide deprecation on a per-case basis. This is what we do now. The proposal just adds a well-defined process to it -* Never deprecate anything. Evolve the API by adding stuff only. Rust would be crushed by the weight of its own cruft before 2.0 even has a chance to land. Users will be uncertain which APIs to use +* Follow a more agressive strategy that actually removes stuff from the +API. This would make it easier for the libstd creators at some cost for +library and application writers, as they are required to keep up to +date or face breakage * Hide deprecated items in the docs: This could +be done either by putting them into a linked extra page or by adding a +"show deprecated" checkbox that may be default be checked or not, +depending on who you ask. This will however confuse people, who see the +deprecated APIs in some code, but cannot find them in the docs anymore +* Allow to distinguish "soft" and "hard" deprecation, so that an API +can be marked as "soft" deprecated to dissuade new uses before hard +deprecation is decided. Allowing people to specify deprecation in +future version appears to have much of the same benefits without +needing a new attribute key. * Decide deprecation on a per-case basis. +This is what we do now. The proposal just adds a well-defined process +to it * Never deprecate anything. Evolve the API by adding stuff only. +Rust would be crushed by the weight of its own cruft before 2.0 even +has a chance to land. Users will be uncertain which APIs to use * We +could extend the deprecation feature to cover libraries. As Cargo.toml +already defines the target versions of dependencies (unless declared as +`"*"`), we could use much of the same machinery to allow library +authors to join the process # Unresolved questions -Should we allow library writers to use the same features for deprecating their API items? +Should we allow library writers to use the same features for +deprecating their API items? From 94acabc8910139983f22e473d8aa517dfcdc7617 Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 3 Jun 2015 16:30:39 +0200 Subject: [PATCH 0516/1195] clarified how cargo gets rust version --- text/0000-deprecation.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 7a48f1d055a..c7e84780a3f 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -37,7 +37,11 @@ First, crates should state their target version using a `#![version = "1.0.0"]` attribute. Cargo should insert the current rust version by default on `cargo new` and *warn* if no version is defined on all other commands. It may optionally *note* that the specified target version is -outdated on `cargo package`. [crates.io](https://crates.io) may deny +outdated on `cargo package`. To get the current rust version, cargo +could query rustc -V (with some postprocessing) or use some as yet +undefined symbol exported by the rust libraries. + +[crates.io](https://crates.io) may deny packages that do not declare a version to give the target version requirement more weight to library authors. Cargo should also be able to hold back a new library version if its declared target version is From 043d6a62a0322da374a3a3d2351e1445e29cc211 Mon Sep 17 00:00:00 2001 From: llogiq Date: Thu, 4 Jun 2015 16:06:09 +0200 Subject: [PATCH 0517/1195] Added paragraph on how rustc should handle future target versions --- text/0000-deprecation.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index c7e84780a3f..e07679259bd 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -57,7 +57,11 @@ probably make cargo not warn on build to get rid of duplicate warnings). Otherwise, use of API items whose `since` attribute is less or equal to the target version of the crate should trigger a warning, while API items whose `removed_at` attribute is less or equal to the -target version should trigger an error. +target version should trigger an error. + +Also if the target definition has a higher version than `rustc`, it +should warn that it probably has to be updated in order to build the +crate. `rustdoc` should mark deprecated APIs as such (e.g. make them in a lighter gray font) and relegate removed APIs to a section below all From 2f3acfb81bf1f0dc157cfe0e589a962f85c6c4f3 Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 10 Jun 2015 07:17:06 +0200 Subject: [PATCH 0518/1195] Fleshed out the Motivation section a bit --- text/0000-deprecation.md | 61 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 56 insertions(+), 5 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index e07679259bd..37b219f9658 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -5,14 +5,65 @@ # Summary -There has been an ongoing [discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) about how we are going to evolve the standard library. This RFC tries to condense the consensus. +There has been an ongoing [discussion on +internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) +about how we are going to evolve the standard library. This RFC tries +to condense the consensus. -# Motivation +As a starting point, the current deprecation feature allows a developer +to annotate their API items with `#[deprecated(since="1.1.0")]` and +have suitable warnings shown if the feature is used. -We want to guide the deprecation efforts to allow std to evolve freely to get the best possible API while ensuring minimum-breakage backwards compatibility for users and allow std authors to remove API items for a given version of Rust. Basically have our cake and eat it. Yum, cake. +# Motivation -Of course we cannot really keep and remove a feature at the same time. -To square this circle, we can follow the process outlined herein. +We want to: + +1. evolve the `std` API, including making items unavailable with new +versions +2. with minimal -- next to no -- breakage +3. be able to plug security/safety holes +4. avoid confusing users +5. stay backwards-compatible so people can continue to use dependencies +written for older versions (except where point 3. forbids this) +6. give users sensible defaults +7. and an update plan when they want to use a more current version + +This was quite short, so let me explain a bit: We want Rust to be +successful, and since the 1.0.0 release, there is an expectation of +stability. Therefore the first order of business when evolving the +`std` API is: **Don't break people's code**. + +In practice there will be some qualification, e.g. if fixing a security +hole requires breaking an API, it is nonetheless acceptable, because +the API was broken to begin with, as is code using it. So breaking this +code is acceptable. + +On the other hand, we really want to make features inaccessible in a +newer version, not just mark them as deprecated. Otherwise we would +bloat our APIs with deprecated features that no one uses (see Java). To +do this, it's not enough to hide the feature from the docs, as that +would be confusing (see point 4.) to those who encounter a hidden API. + +Not breaking code also mean we do not want to have the deprecation +feature interfere with a project's dependencies, which would teach +people to disable or ignore the warnings until their builds break. On +the other hand, we don't want to have all unavailable APIs show up +for library writers, as that -- apart from defeating the purpose of the +deprecation feature -- would create a confusing mirror world, +which is directly in conflict to point 4. + +We also want the feature to be *usable* to the programmer, therefore +any additional code we require should be minimal. If the feature is too +obscure, or too complicated to use, people will just +`#![allow(deprecate)]` and complain when their build finally breaks. + +Note that we expect many more *users* than *writers* of the `std` APIs, +so the wants of the former should count higher than those of the latter. + +Ideally, this can be done so that all parts play well together: Cargo +could help with setup (and possibly reporting), rustc warnings / error +reporting should be extended to inform people of pending or active +deprecations, rustdoc needs some way of reflecting the API lifecycle. # Detailed design From d95558d8dad8b97f8cf65b93fd6b7c5683badfc9 Mon Sep 17 00:00:00 2001 From: llogiq Date: Thu, 11 Jun 2015 23:30:35 +0200 Subject: [PATCH 0519/1195] added future/legacy flags to alternatives --- text/0000-deprecation.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 37b219f9658..eef3c8b6c74 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -130,6 +130,15 @@ decades, so it appears the tradeoff has seen some confirmation already. # Alternatives +* Opt-in and / or opt-out "feature-flags" (e.g. `#[legacy(..)]` for +opting out of a change) was suggested. The big problem is that this +relies on the user being able to change their dependencies, which may +not be possible for legal, organizational or other reasons. In +contrast, a defined target version doesn't ever need to change. +Depending on the specific case, it may be useful to allow a combination +of `#![legacy(..)]`, `#![future(..)]` and `#![target(..)]` where each +API version can declare the currently active feature and permit or +forbid use of the opt-in/out flags. * Follow a more agressive strategy that actually removes stuff from the API. This would make it easier for the libstd creators at some cost for library and application writers, as they are required to keep up to @@ -155,4 +164,5 @@ authors to join the process # Unresolved questions Should we allow library writers to use the same features for -deprecating their API items? +deprecating their API items? I think we should at least make sure that +our design and implementation allow this in the future. From 06b4163177924e1f1c20df145f541279711ce02c Mon Sep 17 00:00:00 2001 From: llogiq Date: Thu, 11 Jun 2015 23:45:45 +0200 Subject: [PATCH 0520/1195] More explanation about security issues --- text/0000-deprecation.md | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index eef3c8b6c74..ac8dbd90269 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -33,16 +33,24 @@ successful, and since the 1.0.0 release, there is an expectation of stability. Therefore the first order of business when evolving the `std` API is: **Don't break people's code**. -In practice there will be some qualification, e.g. if fixing a security -hole requires breaking an API, it is nonetheless acceptable, because -the API was broken to begin with, as is code using it. So breaking this -code is acceptable. - -On the other hand, we really want to make features inaccessible in a -newer version, not just mark them as deprecated. Otherwise we would -bloat our APIs with deprecated features that no one uses (see Java). To -do this, it's not enough to hide the feature from the docs, as that -would be confusing (see point 4.) to those who encounter a hidden API. +In practice there will be some qualification, e.g. if an API is +*inherently* unsafe, it should be acceptable to remove it completely, +as any code using it was in fact broken to begin with. Therefore it is +acceptable to make this code stop working altogether. + +On the other hand, if an API permits unsafe uses, and a safer +alternative is available, we may want to *retroactively deprecate* it, +so that people will get warnings even if they specified an older target +version. We may want to have a different kind of warning than the +standard deprecation warning, as there are already some crates (e.g. +compiletest.rs) on crates.io that declare `#![deny(deprecate)]`, so +those warnings would turn to errors. + +We also really want to make features inaccessible in a newer version, +not just mark them as deprecated. Otherwise we would bloat our APIs +with deprecated features that no one uses (see Java). To do this, it's +not enough to hide the feature from the docs, as that would be +confusing (see point 4.) to those who encounter a hidden API. Not breaking code also mean we do not want to have the deprecation feature interfere with a project's dependencies, which would teach From 1764038ed2eb83737a0bb671587c8798f46a92aa Mon Sep 17 00:00:00 2001 From: llogiq Date: Mon, 15 Jun 2015 07:19:27 +0200 Subject: [PATCH 0521/1195] Clarified the paragraph about cargo/crates.io, also added Policy section --- text/0000-deprecation.md | 71 +++++++++++++++++++++++++++++----------- 1 file changed, 51 insertions(+), 20 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index ac8dbd90269..4957eb6ddd4 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -81,32 +81,49 @@ reason="foo")]`. This should be extended to add an optional `removed_at` key, to state that the item should be made inaccessible at that version. Note that while this allows for marking items as deprecated, there is purposely no provision to actually *remove* items. -In fact this proposal bans removing an API type outright, unless -security concerns are deemed more important than the resulting breakage -from removing it or the API item has some fault that means it cannot be -used correctly at all (thus leaving the API in place would result in -the same level of breakage than removing it). +In fact this proposal strongly advises not to remove an API type, +unless security concerns are deemed more important than the resulting +breakage from removing it or the API item has some fault that means it +cannot be used correctly at all (thus leaving the API in place would +result in the same level of breakage than removing it). Currently every rustc version implements only its own version, having multiple versions is possible using something like multirust, though this does not work within a build. Also currently rustc versions do not guarantee interoperability. This RFC aims to change this situation. -First, crates should state their target version using a `#![version = -"1.0.0"]` attribute. Cargo should insert the current rust version by -default on `cargo new` and *warn* if no version is defined on all other -commands. It may optionally *note* that the specified target version is -outdated on `cargo package`. To get the current rust version, cargo -could query rustc -V (with some postprocessing) or use some as yet -undefined symbol exported by the rust libraries. - -[crates.io](https://crates.io) may deny -packages that do not declare a version to give the target version -requirement more weight to library authors. Cargo should also be able -to hold back a new library version if its declared target version is -newer than the rust version installed on the system. In those cases, -cargo should emit a warning urging the user to upgrade their rust -installation. +First, crates should state their target version using a +`#![target(std= "1.2.0"]` attribute on the main module. The version +string format is the one that cargo currently uses. + +Cargo should insert the current rust version by default on `cargo new` +and *warn* if no version is defined on all other commands. It may +optionally *note* if the specified target version is outdated on `cargo +package` or even `cargo build --release`. To get the current rust +version, cargo could query rustc -V (with some postprocessing) or use a +symbol exported by the rust libraries (e.g. `rustc::target_version`). + +Cargo should also be able to 'hold back' a new library version if its +declared target version is newer than the rust version installed on the +system. In those cases, cargo should emit a warning urging the user to +upgrade their rust installation. + +In the case of packages on crates.io, we could offer a mapping of +target versions to crate versions for each crate, so the corresponding +crate version can directly be used without further search. + +In the case of crates from git, the only reliable way to implement it +is to search the history for a suitable target version definition. Note +that we'd expect the target version to go up monotonously, so a binary +search should be possible, also we can filter out all commits that do +not touch lib.rs/mod.rs. + +This is a very complex feature to implement, so stopping with an error +and referring to the user to do the search is an acceptable option. + +[crates.io](https://crates.io) may start denying new packages that do +not declare a version to give the target version requirement more +weight to library authors. `rustc` should use this target version definition to check for deprecated items. If no target version is defined, deprecation checking @@ -129,6 +146,20 @@ completely remove the documentation, as users of libraries that target old versions may still have a use for them, but neither should we let them clutter the docs. +## Policy + +Even if this proposal reduces breakage arising from new versions +considerably, we should still exercise some care on evolving the APIs. +We already have a `beta` and `nightly` release train representing +future versions, this should be taken into account. + +In general, the Tarzan principle should be followed where applicable +(First grab a vine, *then* let go of the previous vine). In terms of +API evolution, this means not deprecating a feature before a +replacement has been stabilized. It is still possible to deprecate a +feature in a future version, to inform users of its impending +departure. + # Drawbacks By requiring full backwards-compatibility, we will never be able to From ef9c63207a35b89477a0b027a3610b0d5a61e700 Mon Sep 17 00:00:00 2001 From: llogiq Date: Mon, 15 Jun 2015 17:52:51 +0200 Subject: [PATCH 0522/1195] clarification on insecurity and future-proofing --- text/0000-deprecation.md | 83 +++++++++++++++++++++++++++++----------- 1 file changed, 60 insertions(+), 23 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 4957eb6ddd4..5c8c26dfbdd 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -21,7 +21,7 @@ We want to: 1. evolve the `std` API, including making items unavailable with new versions 2. with minimal -- next to no -- breakage -3. be able to plug security/safety holes +3. be able to plug security/safety holes 4. avoid confusing users 5. stay backwards-compatible so people can continue to use dependencies written for older versions (except where point 3. forbids this) @@ -33,18 +33,20 @@ successful, and since the 1.0.0 release, there is an expectation of stability. Therefore the first order of business when evolving the `std` API is: **Don't break people's code**. -In practice there will be some qualification, e.g. if an API is -*inherently* unsafe, it should be acceptable to remove it completely, -as any code using it was in fact broken to begin with. Therefore it is -acceptable to make this code stop working altogether. - -On the other hand, if an API permits unsafe uses, and a safer -alternative is available, we may want to *retroactively deprecate* it, -so that people will get warnings even if they specified an older target -version. We may want to have a different kind of warning than the +In practice there will be some qualification, e.g. if an API is +*inherently* unsafe, it should be acceptable make it completely +unavailable, as any code using it was in fact broken to begin with. +Therefore it is acceptable to make this code stop working altogether, +as long as the resulting error is not too confusing (which again means +we should make the item inaccessible instead of removing it). + +If an API permits unsafe uses, and a safer alternative is available, we +may want to mark it as insecure in addition to deprecating it, so that +people will get warnings even if they specified an older target +version. We want to have a different kind of warning than the standard deprecation warning, as there are already some crates (e.g. compiletest.rs) on crates.io that declare `#![deny(deprecate)]`, so -those warnings would turn to errors. +those warnings would turn to errors. We also really want to make features inaccessible in a newer version, not just mark them as deprecated. Otherwise we would bloat our APIs @@ -126,18 +128,35 @@ not declare a version to give the target version requirement more weight to library authors. `rustc` should use this target version definition to check for -deprecated items. If no target version is defined, deprecation checking -is deactivated (as we cannot assume a specific rust version), however a -warning stating the same should be issued (as with cargo – we should -probably make cargo not warn on build to get rid of duplicate -warnings). Otherwise, use of API items whose `since` attribute is less -or equal to the target version of the crate should trigger a warning, -while API items whose `removed_at` attribute is less or equal to the -target version should trigger an error. - -Also if the target definition has a higher version than `rustc`, it -should warn that it probably has to be updated in order to build the -crate. +deprecated items. If the target version is specified, use of API items +whose `since` attribute is less or equal to the target version of the +crate should trigger a warning, while API items whose `removed_at` +attribute is less or equal to the target version should trigger an +error. + +We can also define a `future deprecation` lint set to `Allow` by +default to allow people being proactive about items that are going to +be deprecated. + +Also if the target definition has a higher version than `rustc`, it +should briefly warn that it probably has to be updated in order to +build the crate. However, `rustc` should try to build the code anyway; +further errors may give the user additional information. + +If *no* target version is defined, deprecation checking is deactivated +(as we cannot assume a specific rust version), however a note +stating the same should be printed (as with cargo – we should probably +make cargo not warn on build to get rid of duplicate warnings). Since +all current code comes without a target version, we have to assume +a minimal version 1.0.0. + +In addition to the note, the `std` authors could opt to create a new +`#[since="1.2.0"]` attribute, which would allow rustc to infer the +minimal target version of some code from the API features it uses in +absence of a specified version. Deprecation warnings/errors should then +refer to the inferred target versions as well as the APIs that led to +the inference of the latest version (at least perhaps on calling +`rustc` with `-v`). `rustdoc` should mark deprecated APIs as such (e.g. make them in a lighter gray font) and relegate removed APIs to a section below all @@ -146,6 +165,24 @@ completely remove the documentation, as users of libraries that target old versions may still have a use for them, but neither should we let them clutter the docs. +## Dealing with insecure items + +Since just removing insecure items, though tempting, would lead to user +confusion, a new `#[insecure(reason = "...")]` attribute should be +added to all insecure API items. An `insecure_api` lint that by default +raises `Error` can catch all uses of those items. To distinguish +between items *some uses of which* may be insecure and *inherently* +insecure items, either a second entry `inherent = true` could be added +or a `#[maybe_insecure(reason = "...")]` annotation could take the +latter part. + +The rationale for defining a separate attribute is that it avoids +mixing separate concerns (versioning and security), and that we want to +allow warnings/errors on dependencies regardless of specified target +versions. It also allows us to show the reason (from the attr) in the +lint message, which will be specific to the insecurity at hand and +hopefully be helpful to the user. + ## Policy Even if this proposal reduces breakage arising from new versions From 9d18ae4152b8932232349e523611eae1475cb836 Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 17 Jun 2015 13:28:20 +0200 Subject: [PATCH 0523/1195] added Cargo.toml-based target to Alternatives section --- text/0000-deprecation.md | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 5c8c26dfbdd..2dfef8e7c95 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -206,11 +206,19 @@ decades, so it appears the tradeoff has seen some confirmation already. # Alternatives -* Opt-in and / or opt-out "feature-flags" (e.g. `#[legacy(..)]` for -opting out of a change) was suggested. The big problem is that this -relies on the user being able to change their dependencies, which may -not be possible for legal, organizational or other reasons. In -contrast, a defined target version doesn't ever need to change. +* Have a flag in `Cargo.toml` instead of the crate root. This however +requires an argument to `rustc`, because Cargo (in addition to those +not using it) somehow has to pass it to `rustc`. Requiring such an +argument on every non-cargoized build would increase room for error and +thus pessimize usability. Also apart from availability of dependencies, +which arguably is Cargo's main raison d'être, we currently do not have +a precedent where Cargo.toml has direct effect on the working of a +crate's code. +* Opt-in and / or opt-out "feature-flags" (e.g. `#[legacy(..)]`) was +suggested. The big problem is that this relies on the user being able +to change their dependencies, which may not be possible for legal, +organizational or other reasons. In contrast, a defined target version +doesn't ever need to change. Depending on the specific case, it may be useful to allow a combination of `#![legacy(..)]`, `#![future(..)]` and `#![target(..)]` where each API version can declare the currently active feature and permit or From e59e20c3332d83eb75de7f5857c6d4ab44ad5c09 Mon Sep 17 00:00:00 2001 From: llogiq Date: Fri, 19 Jun 2015 18:12:04 +0200 Subject: [PATCH 0524/1195] Almost complete rewrite. --- text/0000-deprecation.md | 423 +++++++++++++++++++-------------------- 1 file changed, 207 insertions(+), 216 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 2dfef8e7c95..7ebcbf537f8 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -1,252 +1,243 @@ -- Feature Name: A plan for deprecating APIs within Rust +- Feature Name: Target Version - Start Date: 2015-06-03 - RFC PR: - Rust Issue: # Summary -There has been an ongoing [discussion on +This RFC proposes a small number of extensions to improve the user +experience around different rust versions, while at the same time +giving Rust developers more freedom to make changes without breaking +anyones code. + +Namely the following items: + +1. Add a `--target=`* command line argument to rustc. +This will be used for deprecation checking and for selecting code paths +in the compiler. +2. Add an (optional for now) `rust = "..."` dependency to Cargo.toml, +which `cargo new` pre-fills with the current rust version +3. Allow `std` APIs to declare an `#[insecure(level="Warn", reason="...")]` +attribute that will produce a warning or error, depending on level, +that cannot be switched off (even with `-Awarning`) +4. Add a `removed_at="..."` item to `#[deprecated]` attributes that +allows making API items unavailable starting from certain target +versions. +5. Add a number of warnings to steer users in the direction of using the +most recent Rust version that makes sense to them, while making it +easy for library writers to support a wide range of Rust versions + +## Background + +A good number of policies and mechanisms around versioning regarding +the Rust language and APIs have already been submitted, some accepted. +As a background, in no particular order: + +* [#0572 Feature gates](https://github.com/rust-lang/rfcs/blob/master/text/0572-rustc-attribute.md) +* [#1105 API Evolution](https://github.com/rust-lang/rfcs/blob/master/text/1105-api-evolution.md) +* [#1122 Language SemVer](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) +* [#1150 Rename Attribute](https://github.com/rust-lang/pull/1150) + +In addition, there has been an ongoing [discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) -about how we are going to evolve the standard library. This RFC tries -to condense the consensus. +about how we are going to evolve the standard library, which this +proposal is mostly based on. -As a starting point, the current deprecation feature allows a developer -to annotate their API items with `#[deprecated(since="1.1.0")]` and -have suitable warnings shown if the feature is used. +Finally, the recent discussion on the first breaking change +([RFC PR #1156 Adjust default object bounds](https://github.com/rust-lang/rfcs/pull/1156)) +has made it clear that we need a more flexible way of dealing with +(breaking) changes. + +The current setup allows the `std` API to declare `#[unstable]` and +`#[deprecated]` flags, whereas users can opt-in to unstable features +with `#![feature]` flags that are customarily added to the crate root. +On usage of deprecated APIs, a warning is shown unless suppressed. +`cargo` does this for dependencies by default by calling `rustc` with +the `-Awarnings` argument. # Motivation -We want to: - -1. evolve the `std` API, including making items unavailable with new -versions -2. with minimal -- next to no -- breakage -3. be able to plug security/safety holes -4. avoid confusing users -5. stay backwards-compatible so people can continue to use dependencies -written for older versions (except where point 3. forbids this) -6. give users sensible defaults -7. and an update plan when they want to use a more current version - -This was quite short, so let me explain a bit: We want Rust to be -successful, and since the 1.0.0 release, there is an expectation of -stability. Therefore the first order of business when evolving the -`std` API is: **Don't break people's code**. - -In practice there will be some qualification, e.g. if an API is -*inherently* unsafe, it should be acceptable make it completely -unavailable, as any code using it was in fact broken to begin with. -Therefore it is acceptable to make this code stop working altogether, -as long as the resulting error is not too confusing (which again means -we should make the item inaccessible instead of removing it). - -If an API permits unsafe uses, and a safer alternative is available, we -may want to mark it as insecure in addition to deprecating it, so that -people will get warnings even if they specified an older target -version. We want to have a different kind of warning than the -standard deprecation warning, as there are already some crates (e.g. -compiletest.rs) on crates.io that declare `#![deny(deprecate)]`, so -those warnings would turn to errors. - -We also really want to make features inaccessible in a newer version, -not just mark them as deprecated. Otherwise we would bloat our APIs -with deprecated features that no one uses (see Java). To do this, it's -not enough to hide the feature from the docs, as that would be -confusing (see point 4.) to those who encounter a hidden API. - -Not breaking code also mean we do not want to have the deprecation -feature interfere with a project's dependencies, which would teach -people to disable or ignore the warnings until their builds break. On -the other hand, we don't want to have all unavailable APIs show up -for library writers, as that -- apart from defeating the purpose of the -deprecation feature -- would create a confusing mirror world, -which is directly in conflict to point 4. - -We also want the feature to be *usable* to the programmer, therefore -any additional code we require should be minimal. If the feature is too -obscure, or too complicated to use, people will just -`#![allow(deprecate)]` and complain when their build finally breaks. - -Note that we expect many more *users* than *writers* of the `std` APIs, -so the wants of the former should count higher than those of the latter. - -Ideally, this can be done so that all parts play well together: Cargo -could help with setup (and possibly reporting), rustc warnings / error -reporting should be extended to inform people of pending or active -deprecations, rustdoc needs some way of reflecting the API lifecycle. +## 1. Language / `std` Evolution + +The following motivates items 1 and 2 (and to a lesser extent 5) + +With the current setup, we can already evolve the language and APIs, +albeit in a very limited way. For example, it is virtually impossible +to actually remove an API, because that would break code. Even minor +breaking changes (as [RFC PR #1156](https://github.com/rust-lang/rfcs/pull/1156) +cited above) generate huge discussion, because they call the general +stability of the language and environment into question. + +The problem with opt-out, as defined by +[RFC #1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) +is that code which previously compiled without problems stops working +on a Rust upgrade, and requires manual intervention to get working +again. + +This has the long-term potential of fragmenting the Rust ecosystem into +crates/projects using certain Rust versions and thus should be avoided +at some (but not all) cost. + +Note that a similar problem exists with deprecation as defined: People +may get used to deprecation warnings or just turn them off until their +build breaks. Worse, the current setup creates a mirror world for +libraries, in which deprecation doesn't exist! + +## 2. User Experience + +The following motivates items 2, 4 and 5. + +Currently, there is no way to make an API item unavailable via +deprecation. This means the API will only ever expand, with a lot of +churn (case in point, the Java language has about 20% deprecated API +surface [citation needed]). To better lead users to the right APIs and +to allow for more effective language/API evolution, this proposal adds +the `removed_at=""` item to the `#[deprecated]` attribute. + +This allows us to effectively remove an API item from a certain target +version while avoiding breaking code written for older target versions. + +Also rustc can emit better error messages than it could were the API +items actually removed. In the best case, the error messages can steer +the user to a working replacement. + +We want to avoid users setting `#[allow(deprecate)]` on their code to +get rid of the warnings. On the other hand, there have been instances +of failing builds because code was marked with `#![deny(deprecate)]`, +namely `compiletest.rs` and all crates using it. This shows that the +current system has room for improvement. + +We want to avoid failing builds because of wrong or missing target +version definition. Therefor supplying useful defaults is of the +essence. + +The documentation can be optionally reduced to items relating to the +current target version (e.g. by a switch), or deprecated items +relegated to a separate space, to reduce clutter and possible user +confusion. + +## 3. Security Considerations + +I believe that *should* a security issue in one of our APIs be found, +a swift and effective response will be required, and the current rules +make no provisions for it. Thus proposal item 3. # Detailed design -We already declare deprecation in terms of Rust versions (like "1.0", -"1.2"). The current attribute looks like `#[deprecated(since = "1.0.0", -reason="foo")]`. This should be extended to add an optional -`removed_at` key, to state that the item should be made inaccessible at -that version. Note that while this allows for marking items as -deprecated, there is purposely no provision to actually *remove* items. -In fact this proposal strongly advises not to remove an API type, -unless security concerns are deemed more important than the resulting -breakage from removing it or the API item has some fault that means it -cannot be used correctly at all (thus leaving the API in place would -result in the same level of breakage than removing it). - -Currently every rustc version implements only its own version, having -multiple versions is possible using something like multirust, though -this does not work within a build. Also currently rustc versions do not -guarantee interoperability. This RFC aims to change this situation. - -First, crates should state their target version using a -`#![target(std= "1.2.0"]` attribute on the main module. The version -string format is the one that cargo currently uses. - -Cargo should insert the current rust version by default on `cargo new` -and *warn* if no version is defined on all other commands. It may -optionally *note* if the specified target version is outdated on `cargo -package` or even `cargo build --release`. To get the current rust -version, cargo could query rustc -V (with some postprocessing) or use a -symbol exported by the rust libraries (e.g. `rustc::target_version`). - -Cargo should also be able to 'hold back' a new library version if its -declared target version is newer than the rust version installed on the -system. In those cases, cargo should emit a warning urging the user to -upgrade their rust installation. - -In the case of packages on crates.io, we could offer a mapping of -target versions to crate versions for each crate, so the corresponding -crate version can directly be used without further search. - -In the case of crates from git, the only reliable way to implement it -is to search the history for a suitable target version definition. Note -that we'd expect the target version to go up monotonously, so a binary -search should be possible, also we can filter out all commits that do -not touch lib.rs/mod.rs. - -This is a very complex feature to implement, so stopping with an error -and referring to the user to do the search is an acceptable option. - -[crates.io](https://crates.io) may start denying new packages that do -not declare a version to give the target version requirement more -weight to library authors. - -`rustc` should use this target version definition to check for -deprecated items. If the target version is specified, use of API items -whose `since` attribute is less or equal to the target version of the -crate should trigger a warning, while API items whose `removed_at` -attribute is less or equal to the target version should trigger an -error. - -We can also define a `future deprecation` lint set to `Allow` by +Cargo parses the additional `rust = "..."` dependency as if it was a +library. The usual rules for version parsing apply. If no `rust` +dependency is supplied, it can either default to `*`. + +Cargo should also supply the current Rust version (which can be either +supplied by calling `rustc -V` or by linking to a rust library defining +a version object) on `cargo new`. Cargo supplies the given target +version to `rustc` via the `--target` command line argument. + +Cargo *may* also warn on `cargo package` if no `rust` version was +supplied. [crates.io](https://crates.io) *could* require a version +attribute on upload and display the required rust version on the site. + +One nice aspect of this is that `rust` looks just like yet another +dependency and effectively follows the same rules. + +`rustc` needs to accept the `--target ` command line argument. +If no argument is supplied, `rustc` defaults to its own version. The +same version syntax as Cargo applies: + +* `*` effectively means *any version*. For API items, it means +deprecation checking is disabled. For language changes, it means using +the 1.0.0 code paths (for now, we may opt to change this in the +future), because anything else would break all current code. +* `1.x` or e.g. `>=1.2.0` sets the target version to the minor version. +Deprecation checking and language code path selection occur relative +to the lowest given version. This might also affect stability handling, +though this RFC doesn't specify this as of yet. +* `1.0 - <2.0` as above, the *lowest* supplied version has to be +assumed for code path selection. However, deprecation checking should +assume the *highest* supplied version, if any. + +If the target version is *higher* than the current `rustc` version, +`rustc` should show a warning to suggest that it may need to be updated +in order to compile the crate and then try to compile the crate on a +best-effort basis. + +Optionally, we can define a `future deprecation` lint set to `Allow` by default to allow people being proactive about items that are going to -be deprecated. - -Also if the target definition has a higher version than `rustc`, it -should briefly warn that it probably has to be updated in order to -build the crate. However, `rustc` should try to build the code anyway; -further errors may give the user additional information. - -If *no* target version is defined, deprecation checking is deactivated -(as we cannot assume a specific rust version), however a note -stating the same should be printed (as with cargo – we should probably -make cargo not warn on build to get rid of duplicate warnings). Since -all current code comes without a target version, we have to assume -a minimal version 1.0.0. - -In addition to the note, the `std` authors could opt to create a new -`#[since="1.2.0"]` attribute, which would allow rustc to infer the -minimal target version of some code from the API features it uses in -absence of a specified version. Deprecation warnings/errors should then -refer to the inferred target versions as well as the APIs that led to -the inference of the latest version (at least perhaps on calling -`rustc` with `-v`). - -`rustdoc` should mark deprecated APIs as such (e.g. make them in a -lighter gray font) and relegate removed APIs to a section below all -others (and that may be hidden via a checkbox). We should not -completely remove the documentation, as users of libraries that target -old versions may still have a use for them, but neither should we let -them clutter the docs. - -## Dealing with insecure items - -Since just removing insecure items, though tempting, would lead to user -confusion, a new `#[insecure(reason = "...")]` attribute should be -added to all insecure API items. An `insecure_api` lint that by default -raises `Error` can catch all uses of those items. To distinguish -between items *some uses of which* may be insecure and *inherently* -insecure items, either a second entry `inherent = true` could be added -or a `#[maybe_insecure(reason = "...")]` annotation could take the -latter part. - -The rationale for defining a separate attribute is that it avoids -mixing separate concerns (versioning and security), and that we want to -allow warnings/errors on dependencies regardless of specified target -versions. It also allows us to show the reason (from the attr) in the -lint message, which will be specific to the insecurity at hand and -hopefully be helpful to the user. - -## Policy - -Even if this proposal reduces breakage arising from new versions -considerably, we should still exercise some care on evolving the APIs. -We already have a `beta` and `nightly` release train representing -future versions, this should be taken into account. - -In general, the Tarzan principle should be followed where applicable -(First grab a vine, *then* let go of the previous vine). In terms of -API evolution, this means not deprecating a feature before a -replacement has been stabilized. It is still possible to deprecate a -feature in a future version, to inform users of its impending -departure. +be deprecated. + +`rustc` should also show a warning or error, depending on level, on +encountering usage of API items marked as `#[insecure]`. The attribute +has two values: + +* `level` can either be `Warning` or `Error` and default to `Error` +* `reason` contains a description on why usage of this item was deemed + a security risk. This attribute is mandatory. + +While `rustdoc` already parses the deprecation flags, it should in +addition relegate items removed in the current version to a separate +area below the other documentation and optically mark them as removed. +We should not completely remove them, because that would confuse users +who see the API items in code written for older target versions. + +Also, `rustdoc` should show if API items are marked with `#[insecure]`, +including displaying the `reason` prominently. + +# Optional Extension: Legacy flags + +The `#[deprecated]` attribute could get an optional `legacy="xy` +entry, which could effectively group a set of APIs under the given +name. Users can then declare the `#[legacy]` flag as defined in +[RFC #1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) +to specifically allow usage of the grouped APIs, thus selectively +removing the deprecation warning. + +This would create a nice feature parity between language code paths and +`std` API deprecation checking. # Drawbacks By requiring full backwards-compatibility, we will never be able to actually remove stuff from the APIs, which will probably lead to some -bloat. Other successful languages have lived with this for multiple -decades, so it appears the tradeoff has seen some confirmation already. +bloat. However, the cost of maintaining the outdated APIs is far +outweighted by the benefits. Case in point: Other successful languages +have lived with this for multiple decades, so it appears the tradeoff +has seen some confirmation already. + +Cargo and `rustc` need some code to manage the additional rules. I +estimate the effort to be reasonably low. # Alternatives -* Have a flag in `Cargo.toml` instead of the crate root. This however -requires an argument to `rustc`, because Cargo (in addition to those -not using it) somehow has to pass it to `rustc`. Requiring such an -argument on every non-cargoized build would increase room for error and -thus pessimize usability. Also apart from availability of dependencies, -which arguably is Cargo's main raison d'être, we currently do not have -a precedent where Cargo.toml has direct effect on the working of a -crate's code. -* Opt-in and / or opt-out "feature-flags" (e.g. `#[legacy(..)]`) was -suggested. The big problem is that this relies on the user being able +* It was suggested that opt-in and opt-out (e.g. by `#[legacy(..)]`) +could be sufficient to work around any breaking code on API or language +changes. The big problem here is that this relies on the user being able to change their dependencies, which may not be possible for legal, organizational or other reasons. In contrast, a defined target version -doesn't ever need to change. +doesn't ever need to change + Depending on the specific case, it may be useful to allow a combination -of `#![legacy(..)]`, `#![future(..)]` and `#![target(..)]` where each -API version can declare the currently active feature and permit or -forbid use of the opt-in/out flags. +of `#![legacy(..)]`, `#![feature(..)]` and the target version where +each Rust version can declare the currently active feature set and +permit or forbid use of the opt-in/out flags + * Follow a more agressive strategy that actually removes stuff from the API. This would make it easier for the libstd creators at some cost for library and application writers, as they are required to keep up to -date or face breakage * Hide deprecated items in the docs: This could -be done either by putting them into a linked extra page or by adding a -"show deprecated" checkbox that may be default be checked or not, -depending on who you ask. This will however confuse people, who see the -deprecated APIs in some code, but cannot find them in the docs anymore +date or face breakage. The risk of breaking existing code makes this +strategy very unattractive + +* Hide deprecated items in the docs: This could be done either by +putting them into a linked extra page or by adding a "show deprecated" +checkbox that may be default be checked or not, depending on who you +ask. This will however confuse people, who see the deprecated APIs in +some code, but cannot find them in the docs anymore + * Allow to distinguish "soft" and "hard" deprecation, so that an API can be marked as "soft" deprecated to dissuade new uses before hard deprecation is decided. Allowing people to specify deprecation in future version appears to have much of the same benefits without -needing a new attribute key. * Decide deprecation on a per-case basis. -This is what we do now. The proposal just adds a well-defined process -to it * Never deprecate anything. Evolve the API by adding stuff only. -Rust would be crushed by the weight of its own cruft before 2.0 even -has a chance to land. Users will be uncertain which APIs to use * We -could extend the deprecation feature to cover libraries. As Cargo.toml -already defines the target versions of dependencies (unless declared as -`"*"`), we could use much of the same machinery to allow library -authors to join the process +needing a new attribute key. # Unresolved questions -Should we allow library writers to use the same features for -deprecating their API items? I think we should at least make sure that -our design and implementation allow this in the future. +I no longer have any. Please join the discussion to add yours. From 42af880bb795b767bea1ce515228ea5fa3be3d9c Mon Sep 17 00:00:00 2001 From: llogiq Date: Thu, 25 Jun 2015 09:45:14 +0200 Subject: [PATCH 0525/1195] Renamed --target to --target-version, reworded Cargo entry default --- text/0000-deprecation.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 7ebcbf537f8..11bb3156c8c 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -12,14 +12,15 @@ anyones code. Namely the following items: -1. Add a `--target=`* command line argument to rustc. +1. Add a `--target-version=`* command line argument to rustc. This will be used for deprecation checking and for selecting code paths in the compiler. 2. Add an (optional for now) `rust = "..."` dependency to Cargo.toml, which `cargo new` pre-fills with the current rust version -3. Allow `std` APIs to declare an `#[insecure(level="Warn", reason="...")]` -attribute that will produce a warning or error, depending on level, -that cannot be switched off (even with `-Awarning`) +3. Allow `std` APIs to declare an +`#[insecure(level="Warn", reason="...")]` attribute that will produce +a warning or error, depending on level, that cannot be switched off +(even with `-Awarning`) 4. Add a `removed_at="..."` item to `#[deprecated]` attributes that allows making API items unavailable starting from certain target versions. @@ -126,12 +127,12 @@ make no provisions for it. Thus proposal item 3. Cargo parses the additional `rust = "..."` dependency as if it was a library. The usual rules for version parsing apply. If no `rust` -dependency is supplied, it can either default to `*`. +dependency is supplied, it defaults to `*`. Cargo should also supply the current Rust version (which can be either supplied by calling `rustc -V` or by linking to a rust library defining a version object) on `cargo new`. Cargo supplies the given target -version to `rustc` via the `--target` command line argument. +version to `rustc` via the `--target-version` command line argument. Cargo *may* also warn on `cargo package` if no `rust` version was supplied. [crates.io](https://crates.io) *could* require a version @@ -140,9 +141,9 @@ attribute on upload and display the required rust version on the site. One nice aspect of this is that `rust` looks just like yet another dependency and effectively follows the same rules. -`rustc` needs to accept the `--target ` command line argument. -If no argument is supplied, `rustc` defaults to its own version. The -same version syntax as Cargo applies: +`rustc` needs to accept the `--target-version ` command line +argument. If no argument is supplied, `rustc` defaults to its own +version. The same version syntax as Cargo applies: * `*` effectively means *any version*. For API items, it means deprecation checking is disabled. For language changes, it means using From c4cf32589a8126e243d9839e864cb44627609d0b Mon Sep 17 00:00:00 2001 From: llogiq Date: Thu, 25 Jun 2015 23:58:46 +0200 Subject: [PATCH 0526/1195] added open question about cargo and detailed design about feature flag integration --- text/0000-deprecation.md | 41 ++++++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 16 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 11bb3156c8c..5b5138a3b67 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -12,11 +12,12 @@ anyones code. Namely the following items: -1. Add a `--target-version=`* command line argument to rustc. -This will be used for deprecation checking and for selecting code paths -in the compiler. +1. Add a `--target-version=`** command line argument to +rustc. This will be used for deprecation checking and for selecting +code paths in the compiler. 2. Add an (optional for now) `rust = "..."` dependency to Cargo.toml, -which `cargo new` pre-fills with the current rust version +which `cargo new` pre-fills with the current rust version +(alternatively this could be a package attribute) 3. Allow `std` APIs to declare an `#[insecure(level="Warn", reason="...")]` attribute that will produce a warning or error, depending on level, that cannot be switched off @@ -24,8 +25,8 @@ a warning or error, depending on level, that cannot be switched off 4. Add a `removed_at="..."` item to `#[deprecated]` attributes that allows making API items unavailable starting from certain target versions. -5. Add a number of warnings to steer users in the direction of using the -most recent Rust version that makes sense to them, while making it +5. Add a number of warnings to steer users in the direction of using +the most recent Rust version that makes sense to them, while making it easy for library writers to support a wide range of Rust versions ## Background @@ -39,8 +40,8 @@ As a background, in no particular order: * [#1122 Language SemVer](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) * [#1150 Rename Attribute](https://github.com/rust-lang/pull/1150) -In addition, there has been an ongoing [discussion on -internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) +In addition, there has been an ongoing +[discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) about how we are going to evolve the standard library, which this proposal is mostly based on. @@ -90,10 +91,13 @@ The following motivates items 2, 4 and 5. Currently, there is no way to make an API item unavailable via deprecation. This means the API will only ever expand, with a lot of -churn (case in point, the Java language has about 20% deprecated API -surface [citation needed]). To better lead users to the right APIs and -to allow for more effective language/API evolution, this proposal adds -the `removed_at=""` item to the `#[deprecated]` attribute. +churn (case in point, the Java language as of Version 8 lists 462 +deprecated constructs, including methods, constants and complete +classes, in relation to 4241 classes, this makes for about 10% of +deprecated API surface. Note that this is but a rough estimate). To +better lead users to the right APIs and to allow for more effective +language/API evolution, this proposal adds the `removed_at=""` +item to the `#[deprecated]` attribute. This allows us to effectively remove an API item from a certain target version while avoiding breaking code written for older target versions. @@ -104,7 +108,7 @@ the user to a working replacement. We want to avoid users setting `#[allow(deprecate)]` on their code to get rid of the warnings. On the other hand, there have been instances -of failing builds because code was marked with `#![deny(deprecate)]`, +of failing builds because code was marked with `#![deny(warnings)]`, namely `compiletest.rs` and all crates using it. This shows that the current system has room for improvement. @@ -147,7 +151,7 @@ version. The same version syntax as Cargo applies: * `*` effectively means *any version*. For API items, it means deprecation checking is disabled. For language changes, it means using -the 1.0.0 code paths (for now, we may opt to change this in the +the `1.0.0` code paths (for now, we may opt to change this in the future), because anything else would break all current code. * `1.x` or e.g. `>=1.2.0` sets the target version to the minor version. Deprecation checking and language code path selection occur relative @@ -166,6 +170,11 @@ Optionally, we can define a `future deprecation` lint set to `Allow` by default to allow people being proactive about items that are going to be deprecated. +`rustc` should resolve the `#[feature]` flags against the upper bound +of the specified target version instead the current version, but +default to the current version if no target version is specified or the +specified version has no upper bound. + `rustc` should also show a warning or error, depending on level, on encountering usage of API items marked as `#[insecure]`. The attribute has two values: @@ -185,7 +194,7 @@ including displaying the `reason` prominently. # Optional Extension: Legacy flags -The `#[deprecated]` attribute could get an optional `legacy="xy` +The `#[deprecated]` attribute could get an optional `legacy="xy"` entry, which could effectively group a set of APIs under the given name. Users can then declare the `#[legacy]` flag as defined in [RFC #1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) @@ -241,4 +250,4 @@ needing a new attribute key. # Unresolved questions -I no longer have any. Please join the discussion to add yours. +Should the rust = "" be a *dependency* or a package attribute? From 151d52357d170f7dfbb91bc9b09d0d4886d14067 Mon Sep 17 00:00:00 2001 From: llogiq Date: Fri, 26 Jun 2015 23:42:12 +0200 Subject: [PATCH 0527/1195] made 'rust' a package attribute instead of a pseudo-dependency --- text/0000-deprecation.md | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 5b5138a3b67..cdd3e05bd58 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -15,19 +15,21 @@ Namely the following items: 1. Add a `--target-version=`** command line argument to rustc. This will be used for deprecation checking and for selecting code paths in the compiler. -2. Add an (optional for now) `rust = "..."` dependency to Cargo.toml, -which `cargo new` pre-fills with the current rust version -(alternatively this could be a package attribute) +2. Add an optional `rust = "..."` package attribute to Cargo.toml, +which `cargo new` pre-fills with the current rust version. 3. Allow `std` APIs to declare an `#[insecure(level="Warn", reason="...")]` attribute that will produce a warning or error, depending on level, that cannot be switched off (even with `-Awarning`) 4. Add a `removed_at="..."` item to `#[deprecated]` attributes that allows making API items unavailable starting from certain target -versions. +versions. 5. Add a number of warnings to steer users in the direction of using the most recent Rust version that makes sense to them, while making it easy for library writers to support a wide range of Rust versions +6. (optional) add a `legacy="..."` item to `#[deprecated]` attributes +that allows grouping API items under a legacy flag that is already +defined in RFC #1122 (see below) ## Background @@ -129,9 +131,9 @@ make no provisions for it. Thus proposal item 3. # Detailed design -Cargo parses the additional `rust = "..."` dependency as if it was a -library. The usual rules for version parsing apply. If no `rust` -dependency is supplied, it defaults to `*`. +Cargo parses the additional `rust = "..."` package attribute. The usual +rules for version parsing apply. If no `rust` attribute is supplied, it +defaults to `*`. Cargo should also supply the current Rust version (which can be either supplied by calling `rustc -V` or by linking to a rust library defining @@ -139,11 +141,12 @@ a version object) on `cargo new`. Cargo supplies the given target version to `rustc` via the `--target-version` command line argument. Cargo *may* also warn on `cargo package` if no `rust` version was -supplied. [crates.io](https://crates.io) *could* require a version -attribute on upload and display the required rust version on the site. +supplied. A few versions in the future, Cargo could also warn of +missing version attributes on build or other actions, at least if the +crate is a library. -One nice aspect of this is that `rust` looks just like yet another -dependency and effectively follows the same rules. +[crates.io](https://crates.io) *could* require a version attribute on +upload and display the required rust version on the site. `rustc` needs to accept the `--target-version ` command line argument. If no argument is supplied, `rustc` defaults to its own @@ -202,7 +205,9 @@ to specifically allow usage of the grouped APIs, thus selectively removing the deprecation warning. This would create a nice feature parity between language code paths and -`std` API deprecation checking. +`std` API deprecation checking. Also it would lessen the pain for users +who want to upgrade their systems one feature at a time and can use the +legacy flags to effectively manage their usage of deprecated items. # Drawbacks @@ -250,4 +255,4 @@ needing a new attribute key. # Unresolved questions -Should the rust = "" be a *dependency* or a package attribute? +None From 186fd9170cdc61033ca56e50d6cad011e271b5e5 Mon Sep 17 00:00:00 2001 From: llogiq Date: Sun, 28 Jun 2015 22:54:07 +0200 Subject: [PATCH 0528/1195] formatting improvements --- text/0000-deprecation.md | 85 ++++++++++++++++++++-------------------- 1 file changed, 43 insertions(+), 42 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index cdd3e05bd58..62d8964df39 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -13,23 +13,23 @@ anyones code. Namely the following items: 1. Add a `--target-version=`** command line argument to -rustc. This will be used for deprecation checking and for selecting -code paths in the compiler. + rustc. This will be used for deprecation checking and for selecting + code paths in the compiler. 2. Add an optional `rust = "..."` package attribute to Cargo.toml, -which `cargo new` pre-fills with the current rust version. + which `cargo new` pre-fills with the current rust version. 3. Allow `std` APIs to declare an -`#[insecure(level="Warn", reason="...")]` attribute that will produce -a warning or error, depending on level, that cannot be switched off -(even with `-Awarning`) + `#[insecure(level="Warn", reason="...")]` attribute that will + produce a warning or error, depending on level, that cannot be + switched off (even with `-Awarning`) 4. Add a `removed_at="..."` item to `#[deprecated]` attributes that -allows making API items unavailable starting from certain target -versions. + allows making API items unavailable starting from certain target + versions. 5. Add a number of warnings to steer users in the direction of using -the most recent Rust version that makes sense to them, while making it -easy for library writers to support a wide range of Rust versions + the most recent Rust version that makes sense to them, while making + it easy for library writers to support a wide range of Rust versions 6. (optional) add a `legacy="..."` item to `#[deprecated]` attributes -that allows grouping API items under a legacy flag that is already -defined in RFC #1122 (see below) + that allows grouping API items under a legacy flag that is already + defined in RFC #1122 (see below) ## Background @@ -153,16 +153,16 @@ argument. If no argument is supplied, `rustc` defaults to its own version. The same version syntax as Cargo applies: * `*` effectively means *any version*. For API items, it means -deprecation checking is disabled. For language changes, it means using -the `1.0.0` code paths (for now, we may opt to change this in the -future), because anything else would break all current code. -* `1.x` or e.g. `>=1.2.0` sets the target version to the minor version. -Deprecation checking and language code path selection occur relative -to the lowest given version. This might also affect stability handling, -though this RFC doesn't specify this as of yet. + deprecation checking is disabled. For language changes, it means + using the `1.0.0` code paths (for now, we may opt to change this in + the future), because anything else would break all current code. +* `1.x` or e.g. `>=1.2.0` sets the target version to the minor version. + Deprecation checking and language code path selection occur relative + to the lowest given version. This might also affect stability + handling, though this RFC doesn't specify this as of yet. * `1.0 - <2.0` as above, the *lowest* supplied version has to be -assumed for code path selection. However, deprecation checking should -assume the *highest* supplied version, if any. + assumed for code path selection. However, deprecation checking should + assume the *highest* supplied version, if any. If the target version is *higher* than the current `rustc` version, `rustc` should show a warning to suggest that it may need to be updated @@ -224,34 +224,35 @@ estimate the effort to be reasonably low. # Alternatives * It was suggested that opt-in and opt-out (e.g. by `#[legacy(..)]`) -could be sufficient to work around any breaking code on API or language -changes. The big problem here is that this relies on the user being able -to change their dependencies, which may not be possible for legal, -organizational or other reasons. In contrast, a defined target version -doesn't ever need to change + could be sufficient to work around any breaking code on API or + language changes. The big problem here is that this relies on the + user being able to change their dependencies, which may not be + possible for legal, organizational or other reasons. In contrast, a + defined target version doesn't ever need to change -Depending on the specific case, it may be useful to allow a combination -of `#![legacy(..)]`, `#![feature(..)]` and the target version where -each Rust version can declare the currently active feature set and -permit or forbid use of the opt-in/out flags + Depending on the specific case, it may be useful to allow a + combination of `#![legacy(..)]`, `#![feature(..)]` and the target + version where each Rust version can declare the currently active + feature set and permit or forbid use of the opt-in/out flags * Follow a more agressive strategy that actually removes stuff from the -API. This would make it easier for the libstd creators at some cost for -library and application writers, as they are required to keep up to -date or face breakage. The risk of breaking existing code makes this -strategy very unattractive + API. This would make it easier for the libstd creators at some cost + for library and application writers, as they are required to keep up + to date or face breakage. The risk of breaking existing code makes + this strategy very unattractive * Hide deprecated items in the docs: This could be done either by -putting them into a linked extra page or by adding a "show deprecated" -checkbox that may be default be checked or not, depending on who you -ask. This will however confuse people, who see the deprecated APIs in -some code, but cannot find them in the docs anymore + putting them into a linked extra page or by adding a "show + deprecated" checkbox that may be default be checked or not, depending + on who you ask. This will however confuse people, who see the + deprecated APIs in some code, but cannot find them in the docs + anymore * Allow to distinguish "soft" and "hard" deprecation, so that an API -can be marked as "soft" deprecated to dissuade new uses before hard -deprecation is decided. Allowing people to specify deprecation in -future version appears to have much of the same benefits without -needing a new attribute key. + can be marked as "soft" deprecated to dissuade new uses before hard + deprecation is decided. Allowing people to specify deprecation in + future version appears to have much of the same benefits without + needing a new attribute key. # Unresolved questions From ce6d4e79b6043c78eca8e5c0c1c41e2bb1790347 Mon Sep 17 00:00:00 2001 From: llogiq Date: Sat, 4 Jul 2015 13:25:14 +0200 Subject: [PATCH 0529/1195] Added previous proposal to alternatives, added bikeshedding to unresolved questions --- text/0000-deprecation.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 62d8964df39..67e6bbfaa13 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -223,6 +223,14 @@ estimate the effort to be reasonably low. # Alternatives +* An earlier version of this proposal suggested using a crate attribute + instead of a cargo package attribute and a compiler option, to also + allow cargo-less use cases without manual interaction. However it was + determined that those cases usually target the current Rust version + anyway, and the current proposal allows us to default to the current + version for rustc, while the earlier proposal would have defaulted to + 1.0.0 by necessity of not breaking existing code + * It was suggested that opt-in and opt-out (e.g. by `#[legacy(..)]`) could be sufficient to work around any breaking code on API or language changes. The big problem here is that this relies on the @@ -252,8 +260,10 @@ estimate the effort to be reasonably low. can be marked as "soft" deprecated to dissuade new uses before hard deprecation is decided. Allowing people to specify deprecation in future version appears to have much of the same benefits without - needing a new attribute key. + requiring a new attribute key # Unresolved questions -None +The names for the cargo package attribute and the rustc compiler option +are still subject to bikeshedding (however, discussion has stalled, +suggesting the current names are good enough). From 3b9ca85bc43ca9c8769344c9fa41176a8c889ef6 Mon Sep 17 00:00:00 2001 From: llogiq Date: Thu, 9 Jul 2015 07:12:51 +0200 Subject: [PATCH 0530/1195] #[insecure]-flagging removed from RFC, added some open questions (as prompted by @alexcrichton) --- text/0000-deprecation.md | 59 +++++++++++++++++++--------------------- 1 file changed, 28 insertions(+), 31 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 67e6bbfaa13..d04ef3698db 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -17,17 +17,13 @@ Namely the following items: code paths in the compiler. 2. Add an optional `rust = "..."` package attribute to Cargo.toml, which `cargo new` pre-fills with the current rust version. -3. Allow `std` APIs to declare an - `#[insecure(level="Warn", reason="...")]` attribute that will - produce a warning or error, depending on level, that cannot be - switched off (even with `-Awarning`) -4. Add a `removed_at="..."` item to `#[deprecated]` attributes that +3. Add a `removed_at="..."` item to `#[deprecated]` attributes that allows making API items unavailable starting from certain target versions. -5. Add a number of warnings to steer users in the direction of using +4. Add a number of warnings to steer users in the direction of using the most recent Rust version that makes sense to them, while making it easy for library writers to support a wide range of Rust versions -6. (optional) add a `legacy="..."` item to `#[deprecated]` attributes +5. (optional) add a `legacy="..."` item to `#[deprecated]` attributes that allows grouping API items under a legacy flag that is already defined in RFC #1122 (see below) @@ -45,7 +41,7 @@ As a background, in no particular order: In addition, there has been an ongoing [discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) about how we are going to evolve the standard library, which this -proposal is mostly based on. +proposal is somewhat based on. Finally, the recent discussion on the first breaking change ([RFC PR #1156 Adjust default object bounds](https://github.com/rust-lang/rfcs/pull/1156)) @@ -63,7 +59,7 @@ the `-Awarnings` argument. ## 1. Language / `std` Evolution -The following motivates items 1 and 2 (and to a lesser extent 5) +The following motivates items 1 and 2 (and to a lesser extent 4) With the current setup, we can already evolve the language and APIs, albeit in a very limited way. For example, it is virtually impossible @@ -89,7 +85,7 @@ libraries, in which deprecation doesn't exist! ## 2. User Experience -The following motivates items 2, 4 and 5. +The following motivates items 2, 3 and 4. Currently, there is no way to make an API item unavailable via deprecation. This means the API will only ever expand, with a lot of @@ -123,12 +119,6 @@ current target version (e.g. by a switch), or deprecated items relegated to a separate space, to reduce clutter and possible user confusion. -## 3. Security Considerations - -I believe that *should* a security issue in one of our APIs be found, -a swift and effective response will be required, and the current rules -make no provisions for it. Thus proposal item 3. - # Detailed design Cargo parses the additional `rust = "..."` package attribute. The usual @@ -178,23 +168,12 @@ of the specified target version instead the current version, but default to the current version if no target version is specified or the specified version has no upper bound. -`rustc` should also show a warning or error, depending on level, on -encountering usage of API items marked as `#[insecure]`. The attribute -has two values: - -* `level` can either be `Warning` or `Error` and default to `Error` -* `reason` contains a description on why usage of this item was deemed - a security risk. This attribute is mandatory. - While `rustdoc` already parses the deprecation flags, it should in addition relegate items removed in the current version to a separate area below the other documentation and optically mark them as removed. We should not completely remove them, because that would confuse users who see the API items in code written for older target versions. -Also, `rustdoc` should show if API items are marked with `#[insecure]`, -including displaying the `reason` prominently. - # Optional Extension: Legacy flags The `#[deprecated]` attribute could get an optional `legacy="xy"` @@ -219,7 +198,11 @@ have lived with this for multiple decades, so it appears the tradeoff has seen some confirmation already. Cargo and `rustc` need some code to manage the additional rules. I -estimate the effort to be reasonably low. +estimate the effort to be reasonably low. For *compiler changes* +however, unless it's a genuine bug and unless there could be programs +relying on the old behaviours, both the old and new code paths have to +be maintained in the compiler, which is the biggest cost of +implementing this RFC. # Alternatives @@ -264,6 +247,20 @@ estimate the effort to be reasonably low. # Unresolved questions -The names for the cargo package attribute and the rustc compiler option -are still subject to bikeshedding (however, discussion has stalled, -suggesting the current names are good enough). +* The names for the cargo package attribute and the rustc compiler + option are still subject to bikeshedding (however, discussion has + stalled, suggesting the current names are good enough). + +* How do we determine if something is a genuine bug (and should be + changed retroactively)? + +* If we agree that something needs to be changed retroactively (i.e. in + older versions), do we also release the old versions anew? Which + ones? Should we nominate LTS versions? Who would maintain them? + +* Is *forward-compatibility* sufficiently handled? Seeing that e.g. + adding an item to a trait could break code using that trait, changing + a trait would require both versions being interoperable, which could + be impossible in the general case. This would needed to be handled by + finding a new name for the trait or supplying a default + implementation. From f0021d0cf1ed2aa08169472da5822e5f824bafac Mon Sep 17 00:00:00 2001 From: llogiq Date: Fri, 4 Sep 2015 18:19:49 +0200 Subject: [PATCH 0531/1195] RFC to make stability attributes public --- text/0000-deprecation.md | 316 +++++++++------------------------------ 1 file changed, 71 insertions(+), 245 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index d04ef3698db..b775757e8ea 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -1,266 +1,92 @@ -- Feature Name: Target Version -- Start Date: 2015-06-03 +- Feature Name: Public Stability +- Start Date: 2015-09-03 - RFC PR: - Rust Issue: # Summary -This RFC proposes a small number of extensions to improve the user -experience around different rust versions, while at the same time -giving Rust developers more freedom to make changes without breaking -anyones code. - -Namely the following items: - -1. Add a `--target-version=`** command line argument to - rustc. This will be used for deprecation checking and for selecting - code paths in the compiler. -2. Add an optional `rust = "..."` package attribute to Cargo.toml, - which `cargo new` pre-fills with the current rust version. -3. Add a `removed_at="..."` item to `#[deprecated]` attributes that - allows making API items unavailable starting from certain target - versions. -4. Add a number of warnings to steer users in the direction of using - the most recent Rust version that makes sense to them, while making - it easy for library writers to support a wide range of Rust versions -5. (optional) add a `legacy="..."` item to `#[deprecated]` attributes - that allows grouping API items under a legacy flag that is already - defined in RFC #1122 (see below) - -## Background - -A good number of policies and mechanisms around versioning regarding -the Rust language and APIs have already been submitted, some accepted. -As a background, in no particular order: - -* [#0572 Feature gates](https://github.com/rust-lang/rfcs/blob/master/text/0572-rustc-attribute.md) -* [#1105 API Evolution](https://github.com/rust-lang/rfcs/blob/master/text/1105-api-evolution.md) -* [#1122 Language SemVer](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) -* [#1150 Rename Attribute](https://github.com/rust-lang/pull/1150) - -In addition, there has been an ongoing -[discussion on internals](https://internals.rust-lang.org/t/thoughts-on-aggressive-deprecation-in-libstd/2176/55) -about how we are going to evolve the standard library, which this -proposal is somewhat based on. - -Finally, the recent discussion on the first breaking change -([RFC PR #1156 Adjust default object bounds](https://github.com/rust-lang/rfcs/pull/1156)) -has made it clear that we need a more flexible way of dealing with -(breaking) changes. - -The current setup allows the `std` API to declare `#[unstable]` and -`#[deprecated]` flags, whereas users can opt-in to unstable features -with `#![feature]` flags that are customarily added to the crate root. -On usage of deprecated APIs, a warning is shown unless suppressed. -`cargo` does this for dependencies by default by calling `rustc` with -the `-Awarnings` argument. +This RFC proposes to make the stability attributes `#[deprecate]`, `#[stable]` +and `#[unstable]` publicly available, removing some and adding other +restrictions while keeping everything mostly the same for APIs shipped with +Rust. # Motivation -## 1. Language / `std` Evolution - -The following motivates items 1 and 2 (and to a lesser extent 4) - -With the current setup, we can already evolve the language and APIs, -albeit in a very limited way. For example, it is virtually impossible -to actually remove an API, because that would break code. Even minor -breaking changes (as [RFC PR #1156](https://github.com/rust-lang/rfcs/pull/1156) -cited above) generate huge discussion, because they call the general -stability of the language and environment into question. - -The problem with opt-out, as defined by -[RFC #1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) -is that code which previously compiled without problems stops working -on a Rust upgrade, and requires manual intervention to get working -again. +Library authors want a way to evolve their APIs without too much breakage. To +this end, Rust has long employed the aforementioned attributes. Now that Rust +is somewhat stable, it's time to open them up so that others can use them. -This has the long-term potential of fragmenting the Rust ecosystem into -crates/projects using certain Rust versions and thus should be avoided -at some (but not all) cost. - -Note that a similar problem exists with deprecation as defined: People -may get used to deprecation warnings or just turn them off until their -build breaks. Worse, the current setup creates a mirror world for -libraries, in which deprecation doesn't exist! - -## 2. User Experience - -The following motivates items 2, 3 and 4. - -Currently, there is no way to make an API item unavailable via -deprecation. This means the API will only ever expand, with a lot of -churn (case in point, the Java language as of Version 8 lists 462 -deprecated constructs, including methods, constants and complete -classes, in relation to 4241 classes, this makes for about 10% of -deprecated API surface. Note that this is but a rough estimate). To -better lead users to the right APIs and to allow for more effective -language/API evolution, this proposal adds the `removed_at=""` -item to the `#[deprecated]` attribute. - -This allows us to effectively remove an API item from a certain target -version while avoiding breaking code written for older target versions. - -Also rustc can emit better error messages than it could were the API -items actually removed. In the best case, the error messages can steer -the user to a working replacement. - -We want to avoid users setting `#[allow(deprecate)]` on their code to -get rid of the warnings. On the other hand, there have been instances -of failing builds because code was marked with `#![deny(warnings)]`, -namely `compiletest.rs` and all crates using it. This shows that the -current system has room for improvement. - -We want to avoid failing builds because of wrong or missing target -version definition. Therefor supplying useful defaults is of the -essence. - -The documentation can be optionally reduced to items relating to the -current target version (e.g. by a switch), or deprecated items -relegated to a separate space, to reduce clutter and possible user -confusion. +A pre-RFC on rust-users has seen a good number of supportive voices, which +suggests that the feature will improve the life of rust library authors +considerably. # Detailed design -Cargo parses the additional `rust = "..."` package attribute. The usual -rules for version parsing apply. If no `rust` attribute is supplied, it -defaults to `*`. - -Cargo should also supply the current Rust version (which can be either -supplied by calling `rustc -V` or by linking to a rust library defining -a version object) on `cargo new`. Cargo supplies the given target -version to `rustc` via the `--target-version` command line argument. - -Cargo *may* also warn on `cargo package` if no `rust` version was -supplied. A few versions in the future, Cargo could also warn of -missing version attributes on build or other actions, at least if the -crate is a library. - -[crates.io](https://crates.io) *could* require a version attribute on -upload and display the required rust version on the site. - -`rustc` needs to accept the `--target-version ` command line -argument. If no argument is supplied, `rustc` defaults to its own -version. The same version syntax as Cargo applies: - -* `*` effectively means *any version*. For API items, it means - deprecation checking is disabled. For language changes, it means - using the `1.0.0` code paths (for now, we may opt to change this in - the future), because anything else would break all current code. -* `1.x` or e.g. `>=1.2.0` sets the target version to the minor version. - Deprecation checking and language code path selection occur relative - to the lowest given version. This might also affect stability - handling, though this RFC doesn't specify this as of yet. -* `1.0 - <2.0` as above, the *lowest* supplied version has to be - assumed for code path selection. However, deprecation checking should - assume the *highest* supplied version, if any. - -If the target version is *higher* than the current `rustc` version, -`rustc` should show a warning to suggest that it may need to be updated -in order to compile the crate and then try to compile the crate on a -best-effort basis. - -Optionally, we can define a `future deprecation` lint set to `Allow` by -default to allow people being proactive about items that are going to -be deprecated. - -`rustc` should resolve the `#[feature]` flags against the upper bound -of the specified target version instead the current version, but -default to the current version if no target version is specified or the -specified version has no upper bound. - -While `rustdoc` already parses the deprecation flags, it should in -addition relegate items removed in the current version to a separate -area below the other documentation and optically mark them as removed. -We should not completely remove them, because that would confuse users -who see the API items in code written for older target versions. - -# Optional Extension: Legacy flags - -The `#[deprecated]` attribute could get an optional `legacy="xy"` -entry, which could effectively group a set of APIs under the given -name. Users can then declare the `#[legacy]` flag as defined in -[RFC #1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) -to specifically allow usage of the grouped APIs, thus selectively -removing the deprecation warning. - -This would create a nice feature parity between language code paths and -`std` API deprecation checking. Also it would lessen the pain for users -who want to upgrade their systems one feature at a time and can use the -legacy flags to effectively manage their usage of deprecated items. +Add another stability level variant `Undefined`, to be used whenever a +`#[deprecate]` attribute is without a `#[stable]` or `#[unstable]`. This lifts +the restriction to have the latter attributes whenever the former is used. To +keep the restriction for ASWRs, we add an `undefined_stability` lint that is +`Allow` by default, but set to `Warn` in the Rust build process, that catches +`Undefined` stability attributes (can be done within the Stability lint pass). + +Remove the rust API restriction on `#[deprecate]`, `#[stable]` and +`#[unstable]`. + +On all attributes the `version` field of the attribute should be checked to be +valid semver as per [RFC +#1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md). +It is per this RFC redefined to mean the version of the crate (or rust +distribution, for `std`) as declared in `Cargo.toml`. + +The `issue` field of the `#[`(`un`)`stable]` attributes is defined per this RFC +to mean the suffix to a URL to the issue tracker (it may also be the complete +URL). Optionally rustdoc may link the issue from the documentation; a new +`--issue-tracker-url-prefix=...` option will be prefixed to all links. + +The `feature` field is defined to contain a feature name to be used with +`cfg(feature = "...")`. To check this, Cargo could put the list of *available* +features in a space-separated `CRATE_FEATURES_AVAILABLE` environment variable. +Alternative build processes can also set this. To simplify things, putting a +`"*"` in the environment variable should disable the check. Otherwise the +stability lint can check if the feature has been declared. + +The `reason` field is defined to contain a human-readable text with a +suggestion what to use instead and a rationale. This is how the field is used +currently. See the Alternatives section for a less conservative (but more +work-intensive) proposal. + +The language reference should be extended to describe this feature. # Drawbacks -By requiring full backwards-compatibility, we will never be able to -actually remove stuff from the APIs, which will probably lead to some -bloat. However, the cost of maintaining the outdated APIs is far -outweighted by the benefits. Case in point: Other successful languages -have lived with this for multiple decades, so it appears the tradeoff -has seen some confirmation already. - -Cargo and `rustc` need some code to manage the additional rules. I -estimate the effort to be reasonably low. For *compiler changes* -however, unless it's a genuine bug and unless there could be programs -relying on the old behaviours, both the old and new code paths have to -be maintained in the compiler, which is the biggest cost of -implementing this RFC. +* Work to be done will take time not to invest in other improvements +* There could be attribute definitions in the codebase that do not adhere to +the outlined design, and would have to be changed to fit. It is unclear whether +this is a real drawback +* Once the feature is public, we can no longer change its design +* Someone could misuse the API to e.g. add malicious links into their rustdoc. +However this is possible via plain links even now # Alternatives -* An earlier version of this proposal suggested using a crate attribute - instead of a cargo package attribute and a compiler option, to also - allow cargo-less use cases without manual interaction. However it was - determined that those cases usually target the current Rust version - anyway, and the current proposal allows us to default to the current - version for rustc, while the earlier proposal would have defaulted to - 1.0.0 by necessity of not breaking existing code - -* It was suggested that opt-in and opt-out (e.g. by `#[legacy(..)]`) - could be sufficient to work around any breaking code on API or - language changes. The big problem here is that this relies on the - user being able to change their dependencies, which may not be - possible for legal, organizational or other reasons. In contrast, a - defined target version doesn't ever need to change - - Depending on the specific case, it may be useful to allow a - combination of `#![legacy(..)]`, `#![feature(..)]` and the target - version where each Rust version can declare the currently active - feature set and permit or forbid use of the opt-in/out flags - -* Follow a more agressive strategy that actually removes stuff from the - API. This would make it easier for the libstd creators at some cost - for library and application writers, as they are required to keep up - to date or face breakage. The risk of breaking existing code makes - this strategy very unattractive - -* Hide deprecated items in the docs: This could be done either by - putting them into a linked extra page or by adding a "show - deprecated" checkbox that may be default be checked or not, depending - on who you ask. This will however confuse people, who see the - deprecated APIs in some code, but cannot find them in the docs - anymore - -* Allow to distinguish "soft" and "hard" deprecation, so that an API - can be marked as "soft" deprecated to dissuade new uses before hard - deprecation is decided. Allowing people to specify deprecation in - future version appears to have much of the same benefits without - requiring a new attribute key +* Do nothing +* Optionally the deprecation lint chould check the current version as set by +cargo in the CARGO_CRATE_VERSION environment variable (the rust build process +should set this environment variable, too). This would allow future +deprecations to be shown in the docs early, but not warned against by the +stability lint (there could however be a `future-deprecation` lint that should +be `Allow` by default). +* The `reason` field definition could be reduced to stating the *rationale* for +deprecating the API item. A new `instead` field then contains the full path to +a replacement item (trait, method, function, etc.). Since this path is well +defined, it can be checked against. However, some provision needs to be made to +allow those paths to be extended with a crate (e.g. for items that have been +moved to different crates). The upside is that this would open up the +possibility for rustdoc to link to the replacement, the downside is that the +check could potentially be costly. # Unresolved questions -* The names for the cargo package attribute and the rustc compiler - option are still subject to bikeshedding (however, discussion has - stalled, suggesting the current names are good enough). - -* How do we determine if something is a genuine bug (and should be - changed retroactively)? - -* If we agree that something needs to be changed retroactively (i.e. in - older versions), do we also release the old versions anew? Which - ones? Should we nominate LTS versions? Who would maintain them? - -* Is *forward-compatibility* sufficiently handled? Seeing that e.g. - adding an item to a trait could break code using that trait, changing - a trait would require both versions being interoperable, which could - be impossible in the general case. This would needed to be handled by - finding a new name for the trait or supplying a default - implementation. +* Is the current design (as outlined herein) good enough to be made public? +* What other restrictions should we introduce now to avoid being bound to a +possibly flawed design? From 10c39c404a20d7158543402dbcca1a011564cc3b Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 4 Sep 2015 14:37:35 -0400 Subject: [PATCH 0532/1195] Move RFC #953 in its rightful place --- text/{0000-op-assign.md => 0953-op-assign.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-op-assign.md => 0953-op-assign.md} (95%) diff --git a/text/0000-op-assign.md b/text/0953-op-assign.md similarity index 95% rename from text/0000-op-assign.md rename to text/0953-op-assign.md index 6657f0a982f..cf9d1397de9 100644 --- a/text/0000-op-assign.md +++ b/text/0953-op-assign.md @@ -1,7 +1,7 @@ - Feature Name: op_assign - Start Date: 2015-03-08 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#953](https://github.com/rust-lang/rfcs/pull/953) +- Rust Issue: [rust-lang/rust#28235](https://github.com/rust-lang/rust/issues/28235) # Summary From 973484f084090630a02f5a1630210d736c781ada Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 4 Sep 2015 14:48:03 -0400 Subject: [PATCH 0533/1195] link RFC #1135, raw pointer comparisons --- ...inter-comparisons.md => 1135-raw-pointer-comparisons.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-raw-pointer-comparisons.md => 1135-raw-pointer-comparisons.md} (91%) diff --git a/text/0000-raw-pointer-comparisons.md b/text/1135-raw-pointer-comparisons.md similarity index 91% rename from text/0000-raw-pointer-comparisons.md rename to text/1135-raw-pointer-comparisons.md index 5287c239951..0fe9423b5bb 100644 --- a/text/0000-raw-pointer-comparisons.md +++ b/text/1135-raw-pointer-comparisons.md @@ -1,7 +1,7 @@ - Feature Name: raw-pointer-comparisons - Start Date: 2015-05-27 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1135](https://github.com/rust-lang/rfcs/pull/1135) +- Rust Issue: [rust-lang/rust#28235](https://github.com/rust-lang/rust/issues/28236) # Summary @@ -55,4 +55,4 @@ lexicographic order or addr -> unsize lexicographic order. # Unresolved questions -See Alternatives. \ No newline at end of file +See Alternatives. From 2baf5c4c78fd37d362f6047e7ce3866af18cd8d7 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 4 Sep 2015 14:50:31 -0400 Subject: [PATCH 0534/1195] amend RFC 1135 to specify unresolved question more explicitly --- text/1135-raw-pointer-comparisons.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/1135-raw-pointer-comparisons.md b/text/1135-raw-pointer-comparisons.md index 0fe9423b5bb..abdca0cd814 100644 --- a/text/1135-raw-pointer-comparisons.md +++ b/text/1135-raw-pointer-comparisons.md @@ -55,4 +55,6 @@ lexicographic order or addr -> unsize lexicographic order. # Unresolved questions -See Alternatives. +What form of ordering should be adopted, if any? + + From 0881d7bc2adc9b0bcf91c9d8fc45ba0c4338c975 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 4 Sep 2015 14:53:16 -0400 Subject: [PATCH 0535/1195] Adopt RFC #1192, inclusive ranges --- text/{0000-inclusive-ranges.md => 1192-inclusive-ranges.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-inclusive-ranges.md => 1192-inclusive-ranges.md} (95%) diff --git a/text/0000-inclusive-ranges.md b/text/1192-inclusive-ranges.md similarity index 95% rename from text/0000-inclusive-ranges.md rename to text/1192-inclusive-ranges.md index 90a76cc8b94..78051be6b5a 100644 --- a/text/0000-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -1,7 +1,7 @@ - Feature Name: inclusive_range_syntax - Start Date: 2015-07-07 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1192](https://github.com/rust-lang/rfcs/pull/1192) +- Rust Issue: [rust-lang/rust#28237](https://github.com/rust-lang/rust/issues/28237) # Summary From a4b0b93c040df16c80cc7dbff2e6df1124bb00e7 Mon Sep 17 00:00:00 2001 From: llogiq Date: Fri, 4 Sep 2015 21:08:47 +0200 Subject: [PATCH 0536/1195] rewrote to start from clean slate, reduce surface area --- text/0000-deprecation.md | 124 ++++++++++++++++++++------------------- 1 file changed, 64 insertions(+), 60 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index b775757e8ea..2f38d0f589c 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -5,67 +5,76 @@ # Summary -This RFC proposes to make the stability attributes `#[deprecate]`, `#[stable]` -and `#[unstable]` publicly available, removing some and adding other -restrictions while keeping everything mostly the same for APIs shipped with -Rust. +This RFC proposes to allow library authors to use a `#[deprecate]` attribute, +with `since="`(version)`"`, `reason="`(free text)`"` and +`surrogate="`(text or surrogate declaration)`"` fields. The compiler can then +warn on deprecated items, while `rustdoc` can document their deprecation +accordingly. # Motivation -Library authors want a way to evolve their APIs without too much breakage. To -this end, Rust has long employed the aforementioned attributes. Now that Rust -is somewhat stable, it's time to open them up so that others can use them. +Library authors want a way to evolve their APIs; which also involves +deprecating items. To do this cleanly, they need to document their intentions +and give their users enough time to react. -A pre-RFC on rust-users has seen a good number of supportive voices, which -suggests that the feature will improve the life of rust library authors -considerably. +Currently there is no support from the language for this oft-wanted feature +(despite a similar feature existing for the sole purpose of evolving the Rust +standard library). This RFC aims to rectify that, while giving a pleasant +interface to use while maximizing usefulness of the metadata introduced. # Detailed design -Add another stability level variant `Undefined`, to be used whenever a -`#[deprecate]` attribute is without a `#[stable]` or `#[unstable]`. This lifts -the restriction to have the latter attributes whenever the former is used. To -keep the restriction for ASWRs, we add an `undefined_stability` lint that is -`Allow` by default, but set to `Warn` in the Rust build process, that catches -`Undefined` stability attributes (can be done within the Stability lint pass). - -Remove the rust API restriction on `#[deprecate]`, `#[stable]` and -`#[unstable]`. - -On all attributes the `version` field of the attribute should be checked to be -valid semver as per [RFC -#1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md). -It is per this RFC redefined to mean the version of the crate (or rust -distribution, for `std`) as declared in `Cargo.toml`. - -The `issue` field of the `#[`(`un`)`stable]` attributes is defined per this RFC -to mean the suffix to a URL to the issue tracker (it may also be the complete -URL). Optionally rustdoc may link the issue from the documentation; a new -`--issue-tracker-url-prefix=...` option will be prefixed to all links. - -The `feature` field is defined to contain a feature name to be used with -`cfg(feature = "...")`. To check this, Cargo could put the list of *available* -features in a space-separated `CRATE_FEATURES_AVAILABLE` environment variable. -Alternative build processes can also set this. To simplify things, putting a -`"*"` in the environment variable should disable the check. Otherwise the -stability lint can check if the feature has been declared. - -The `reason` field is defined to contain a human-readable text with a -suggestion what to use instead and a rationale. This is how the field is used -currently. See the Alternatives section for a less conservative (but more -work-intensive) proposal. - -The language reference should be extended to describe this feature. +Public API items (both plain `fn`s, methods, trait- and inherent +`impl`ementations as well as `const` definitions) can be given a `#[deprecate]` +attribute. + +This attribute *must* have the `since` field, which contains the version of the +crate that deprecated the item, as defined by Cargo.toml (thus following the +semver scheme). It makes no sense to put a version number higher than the +current newest version here, and this is not checked (but could be by external +lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). + +Other optional fields are: + +* `reason` should contain a human-readable string outlining the reason for +deprecating the item. While this field is not required, library authors are +strongly advised to make use of it to convey the reason to users of their +library. The string is required to be plain unformatted text (for now) so that +rustdoc can include it in the item's documentation without messing up the +formatting. +* `surrogate` should be the full path to an API item that will replace the +functionality of the deprecated item, optionally (if the surrogate is in a +different crate) followed by `@` and either a crate name (so that +`https://crates.io/crates/` followed by the name is a live link) or the URL to +a repository or other location where a surrogate can be obtained. Links must be +plain FTP, FTPS, HTTP or HTTPS links. The intention is to allow rustdoc (and +possibly other tools in the future, e.g. IDEs) to act on the included +information. + +On use of a *deprecated* item, `rustc` should `warn` of the deprecation. Note +that during Cargo builds, warnings on dependencies get silenced. Note that +while this has the upside of keeping things tidy, it has a downside when it +comes to deprecation: + +Let's say I have my `llogiq` crate that depends on `foobar` which uses a +deprecated item of `serde`. I will never get the warning about this unless I +try to build `foobar` directly. We may want to create a service like `crater` +to warn on use of deprecated items in library crates, however this is outside +the scope of this RFC. + +`rustdoc` should show deprecation on items, with a `[deprecated since x.y.z]` +box that may optionally show the reason and/or link to the surrogate if +available. + +The language reference should be extended to describe this feature as outlined +in this RFC. Authors shall be advised to leave their users enough time to react +before *removing* a deprecated item. # Drawbacks -* Work to be done will take time not to invest in other improvements -* There could be attribute definitions in the codebase that do not adhere to -the outlined design, and would have to be changed to fit. It is unclear whether -this is a real drawback +* The required checks for the `since` and `surrogate` fields are potentially +quite complex. * Once the feature is public, we can no longer change its design -* Someone could misuse the API to e.g. add malicious links into their rustdoc. -However this is possible via plain links even now # Alternatives @@ -75,18 +84,13 @@ cargo in the CARGO_CRATE_VERSION environment variable (the rust build process should set this environment variable, too). This would allow future deprecations to be shown in the docs early, but not warned against by the stability lint (there could however be a `future-deprecation` lint that should -be `Allow` by default). -* The `reason` field definition could be reduced to stating the *rationale* for -deprecating the API item. A new `instead` field then contains the full path to -a replacement item (trait, method, function, etc.). Since this path is well -defined, it can be checked against. However, some provision needs to be made to -allow those paths to be extended with a crate (e.g. for items that have been -moved to different crates). The upside is that this would open up the -possibility for rustdoc to link to the replacement, the downside is that the -check could potentially be costly. +be `Allow` by default) +* `reason` could include markdown formatting +* The `surrogate` could simply be plain text, which would remove much of the +complexity here # Unresolved questions -* Is the current design (as outlined herein) good enough to be made public? * What other restrictions should we introduce now to avoid being bound to a possibly flawed design? +* Can / Should the `std` library make use of the `#[deprecate]` extensions? From 24429e17be2e3879cd214481db2d212a65cf399d Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 4 Sep 2015 15:09:31 -0400 Subject: [PATCH 0537/1195] Merge RFC #1229, compile time asserts. --- ...0-compile-time-asserts.md => 1229-compile-time-asserts.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-compile-time-asserts.md => 1229-compile-time-asserts.md} (95%) diff --git a/text/0000-compile-time-asserts.md b/text/1229-compile-time-asserts.md similarity index 95% rename from text/0000-compile-time-asserts.md rename to text/1229-compile-time-asserts.md index 72cbcfcc7d6..c15720e2d36 100644 --- a/text/0000-compile-time-asserts.md +++ b/text/1229-compile-time-asserts.md @@ -1,7 +1,7 @@ - Feature Name: compile_time_asserts - Start Date: 2015-07-30 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1229](https://github.com/rust-lang/rfcs/pull/1229) +- Rust Issue: [rust-lang/rust#28238](https://github.com/rust-lang/rust/issues/28238) # Summary From e487613345bbad2594121398621cae542864f7cb Mon Sep 17 00:00:00 2001 From: llogiq Date: Sun, 6 Sep 2015 08:24:41 +0200 Subject: [PATCH 0538/1195] Clarified API items+versioning, more alternatives --- text/0000-deprecation.md | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 2f38d0f589c..b4af43eb389 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -25,14 +25,19 @@ interface to use while maximizing usefulness of the metadata introduced. # Detailed design Public API items (both plain `fn`s, methods, trait- and inherent -`impl`ementations as well as `const` definitions) can be given a `#[deprecate]` -attribute. +`impl`ementations as well as `const` definitions, type definitions, struct +fields and enum variants) can be given a `#[deprecate]` attribute. -This attribute *must* have the `since` field, which contains the version of the -crate that deprecated the item, as defined by Cargo.toml (thus following the -semver scheme). It makes no sense to put a version number higher than the -current newest version here, and this is not checked (but could be by external -lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). +This attribute *must* have the `since` field, which contains at least the +version of the crate that deprecated the item, as defined by Cargo.toml +(thus following the semver scheme). It makes no sense to put a version number +higher than the current newest version here, and this is not checked (but +could be by external lints, e.g. +[rust-clippy](https://github.com/Manishearth/rust-clippy). + +Following semantic versioning would mean that the supplied value could +actually be a version *range*, which could imply an end-of-life for the +feature. Other optional fields are: @@ -79,12 +84,14 @@ quite complex. # Alternatives * Do nothing +* make the `since` field optional * Optionally the deprecation lint chould check the current version as set by cargo in the CARGO_CRATE_VERSION environment variable (the rust build process should set this environment variable, too). This would allow future deprecations to be shown in the docs early, but not warned against by the stability lint (there could however be a `future-deprecation` lint that should be `Allow` by default) +* require either `reason` or `surrogate` be present * `reason` could include markdown formatting * The `surrogate` could simply be plain text, which would remove much of the complexity here @@ -94,3 +101,4 @@ complexity here * What other restrictions should we introduce now to avoid being bound to a possibly flawed design? * Can / Should the `std` library make use of the `#[deprecate]` extensions? +* Bikeshedding: Are the names good enough? From 2112969c6fb7e8ff88a330e5accc4a92b7fa4ef7 Mon Sep 17 00:00:00 2001 From: Hunan Rostomyan Date: Sun, 6 Sep 2015 00:20:55 -0700 Subject: [PATCH 0539/1195] Fix minor typos, capitalize for consistency --- text/0246-const-vs-static.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/text/0246-const-vs-static.md b/text/0246-const-vs-static.md index 89c4bdc8339..9daf3df619b 100644 --- a/text/0246-const-vs-static.md +++ b/text/0246-const-vs-static.md @@ -18,12 +18,12 @@ Divide global declarations into two categories: # Motivation We have been wrestling with the best way to represent globals for some -times. There are number of interrelated issues: +times. There are a number of interrelated issues: - *Significant addresses and inlining:* For optimization purposes, it is useful to be able to inline constant values directly into the program. It is even more useful if those constant values do not have - a known address, because that means the compiler is free to replicate + known addresses, because that means the compiler is free to replicate them as it wishes. Moreover, if a constant is inlined into downstream crates, then they must be recompiled whenever that constant changes. - *Read-only memory:* Whenever possible, we'd like to place large @@ -32,18 +32,18 @@ times. There are number of interrelated issues: - *Global atomic counters and the like:* We'd like to make it possible for people to create global locks or atomic counters that can be used without resorting to unsafe code. -- *Interfacing with C code:* some C libraries require the use of +- *Interfacing with C code:* Some C libraries require the use of global, mutable data. Other times it's just convenient and threading is not a concern. -- *Initializer constants:* there must be a way to have initializer +- *Initializer constants:* There must be a way to have initializer constants for things like locks and atomic counters, so that people can write `static MY_COUNTER: AtomicUint = INIT_ZERO` or some such. It should not be possible to modify these initializer constants. The current design is that we have only one keyword, `static`, which -declares a global variable. By default, global variables do not have a -significant address and can be inlined into the program. You can make +declares a global variable. By default, global variables do not have +significant addresses and can be inlined into the program. You can make a global variable have a *significant* address by marking it `#[inline(never)]`. Furthermore, you can declare a mutable global using `static mut`: all accesses to `static mut` variables are @@ -56,8 +56,8 @@ Some concrete problems with this design are: - There is no way to have a safe global counter or lock. Those must be placed in `static mut` variables, which means that access to them is - illegal. To resolve this, there is an alternative proposal which - makes access to `static mut` be considered safe if the type of the + illegal. To resolve this, there is an alternative proposal, according + to which, access to `static mut` is considered safe if the type of the static mut meets the `Sync` trait. - The significance (no pun intended) of the `#[inline(never)]` annotation is not intuitive. @@ -68,14 +68,14 @@ Other less practical and more aesthetic concerns are: - Although `static` and `let` look and feel analogous, the two behave quite differently. Generally speaking, `static` declarations do not declare variables but rather values, which can be inlined and which - do not have a fixed address. You cannot have interior mutability in + do not have fixed addresses. You cannot have interior mutability in a `static` variable, but you can in a `let`. So that `static` variables can appear in patterns, it is illegal to shadow a `static` variable -- but `let` variables cannot appear in patterns. Etc. - There are other constructs in the language, such as nullary enum variants and nullary structs, which look like global data but in fact act quite differently. They are actual values which do not have - a address. They are categorized as rvalues and so forth. + addresses. They are categorized as rvalues and so forth. # Detailed design @@ -88,8 +88,8 @@ Reintroduce a `const` declaration which declares a *constant*: Constants may be declared in any scope. They cannot be shadowed. Constants are considered rvalues. Therefore, taking the address of a constant actually creates a spot on the local stack -- they by -definition have no significant address. Constants are intended to -behave exactly like a nullary enum variant. +definition have no significant addresses. Constants are intended to +behave exactly like nullary enum variants. ### Possible extension: Generic constants @@ -122,7 +122,7 @@ among other things. ## Static variables Repurpose the `static` declaration to declare static variables -only. Static variables always have a single address. `static` +only. Static variables always have single addresses. `static` variables can optionally be declared as `mut`. The lifetime of a `static` variable is `'static`. It is not legal to move from a static. Accesses to a static variable generate actual reads and writes: the @@ -162,7 +162,7 @@ compiler will reinterpret this as if it were written as: Here a `static` is introduced to be able to give the `const` a pointer which does indeed have the `'static` lifetime. Due to this rewriting, the compiler will disallow `SomeStruct` from containing an `UnsafeCell` (interior -mutability). In general a constant A cannot reference the address of another +mutability). In general, a constant A cannot reference the address of another constant B if B contains an `UnsafeCell` in its interior. ### const => static From 0aaf9d45e055c16bba13b73ab2695f2c71a038ea Mon Sep 17 00:00:00 2001 From: llogiq Date: Tue, 8 Sep 2015 22:00:47 +0200 Subject: [PATCH 0540/1195] Require since to be exact version, add `d` --- text/0000-deprecation.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index b4af43eb389..731598af069 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -5,7 +5,7 @@ # Summary -This RFC proposes to allow library authors to use a `#[deprecate]` attribute, +This RFC proposes to allow library authors to use a `#[deprecated]` attribute, with `since="`(version)`"`, `reason="`(free text)`"` and `surrogate="`(text or surrogate declaration)`"` fields. The compiler can then warn on deprecated items, while `rustdoc` can document their deprecation @@ -26,18 +26,17 @@ interface to use while maximizing usefulness of the metadata introduced. Public API items (both plain `fn`s, methods, trait- and inherent `impl`ementations as well as `const` definitions, type definitions, struct -fields and enum variants) can be given a `#[deprecate]` attribute. +fields and enum variants) can be given a `#[deprecated]` attribute. -This attribute *must* have the `since` field, which contains at least the +This attribute *must* have the `since` field, which contains the exact version of the crate that deprecated the item, as defined by Cargo.toml (thus following the semver scheme). It makes no sense to put a version number higher than the current newest version here, and this is not checked (but could be by external lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). -Following semantic versioning would mean that the supplied value could -actually be a version *range*, which could imply an end-of-life for the -feature. +It is required that the version be fully specified (e.g. no wildcards or +ranges). Other optional fields are: @@ -54,7 +53,7 @@ different crate) followed by `@` and either a crate name (so that a repository or other location where a surrogate can be obtained. Links must be plain FTP, FTPS, HTTP or HTTPS links. The intention is to allow rustdoc (and possibly other tools in the future, e.g. IDEs) to act on the included -information. +information. The `surrogate` field can have multiple values. On use of a *deprecated* item, `rustc` should `warn` of the deprecation. Note that during Cargo builds, warnings on dependencies get silenced. Note that @@ -100,5 +99,7 @@ complexity here * What other restrictions should we introduce now to avoid being bound to a possibly flawed design? -* Can / Should the `std` library make use of the `#[deprecate]` extensions? +* How should the multiple values in the `surrogate` field work? Just split by +some delimiter? +* Can / Should the `std` library make use of the `#[deprecated]` extensions? * Bikeshedding: Are the names good enough? From 449ec5330079eda27556a71b88c0f55eee8247ba Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 9 Sep 2015 07:50:58 +0200 Subject: [PATCH 0541/1195] Incorporated chris-morgan's suggestion --- text/0000-deprecation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 731598af069..97b0d1f6832 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -6,8 +6,8 @@ # Summary This RFC proposes to allow library authors to use a `#[deprecated]` attribute, -with `since="`(version)`"`, `reason="`(free text)`"` and -`surrogate="`(text or surrogate declaration)`"` fields. The compiler can then +with `since = "`*version*`"`, `reason = "`*free text*`"` and +`surrogate = "`*text or surrogate declaration*`"` fields. The compiler can then warn on deprecated items, while `rustdoc` can document their deprecation accordingly. From 0c82b6bf272259df14ac561d750bc2ca184b5e1d Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 9 Sep 2015 08:16:13 +0200 Subject: [PATCH 0542/1195] More alternatives, relation to internal feature --- text/0000-deprecation.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 97b0d1f6832..d2d751c2fa4 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -74,6 +74,9 @@ The language reference should be extended to describe this feature as outlined in this RFC. Authors shall be advised to leave their users enough time to react before *removing* a deprecated item. +The internally used feature can either be subsumed by this or possibly renamed +to avoid a name clash. + # Drawbacks * The required checks for the `since` and `surrogate` fields are potentially @@ -94,6 +97,8 @@ be `Allow` by default) * `reason` could include markdown formatting * The `surrogate` could simply be plain text, which would remove much of the complexity here +* The `surrogate` field could be left out and added later. However, this would +lead people to describe it in the `reason` field # Unresolved questions From feec81762ec2ddb6c5486462723cc84db2f2abb8 Mon Sep 17 00:00:00 2001 From: llogiq Date: Wed, 16 Sep 2015 17:11:16 +0200 Subject: [PATCH 0543/1195] Rename "surrogate" to "use", "since" now optional Also added an alternative to "use" which resolves more cleverly. --- text/0000-deprecation.md | 57 ++++++++++++++++++++-------------------- 1 file changed, 29 insertions(+), 28 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index d2d751c2fa4..3ef33765724 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -6,8 +6,8 @@ # Summary This RFC proposes to allow library authors to use a `#[deprecated]` attribute, -with `since = "`*version*`"`, `reason = "`*free text*`"` and -`surrogate = "`*text or surrogate declaration*`"` fields. The compiler can then +with optional `since = "`*version*`"`, `reason = "`*free text*`"` and +`use = "`*substitute declaration*`"` fields. The compiler can then warn on deprecated items, while `rustdoc` can document their deprecation accordingly. @@ -26,34 +26,30 @@ interface to use while maximizing usefulness of the metadata introduced. Public API items (both plain `fn`s, methods, trait- and inherent `impl`ementations as well as `const` definitions, type definitions, struct -fields and enum variants) can be given a `#[deprecated]` attribute. - -This attribute *must* have the `since` field, which contains the exact -version of the crate that deprecated the item, as defined by Cargo.toml -(thus following the semver scheme). It makes no sense to put a version number -higher than the current newest version here, and this is not checked (but -could be by external lints, e.g. -[rust-clippy](https://github.com/Manishearth/rust-clippy). - -It is required that the version be fully specified (e.g. no wildcards or -ranges). - -Other optional fields are: - +fields and enum variants) can be given a `#[deprecated]` attribute. All +possible fields are optional: + +* `since` is defined to contain the exact version of the crate that +deprecated the item, as defined by Cargo.toml (thus following the semver +scheme). It makes no sense to put a version number higher than the current +newest version here, and this is not checked (but could be by external +lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). +To maximize usefulness, the version should be fully specified (e.g. no +wildcards or ranges). * `reason` should contain a human-readable string outlining the reason for deprecating the item. While this field is not required, library authors are strongly advised to make use of it to convey the reason to users of their library. The string is required to be plain unformatted text (for now) so that rustdoc can include it in the item's documentation without messing up the formatting. -* `surrogate` should be the full path to an API item that will replace the -functionality of the deprecated item, optionally (if the surrogate is in a +* `use` should be the full path to an API item that will replace the +functionality of the deprecated item, optionally (if the replacement is in a different crate) followed by `@` and either a crate name (so that `https://crates.io/crates/` followed by the name is a live link) or the URL to a repository or other location where a surrogate can be obtained. Links must be plain FTP, FTPS, HTTP or HTTPS links. The intention is to allow rustdoc (and possibly other tools in the future, e.g. IDEs) to act on the included -information. The `surrogate` field can have multiple values. +information. The `use` field can have multiple values. On use of a *deprecated* item, `rustc` should `warn` of the deprecation. Note that during Cargo builds, warnings on dependencies get silenced. Note that @@ -67,7 +63,7 @@ to warn on use of deprecated items in library crates, however this is outside the scope of this RFC. `rustdoc` should show deprecation on items, with a `[deprecated since x.y.z]` -box that may optionally show the reason and/or link to the surrogate if +box that may optionally show the reason and/or link to the replacement if available. The language reference should be extended to describe this feature as outlined @@ -79,32 +75,37 @@ to avoid a name clash. # Drawbacks -* The required checks for the `since` and `surrogate` fields are potentially +* The required checks for the `since` and `use` fields are potentially quite complex. * Once the feature is public, we can no longer change its design # Alternatives * Do nothing -* make the `since` field optional +* make the `since` field required and check that it's a single version * Optionally the deprecation lint chould check the current version as set by cargo in the CARGO_CRATE_VERSION environment variable (the rust build process should set this environment variable, too). This would allow future deprecations to be shown in the docs early, but not warned against by the stability lint (there could however be a `future-deprecation` lint that should be `Allow` by default) -* require either `reason` or `surrogate` be present +* require either `reason` or `use` be present * `reason` could include markdown formatting -* The `surrogate` could simply be plain text, which would remove much of the +* The `use` could simply be plain text, which would remove much of the complexity here -* The `surrogate` field could be left out and added later. However, this would -lead people to describe it in the `reason` field +* The `use` field contents could make use of the context in finding +replacements, e.g. extern crates, so that `time::precise_time_ns` would resolve +to the `time::precise_time_ns` API in the `time` crate, provided an +`extern crate time;` declaration is present +* The `use` field could be left out and added later. However, this would +lead people to describe a replacement in the `reason` field, as is already +happening in the case of rustc-private deprecation # Unresolved questions * What other restrictions should we introduce now to avoid being bound to a possibly flawed design? -* How should the multiple values in the `surrogate` field work? Just split by -some delimiter? +* How should the multiple values in the `use` field work? Just split by +comma or some other delimiter? * Can / Should the `std` library make use of the `#[deprecated]` extensions? * Bikeshedding: Are the names good enough? From 10d7e82fa2ed97bea4a1198e9fb9accc9cd4999a Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Wed, 16 Sep 2015 21:47:29 -0700 Subject: [PATCH 0544/1195] Update to only block * constraints Also update to a warning-first rollout --- text/0000-no-wildcard-deps.md | 68 ++++++++++++++++++++++------------- 1 file changed, 43 insertions(+), 25 deletions(-) diff --git a/text/0000-no-wildcard-deps.md b/text/0000-no-wildcard-deps.md index cd87b43f56e..80c3b7ed84c 100644 --- a/text/0000-no-wildcard-deps.md +++ b/text/0000-no-wildcard-deps.md @@ -11,7 +11,7 @@ constraints range from accepting exactly one version (`=1.2.3`), to accepting a range of versions (`^1.2.3`, `~1.2.3`, `>= 1.2.3, < 3.0.0`), to accepting any version at all (`*`). This RFC proposes to update crates.io to reject publishes of crates that have compile or build dependencies with -version constraints that have no upper bound. +a wildcard version constraint. # Motivation @@ -40,10 +40,13 @@ guarantees have on consumers of libraries. As an example, consider the [openssl](https://crates.io/crates/openssl) crate. It is one of the most popular libraries on crates.io, with several hundred downloads every day. 50% of the [libraries that depend on it](https://crates.io/crates/openssl/reverse_dependencies) -have a wildcard constraint on the version. Almost all of them them will fail -to compile against version 0.7 of openssl when it is released. When that -happens, users of those libraries will be forced to manually override Cargo's -version selection every time it is recalculated. This is not a fun time. +have a wildcard constraint on the version. None of them can build against every +version that has ever been released. Indeed, no libraries can since many of +those releases can before Rust 1.0 released. In addition, almost all of them +them will fail to compile against version 0.7 of openssl when it is released. +When that happens, users of those libraries will be forced to manually override +Cargo's version selection every time it is recalculated. This is not a fun +time. Bad version restrictions are also "viral". Even if a developer is careful to pick dependencies that have reasonable version restrictions, there could be a @@ -77,37 +80,52 @@ build dependencies, but not to dev dependencies. Dev dependencies are only used when testing a crate, so it doesn't matter to downstream consumers if they break. -# Detailed design - -Alter crates.io's pre-publish behavior to check the version constraints of all -compile and build dependencies, and reject those that have no upper bound. For -example, these would be rejected: +This RFC is not trying to prohibit *all* constraints that would run into the +issues described above. For example, the constraint `>= 0.0.0` is exactly +equivalent to `*`. This is for a couple of reasons: + +* It's not totally clear how to precisely define "reasonable" constraints. For +example, one might want to forbid constraints that allow unreleased major +versions. However, some crates provide strong guarantees that any breaks will +be followed by one full major version of deprecation. If a library author is +sure that their crate doesn't use any deprecated functionality of that kind of +dependency, it's completely safe and reasonable to explicitly extend the +version constraint to include the next unreleased version. +* Cargo and crates.io are missing tools to deal with overly-restrictive +constraints. For example, it's not currently possible to force Cargo to allow +dependency resolution that violates version constraints. Without this kind of +support, it is somewhat risky to push too hard towards tight version +constraints. +* Wildcard constraints are popular, at least in part, because they are the +path of least resistance when writing a crate. Without wildcard constraints, +crate authors will be forced to figure out what kind of constraints make the +most sense in their use cases, which may very well be good enough. - * `*` - * `> 0.3` - * `>= 0.3` +# Detailed design -While these would not: +The prohibition on wildcard constraints will be rolled out in stages to make +sure that crate authors have lead time to figure out their versioning stories. - * `>= 0.3, < 0.5` - * `^0.3` - * `~0.3` - * `=0.3.1` +In the next stable Rust release (1.4), Cargo will issue warnings for all +wildcard constraints on build and compile dependencies when publishing, but +publishes those constraints will still succeed. Along side the next stable +release after that (1.5 on December 11th, 2015), crates.io be updated to reject +publishes of crates with those kinds of dependency constraints. Note that the +check will happen on the crates.io side rather than on the Cargo side since +Cargo can publish to locations other than crates.io which may not worry about +these restrictions. # Drawbacks The barrier to entry when publishing a crate will be mildly higher. -In theory, there could be contexts where an unbounded version constraint is -actually appropriate? +Tightening constraints has the potential to cause resolution breakage when no +breakage would occur otherwise. # Alternatives We could continue allowing these kinds of constraints, but complain in a "sufficiently annoying" manner during publishes to discourage their use. -# Unresolved questions - -Should crates.io also forbid constraints that reference versions of -dependencies that don't yet exist? For example, a constraint of `>= 0.3, < 0.5` -where the dependency has no published versions in the `0.4` range. +This RFC originally proposed forbidding all constraints that had no upper +version bound but has since been pulled back to just `*` constraints. From 7ea2b4dd4a0640163603a8f9ff1ec4b7facec110 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 18 Sep 2015 15:01:44 -0400 Subject: [PATCH 0545/1195] Move to proper place, link to tracking issue etc. --- ...000-simd-infrastructure.md => 1199-simd-infrastructure.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-simd-infrastructure.md => 1199-simd-infrastructure.md} (99%) diff --git a/text/0000-simd-infrastructure.md b/text/1199-simd-infrastructure.md similarity index 99% rename from text/0000-simd-infrastructure.md rename to text/1199-simd-infrastructure.md index 4c8974fdfce..aa71c3b4665 100644 --- a/text/0000-simd-infrastructure.md +++ b/text/1199-simd-infrastructure.md @@ -1,7 +1,7 @@ - Feature Name: repr_simd, platform_intrinsics, cfg_target_feature - Start Date: 2015-06-02 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1199 +- Rust Issue: https://github.com/rust-lang/rust/issues/27731 # Summary From a4697d75c28ba89948b6b52816348cf7c5c883d7 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 18 Sep 2015 15:08:28 -0400 Subject: [PATCH 0546/1195] Rename RFC #1238 and add a note to RFC #769. --- text/0769-sound-generic-drop.md | 5 +++++ ...-nonparametric-dropck.md => 1238-nonparametric-dropck.md} | 4 ++-- 2 files changed, 7 insertions(+), 2 deletions(-) rename text/{0000-nonparametric-dropck.md => 1238-nonparametric-dropck.md} (99%) diff --git a/text/0769-sound-generic-drop.md b/text/0769-sound-generic-drop.md index 1cf9947beb0..1842124f203 100644 --- a/text/0769-sound-generic-drop.md +++ b/text/0769-sound-generic-drop.md @@ -2,6 +2,11 @@ - RFC PR: [rust-lang/rfcs#769](https://github.com/rust-lang/rfcs/pull/769) - Rust Issue: [rust-lang/rust#8861](https://github.com/rust-lang/rust/issues/8861) +# History + +2015.09.18 -- This RFC was partially superceded by RFC 1238, which +removed the parametricity-based reasoning in favor of an attribute. + # Summary Remove `#[unsafe_destructor]` from the Rust language. Make it safe diff --git a/text/0000-nonparametric-dropck.md b/text/1238-nonparametric-dropck.md similarity index 99% rename from text/0000-nonparametric-dropck.md rename to text/1238-nonparametric-dropck.md index aab682378ba..a79ead95e2a 100644 --- a/text/0000-nonparametric-dropck.md +++ b/text/1238-nonparametric-dropck.md @@ -1,7 +1,7 @@ - Feature Name: dropck_parametricity - Start Date: 2015-08-05 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1238/ +- Rust Issue: https://github.com/rust-lang/rust/issues/28498 # Summary From d8321d80778308d12beb79bb9a312e099255e019 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 18 Sep 2015 15:11:29 -0400 Subject: [PATCH 0547/1195] Accept and link RFC #1240. --- ...pr-packed-unsafe-ref.md => 1240-repr-packed-unsafe-ref.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-repr-packed-unsafe-ref.md => 1240-repr-packed-unsafe-ref.md} (99%) diff --git a/text/0000-repr-packed-unsafe-ref.md b/text/1240-repr-packed-unsafe-ref.md similarity index 99% rename from text/0000-repr-packed-unsafe-ref.md rename to text/1240-repr-packed-unsafe-ref.md index 473c7a991e1..6ac5b341974 100644 --- a/text/0000-repr-packed-unsafe-ref.md +++ b/text/1240-repr-packed-unsafe-ref.md @@ -1,7 +1,7 @@ - Feature Name: NA - Start Date: 2015-08-06 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1240 +- Rust Issue: https://github.com/rust-lang/rust/issues/27060 # Summary From 7fc338e5b4cdc46678cbcd20bbd2871f0ef34367 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 18 Sep 2015 15:13:24 -0400 Subject: [PATCH 0548/1195] Merge branch 'master' of github.com:rust-lang/rfcs --- README.md | 171 +++++++++++++++++++--------------- compiler_changes.md | 52 +++++++++++ lang_changes.md | 36 +++++++ libs_changes.md | 114 +++++++++++++++++++++++ text/0560-integer-overflow.md | 31 +++--- 5 files changed, 313 insertions(+), 91 deletions(-) create mode 100644 compiler_changes.md create mode 100644 lang_changes.md create mode 100644 libs_changes.md diff --git a/README.md b/README.md index 96d1a1774d1..9620754f5ea 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ implemented and reviewed via the normal GitHub pull request workflow. Some changes though are "substantial", and we ask that these be put through a bit of a design process and produce a consensus among the Rust -community and the [core team]. +community and the [sub-team]s. The "RFC" (request for comments) process is intended to provide a consistent and controlled path for new features to enter the language @@ -80,19 +80,21 @@ the direction the language is evolving in. * [RFC Postponement] * [Help this is all too informal!] + ## When you need to follow this process [When you need to follow this process]: #when-you-need-to-follow-this-process -You need to follow this process if you intend to make "substantial" -changes to Rust, Cargo, Crates.io, or the RFC process itself. What constitutes -a "substantial" change is evolving based on community norms, but may include -the following. +You need to follow this process if you intend to make "substantial" changes to +Rust, Cargo, Crates.io, or the RFC process itself. What constitutes a +"substantial" change is evolving based on community norms and varies depending +on what part of the ecosystem you are proposing to change, but may include the +following. - Any semantic or syntactic change to the language that is not a bugfix. - Removing language features, including those that are feature-gated. - - Changes to the interface between the compiler and libraries, -including lang items and intrinsics. - - Additions to `std` + - Changes to the interface between the compiler and libraries, including lang + items and intrinsics. + - Additions to `std`. Some changes do not require an RFC: @@ -108,6 +110,15 @@ If you submit a pull request to implement a new feature without going through the RFC process, it may be closed with a polite request to submit an RFC first. +For more details on when an RFC is required, please see the following specific +guidelines, these correspond with some of the Rust community's +[sub-teams](http://www.rust-lang.org/team.html): + +* [language changes](lang_changes.md), +* [library changes](libs_changes.md), +* [compiler changes](compiler_changes.md). + + ## Before creating an RFC [Before creating an RFC]: #before-creating-an-rfc @@ -130,12 +141,12 @@ on the [RFC issue tracker][issues], and occasionally posting review. As a rule of thumb, receiving encouraging feedback from long-standing -project developers, and particularly members of the [core team][core] +project developers, and particularly members of the relevant [sub-team] is a good indication that the RFC is worth pursuing. [issues]: https://github.com/rust-lang/rfcs/issues [discuss]: http://discuss.rust-lang.org/ -[core]: https://github.com/rust-lang/rust/wiki/Note-core-team + ## What the process is [What the process is]: #what-the-process-is @@ -146,49 +157,57 @@ is 'active' and may be implemented with the goal of eventual inclusion into Rust. * Fork the RFC repo http://github.com/rust-lang/rfcs -* Copy `0000-template.md` to `text/0000-my-feature.md` (where -'my-feature' is descriptive. don't assign an RFC number yet). -* Fill in the RFC. Put care into the details: RFCs that do not -present convincing motivation, demonstrate understanding of the -impact of the design, or are disingenuous about the drawbacks or -alternatives tend to be poorly-received. -* Submit a pull request. As a pull request the RFC will receive design -feedback from the larger community, and the author should be prepared -to revise it in response. -* During Rust triage, the pull request will either be closed (for RFCs -that clearly will not be accepted) or assigned a *shepherd*. The -shepherd is a trusted developer who is familiar with the process, who -will help to move the RFC forward, and ensure that the right people -see and review it. -* Build consensus and integrate feedback. RFCs that have broad support -are much more likely to make progress than those that don't receive -any comments. The shepherd assigned to your RFC should help you get -feedback from Rust developers as well. +* Copy `0000-template.md` to `text/0000-my-feature.md` (where 'my-feature' is +descriptive. don't assign an RFC number yet). +* Fill in the RFC. Put care into the details: RFCs that do not present +convincing motivation, demonstrate understanding of the impact of the design, or +are disingenuous about the drawbacks or alternatives tend to be poorly-received. +* Submit a pull request. As a pull request the RFC will receive design feedback +from the larger community, and the author should be prepared to revise it in +response. +* Each pull request will be labeled with the most relevant [sub-team]. +* Each sub-team triages its RFC PRs. The sub-team will will either close the PR +(for RFCs that clearly will not be accepted) or assign it a *shepherd*. The +shepherd is a trusted developer who is familiar with the RFC process, who will +help to move the RFC forward, and ensure that the right people see and review +it. +* Build consensus and integrate feedback. RFCs that have broad support are much +more likely to make progress than those that don't receive any comments. The +shepherd assigned to your RFC should help you get feedback from Rust developers +as well. * The shepherd may schedule meetings with the author and/or relevant -stakeholders to discuss the issues in greater detail, and in some -cases the topic may be discussed at the larger [weekly meeting]. In -either case a summary from the meeting will be posted back to the RFC -pull request. -* Once both proponents and opponents have clarified and defended -positions and the conversation has settled, the shepherd will take it -to the [core team] for a final decision. -* Eventually, someone from the [core team] will either accept the RFC -by merging the pull request, assigning the RFC a number (corresponding -to the pull request number), at which point the RFC is 'active', or -reject it by closing the pull request. +stakeholders to discuss the issues in greater detail. +* The sub-team will discuss the RFC PR, as much as possible in the comment +thread of the PR itself. Offline discussion will be summarized on the PR comment +thread. +* Once both proponents and opponents have clarified and defended positions and +the conversation has settled, the RFC will enter its *final comment period* +(FCP). This is a final opportunity for the community to comment on the PR and is +a reminder for all members of the sub-team to be aware of the RFC. +* The FCP lasts one week. It may be extended if consensus between sub-team +members cannot be reached. At the end of the FCP, the [sub-team] will either +accept the RFC by merging the pull request, assigning the RFC a number +(corresponding to the pull request number), at which point the RFC is 'active', +or reject it by closing the pull request. How exactly the sub-team decide on an +RFC is up to the sub-team. + ## The role of the shepherd [The role of the shepherd]: #the-role-of-the-shepherd -During triage, every RFC will either be closed or assigned a shepherd. -The role of the shepherd is to move the RFC through the process. This -starts with simply reading the RFC in detail and providing initial -feedback. The shepherd should also solicit feedback from people who -are likely to have strong opinions about the RFC. Finally, when this -feedback has been incorporated and the RFC seems to be in a steady -state, the shepherd will bring it to the meeting. In general, the idea -here is to "front-load" as much of the feedback as possible before the -point where we actually reach a decision. +During triage, every RFC will either be closed or assigned a shepherd from the +relevant sub-team. The role of the shepherd is to move the RFC through the +process. This starts with simply reading the RFC in detail and providing initial +feedback. The shepherd should also solicit feedback from people who are likely +to have strong opinions about the RFC. When this feedback has been incorporated +and the RFC seems to be in a steady state, the shepherd and/or sub-team leader +will announce an FCP. In general, the idea here is to "front-load" as much of +the feedback as possible before the point where we actually reach a decision - +by the end of the FCP, the decision on whether or not to accept the RFC should +usually be obvious from the RFC discussion thread. On occasion, there may not be +consensus but discussion has stalled. In this case, the relevant team will make +a decision. + ## The RFC life-cycle [The RFC life-cycle]: #the-rfc-life-cycle @@ -210,35 +229,36 @@ through to completion: authors should not expect that other project developers will take on responsibility for implementing their accepted feature. -Modifications to active RFC's can be done in followup PR's. We strive +Modifications to active RFC's can be done in follow-up PR's. We strive to write each RFC in a manner that it will reflect the final design of the feature; but the nature of the process means that we cannot expect every merged RFC to actually reflect what the end result will be at -the time of the next major release; therefore we try to keep each RFC -document somewhat in sync with the language feature as planned, -tracking such changes via followup pull requests to the document. +the time of the next major release. + +In general, once accepted, RFCs should not be substantially changed. Only very +minor changes should be submitted as amendments. More substantial changes should +be new RFCs, with a note added to the original RFC. Exactly what counts as a +"very minor change" is up to the sub-team to decide. There are some more +specific guidelines in the sub-team RFC guidelines for the [language](lang_changes.md), +[libraries](libs_changes.md), and [compiler](compiler_changes.md). -An RFC that makes it through the entire process to implementation is -considered 'complete' and is moved to the 'complete' folder; an RFC -that fails after becoming active is 'inactive' and moves to the -'inactive' folder. ## Reviewing RFC's [Reviewing RFC's]: #reviewing-rfcs While the RFC PR is up, the shepherd may schedule meetings with the author and/or relevant stakeholders to discuss the issues in greater -detail, and in some cases the topic may be discussed at the larger -[weekly meeting]. In either case a summary from the meeting will be +detail, and in some cases the topic may be discussed at a sub-team +meeting. In either case a summary from the meeting will be posted back to the RFC pull request. -The core team makes final decisions about RFCs after the benefits and -drawbacks are well understood. These decisions can be made at any -time, but the core team will regularly issue decisions on at least a -weekly basis. When a decision is made, the RFC PR will either be -merged or closed, in either case with a comment describing the -rationale for the decision. The comment should largely be a summary of -discussion already on the comment thread. +A sub-team makes final decisions about RFCs after the benefits and drawbacks are +well understood. These decisions can be made at any time, but the sub-team will +regularly issue decisions. When a decision is made, the RFC PR will either be +merged or closed. In either case, if the reasoning is not clear from the +discussion in thread, the sub-team will add a comment describing the rationale +for the decision. + ## Implementing an RFC [Implementing an RFC]: #implementing-an-rfc @@ -248,7 +268,7 @@ implemented right away. Other accepted RFC's can represent features that can wait until some arbitrary developer feels like doing the work. Every accepted RFC has an associated issue tracking its implementation in the Rust repository; thus that associated issue can -be assigned a priority via the [triage process] that the team uses for +be assigned a priority via the triage process that the team uses for all issues in the Rust repository. The author of an RFC is not obligated to implement it. Of course, the @@ -259,15 +279,18 @@ If you are interested in working on the implementation for an 'active' RFC, but cannot determine if someone else is already working on it, feel free to ask (e.g. by leaving a comment on the associated issue). + ## RFC Postponement [RFC Postponement]: #rfc-postponement -Some RFC pull requests are tagged with the 'postponed' label when they -are closed (as part of the rejection process). An RFC closed with -“postponed” is marked as such because we want neither to think about -evaluating the proposal nor about implementing the described feature -until after the next major release, and we believe that we can afford -to wait until then to do so. +Some RFC pull requests are tagged with the 'postponed' label when they are +closed (as part of the rejection process). An RFC closed with “postponed” is +marked as such because we want neither to think about evaluating the proposal +nor about implementing the described feature until some time in the future, and +we believe that we can afford to wait until then to do so. Historically, +"postponed" was used to postpone features until after 1.0. Postponed PRs may be +re-opened when the time is right. We don't have any formal process for that, you +should ask members of the relevant sub-team. Usually an RFC pull request marked as “postponed” has already passed an informal first round of evaluation, namely the round of “do we @@ -285,6 +308,4 @@ present circumstances. As usual, we are trying to let the process be driven by consensus and community norms, not impose more structure than necessary. -[core team]: https://github.com/mozilla/rust/wiki/Note-core-team -[triage process]: https://github.com/rust-lang/rust/wiki/Note-development-policy#milestone-and-priority-nomination-and-triage -[weekly meeting]: https://github.com/rust-lang/meeting-minutes +[sub-team]: http://www.rust-lang.org/team.html diff --git a/compiler_changes.md b/compiler_changes.md new file mode 100644 index 00000000000..75137743041 --- /dev/null +++ b/compiler_changes.md @@ -0,0 +1,52 @@ +# RFC policy - the compiler + +We have not previously had an RFC system for compiler changes, so policy here is +likely to change as we get the hang of things. We don't want to slow down most +compiler development, but on the other hand we do want to do more design work +ahead of time on large additions and refactorings. + +Compiler RFCs will be managed by the compiler sub-team, and tagged `T-compiler`. +The compiler sub-team will do an initial triage of new PRs within a week of +submission. The result of triage will either be that the PR is assigned to a +member of the sub-team for shepherding, the PR is closed because the sub-team +believe it should be done without an RFC, or closed because the sub-team feel it +should clearly not be done and further discussion is not necessary. We'll follow +the standard procedure for shepherding, final comment period, etc. + +Where there is significant design work for the implementation of a language +feature, the preferred workflow is to submit two RFCs - one for the language +design and one for the implementation design. The implementation RFC may be +submitted later if there is scope for large changes to the language RFC. + + +## Changes which need an RFC + +* Large refactorings or redesigns of the compiler +* Changing the API presented to syntax extensions or other compiler plugins in + non-trivial ways +* Adding, removing, or changing a stable compiler flag +* The implementation of new language features where there is significant change + or addition to the compiler. There is obviously some room for interpretation + about what consitutes a "significant" change and how much detail the + implementation RFC needs. For guidance, [associated items](text/0195-associated-items.md) + and [UFCS](text/0132-ufcs.md) would clearly need an implementation RFC, + [type ascription](text/0803-type-ascription.md) and + [lifetime elision](text/0141-lifetime-elision.md) would not. +* Any other change which causes backwards incompatible changes to stable + behaviour of the compiler, language, or libraries + + +## Changes which don't need an RFC + +* Bug fixes, improved error messages, etc. +* Minor refactoring/tidying up +* Implmenting language features which have an accepted RFC, where the + implementation does not significantly change the compiler or require + significant new design work +* Adding unstable API for tools (note that all compiler API is currently unstable) +* Adding, removing, or changing an unstable compiler flag (if the compiler flag + is widely used there should be at least some discussion on discuss, or an RFC + in some cases) + +If in doubt it is probably best to just announce the change you want to make to +the compiler subteam on discuss or IRC, and see if anyone feels it needs an RFC. diff --git a/lang_changes.md b/lang_changes.md new file mode 100644 index 00000000000..7e7e6a732e7 --- /dev/null +++ b/lang_changes.md @@ -0,0 +1,36 @@ +# RFC policy - language design + +Pretty much every change to the language needs an RFC. + +Language RFCs are managed by the language sub-team, and tagged `T-lang`. The +language sub-team will do an initial triage of new PRs within a week of +submission. The result of triage will either be that the PR is assigned to a +member of the sub-team for shepherding, the PR is closed as postponed because +the subteam believe it might be a good idea, but is not currently aligned with +Rust's priorities, or the PR is closed because the sub-team feel it should +clearly not be done and further discussion is not necessary. In the latter two +cases, the sub-team will give a detailed explanation. We'll follow the standard +procedure for shepherding, final comment period, etc. + + +## Amendments + +Sometimes in the implementation of an RFC, changes are required. In general +these don't require an RFC as long as they are very minor and in the spirit of +the accepted RFC (essentially bug fixes). In this case implementers should +submit an RFC PR which amends the accepted RFC with the new details. Although +the RFC repository is not intended as a reference manual, it is preferred that +RFCs do reflect what was actually implemented. Amendment RFCs will go through +the same process as regular RFCs, but should be less controversial and thus +should move more quickly. + +When a change is more dramatic, it is better to create a new RFC. The RFC should +be standalone and reference the original, rather than modifying the existing +RFC. You should add a comment to the original RFC with referencing the new RFC +as part of the PR. + +Obviously there is some scope for judgment here. As a guideline, if a change +affects more than one part of the RFC (i.e., is a non-local change), affects the +applicability of the RFC to its motivating use cases, or there are multiple +possible new solutions, then the feature is probably not 'minor' and should get +a new RFC. diff --git a/libs_changes.md b/libs_changes.md new file mode 100644 index 00000000000..31f1de0210d --- /dev/null +++ b/libs_changes.md @@ -0,0 +1,114 @@ +# RFC guidelines - libraries sub-team + +# Motivation + +* RFCs are heavyweight: + * RFCs generally take at minimum 2 weeks from posting to land. In + practice it can be more on the order of months for particularly + controversial changes. + * RFCs are a lot of effort to write; especially for non-native speakers or + for members of the community whose strengths are more technical than literary. + * RFCs may involve pre-RFCs and several rewrites to accommodate feedback. + * RFCs require a dedicated shepherd to herd the community and author towards + consensus. + * RFCs require review from a majority of the subteam, as well as an official + vote. + * RFCs can't be downgraded based on their complexity. Full process always applies. + Easy RFCs may certainly land faster, though. + * RFCs can be very abstract and hard to grok the consequences of (no implementation). + +* PRs are low *overhead* but potentially expensive nonetheless: + * Easy PRs can get insta-merged by any rust-lang contributor. + * Harder PRs can be easily escalated. You can ping subject-matter experts for second + opinions. Ping the whole team! + * Easier to grok the full consequences. Lots of tests and Crater to save the day. + * PRs can be accepted optimistically with bors, buildbot, and the trains to guard + us from major mistakes making it into stable. The size of the nightly community + at this point in time can still mean major community breakage regardless of trains, + however. + * HOWEVER: Big PRs can be a lot of work to make only to have that work rejected for + details that could have been hashed out first. + +* RFCs are *only* meaningful if a significant and diverse portion of the +community actively participates in them. The official teams are not +sufficiently diverse to establish meaningful community consensus by agreeing +amongst themselves. + +* If there are *tons* of RFCs -- especially trivial ones -- people are less +likely to engage with them. Official team members are super busy. Domain experts +and industry professionals are super busy *and* have no responsibility to engage +in RFCs. Since these are *exactly* the most important people to get involved in +the RFC process, it is important that we be maximally friendly towards their +needs. + + +# Is an RFC required? + +The overarching philosophy is: *do whatever is easiest*. If an RFC +would be less work than an implementation, that's a good sign that an RFC is +necessary. That said, if you anticipate controversy, you might want to short-circuit +straight to an RFC. For instance new APIs almost certainly merit an RFC. Especially +as `std` has become more conservative in favour of the much more agile cargoverse. + +* **Submit a PR** if the change is a: + * Bugfix + * Docfix + * Obvious API hole patch, such as adding an API from one type to a symmetric type. + e.g. `Vec -> Box<[T]>` clearly motivates adding `String -> Box` + * Minor tweak to an unstable API (renaming, generalizing) + * Implementing an "obvious" trait like Clone/Debug/etc +* **Submit an RFC** if the change is a: + * New API + * Semantic Change to a stable API + * Generalization of a stable API (e.g. how we added Pattern or Borrow) + * Deprecation of a stable API + * Nontrivial trait impl (because all trait impls are insta-stable) +* **Do the easier thing** if uncertain. (choosing a path is not final) + + +# Non-RFC process + +* A (non-RFC) PR is likely to be **closed** if clearly not acceptable: + * Disproportionate breaking change (small inference breakage may be acceptable) + * Unsound + * Doesn't fit our general design philosophy around the problem + * Better as a crate + * Too marginal for std + * Significant implementation problems + +* A PR may also be closed because an RFC is approriate. + +* A (non-RFC) PR may be **merged as unstable**. In this case, the feature +should have a fresh feature gate and an associated tracking issue for +stabilisation. Note that trait impls and docs are insta-stable and thus have no +tracking issue. This may imply requiring a higher level of scrutiny for such +changes. + +However, an accepted RFC is not a rubber-stamp for merging an implementation PR. +Nor must an implementation PR perfectly match the RFC text. Implementation details +may merit deviations, though obviously they should be justified. The RFC may be +amended if deviations are substantial, but are not generally necessary. RFCs should +favour immutability. The RFC + Issue + PR should form a total explanation of the +current implementation. + +* Once something has been merged as unstable, a shepherd should be assigned + to promote and obtain feedback on the design. + +* Every time a release cycle ends, the libs teams assesses the current unstable + APIs and selects some number of them for potential stabilization during the + next cycle. These are announced for FCP at the beginning of the cycle, and + (possibly) stabilized just before the beta is cut. + +* After the final comment period, an API should ideally take one of two paths: + * **Stabilize** if the change is desired, and consensus is reached + * **Deprecate** is the change is undesired, and consensus is reached + * **Extend the FCP** is the change cannot meet consensus + * If consensus *still* can't be reached, consider requiring a new RFC or + just deprecating as "too controversial for std". + +* If any problems are found with a newly stabilized API during its beta period, + *strongly* favour reverting stability in order to prevent stabilizing a bad + API. Due to the speed of the trains, this is not a serious delay (~2-3 months + if it's not a major problem). + + diff --git a/text/0560-integer-overflow.md b/text/0560-integer-overflow.md index de14896fe01..539f225c1cd 100644 --- a/text/0560-integer-overflow.md +++ b/text/0560-integer-overflow.md @@ -125,10 +125,15 @@ The error conditions that can arise, and their defined results, are as follows. The intention is that the defined results are the same as the defined results today. The only change is that now a panic may result. -- The operations `+`, `-`, `*`, `/`, `%` can underflow and - overflow. -- Shift operations (`<<`, `>>`) can shift a value of width `N` by more - than `N` bits. +- The operations `+`, `-`, `*`, can underflow and overflow. When checking is + enabled this will panic. When checking is disabled this will two's complement + wrap. +- The operations `/`, `%` for the arguments `INT_MIN` and `-1` + will unconditionally panic. This is unconditional for legacy reasons. +- Shift operations (`<<`, `>>`) on a value of with `N` can be passed a shift value + >= `N`. It is unclear what behaviour should result from this, so the shift value + is unconditionally masked to be modulo `N` to ensure that the argument is always + in range. ## Enabling overflow checking @@ -145,7 +150,7 @@ potential overflow (and, in particular, for code where overflow is expected and normal, they will be immediately guided to use the wrapping methods introduced below). However, because these checks will be compiled out whenever an optimized build is produced, final code -wilil not pay a performance penalty. +will not pay a performance penalty. In the future, we may add additional means to control when overflow is checked, such as scoped attributes or a global, independent @@ -451,17 +456,7 @@ were: # Unresolved questions -The C semantics of wrapping operations in some cases are undefined: - -- `INT_MIN / -1`, `INT_MIN % -1` -- Shifts by an excessive number of bits - -This RFC takes no position on the correct semantics of these -operations, simply preserving the existing semantics. However, it may -be worth trying to define the wrapping semantics of these operations -in a portable way, even if that implies some runtime cost. Since these -are all error conditions, this is an orthogonal topic to the matter of -overflow. +None today (see Updates section below). # Future work @@ -491,6 +486,10 @@ Since it was accepted, the RFC has been updated as follows: 2. `as` was changed to restore the behavior before the RFC (that is, it truncates to the target bitwidth and reinterprets the highest order bit, a.k.a. sign-bit, as necessary, as a C cast would). +3. Shifts were specified to mask off the bits of over-long shifts. +4. Overflow was specified to be two's complement wrapping (this was mostly + a clarification). +5. `INT_MIN / -1` and `INT_MIN % -1` panics. # Acknowledgements and further reading From 5f06b95ac5b2500c1689f33bf4849000294916bf Mon Sep 17 00:00:00 2001 From: Yehuda Katz Date: Sun, 20 Sep 2015 21:49:01 -0700 Subject: [PATCH 0549/1195] Improvements to the Time APIs `SystemTime` and `LocalTime` --- text/0000-time-improvements.md | 327 +++++++++++++++++++++++++++++++++ 1 file changed, 327 insertions(+) create mode 100644 text/0000-time-improvements.md diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md new file mode 100644 index 00000000000..861c5000fc8 --- /dev/null +++ b/text/0000-time-improvements.md @@ -0,0 +1,327 @@ +- Feature Name: time_improvements +- Start Date: 2015-09-20 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC proposes several new types and associated APIs for working with times in Rust. +The primary new types are `ProcessTime`, for working with monotonic time within a single +process, and `SystemTime`, for working with times across processes on a single system +(usually internally represented as a number of seconds since an epoch). + +# Motivations + +The primary motivation of this RFC is to flesh out a larger set of APIs for +representing instants in time and durations of time. + +For various reasons that this RFC will explore, APIs related to time are fairly +error-prone and have a number of caveats that programmers do not expect. + +Rust APIs tend to expose more of these kinds of caveats through their APIs, in +order to help programmers become aware of and handle edge-cases. At the same +time, un-ergonomic APIs can work against that goal. + +This RFC attempts to balance the desire to expose common footguns and help +programmers handle edge-cases with a desire to avoid creating so many hoops to +jump through that the useful caveats get ignored. + +At a high level, this RFC covers two concepts related to time: + +* Instants, moments in time +* Durations, an amount of time between two instants + +We would like to be able to do some basic operations with these instants: + +* Compare two instants +* Add a time period to an instant +* Subtract a time period from an instant +* Compare an instant to "now" to discover time elapsed + +However, there are a number of problems that arise when trying to define these +types and operations. + +First of all, with the exception of instants created in the rutime of a single +process, instants are not monotonic. A simple example of this is that if a +program creates two files sequentially, it cannot assume that the creation time +of the second file is later than the creation time of the first file. + +This is because NTP (the network time protocol) can arbitrarily change the +system clock, and can even **rewind time**. This kind of time travel means that +the "system time-line" is not continuous and monotonic, which is something that +programmers very often forget when writing code involving machine times. + +This design attempts to help programmers avoid some of the most egregious and +unexpected consequences of this kind of "time travel". + +--- + +Leap seconds, which cannot be predicted, mean that it is impossible +to reliably add a number of seconds to a particular instant represented as a +human date and time ("1 million seconds from 2015-09-20 at midnight"). + +They also mean that seemingly simple concepts, like "1 minute", have caveats +depending on exactly how they are used. Caveats related to leap seconds +create real-world bugs, because of how unusual leap seconds are, and how +unlikely programmers are to consider "12:00:60" as a valid time. + +Certain kinds of seemingly simple operations may not make sense in +all cases. For example, adding "1 year" to February 29, 2012 would produce +February 29, 2013, which is not a valid date. Adding "1 month" to August 31, +2015 would produce September 31, 2015, which is also not a valid date. + +Certain human descriptions of durations, like "1 month and 35 days" +do not make sense, and human descriptions like "1 month and 5 days" have +ambiguous meaning when used in operations (do you add 1 month first and then +5 days or vice versa). + + +For these reasons, this RFC does not attempt to define a human duration with +fields for years, days or months. Such a duration would be difficult to use +in operations without hard-to-remember ordering rules. + +For these reasons, this RFC does not propose APIs related to human concepts +dates and times. It is intentionally forwards-compatible with such +extensions. + +--- + +Finally, many APIs that **take** a `Duration` can only do something useful with +positive values. For example, a timeout API would not know how to wait a +negative amount of time before timing out. Even discounting the possibility of +coding mistakes, the problem of system clock time travel means that programmers +often produce negative durations that they did not expect, and APIs that +liberally accept negative durations only propagate the error further. + +As a result, this RFC makes a number of simplifying assumptions that can be +relaxed over time with additional types or through further RFCs: + +It provides convenience methods for constructing Durations from larger units +of time (minutes, hours, days), but gives them names like +`Duration.from_standard_hour`. A standard hour is 3600 seconds, regardless of + +It provides APIs that are expected to produce positive `Duration`s, and expects +that APIs like timeouts will accept positive `Durations` (which is currently +the case in Rust's standard library). These APIs help the programmer discover +the possibility of system clock time travel, and either handle the error explicitly, +or at least avoid propagating the problem into other APIs (by using `unwrap`). + +It separates monotonic time (`ProcessTime`) from time derived from the system +clock (`SystemTime`), which must account for the possibility of time travel. +This allows methods related to monotonic time to be uncaveated, while working +with the system clock has more methods that return `Result`s. + +This RFC does not attempt to define a type for calendared DateTimes, nor does it +directly address time zones. + +# Proposal + +## Types + +```rs +pub struct ProcessTime { + secs: u64, + nanos: u32 +} + +pub struct SystemTime { + secs: u64, + nanos: u32 +} + +pub struct Duration { + secs: u64, + nanos: u32 +} +``` + +### ProcessTime + +`ProcessTime` is the simplest of the instant types. It represents an opaque +(non-serializable!) timestamp that is guaranteed to be monotonic throughout +the timeframe of the process it was created in. + +> In this context, monotonic means that a timestamp created later in real-world +> time will always be larger than a timestamp created earlier in real-world +> time. + +The `Duration` type can be used in conjunction with `ProcessTime`, and these +operations have none of the usual time-related caveats. + +* Add a `Duration` to a `ProcessTime`, producing a new `ProcessTime` +* compare two `ProcessTime`s to each other +* subtract a `ProcessTime` from a later `ProcessTime`, producing a `Duration` +* ask for an amount of time elapsed since a `ProcessTime`, producing a `Duration` + +Asking for an amount of time elapsed from a given `SystemTime` is a very common +operation that is guaranteed to produce a positive `Duration`. Asking for the +difference between an earlier and a later `SystemTime` also produces a positive +`Duration` when used correctly. + +This design does not assume that negative `Duration`s are never useful, but +rather than the most common uses of `Duration` do not have a meaningful +use for negative values. Rather than require each API that takes a `Duration` +to produce an `Err` (or `panic!`) when receiving a negative value, this design +optimizes for the broadly useful positive `Duration`. + +```rs +impl SystemTime { + /// Panics if `earlier` is later than &self. + /// Because SystemTime is monotonic, the only ime that `earlier` should be + /// a later time is a bug in your code. + pub fn duration_from_earlier(&self, earlier: SystemTime) -> SystemTime; + + /// Panics if self is later than the current time (can happen if a SystemTime + /// is produced synthetically) + pub fn elapsed(&self) -> Duration; +} + +impl Add for SystemTime { + type Output = SystemTime; +} + +impl Sub for SystemTime { + type Output = SystemTime; +} + +impl PartialEq for SystemTime; +impl Eq for SystemTime; +impl PartialOrd for SystemTime; +impl Ord for SystemTime; +``` + +For convenience, several new constructors are added to `Duration`. Because any +unit greater than seconds has caveats related to leap seconds, all of the +constructors take "standard" units. For example a "standard minute" is 60 +seconds, while a "standard hour" is 3600 seconds. + +The "standard" terminology comes from [JodaTime][joda-time-standard]. + +[joda-time-standard]: http://joda-time.sourceforge.net/apidocs/org/joda/time/Duration.html#standardDays(long) + +```rs +impl Duration { + /// a standard minute is 60 seconds + /// panics if the number of minutes is larger than u64 seconds + pub fn from_standard_minutes(minutes: u64) -> Duration; + + /// a standard hour is 60 standard minutes + /// panics if the number of hours is larger than u64 seconds + pub fn from_standard_hours(hours: u64) -> Duration; + + /// a standard day is 24 standard hours + /// panics if the number of days is larger than u64 seconds + pub fn from_standard_days(days: u64) -> Duration; +} +``` + +### SystemTime + +**This type should not be used for in-process timestamps, like those used in +benchmarks.** + +A `SystemTime` represents a time stored on the local machine derived from the +system clock. For example, it is used to represent `mtime` on the file system. + +The most important caveat of `SystemTime` is that it is **not monotonic**. This +means that you can save a file to the file system, then save another file to +the file system, **and the second file has an `mtime` earlier than the second**. + +> **This means that an operation that happens after another operation in real +> time may have an earlier `SystemTime`!** + +In practice, most programmers do not think about this kind of "time travel" +with the system clock, leading to strange bugs once the mistaken assumption +propagates through the system. + +This design attempts to help the programmer catch the most egregious of these +kinds of mistakes (unexpected travel **back in time**) before the mistake +propagates. + +```rs +impl SystemTime { + /// Returns an `Err` if `earlier` is later + pub fn duration_from_earlier(&self, earlier: SystemTime) -> Result; + + /// Returns an `Err` if &self is later than the current system time. + pub fn elapsed(&self) -> Result; +} + +impl Add for SystemTime { + type Output = SystemTime; +} + +impl Sub for SystemTime { + type Output = SystemTime; +} + +// Note that none of these operations actually imply that the underlying system +// operation that produced these SystemTimes happened at the same time +// (for Eq) or before/after (for Ord) than the other system operation. +impl PartialEq for SystemTime; +impl Eq for SystemTime; +impl PartialOrd for SystemTime; +impl Ord for SystemTime; +``` + +The main difference from the design of `ProcessTime` is that it is impossible to +know for sure that a `SystemTime` is in the past, even if the operation that +produced it happened in the past (in real time). + +--- + +##### Illustrative Example: + +If a program requests a `SystemTime` that represents the `mtime` of a given file, +then writes a new file and requests its `SystemTime`, it may expect the second +`SystemTime` to be after the first. + +Using `duration_from_earlier` will remind the programmer that "time travel" is +possible, and make it easy to handle that case. As always, the programmer can +use `.unwrap()` in the prototype stage to avoid having to handle the edge-case +yet, while retaining a reminder that the edge-case is possible. + +# Drawbacks + +This RFC defines two new types for describing times, and posits a third type +to complete the picture. At first glance, having three different APIs for +working with times may seem overly complex. + +However, there are significant differences between times that only go forward +and times that can go forward or backward. There are also significant differences +between times represented as a number since an epoch and time represented in +human terms. + +As a result, this RFC chose to make these differences explicit, allowing +ergonomic, uncaveated use of monotonic time, and a small speedbump when +working with times that can move both forward and backward. + +# Alternatives + +One alternative design would be to attempt to have a single unified time +type. The rationale for now doing so is explained under Drawbacks. + +Another possible alternative is to allow free math between instants, +rather than providing operations for comparing later instants to earlier +ones. + +In practice, the vast majority of APIs **taking** a `Duration` expect +a positive-only `Duration`, and therefore code that subtracts a time +from another time will usually want a positive `Duration`. + +The problem is especially acute when working with `SystemTime`, where +it is possible for a question like: "how much time has elapsed since +I created this file" to return a negative Duration! + +This RFC attempts to catch mistakes related to negative `Duration`s at +the point where they are produced, rather than requiring all APIs that +**take** a `Duration` to guard against negative values. + +Because `Ord` is implemented on `SystemTime` and `ProcessTime`, it is +possible to compare two arbitrary times to each other first, and then +use `duration_from_earlier` reliably to get a positive `Duration`. + +# Unresolved Questions + +This RFC leaves types related to human representations of dates and times +to a future proposal. \ No newline at end of file From 95d180e76d7b0c9051a5d840519bd8cea3b3ad16 Mon Sep 17 00:00:00 2001 From: Yehuda Katz Date: Mon, 21 Sep 2015 00:59:11 -0700 Subject: [PATCH 0550/1195] Fixed a bunch of typos; thx @nagisa! --- text/0000-time-improvements.md | 35 ++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index 861c5000fc8..f7185aa0dd2 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -98,7 +98,8 @@ relaxed over time with additional types or through further RFCs: It provides convenience methods for constructing Durations from larger units of time (minutes, hours, days), but gives them names like -`Duration.from_standard_hour`. A standard hour is 3600 seconds, regardless of +`Duration.from_standard_hour`. A standard hour is always 3600 seconds, +regardless of leap seconds. It provides APIs that are expected to produce positive `Duration`s, and expects that APIs like timeouts will accept positive `Durations` (which is currently @@ -153,9 +154,9 @@ operations have none of the usual time-related caveats. * subtract a `ProcessTime` from a later `ProcessTime`, producing a `Duration` * ask for an amount of time elapsed since a `ProcessTime`, producing a `Duration` -Asking for an amount of time elapsed from a given `SystemTime` is a very common +Asking for an amount of time elapsed from a given `ProcessTime` is a very common operation that is guaranteed to produce a positive `Duration`. Asking for the -difference between an earlier and a later `SystemTime` also produces a positive +difference between an earlier and a later `ProcessTime` also produces a positive `Duration` when used correctly. This design does not assume that negative `Duration`s are never useful, but @@ -165,29 +166,29 @@ to produce an `Err` (or `panic!`) when receiving a negative value, this design optimizes for the broadly useful positive `Duration`. ```rs -impl SystemTime { +impl ProcessTime { /// Panics if `earlier` is later than &self. - /// Because SystemTime is monotonic, the only ime that `earlier` should be + /// Because ProcessTime is monotonic, the only time that `earlier` should be /// a later time is a bug in your code. - pub fn duration_from_earlier(&self, earlier: SystemTime) -> SystemTime; + pub fn duration_from_earlier(&self, earlier: ProcessTime) -> ProcessTime; - /// Panics if self is later than the current time (can happen if a SystemTime + /// Panics if self is later than the current time (can happen if a ProcessTime /// is produced synthetically) pub fn elapsed(&self) -> Duration; } -impl Add for SystemTime { +impl Add for ProcessTime { type Output = SystemTime; } -impl Sub for SystemTime { - type Output = SystemTime; +impl Sub for ProcessTime { + type Output = ProcessTime; } -impl PartialEq for SystemTime; -impl Eq for SystemTime; -impl PartialOrd for SystemTime; -impl Ord for SystemTime; +impl PartialEq for ProcessTime; +impl Eq for ProcessTime; +impl PartialOrd for ProcessTime; +impl Ord for ProcessTime; ``` For convenience, several new constructors are added to `Duration`. Because any @@ -241,10 +242,10 @@ propagates. ```rs impl SystemTime { /// Returns an `Err` if `earlier` is later - pub fn duration_from_earlier(&self, earlier: SystemTime) -> Result; + pub fn duration_from_earlier(&self, earlier: SystemTime) -> Result; /// Returns an `Err` if &self is later than the current system time. - pub fn elapsed(&self) -> Result; + pub fn elapsed(&self) -> Result; } impl Add for SystemTime { @@ -323,5 +324,7 @@ use `duration_from_earlier` reliably to get a positive `Duration`. # Unresolved Questions +What should `SystemTimeError` look like? + This RFC leaves types related to human representations of dates and times to a future proposal. \ No newline at end of file From c575a8e961881731cdccddfc274f043e84a97851 Mon Sep 17 00:00:00 2001 From: Wangshan Lu Date: Tue, 22 Sep 2015 23:34:45 +0800 Subject: [PATCH 0551/1195] Fix typo --- text/0000-time-improvements.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index f7185aa0dd2..56f3bc64a2a 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -170,7 +170,7 @@ impl ProcessTime { /// Panics if `earlier` is later than &self. /// Because ProcessTime is monotonic, the only time that `earlier` should be /// a later time is a bug in your code. - pub fn duration_from_earlier(&self, earlier: ProcessTime) -> ProcessTime; + pub fn duration_from_earlier(&self, earlier: ProcessTime) -> Duration; /// Panics if self is later than the current time (can happen if a ProcessTime /// is produced synthetically) @@ -327,4 +327,4 @@ use `duration_from_earlier` reliably to get a positive `Duration`. What should `SystemTimeError` look like? This RFC leaves types related to human representations of dates and times -to a future proposal. \ No newline at end of file +to a future proposal. From 415125fca3318d88a9279d81934980744190a585 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 21 Sep 2015 14:48:32 -0700 Subject: [PATCH 0552/1195] RFC: Promote the `libc` crate from the nursery Move the `libc` crate into the `rust-lang` organization after applying changes such as: * Remove the internal organization of the crate in favor of just one flat namespace at the top of the crate. * Set up a large number of CI builders to verify FFI bindings across many platforms in an automatic fashion. * Define the scope of libc in terms of bindings it will provide for each platform. --- text/0000-promote-libc.md | 308 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 308 insertions(+) create mode 100644 text/0000-promote-libc.md diff --git a/text/0000-promote-libc.md b/text/0000-promote-libc.md new file mode 100644 index 00000000000..127826b397e --- /dev/null +++ b/text/0000-promote-libc.md @@ -0,0 +1,308 @@ +- Feature Name: N/A +- Start Date: 2015-09-21 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Promote the `libc` crate from the nursery into the `rust-lang` organization +after applying changes such as: + +* Remove the internal organization of the crate in favor of just one flat + namespace at the top of the crate. +* Set up a large number of CI builders to verify FFI bindings across many + platforms in an automatic fashion. +* Define the scope of libc in terms of bindings it will provide for each + platform. + +# Motivation + +The current `libc` crate is a bit of a mess unfortunately, having long since +departed from its original organization and scope of definition. As more +platforms have been added over time as well as more APIs in general, the +internal as well as external facing organization has become a bit muddled. Some +specific concerns related to organization are: + +* There is a vast amount of duplication between platforms with some common + definitions. For example all BSD-like platforms end up defining a similar set + of networking struct constants with the same definitions, but duplicated in + many locations. +* Some subset of `libc` is reexported at the top level via globs, but not all of + `libc` is reexported in this fashion. +* When adding new APIs it's unclear what modules it should be placed into. It's + not always the case that the API being added conforms to one of the existing + standards that a module exist for and it's not always easy to consult the + standard itself to see if the API is in the standard. +* Adding a new platform to liblibc largely entails just copying a huge amount of + code from some previously similar platform and placing it at a new location in + the file. + +Additionally, on the technical and tooling side of things some concerns are: + +* None of the FFI bindings in this module are verified in terms of testing. + This means that they are both not automatically generated nor verified, and + it's highly likely that there are a good number of mistakes throughout. +* It's very difficult to explore the documentation for libc on different + platforms, but this is often one of the more important libraries to have + documentation for across all platforms. + +The purpose of this RFC is to largely propose a reorganization of the libc +crate, along with tweaks to some of the mundane details such as internal +organization, CI automation, how new additions are accepted, etc. These changes +should all help push `libc` to a more more robust position where it can be well +trusted across all platforms both now and into the future! + +# Detailed design + +All design can be previewed as part of an [in progress fork][libc] available on +GitHub. Additionally, all mentions of the `libc` crate in this RFC refer to the +external copy on crates.io, not the in-tree one in the `rust-lang/rust` +repository. No changes are being proposed (e.g. to stabilize) the in-tree copy. + +[libc]: https://github.com/alexcrichton/libc + +### What is this crate? + +The primary purpose of this crate is to provide all of the definitions +necessary to easily interoperate with C code (or "C-like" code) on each of the +platforms that Rust supports. This includes type definitions (e.g. `c_int`), +constants (e.g. `EINVAL`) as well as function headers (e.g. `malloc`). + +One question that typically comes up with this sort of purpose is whether the +crate is "cross platform" in the sense that it basically just works across the +platforms it supports. The `libc` crate, however, **is not intended to be cross +platform** but rather the opposite, an exact binding to the platform in +question. In essence, the `libc` crate is targeted as "replacement for +`#include` in Rust" for traditional system header files, but it makes no +effort to be help being portable by tweaking type definitions and signatures. + +### The Home of `libc` + +Currently this crate resides inside of the main `rust` repo of the `rust-lang` +organization, but this unfortunately somewhat hinders its development as it +takes awhile to land PRs and isn't quite as quick to release as external +repositories. As a result, this RFC proposes having the crate reside externally +in the `rust-lang` organization so additions can be made through PRs (tested +much more quickly). + +The main repository will have a submodule pointing at the external repository to +continue building libstd. + +### Public API + +The `libc` crate will hide all internal organization of the crate from users of +the crate. All items will be reexported at the top level as part of a flat +namespace. This brings with it a number of benefits: + +* The internal structure can evolve over time to better fit new platforms + while being backwards compatible. +* This design matches what one would expect from C, where there's only a flat + namespace available. +* Finding an API is quite easy as the answer is "it's always at the root". + +A downside of this approach, however, is that the public API of `libc` will be +platform-specific (e.g. the set of symbols it exposes is different across +platforms), which isn't seen very commonly throughout the rest of the Rust +ecosystem today. This can be mitigated, however, by clearly indicating that this +is a platform specific library in the sense that it matches what you'd get if +you were writing C code across multiple platforms. + +The API itself will include any number of definitions typically found in C +header files such as: + +* C types, e.g. typedefs, primitive types, structs, etc. +* C constants, e.g. `#define` directives +* C statics +* C functions (their headers) +* C macros (exported as `#[inline]` functions in Rust) + +As a technical detail, all `struct` types exposed in `libc` will be guaranteed +to implement the `Copy` and `Clone` traits. There will be an optional feature of +the library to implement `Debug` for all structs, but it will be turned off by +default. + +### Changes from today + +The [in progress][libc] implementation of this RFC has a number of API changes +and breakages from today's `libc` crate. Almost all of them are minor and +targeted at making bindings more correct in terms of faithfully representing the +underlying platforms. + +There is, however, one large notable change from today's crate. The `size_t`, +`ssize_t`, `ptrdiff_t`, `intptr_t`, and `uintptr_t` types are all defined in +terms of `isize` and `usize` instead of known sizes. Brought up by @briansmith +on [#28096][isizeusize] this helps decrease the number of casts necessary in +normal code and matches the existing definitions on all platforms that `libc` +supports today. In the future if a platform is added where these type +definitions are not correct then new ones will simply be available for that +target platform (and casts will be necessary if targeting it). + +[isizeusize]: https://github.com/rust-lang/rust/pull/28096 + +Note that part of this change depends upon removing the compiler's +lint-by-default about `isize` and `usize` being used in FFI definitions. This +lint is mostly a holdover from when the types were named `int` and `uint` and it +was easy to confuse them with C's `int` and `unsigned int` types. + +The final change to the `libc` crate will be to bump its version to 1.0.0, +signifying that breakage has happened (a bump from 0.1.x) as well as having a +future-stable interface until 2.0.0. + +### Scope of `libc` + +The name "libc" is a little nebulous as to what it means across platforms. It +is clear, however, that this library must have a well defined scope up to which +it can expand to ensure that it doesn't start pulling in dozens of runtime +dependencies to bind all the system APIs that are found. + +Unfortunately, however, this library also can't be "just libc" in the sense of +"just libc.so on Linux," for example, as this would omit common APIs like +pthreads and would also mean that pthreads would be included on platforms like +MUSL (where it is literally inside libc.a). Additionally, the purpose of libc +isn't to provide a cross platform API, so there isn't necessarily one true +definition in terms of sets of symbols that `libc` will export. + +In order to have a well defined scope while satisfying these constraints, this +RFC proposes that this crate will have a scope that is defined separately for +each platform that it targets. The proposals are: + +* Linux (and other unix-like platforms) - the libc, libm, librt, libdl, and + libpthread libraries. Additional platforms can include libraries whose symbols + are found in these libraries on Linux as well. +* OSX - the common library to link to on this platform is libSystem, but this + transitively brings in quite a few dependencies, so this crate will refine + what it depends upon from libSystem a little further, specifically: + libsystem\_c, libsystem\_m, libsystem\_pthread, libsystem\_malloc and libdyld. +* Windows - the VS CRT libraries. This library is currently intended to be + distinct from the `winapi` crate as well as bindings to common system DLLs + found on Windows, so the current scope of `libc` will be pared back to just + what the CRT contains. This notably means that a large amount of the current + contents will be removed on Windows. + +New platforms added to `libc` can decide the set of libraries `libc` will link +to and bind at that time. + +### Internal structure + +The primary change being made is that the crate will no longer be one large file +sprinkled with `#[cfg]` annotations. Instead, the crate will be split into a +tree of modules, and all modules will reexport the entire contents of their +children. Unlike most libraries, however, most modules in `libc` will be +hidden via `#[cfg]` at compile time. Each platform supported by `libc` will +correspond to a path from a leaf module to the root, picking up more +definitions, types, and constants as the tree is traversed upwards. + +This organization provides a simple method of deduplication between platforms. +For example `libc::unix` contains functions found across all unix platforms +whereas `libc::unix::bsd` is a refinement saying that the APIs within are common +to only BSD-like platforms (these may or may not be present on non-BSD platforms +as well). The benefits of this structure are: + +* For any particular platform, it's easy in the source to look up what its value + is (simply trace the path from the leaf to the root, aka the filesystem + structure, and the value can be found). +* When adding an API it's easy to know **where** the API should be added because + each node in the module hierarchy corresponds clearly to some subset of + platforms. +* Adding new platforms should be a relatively simple and confined operation. New + leaves of the hierarchy would be created and some definitions upwards may be + pushed to lower levels if APIs need to be changed or aren't present on the new + platform. It should be easy to audit, however, that a new platform doesn't + tamper with older ones. + +### Testing + +The current set of bindings in the `libc` crate suffer a drawback in that they +are not verified. This is often a pain point for new platforms where when +copying from an existing platform it's easy to forget to update a constant here +or there. This lack of testing leads to problems like a [wrong definition of +`ioctl`][ioctl] which in turn lead to [backwards compatibility +problems][backcompat] when the API is fixed. + +[ioctl]: https://github.com/rust-lang/rust/pull/26809 +[backcompat]: https://github.com/rust-lang/rust/pull/27762 + +In order to solve this problem altogether, the libc crate will be enhanced with +the ability to automatically test the FFI bindings it contains. As this crate +will begin to live in `rust-lang` instead of the `rust` repo itself, this means +it can leverage external CI systems like Travis CI and AppVeyor to perform these +tasks. + +The [current implementation][ctest] of the binding testing verifies attributes +such as type size/alignment, struct field offset, struct field types, constant +values, function definitions, etc. Over time it can be enhanced with more +metrics and properties to test. + +[ctest]: https://github.com/alexcrichton/ctest + +In theory adding a new platform to `libc` will be blocked until automation can +be set up to ensure that the bindings are correct, but it is unfortunately not +easy to add this form of automation for all platforms, so this will not be a +requirement (beyond "tier 1 platforms"). There is currently automation for the +following targets, however, through Travis and AppVeyor: + +* `{i686,x86_64}-pc-windows-{msvc,gnu}` +* `{i686,x86_64,mips,aarch64}-unknown-linux-gnu` +* `x86_64-unknown-linux-musl` +* `arm-unknown-linux-gnueabihf` +* `arm-linux-androideabi` +* `{i686,x86_64}-apple-{darwin,ios}` + +# Drawbacks + +### Loss of module organization + +The loss of an internal organization structure can be seen as a drawback of this +design. While perhaps not precisely true today, the principle of the structure +was that it is easy to constrain yourself to a particular C standard or subset +of C to in theory write "more portable programs by default" by only using the +contents of the respective module. Unfortunately in practice this does not seem +to be that much in use, and it's also not clear whether this can be expressed +through simply headers in `libc`. For example many platforms will have slight +tweaks to common structures, definitions, or types in terms of signedness or +value, so even if you were restricted to a particular subset it's not clear that +a program would automatically be more portable. + +That being said, it would still be useful to have these abstractions to *some +degree*, but the filp side is that it's easy to build this sort of layer on top +of `libc` as designed here externally on crates.io. For example `extern crate +posix` could just depend on `libc` and reexport all the contents for the +POSIX standard, perhaps with tweaked signatures here and there to work better +across platforms. + +### Loss of Windows bindings + +By only exposing the CRT functions on Windows, the contents of `libc` will be +quite trimmed down which means when accessing similar functions like `send` or +`connect` crates will be required to link to two libraries at least. + +This is also a bit of a maintenance burden on the standard library itself as it +means that all the bindings it uses must move to `src/libstd/sys/windows/c.rs` +in the immedidate future. + +# Alternatives + +* Instead of *only* exporting a flat namespace the `libc` crate could optionally + also do what it does today with respect to reexporting modules corresponding + to various C standards. The downside to this, unfortunately, is that it's + unclear how much portability using these standards actually buys you. + +* The crate could be split up into multiple crates which represent an exact + correspondance to system libraries, but this has the downside of using common + functions available on both OSX and Linux would require at least two `extern + crate` directives and dependencies. + +# Unresolved questions + +* The only platforms without automation currently are the BSD-like platforms + (e.g. FreeBSD, OpenBSD, Bitrig, DragonFly, etc), but if it were possible to + set up automation for these then it would be plausible to actually require + automation for any new platform. It is possible to do this? + +* What is the relation between `std::os::*::raw` and `libc`? Given that the + standard library will probably always depend on an in-tree copy of the `libc` + crate, should `libc` define its own in this case, have the standard library + reexport, and then the out-of-tree `libc` reexports the standard library? + +* Should Windows be supported to a greater degree in `libc`? Should this crate + and `winapi` have a closer relationship? From 8e2d3a3341da533f846f61f10335b72c9a9f4740 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 24 Sep 2015 08:04:57 -0700 Subject: [PATCH 0553/1195] RFC 1241 is no wildcard deps on crates.io --- text/{0000-no-wildcard-deps.md => 1241-no-wildcard-deps.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-no-wildcard-deps.md => 1241-no-wildcard-deps.md} (97%) diff --git a/text/0000-no-wildcard-deps.md b/text/1241-no-wildcard-deps.md similarity index 97% rename from text/0000-no-wildcard-deps.md rename to text/1241-no-wildcard-deps.md index 80c3b7ed84c..b0fc80cf984 100644 --- a/text/0000-no-wildcard-deps.md +++ b/text/1241-no-wildcard-deps.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-07-23 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1241](https://github.com/rust-lang/rfcs/pull/1241) +- Rust Issue: [rust-lang/rust#28628](https://github.com/rust-lang/rust/issues/28628) # Summary From 09c71cd1873ce877bdf53535da6ee30dcc979a30 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 28 Sep 2015 12:46:39 -0400 Subject: [PATCH 0554/1195] Incremental compilation RFC --- text/0000-incremental-compilation.md | 616 +++++++++++++++++++++++++++ 1 file changed, 616 insertions(+) create mode 100644 text/0000-incremental-compilation.md diff --git a/text/0000-incremental-compilation.md b/text/0000-incremental-compilation.md new file mode 100644 index 00000000000..c002b148181 --- /dev/null +++ b/text/0000-incremental-compilation.md @@ -0,0 +1,616 @@ +- Feature Name: incremental-compilation +- Start Date: 2015-08-04 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Enable the compiler to cache incremental workproducts for debug +builds. + +# Motivation + +The goal of incremental compilation is, naturally, to improve build +times when making small edits. Any reader who has never felt the need +for such a feature is strongly encouraged to attempt hacking on the +compiler or servo sometime (naturally, all readers are so encouraged, +regardless of their opinion on the need for incremental compilation). + +## Basic usage + +The basic usage will be that one enables incremental compilation using +a compiler flag like `-C incremental-compilation=TMPDIR`. The `TMPDIR` +directory is intended to be an empty directory that the compiler can +use to store intermediate by-products; the compiler will automatically +"GC" this directory, deleting older files that are no longer relevant +and creating new ones. + +## High-level design + +The high-level idea is that we will track the following intermediate +workproducts for every function (and, indeed, for other kinds of items +as well, but functions are easiest to describe): + +- External signature + - For a function, this would include the types of its arguments, + where-clauses declared on the function, and so forth. +- MIR + - The MIR represents the type-checked statements in the body, in + simplified forms. It is described by [RFC #1211][1211]. As the MIR + is not fully implemented, this is a non-trivial dependency. We + could instead use the existing annotated HIR, however that would + require a larger effort in terms of porting and adapting data + structures to an incremental setting. Using the MIR simplifies + things in this respect. +- Object files + - This represents the final result of running LLVM. It may be that + the best strategy is to "cache" compiled code in the form of an + rlib that is progessively patched, or it may be easier to store + individual `.o` files that must be relinked (anyone who has worked + in a substantial C++ project can attest, however, that linking can + take a non-trivial amount of time). + +Of course, the key to any incremental design is to determine what must +be changed. This can be encoded in a *dependency graph*. This graph +connects the various bits of the HIR to the external products +(signatures, MIR, and object files). It is of the utmost importance +that this dependency graph is complete: if edges are missing, the +result will be obscure errors where changes are not fully propagated, +yielding inexplicable behavior at runtime. This RFC proposes an +automatic scheme based on encapsulation. + +### Interaction with lints and compiler plugins + +Although rustc does not yet support compiler plugins through a stable +interface, we have long planned to allow for custom lints, syntax +extensions, and other sorts of plugins. It would be nice therefore to +be able to accommodate such plugins in the design, so that their +inputs can be tracked and accounted for as well. + +## Interaction with optimization + +It is important to clarify, though, that this design does not attempt +to enable full optimizing for incremental compilation; indeed the two +are somewhat at odds with one another, as full optimization may +perform inlining and inter-function analysis, which can cause small +edits in one function to affect the generated code of another. This +situation is further exacerbated by the fact that LLVM does not +provide any way to track these sorts of dependencies (e.g., one cannot +even determine what inlining took place, though @dotdash suggested a +clever trick of using llvm lifetime hints). Strategies for handling +this are discussed in the [Optimization section](#optimization) below. + +# Detailed design + +We begin with a high-level execution plan, followed by sections that +explore aspects of the plan in more detail. The high-level summary +includes links to each of the other sections. + +## High-level execution plan + +Regardless of whether it is invoked in incremental compilation mode or +not, the compiler will always parse and macro expand the entire crate, +resulting in a HIR tree. Once we have a complete HIR tree, and if we +are invoked in incremental compilation mode, the compiler will then +try to determine which parts of the crate have changed since the last +execution. For each item, we compute a [(mostly) stable id](#defid) +based primarily on the item's name and containing module. We then +compute a hash of its contents and compare that hash against the hash +that the item had in the compilation (if any). + +Once we know which items have changed, we consult a +[dependency graph](#depgraph) to tell us which artifacts are still +usable. These artifacts can take the form of serializing MIR graphs, +LLVM IR, compiled object code, and so forth. The dependency graph +tells us which bits of AST contributed to each artifact. It is +constructed by dynamically monitoring what the compiler accesses +during execution. + +Finally, we can begin execution. The compiler is currently structured +in a series of passes, each of which walks the entire AST. We do not +need to change this structure to enable incremental +compilation. Instead, we continue to do every pass as normal, but when +we come to an item for which we have a pre-existing artifact (for +example, if we are type-checking a fn that has not changed since the +last execution), we can simply skip over that fn instead. Similar +strategies can be used to enable lazy or parallel compilation at later +times. (Eventually, though, it might be nice to restructure the +compiler so that it operates in more of a demand driven style, rather +than a series of sweeping passes.) + +When we come to the final LLVM stages, we must +[separate the functions into distinct "codegen units"](#optimization) +for the purpose of LLVM code generation. This will build on the +existing "codegen-units" used for parallel code generation. LLVM may +perform inlining or interprocedural analysis within a unit, but not +across units, which limits the amount of reoptimization needed when +one of those functions changes. + +Finally, the RFC closes with a discussion of +[testing strategies](#testing) we can use to help avoid bugs due to +incremental compilation. + +### Staging + +One important question is how to stage the incremental compilation +work. That is, it'd be nice to start seeing some benefit as soon as +possible. One possible plan is as follows: + +1. Implement stable def-ids (in progress, nearly complete). +2. Implement the dependency graph and tracking system (started). +3. Experiment with distinct modularization schemes to find the one which + gives the best fragmentation with minimal performance impact. + Or, at least, implement something finer-grained than today's codegen-units. +4. Persist compiled object code only. +5. Persist intermediate MIR and generated LLVM as well. + +The most notable staging point here is that we can begin by just +saving object code, and then gradually add more artifacts that get +saved. The effect of saving fewer things (such as only saving object +code) will simply be to make incremental compilation somewhat less +effective, since we will be forced to re-type-check and re-trans +functions where we might have gotten away with only generating new +object code. However, this is expected to be be a second order effect +overall, particularly since LLVM optimization time can be a very large +portion of compilation. + + +## Handling DefIds + +In order to correlate artifacts between compilations, we need some +stable way to name items across compilations (and across crates). The +compiler currently uses something called a `DefId` to identify each +item. However, these ids today are based on a node-id, which is just +an index into the HIR and hence will change whenever *anything* +preceding it in the HIR changes. We need to make the `DefId` for an +item independent of changes to other items. + +Conceptually, the idea is to change `DefId` into the pair of a crate +and a path: + +``` +CRATE = +PATH = Crate(ID) + | PATH :: Mod(ID) + | PATH :: Item(ID) + | PATH :: TypeParameter(ID) + | PATH :: LifetimeParameter(ID) + | PATH :: Member(ID) + | PATH :: Impl + | ... +``` + +However, rather than actually store the path in the compiler, we will +instead intern the paths in the `CStore`, and the `DefId` will simply +store an integer. So effectively the `node` field of `DefId`, which +currently indexes into the HIR of the appropriate crate, becomes an +index into the crate's list of paths. + +For the most part, these paths match up with user's intutions. So a +struct `Foo` declared in a module `bar` would just have a path like +`bar::Foo`. However, the paths are also able to express things for +which there is no syntax, such as an item declared within a function +body. + +### Disambiguation + +For the most part, paths should naturally be unique. However, there +are some cases where a single parent may have multiple children with +the same path. One case would be erroneous programs, where there are +(e.g.) two structs declared with the same name in the same +module. Another is that some items, such as impls, do not have a name, +and hence we cannot easily distinguish them. Finally, it is possible +to declare multiple functions with the same name within function bodies: + +```rust +fn foo() { + { + fn bar() { } + } + + { + fn bar() { } + } +} +``` + +All of these cases are handled by a simple *disambiguation* mechanism. +The idea is that we will assign a path to each item as we traverse the +HIR. If we find that a single parent has two children with the same +name, such as two impls, then we simply assign them unique integers in +the order that they appear in the program text. For example, the +following program would use the paths shown: + +```rust +mod foo { // Path: ::foo + pub struct Type { } // Path: ::foo::Type + impl Type { // Path: ::foo:: + fn bar() {..} // Path: ::foo::::bar + } + impl Type { } // Path: ::foo:: +} +``` + +Note that the impls were arbitarily assigned indices based on the order +in which they appear. This does mean that reordering impls may cause +spurious recompilations. We can try to mitigate this somewhat by making the +path entry for an impl include some sort of hash for its header or its contents, +but that will be something we can add later. + +*Implementation note:* Refactoring DefIds in this way is a large +task. I've made several attempts at doing it, but my latest branch +appears to be working out (it is not yet complete). As a side benefit, +I've uncovered a few fishy cases where we using the node id from +external crates to index into the local crate's HIR map, which is +certainly incorrect. --nmatsakis + + +## Identifying and tracking dependencies + +### Core idea: a fine-grained dependency graph + +Naturally any form of incremental compilation requires a detailed +understanding of how each work item is dependent on other work items. +This is most readily visualized as a dependency graph; the +finer-grained the nodes and edges in this graph, the better. For example, +consider a function `foo` that calls a function `bar`: + +```rust +fn foo() { + ... + bar(); + ... +} +``` + +Now imagine that the body (but not the external signature) of `bar` +changes. Do we need to type-check `foo` again? Of course not: `foo` +only cares about the signature of `bar`, not its body. For the +compiler to understand this, though, we'll need to create distinct +graph nodes for the signature and body of each function. + +(Note that our policy of making "external signatures" fully explicit +is helpful here. If we supported, e.g., return type inference, than it +would be harder to know whether a change to `bar` means `foo` must be +recompiled.) + +### Categories of nodes + +This section gives a kind of "first draft" of the set of graph +nodes/edges that we will use. It is expected that the full set of +nodes/edges will evolve in the course of implementation (and of course +over time as well). In particular, some parts of the graph as +presented here are intentionally quite coarse and we envision that the +graph will be gradually more fine-grained. + +The nodes fall into the following categories: + +- **HIR nodes.** Represent some portion of the input HIR. For example, + the body of a fn as a HIR node (or, perhaps, HIR node). These are + the inputs to the entire compilation process. + - Examples: + - `SIG(X)` would represent the signature of some fn item + `X` that the user wrote (i.e., the names of the types, + where-clauses, etc) + - `BODY(X)` would be the body of some fn item `X` + - and so forth +- **IR nodes.** Represent some portion of the computed IR. For + example, the MIR representation of a fn body, or the `ty` + representation of a fn signature. These also frequently correspond + to a single entry in one of the various compiler hashmaps. These are + the outputs (and intermediate steps) of the compilation process + - Examples: + - `ITEM_TYPE(X)` -- entry in the obscurely named `tcache` table + for `X` (what is returned by the rather-more-clearly-named + `lookup_item_type`) + - `PREDICATES(X)` -- entry in the `predicates` table + - `ADT(X)` -- ADT node for a struct (this may want to be more + fine-grained, particularly to cover the ivars) + - `MIR(X)` -- the MIR for the item `X` + - `LLVM(X)` -- the LLVM IR for the item `X` + - `OBJECT(X)` -- the object code generated by compiling some item + `X`; the precise way that this is saved will depend on whether + we use `.o` files that are linked together, or if we attempt to + amend the shared library in place. +- **Procedure nodes.** These represent various passes performed by the + compiler. For example, the act of type checking a fn body, or the + act of constructing MIR for a fn body. These are the "glue" nodes + that wind up reading the inputs and creating the outputs, and hence + which ultimately tie the graph together. + - Examples: + - `COLLECT(X)` -- the collect code executing on item `X` + - `WFCHECK(X)` -- the wfcheck code executing on item `X` + - `BORROWCK(X)` -- the borrowck code executing on item `X` + +To see how this all fits together, let's consider the graph for a +simple example: + +```rust +fn foo() { + bar(); +} + +fn bar() { +} +``` + +This might generate a graph like the following (the following sections +will describe how this graph is constructed). Note that this is not a +complete graph, it only shows the data needed to produce `MIR(foo)`. + +``` +BODY(foo) ----------------------------> TYPECK(foo) --> MIR(foo) + ^ ^ ^ ^ | +SIG(foo) ----> COLLECT(foo) | | | | | + | | | | | v + +--> ITEM_TYPE(foo) -----+ | | | LLVM(foo) + +--> PREDICATES(foo) ------+ | | | + | | | +SIG(bar) ----> COLLECT(bar) | | v + | | | OBJECT(foo) + +--> ITEM_TYPE(bar) ---------+ | + +--> PREDICATES(bar) ----------+ +``` + +As you can see, this graph indicates that if the signature of either +function changes, we will need to rebuild the MIR for `foo`. But there +is no path from the body of `bar` to the MIR for foo, so changes there +need not trigger a rebuild. + +### Building the graph + +It is very important the dependency graph contain *all* edges. If any +edges are missing, it will mean that we will get inconsistent builds, +where something should have been rebuilt what was not. Hand-coding a +graph like this, therefore, is probably not the best choice -- we +might get it right at first, but it's easy to for such a setup to fall +out of sync as the code is edited. (For example, if a new table is +added, or a function starts reading data that it didn't before.) + +Another consideration is compiler plugins. At present, of course, we +don't have a stable API for such plugins, but eventually we'd like to +support a rich family of them, and they may want to participate in the +incremental compilation system as well. So we need to have an idea of +what data a plugin accesses and modifies, and for what purpose. + +The basic strategy then is to build the graph dynamically with an API +that looks something like this: + +- `push_procedure(procedure_node)` +- `pop_procedure(procedure_node)` +- `read_from(data_node)` +- `write_to(data_node)` + +Here, the `procedure_node` arguments are one of the procedure labels +above (like `COLLECT(X)`), and the `data_node` arguments are either +HIR or IR nodes (e.g., `SIG(X)`, `MIR(X)`). + +The idea is that we maintain for each thread a stack of active +procedures. When `push_procedure` is called, a new entry is pushed +onto that stack, and when `pop_procedure` is called, an entry is +popped. When `read_from(D)` is called, we add an edge from `D` to the +top of the stack (it is an error if the stack is empty). Similarly, +`write_to(D)` adds an edge from the top of the stack to `D`. + +Naturally it is easy to misuse the above methods: one might forget to +push/pop a procedure at the right time, or fail to invoke +read/write. There are a number of refactorings we can do on the +compiler to make this scheme more robust. + +#### Procedures + +Most of the compiler passes operate an item at a time. Nonetheless, +they are largely encoded using the standard visitor, which walks all +HIR nodes. We can refactor most of them to instead use an outer +visitor, which walks items, and an inner visitor, which walks a +particular item. (Many passes, such as borrowck, already work this +way.) This outer visitor will be parameterized with the label for the +pass, and will automatically push/pop procedure nodes as appropriate. +This means that as long as you base your pass on the generic +framework, you don't really have to worry. + +In general, while I described the general case of a stack of procedure +nodes, it may be desirable to try and maintain the invariant that +there is only ever one procedure node on the stack at a +time. Otherwise, failing to push/pop a procdure at the right time +could result in edges being added to the wrong procedure. It is likely +possible to refactor things to maintain this invariant, but that has +to be determined as we go. + +#### IR nodes + +Adding edges to the IR nodes that represent the compiler's +intermediate byproducts can be done by leveraging privacy. The idea is +to enforce the use of accessors to the maps and so forth, rather than +allowing direct access. These accessors will call the `read_from` and +`write_to` methods as appropriate to add edges to/from the current +active procedure. + +#### HIR nodes + +HIR nodes are a bit trickier to encapsulate. After all, the HIR map +itself gives access to the root of the tree, which in turn gives +access to everything else -- and encapsulation is harder to enforce +here. + +Some experimentation will be required here, but the rough plan is to: + +1. Leveraging the HIR, move away from storing the HIR as one large tree, + and instead have a tree of items, with each item containing only its own + content. + - This way, giving access to the HIR node for an item doesn't implicitly + give access to all of its subitems. + - Ideally this would match precisely the HIR nodes we setup, which + means that e.g. a function would have a subtree corresponding to + its signature, and a separating subtree corresponding to its + body. + - We can still register the lexical nesting of items by linking "indirectly" + via a `DefId`. +2. Annotate the HIR map accessor methods so that they add appropriate + read/write edges. + +This will integrate with the "default visitor" described under +procedure nodes. This visitor can hand off just an opaque id for each +item, requiring the pass itself to go through the map to fetch the +actual HIR, thus triggering a read edge (we might also bake this +behavior into the visitor for convenience). + +### Persisting the graph + +Once we've built the graph, we have to persist it, along with some +associated information. The idea is that the compiler, when invoked, +will be supplied with a directory. It will store temporary files in +there. We could also consider extending the design to support use by +multiple simultaneous compiler invocations, which could mean +incremental compilation results even across branches, much like ccache +(but this may require tweaks to the GC strategy). + +Once we get to the point of persisting the graph, we don't need the +full details of the graph. The process nodes, in particular, can be +removed. They exist only to create links between the other nodes. To +remove them, we first compute the transitive reachability relationship +and then drop the process nodes out of the graph, leaving only the HIR +nodes (inputs) and IR nodes (output). (In fact, we only care about +the IR nodes that we intend to persist, which may be only a subset of +the IR nodes, so we can drop those that we do not plan to persist.) + +For each HIR node, we will hash the HIR and store that alongside the +node. This indicates precisely the state of the node at the time. +Note that we only need to hash the HIR itself; contextual information +(like `use` statements) that are needed to interpret the text will be +part of a separate HIR node, and there should be edges from that node +to the relevant compiler data structures (such as the name resolution +tables). + +For each IR node, we will serialize the relevant information from the +table and store it. The following data will need to be serialized: + +- Types, regions, and predicates +- ADT definitions +- MIR definitions +- Identifiers +- Spans + +This list was gathered primarily by spelunking through the compiler. +It is probably somewhat incomplete. The appendix below lists an +exhaustive exploration. + +### Reusing and garbage collecting artifacts + +The general procedure when the compiler starts up in incremental mode +will be to parse and macro expand the input, create the corresponding +set of HIR nodes, and compute their hashes. We can then load the +previous dependency graph and reconcile it against the current state: + +- If the dep graph contains a HIR node that is no longer present in the + source, that node is queued for deletion. +- If the same HIR node exists in both the dep graph and the input, but + the hash has changed, that node is queued for deletion. +- If there is a HIR node that exists only in the input, it is added + to the dep graph with no dependencies. + +We then delete the transitive closure of nodes queued for deletion +(that is, all the HIR nodes that have changed or been removed, and all +nodes reachable from those HIR nodes). As part of the deletion +process, we remove whatever on disk artifact that may have existed. + + +## Optimization and codegen units + +There is an inherent tension between incremental compilation and full +optimization. Full optimization may perform inlining and +inter-function analysis, which can cause small edits in one function +to affect the generated code of another. This situation is further +exacerbated by the fact that LLVM does not provide any means to track +when one function was inlined into another, or when some sort of +interprocedural analysis took place (to the best of our knowledge, at +least). + +This RFC proposes a simple mechanism for permitting aggressive +optimization, such as inlining, while also supporting reasonable +incremental compilation. The idea is to create *codegen units* that +compartmentalize closely related functions (for example, on a module +boundary). This means that those compartmentalized functions may +analyze one another, while treating functions from other compartments +as opaque entities. This means that when a function in compartment X +changes, we know that functions from other compartments are unaffected +and their object code can be reused. Moreover, while the other +functions in compartment X must be re-optimized, we can still reuse +the existing LLVM IR. (These are the same codegen units as we use for +parallel codegen, but setup differently.) + +In terms of the dependency graph, we would create one IR node +representing the codegen unit. This would have the object code as an +associated artifact. We would also have edges from each component of +the codegen unit. As today. generic or inlined functions would not +belong to any codegen unit, but rather would be instantiated anew into +each codegen unit in which they are (transitively) referenced. + +There is an analogy here with C++, which naturally faces the same +problems. In that setting, templates and inlineable functions are +often placed into header files. Editing those header files naturally +triggers more recompilation. The compiler could employ a similar +strategy by replicating things that look like good candidates for +inlining into each module; call graphs and profiling information may +be a good input for such heuristics. + + +## Testing strategy + +If we are not careful, incremental compilation has the potential to +produce an infinite stream of irreproducible bug reports, so it's +worth considering how we can best test this code. + +### Regression tests + +The first and most obvious piece of infrastructure is something for +reliable regression testing. The plan is simply to have a series of +sources and patches. The source will have each patch applied in +sequence, rebuilding (incrementally) at each point. We can then +compare the result with the result of a fresh build from scratch. +This allows us to build up tests for specific scenarios or bug +reports, but doesn't help with *finding* bugs in the first place. + +### Replaying crates.io versions and git history + +The next step is to search across crates.io for consecutive +releases. For a given package, we can checkout version `X.Y` and then +version `X.(Y+1)` and check that incrementally building from one to +the other is successful and that all tests still yield the same +results as before (pass or fail). + +A similar search can be performed across git history, where we +identify pairs of consecutive commits. This has the advantage of being +more fine-grained, but the disadvantage of being a MUCH larger search +space. + +### Fuzzing + +The problem with replaying crates.io versions and even git commits is +that they are probably much larger changes than the typical +recompile. Another option is to use fuzzing, making "innocuous" +changes that should trigger a recompile. Fuzzing is made easier here +because we have an oracle -- that is, we can check that the results of +recompiling incrementally match the results of compiling from scratch. +It's also not necessary that the edits are valid Rust code, though we +should test that too -- in particular, we want to test that the proper +errors are reported when code is invalid, as well. @nrc also +suggested a clever hybrid, where we use git commits as a source for +the fuzzer's edits, gradually building up the commit. + +# Drawbacks + +The primary drawback is that incremental compilation may introduce a +new vector for bugs. The design mitigates this concern by attempting +to make the construction of the dependency graph as automated as +possible. We also describe automated testing strategies. + +# Alternatives + +This design is an evolution from a prior RFC. + +# Unresolved questions + +None. + +[1211]: https://github.com/rust-lang/rfcs/pull/1211 From 916834b883b4aba32c9ae56c4e392a75ecf731d7 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 28 Sep 2015 13:50:27 -0400 Subject: [PATCH 0555/1195] Update summary to not say "debug builds" --- text/0000-incremental-compilation.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/0000-incremental-compilation.md b/text/0000-incremental-compilation.md index c002b148181..a8e04481fa1 100644 --- a/text/0000-incremental-compilation.md +++ b/text/0000-incremental-compilation.md @@ -5,8 +5,7 @@ # Summary -Enable the compiler to cache incremental workproducts for debug -builds. +Enable the compiler to cache incremental workproducts. # Motivation From fa9ce6d68f7c3fe5e7251439f9f9866f8e50eff3 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 28 Sep 2015 13:56:56 -0400 Subject: [PATCH 0556/1195] add a brief note about cross-crate dependencies --- text/0000-incremental-compilation.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-incremental-compilation.md b/text/0000-incremental-compilation.md index a8e04481fa1..9c978948f7e 100644 --- a/text/0000-incremental-compilation.md +++ b/text/0000-incremental-compilation.md @@ -293,6 +293,12 @@ The nodes fall into the following categories: where-clauses, etc) - `BODY(X)` would be the body of some fn item `X` - and so forth +- **Metadata nodes.** These represent portions of the metadata from + another crate. Each piece of metadata will include a hash of its + contents. When we need information about an external item, we load + that info out of the metadata and add it into the IR nodes below; + this can be represented in the graph using edges. This means that + incremental compilation can also work across crates. - **IR nodes.** Represent some portion of the computed IR. For example, the MIR representation of a fn body, or the `ty` representation of a fn signature. These also frequently correspond From 40cbce96359dec2982f59dfca2c237a3be53b4d7 Mon Sep 17 00:00:00 2001 From: James Miller Date: Tue, 29 Sep 2015 21:04:17 +1300 Subject: [PATCH 0557/1195] RFC on intrinsic semantics --- text/0000-intrinsic-semantics.md | 49 ++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 text/0000-intrinsic-semantics.md diff --git a/text/0000-intrinsic-semantics.md b/text/0000-intrinsic-semantics.md new file mode 100644 index 00000000000..f5fdc4a67a3 --- /dev/null +++ b/text/0000-intrinsic-semantics.md @@ -0,0 +1,49 @@ +- Feature Name: intrinsic-semantics +- Start Date: 2015-09-29 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Define the general semantics of intrinsic functions. This does not define the semantics of the +individual intrinsics, instead defines the semantics around intrinsic functions in general. + +# Motivation + +Intrinsics are currently poorly-specified in terms of how they function. This means they are a +cause of ICEs and general confusion. The poor specification of them also means discussion affecting +intrinsics gets mired in opinions about what intrinsics should be like and how they should act or +be implemented. + +# Detailed design + +Intrinsics are currently implemented by generating the code for the intrinsic at the call +site. This allows for intrinsics to be implemented much more efficiently in many cases. For +example, `transmute` is able to evaluate the input expression directly into the storage for the +result, removing a potential copy. This is the main idea of intrinsics, a way to generate code that +is otherwise inexpressible in Rust. + +Keeping this in-place behaviour is desirable, so this RFC proposes that intrinsics should only be +usable as functions when called. This is not a change from the current behaviour, as you already +cannot use intrinsics as function pointers. Using an intrinsic in any way other than directly +calling should be considered an error. + +Intrinsics should continue to be defined and declared the same way. The `rust-intrinsic` and +`platform-intrinsic` ABIs indicate that the function is an intrinsic function. + +# Drawbacks + +* Fewer bikesheds to paint. +* Doesn't allow intrinsics to be used as regular functions. (Note that this is not something we + have evidence to suggest is a desired property, as it is currently the case anyway) + +# Alternatives + +* Allow coercion to regular functions and generate wrappers. This is similar to how we handle named + tuple constructors. Doing this undermines the idea of intrinsics as a way of getting the compiler + to generate specific code at the call-site however. +* Do nothing. + +# Unresolved questions + +None. From 5a34c450b18ac517e547046de11c9ea056ca72b6 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 29 Sep 2015 07:08:42 -0400 Subject: [PATCH 0558/1195] correct typos and minor things --- text/0000-incremental-compilation.md | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/text/0000-incremental-compilation.md b/text/0000-incremental-compilation.md index 9c978948f7e..9c85e7ffd20 100644 --- a/text/0000-incremental-compilation.md +++ b/text/0000-incremental-compilation.md @@ -44,7 +44,7 @@ as well, but functions are easiest to describe): - Object files - This represents the final result of running LLVM. It may be that the best strategy is to "cache" compiled code in the form of an - rlib that is progessively patched, or it may be easier to store + rlib that is progressively patched, or it may be easier to store individual `.o` files that must be relinked (anyone who has worked in a substantial C++ project can attest, however, that linking can take a non-trivial amount of time). @@ -185,7 +185,7 @@ store an integer. So effectively the `node` field of `DefId`, which currently indexes into the HIR of the appropriate crate, becomes an index into the crate's list of paths. -For the most part, these paths match up with user's intutions. So a +For the most part, these paths match up with user's intuitions. So a struct `Foo` declared in a module `bar` would just have a path like `bar::Foo`. However, the paths are also able to express things for which there is no syntax, such as an item declared within a function @@ -230,7 +230,7 @@ mod foo { // Path: ::foo } ``` -Note that the impls were arbitarily assigned indices based on the order +Note that the impls were arbitrarily assigned indices based on the order in which they appear. This does mean that reordering impls may cause spurious recompilations. We can try to mitigate this somewhat by making the path entry for an impl include some sort of hash for its header or its contents, @@ -285,8 +285,8 @@ graph will be gradually more fine-grained. The nodes fall into the following categories: - **HIR nodes.** Represent some portion of the input HIR. For example, - the body of a fn as a HIR node (or, perhaps, HIR node). These are - the inputs to the entire compilation process. + the body of a fn as a HIR node. These are the inputs to the entire + compilation process. - Examples: - `SIG(X)` would represent the signature of some fn item `X` that the user wrote (i.e., the names of the types, @@ -417,7 +417,7 @@ framework, you don't really have to worry. In general, while I described the general case of a stack of procedure nodes, it may be desirable to try and maintain the invariant that there is only ever one procedure node on the stack at a -time. Otherwise, failing to push/pop a procdure at the right time +time. Otherwise, failing to push/pop a procedure at the right time could result in edges being added to the wrong procedure. It is likely possible to refactor things to maintain this invariant, but that has to be determined as we go. @@ -547,7 +547,7 @@ parallel codegen, but setup differently.) In terms of the dependency graph, we would create one IR node representing the codegen unit. This would have the object code as an associated artifact. We would also have edges from each component of -the codegen unit. As today. generic or inlined functions would not +the codegen unit. As today, generic or inlined functions would not belong to any codegen unit, but rather would be instantiated anew into each codegen unit in which they are (transitively) referenced. @@ -571,10 +571,11 @@ worth considering how we can best test this code. The first and most obvious piece of infrastructure is something for reliable regression testing. The plan is simply to have a series of sources and patches. The source will have each patch applied in -sequence, rebuilding (incrementally) at each point. We can then -compare the result with the result of a fresh build from scratch. -This allows us to build up tests for specific scenarios or bug -reports, but doesn't help with *finding* bugs in the first place. +sequence, rebuilding (incrementally) at each point. We can then check +that (a) we only rebuilt what we expected to rebuild and (b) compare +the result with the result of a fresh build from scratch. This allows +us to build up tests for specific scenarios or bug reports, but +doesn't help with *finding* bugs in the first place. ### Replaying crates.io versions and git history @@ -612,10 +613,11 @@ possible. We also describe automated testing strategies. # Alternatives -This design is an evolution from a prior RFC. +This design is an evolution from [RFC 594][]. # Unresolved questions None. [1211]: https://github.com/rust-lang/rfcs/pull/1211 +[RFC 594]: https://github.com/rust-lang/rfcs/pull/594 From 5d5fa510344c4acc465e388202a3fbdea820f6e6 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 29 Sep 2015 07:12:49 -0400 Subject: [PATCH 0559/1195] make PATH grammar match actual implementation more closely --- text/0000-incremental-compilation.md | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/text/0000-incremental-compilation.md b/text/0000-incremental-compilation.md index 9c85e7ffd20..778a04159d9 100644 --- a/text/0000-incremental-compilation.md +++ b/text/0000-incremental-compilation.md @@ -168,15 +168,18 @@ Conceptually, the idea is to change `DefId` into the pair of a crate and a path: ``` +DEF_ID = (CRATE, PATH) CRATE = -PATH = Crate(ID) - | PATH :: Mod(ID) - | PATH :: Item(ID) - | PATH :: TypeParameter(ID) - | PATH :: LifetimeParameter(ID) - | PATH :: Member(ID) - | PATH :: Impl - | ... +PATH = PATH_ELEM | PATH :: PATH_ELEM +PATH_ELEM = (PATH_ELEM_DATA, ) +PATH_ELEM_DATA = Crate(ID) + | Mod(ID) + | Item(ID) + | TypeParameter(ID) + | LifetimeParameter(ID) + | Member(ID) + | Impl + | ... ``` However, rather than actually store the path in the compiler, we will @@ -218,15 +221,16 @@ The idea is that we will assign a path to each item as we traverse the HIR. If we find that a single parent has two children with the same name, such as two impls, then we simply assign them unique integers in the order that they appear in the program text. For example, the -following program would use the paths shown: +following program would use the paths shown (I've elided the +disambiguating integer except where it is relevant): ```rust mod foo { // Path: ::foo pub struct Type { } // Path: ::foo::Type - impl Type { // Path: ::foo:: - fn bar() {..} // Path: ::foo::::bar + impl Type { // Path: ::foo::(,0) + fn bar() {..} // Path: ::foo::(,0)::bar } - impl Type { } // Path: ::foo:: + impl Type { } // Path: ::foo::(,1) } ``` From fe33e6da37e1b6eacf97b9e4b367f8686f3e12b5 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 30 Sep 2015 05:45:10 -0400 Subject: [PATCH 0560/1195] add a clarifying note about inlining --- text/0000-incremental-compilation.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-incremental-compilation.md b/text/0000-incremental-compilation.md index 778a04159d9..db6fd7bef8d 100644 --- a/text/0000-incremental-compilation.md +++ b/text/0000-incremental-compilation.md @@ -364,7 +364,9 @@ SIG(bar) ----> COLLECT(bar) | | v As you can see, this graph indicates that if the signature of either function changes, we will need to rebuild the MIR for `foo`. But there is no path from the body of `bar` to the MIR for foo, so changes there -need not trigger a rebuild. +need not trigger a rebuild (we are assuming here that `bar` is not +inlined into `foo`; see the [section on optimizations](#optimization) +for more details on how to handle those sorts of dependencies). ### Building the graph From daf752d2f0f69f082a1a02a1711ca45e6ed3f1e4 Mon Sep 17 00:00:00 2001 From: Ulrik Sverdrup Date: Fri, 2 Oct 2015 13:41:28 +0200 Subject: [PATCH 0561/1195] drain range: Remove trait IntoCheckedRange --- text/0000-drain-range-2.md | 89 +------------------------------------- 1 file changed, 2 insertions(+), 87 deletions(-) diff --git a/text/0000-drain-range-2.md b/text/0000-drain-range-2.md index 459b4ad8eac..96096ee8efa 100644 --- a/text/0000-drain-range-2.md +++ b/text/0000-drain-range-2.md @@ -28,7 +28,7 @@ elements, more efficently than any other safe API. - Implement `.drain()` for other collections. This is just like `.drain(..)` would be (drain the whole collection). - Ranged drain accepts all range types, currently .., a.., ..b, a..b, - and drain will accept inclusive end ranges ("closed ranges") if they are implemented. + and drain will accept inclusive end ranges ("closed ranges") when they are implemented. - Drain removes every element in the range. - Drain returns an iterator that produces the removed items by value. - Drain removes the whole range, regardless if you iterate the draining iterator @@ -60,89 +60,6 @@ has other indexed methods (`.split_off()`). `BTreeMap` and `BTreeSet` should have arguments completely consistent the range method. This will be addressed separately. -## `IntoCheckedRange` trait - -The existing trait `collections::range::RangeArgument` will be replaced by -`IntoCheckedRange`, and will be used for `drain` methods that use a range -parameter. - -`IntoCheckedRange` is designed to allow bounds checking half-open and closed -ranges. Bounds checking before conversion allows handling otherwise tricky -extreme values correctly. It is an `unsafe trait` so that bounds checking can -be trusted. Below is a sketched-out implementation. - -```rust -/// Convert `Self` into a half open `usize` range that slices -/// a sequence indexed from 0 to `len`. -/// Return `Err` with a faulty index if out of bounds. -/// -/// Unsafe because: Implementation is trusted to bounds check correctly. -pub unsafe trait IntoCheckedRange { - fn into_checked_range(self, len: usize) -> Result, usize>; -} - -unsafe impl IntoCheckedRange for RangeFull { - #[inline] - fn into_checked_range(self, len: usize) -> Result, usize> { - Ok(0..len) - } -} - -unsafe impl IntoCheckedRange for RangeFrom { - #[inline] - fn into_checked_range(self, len: usize) -> Result, usize> { - if self.start <= len { - Ok(self.start..len) - } else { Err(self.start) } - } -} - -unsafe impl IntoCheckedRange for RangeTo { - #[inline] - fn into_checked_range(self, len: usize) -> Result, usize> { - if self.end <= len { - Ok(0..self.end) - } else { Err(self.end) } - } -} - -unsafe impl IntoCheckedRange for Range { - #[inline] - fn into_checked_range(self, len: usize) -> Result, usize> { - if self.start <= self.end && self.end <= len { - Ok(self.start..self.end) - } else { Err(cmp::max(self.start, self.end)) } - } -} - -// For illustration, this is what a closed range impl would look like -pub struct ClosedRangeSketch { - pub start: T, - pub end: T, -} - -unsafe impl IntoCheckedRange for ClosedRangeSketch { - fn into_checked_range(self, len: usize) -> Result, usize> { - if self.start <= self.end && self.end < len { - Ok(self.start..self.end + 1) - } else { Err(cmp::max(self.start, self.end)) } - } -} -``` - -Example use of `IntoCheckedRange`: - -```rust -pub fn drain(&mut self, range: R) -> Drain - where R: IntoCheckedRange -{ - let remove_range = match range.into_checked_range(self.len()) { - Err(i) => panic!("drain: Index {} is out of bounds", i), - Ok(r) => r, - }; - /* impl omitted */ -``` - ## Stabilization The following can be stabilized as they are: @@ -160,8 +77,6 @@ The following will be heading towards stabilization after changes: - `VecDeque::drain` -The `IntoCheckedRange` trait will not be stabilized until we have closed ranges. - # Drawbacks - Collections disagree on if they are drained with a range (`Vec`) or not (`HashMap`) @@ -177,7 +92,7 @@ The `IntoCheckedRange` trait will not be stabilized until we have closed ranges. ```rust fn splice(&mut self, range: R, iter: I) -> Splice - where R: IntoCheckedRange, I: IntoIterator + where R: RangeArgument, I: IntoIterator ``` if the method `.splice()` would both return an iterator of the replaced elements, From aa533f93904f22a807204b53c0e43b470d22e128 Mon Sep 17 00:00:00 2001 From: Tobias Bucher Date: Fri, 2 Oct 2015 21:44:13 +0100 Subject: [PATCH 0562/1195] Add @pnkfelix's suggestions --- text/0000-main-reexport.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/text/0000-main-reexport.md b/text/0000-main-reexport.md index 0f1007cc5bf..08f48185fd5 100644 --- a/text/0000-main-reexport.md +++ b/text/0000-main-reexport.md @@ -22,7 +22,7 @@ Example: println!("Hello world!"); } } - pub use foo::bar as main; + use foo::bar as main; Example 2: @@ -32,11 +32,17 @@ Example 2: See also https://github.com/rust-lang/rust/issues/27640 for the corresponding issue discussion. +The `#[main]` attribute can also be used to change the entry point of the +generated binary. This is largely irrelevant for this RFC as this RFC tries to +fix an inconsistency with re-exports and directly defined functions. +Nevertheless, it can be pointed out that the `#[main]` attribute does not cover +all the above-mentioned use cases. + # Detailed design Use the symbol `main` at the top-level of a crate that is compiled as a program (`--crate-type=bin`) – instead of explicitly only accepting directly-defined -functions, also allow re-exports. +functions, also allow (possibly non-`pub`) re-exports. # Drawbacks From 59b01f1fac9711d163681afaf149382810d6be70 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 5 Oct 2015 12:31:40 -0400 Subject: [PATCH 0563/1195] insert some text about spans --- text/0000-incremental-compilation.md | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/text/0000-incremental-compilation.md b/text/0000-incremental-compilation.md index db6fd7bef8d..fb8a9e1a860 100644 --- a/text/0000-incremental-compilation.md +++ b/text/0000-incremental-compilation.md @@ -246,7 +246,7 @@ appears to be working out (it is not yet complete). As a side benefit, I've uncovered a few fishy cases where we using the node id from external crates to index into the local crate's HIR map, which is certainly incorrect. --nmatsakis - + ## Identifying and tracking dependencies @@ -525,6 +525,25 @@ We then delete the transitive closure of nodes queued for deletion nodes reachable from those HIR nodes). As part of the deletion process, we remove whatever on disk artifact that may have existed. + +### Handling spans + +There are times when the precise span of an item is a significant part +of its metadata. For example, debuginfo needs to identify line numbers +and so forth. However, editing one fn will affect the line numbers for +all subsequent fns in the same file, and it'd be best if we can avoid +recompiling all of them. Our plan is to phase span support in incrementally: + +1. Initially, the AST hash will include the filename/line/column, + which does mean that later fns in the same file will have to be + recompiled (somewhat unnnecessarily). +2. Eventually, it would be better to encode spans by identifying a + particular AST node (relative to the root of the item). Since we + are hashing the structure of the AST, we know the AST from the + previous and current compilation will match, and thus we can + compute the current span by finding tha corresponding AST node and + loading its span. This will require some refactoring and work however. + ## Optimization and codegen units From 736bfced96d3f16c080adb9f0e6c212a512a1e30 Mon Sep 17 00:00:00 2001 From: Eli Friedman Date: Sun, 4 Oct 2015 14:03:38 -0700 Subject: [PATCH 0564/1195] RFC to add some additional utility methods to OsString and OsStr. --- text/0000-osstring-methods.md | 77 +++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100644 text/0000-osstring-methods.md diff --git a/text/0000-osstring-methods.md b/text/0000-osstring-methods.md new file mode 100644 index 00000000000..42b265df776 --- /dev/null +++ b/text/0000-osstring-methods.md @@ -0,0 +1,77 @@ +- Feature Name: osstring_simple_functions +- Start Date: 2015-10-04 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add some additional utility methods to OsString and OsStr. + +# Motivation + +OsString and OsStr are extremely bare at the moment; some utilities would make them +easier to work with. The given set of utilities is taken from String, and don't add +any additional restrictions to the implementation. + +I don't think any of the proposed methods are controversial. + +# Detailed design + +Add the following methods to OsString: + +```rust +/// Creates a new `OsString` with the given capacity. The string will be able +/// to hold exactly `capacity` bytes without reallocating. If `capacity` is 0, +/// the string will not allocate. +/// +/// See main `OsString` documentation information about encoding. +fn with_capacity(capacity: usize) -> OsString; + +/// Truncates `self` to zero length. +fn clear(&mut self); + +/// Returns the number of bytes this `OsString` can hold without reallocating. +/// +/// See `OsString` introduction for information about encoding. +fn capacity(&self) -> usize; + +/// Reserves capacity for at least `additional` more bytes to be inserted in the +/// given `OsString`. The collection may reserve more space to avoid frequent +/// reallocations. +fn reserve(&mut self, additional: usize); + +/// Reserves the minimum capacity for exactly `additional` more bytes to be +/// inserted in the given `OsString`. Does nothing if the capacity is already +/// sufficient. +/// +/// Note that the allocator may give the collection more space than it +/// requests. Therefore capacity can not be relied upon to be precisely +/// minimal. Prefer reserve if future insertions are expected. +fn reserve_exact(&mut self, additional: usize); +``` + +Add the following methods to OsStr: + +```rust +/// Checks whether `self` is empty. +fn is_empty(&self) -> bool; + +/// Returns the number of bytes in this string. +/// +/// See `OsStr` introduction for information about encoding. +fn len(&self) -> usize; +``` + +# Drawbacks + +The meaning of `len()` might be a bit confusing because it's the size of +the internal representation on Windows, which isn't otherwise visible to the +user. + +# Alternatives + +None. + +# Unresolved questions + +None. From 4f3db01a8debfa73c0ced1813d14773e50fdaaf2 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 9 Oct 2015 15:08:07 -0700 Subject: [PATCH 0565/1195] RFC 1228 is Placement left arrow syntax --- README.md | 1 + ...-placement-left-arrow.md => 1228-placement-left-arrow.md} | 5 ++--- 2 files changed, 3 insertions(+), 3 deletions(-) rename text/{0000-placement-left-arrow.md => 1228-placement-left-arrow.md} (98%) diff --git a/README.md b/README.md index 9620754f5ea..394b46371c7 100644 --- a/README.md +++ b/README.md @@ -64,6 +64,7 @@ the direction the language is evolving in. * [1184-stabilize-no_std.md](text/1184-stabilize-no_std.md) * [1214-projections-lifetimes-and-wf.md](text/1214-projections-lifetimes-and-wf.md) * [1219-use-group-as.md](text/1219-use-group-as.md) +* [1228-placement-left-arrow.md](text/1228-placement-left-arrow.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-placement-left-arrow.md b/text/1228-placement-left-arrow.md similarity index 98% rename from text/0000-placement-left-arrow.md rename to text/1228-placement-left-arrow.md index 71758e27b16..d903a0cf12f 100644 --- a/text/0000-placement-left-arrow.md +++ b/text/1228-placement-left-arrow.md @@ -1,7 +1,7 @@ - Feature Name: place_left_arrow_syntax - Start Date: 2015-07-28 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1228 +- Rust Issue: https://github.com/rust-lang/rust/issues/27779 # Summary @@ -161,4 +161,3 @@ let ref_2 = in arena <- value_expression; # Unresolved questions None - From ac3662facefc9eec6c66e8f185e3ebcb8002d58d Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 9 Oct 2015 15:26:38 -0700 Subject: [PATCH 0566/1195] RFC 1260 is Allow a re-export for `main` --- README.md | 1 + text/{0000-main-reexport.md => 1260-main-reexport.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-main-reexport.md => 1260-main-reexport.md} (92%) diff --git a/README.md b/README.md index 394b46371c7..a46080d6075 100644 --- a/README.md +++ b/README.md @@ -65,6 +65,7 @@ the direction the language is evolving in. * [1214-projections-lifetimes-and-wf.md](text/1214-projections-lifetimes-and-wf.md) * [1219-use-group-as.md](text/1219-use-group-as.md) * [1228-placement-left-arrow.md](text/1228-placement-left-arrow.md) +* [1260-main-reexport.md](text/1260-main-reexport.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-main-reexport.md b/text/1260-main-reexport.md similarity index 92% rename from text/0000-main-reexport.md rename to text/1260-main-reexport.md index 08f48185fd5..9a9d6b35f16 100644 --- a/text/0000-main-reexport.md +++ b/text/1260-main-reexport.md @@ -1,7 +1,7 @@ - Feature Name: main_reexport - Start Date: 2015-08-19 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1260 +- Rust Issue: https://github.com/rust-lang/rust/issues/28937 # Summary From a388f27acab36e3229d463e79e519aeade86cdf7 Mon Sep 17 00:00:00 2001 From: Tobias Bucher Date: Wed, 19 Aug 2015 17:17:20 +0200 Subject: [PATCH 0567/1195] RFC: Allow a re-export for `main` --- text/0000-main-reexport.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 text/0000-main-reexport.md diff --git a/text/0000-main-reexport.md b/text/0000-main-reexport.md new file mode 100644 index 00000000000..64cba1c7ead --- /dev/null +++ b/text/0000-main-reexport.md @@ -0,0 +1,34 @@ +- Feature Name: main_reexport +- Start Date: 2015-08-19 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Allow a re-export of a function as entry point `main`. + +# Motivation + +Functions and re-exports of functions usually behave the same way, but they do +not for the program entry point `main`. This RFC aims to fix this inconsistency. + +The above mentioned inconsistency means that e.g. you currently cannot use a +library's exported function as your main function. + +# Detailed design + +Use the symbol `main` at the top-level of a crate that is compiled as a program +(`--crate-type=bin`) – instead of explicitly only accepting directly-defined +functions, also allow re-exports. + +# Drawbacks + +None. + +# Alternatives + +None. + +# Unresolved questions + +None. From f57cb490f4f4552fcdc93f0c76e42c6fc4c65708 Mon Sep 17 00:00:00 2001 From: Tobias Bucher Date: Fri, 21 Aug 2015 12:08:43 +0200 Subject: [PATCH 0568/1195] Add some examples and refer to the issue that led to the RFC --- text/0000-main-reexport.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/text/0000-main-reexport.md b/text/0000-main-reexport.md index 64cba1c7ead..0f1007cc5bf 100644 --- a/text/0000-main-reexport.md +++ b/text/0000-main-reexport.md @@ -15,6 +15,23 @@ not for the program entry point `main`. This RFC aims to fix this inconsistency. The above mentioned inconsistency means that e.g. you currently cannot use a library's exported function as your main function. +Example: + + pub mod foo { + pub fn bar() { + println!("Hello world!"); + } + } + pub use foo::bar as main; + +Example 2: + + extern crate main_functions; + pub use main_functions::rmdir as main; + +See also https://github.com/rust-lang/rust/issues/27640 for the corresponding +issue discussion. + # Detailed design Use the symbol `main` at the top-level of a crate that is compiled as a program From fcfe2d60499e39e594b48aff41e6b7255f0bb680 Mon Sep 17 00:00:00 2001 From: Tobias Bucher Date: Fri, 2 Oct 2015 21:44:13 +0100 Subject: [PATCH 0569/1195] Add @pnkfelix's suggestions --- text/0000-main-reexport.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/text/0000-main-reexport.md b/text/0000-main-reexport.md index 0f1007cc5bf..08f48185fd5 100644 --- a/text/0000-main-reexport.md +++ b/text/0000-main-reexport.md @@ -22,7 +22,7 @@ Example: println!("Hello world!"); } } - pub use foo::bar as main; + use foo::bar as main; Example 2: @@ -32,11 +32,17 @@ Example 2: See also https://github.com/rust-lang/rust/issues/27640 for the corresponding issue discussion. +The `#[main]` attribute can also be used to change the entry point of the +generated binary. This is largely irrelevant for this RFC as this RFC tries to +fix an inconsistency with re-exports and directly defined functions. +Nevertheless, it can be pointed out that the `#[main]` attribute does not cover +all the above-mentioned use cases. + # Detailed design Use the symbol `main` at the top-level of a crate that is compiled as a program (`--crate-type=bin`) – instead of explicitly only accepting directly-defined -functions, also allow re-exports. +functions, also allow (possibly non-`pub`) re-exports. # Drawbacks From 411daabc242e1a19d6ae78101c59ffbe47f909ea Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 9 Oct 2015 15:26:38 -0700 Subject: [PATCH 0570/1195] RFC 1260 is Allow a re-export for `main` --- README.md | 1 + text/{0000-main-reexport.md => 1260-main-reexport.md} | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) rename text/{0000-main-reexport.md => 1260-main-reexport.md} (92%) diff --git a/README.md b/README.md index 394b46371c7..a46080d6075 100644 --- a/README.md +++ b/README.md @@ -65,6 +65,7 @@ the direction the language is evolving in. * [1214-projections-lifetimes-and-wf.md](text/1214-projections-lifetimes-and-wf.md) * [1219-use-group-as.md](text/1219-use-group-as.md) * [1228-placement-left-arrow.md](text/1228-placement-left-arrow.md) +* [1260-main-reexport.md](text/1260-main-reexport.md) ## Table of Contents [Table of Contents]: #table-of-contents diff --git a/text/0000-main-reexport.md b/text/1260-main-reexport.md similarity index 92% rename from text/0000-main-reexport.md rename to text/1260-main-reexport.md index 08f48185fd5..9a9d6b35f16 100644 --- a/text/0000-main-reexport.md +++ b/text/1260-main-reexport.md @@ -1,7 +1,7 @@ - Feature Name: main_reexport - Start Date: 2015-08-19 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1260 +- Rust Issue: https://github.com/rust-lang/rust/issues/28937 # Summary From 6b0b77189dff97bdd2987da099223e9d8b530ee6 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 24 Sep 2015 13:01:04 +1200 Subject: [PATCH 0571/1195] Changes to the compiler to support IDEs --- text/0000-ide.md | 530 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 530 insertions(+) create mode 100644 text/0000-ide.md diff --git a/text/0000-ide.md b/text/0000-ide.md new file mode 100644 index 00000000000..8a0ab3eaf92 --- /dev/null +++ b/text/0000-ide.md @@ -0,0 +1,530 @@ +- Feature Name: n/a +- Start Date: 2015-10-13 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC describes how we intend to modify the compiler to support IDEs. The +intention is that support will be as generic as possible. A follow-up internals +post will describe how we intend to focus our energies and deploy Rust support +in actual IDEs. + +There are two sets of technical changes proposed in this RFC: changes to how we +compile, and the creation of an 'oracle' tool (name of tool TBC). + +This RFC is fairly detailed, it is intended as a straw-man plan to guide early +implementation, rather than as a strict blueprint. + + +## Compilation model + +An IDE will perform two kinds of compilation - an incremental check as the user +types (used to provide error and code completion information) and a full build. +The full build is explicitly signaled by the user (it could also happen +implicitly, for example when the user saves a file). A full build is basically +just a `cargo build` command, as would be done from the command line. It will +take advantage of any future improvements to regular compilation (such as +incremental compilation), but there is essentially no change from a compile +today. It is not very interesting and won't be discussed further. + +The incremental check follows a new model of compilation. This check must be as +fast as possible but does not need to generate machine code. We'll describe it +in more detail below. We call this kind of compilation a 'quick-check'. + +This RFC also covers making compilation more robust. + + +## The oracle + +The oracle is a long running daemon process. It will keep a database +representation of an entire project's source code and semantic information (as +opposed to the compiler which operates on a crate at a time). It is +incrementally updated by the compiler and provides an IPC API for providing +information about a program - the low-level information an IDE (or similar tool) +needs, e.g., code completion options, location of definitions/declarations, +documentation for items. + +The oracle is a general purpose, low-level tool and should be usable by any IDE +as well as other tools. End users and editors with less project knowledge should +use the oracle via a more friendly interface (such as Racer). + + +## Other shared functionality + +Other functionality, such as refactoring and reformatting will be provided by +separate tools rather than the oracle. These should be sharable between IDE +implementations. They are not covered in this RFC. + + +# Motivation + +An IDE collects together many tools into a single piece of software. Some of +these are entirely separate from the rest of the Rust eco-system (such as editor +functionality), some will reuse existing tools in pretty much the same way they +are already used (e.g., formatting code, which should straightforwardly use +Rustfmt), and some will have totally new ways of using the compiler or other +tools (e.g., code completion). + +Modern IDEs are large and complex pieces of software; creating a new one from +scratch for Rust would be impractical. Therefore we need to work with existing +IDEs (such as Eclipse, IntelliJ, and Visual Studio) to provide functionality. +These IDEs provide excellent editor and project management support out of the +box, but know nothing about the Rust language. + +An important aspect of IDE support is that response times must be extremely +short. Users expect information as they type. Running normal compilation of an +entire project is far too slow. Furthermore, as the user is typing, the program +will not be a valid, complete Rust program. + +We expect that an IDE may have its own lexer and parser. This is necessary for +the IDE to quickly give parse errors as the user types. Editors are free to rely +on the compiler's parsing if they prefer (the compiler will do its own parsing +in any case). Further information (name resolution, type information, etc.) will +be provided by the compiler via the oracle. + + +# Detailed design + +## Quick-check compilation + +(See also open questions, below). + +We run the quick-check compiler on a single crate. At some point after quick +checking, dependent crates must be rebuilt. This is the responsibility of an +external tool to manage (see below). Quick-check is driven by an IDE (or +possibly by the oracle), not by Cargo. + + +### Incremental and lazy compilation + +Incremental compilation is where, rather than re-compiling an entire crate, only +code which is changed and its dependencies are re-compiled. See +[RFC #1298](https://github.com/rust-lang/rfcs/pull/1298). + +Lazy compilation is where, rather than compiling an entire crate, we start by +compiling a single function (or possibly some other unit of code), and re- +compiling code which is depended on until we are done. Not all of a crate will +be compiled in this fashion. + +These two compilation strategies are faster than the current compilation model +(compile everything, every time). They are somewhat orthogonal - compilation can +be either lazy or incremental without implying the other. The [current +proposal](https://github.com/rust-lang/rfcs/pull/1298) for supporting +incremental compilation involves some lazy compilation as an implementation +detail. + +For quick-checking, compilation should be both incremental and lazy. The input +to the compiler is not just the crate being re-compiled, but also the span of +code changed (normal incremental compilation computes this span for itself, but +the IDE already has this information, so it would be wasteful to recompute it). +As a further optimisation, if the IDE can refer to items by an id (such as a +path), then this could be fed to the compiler rather than a code span to save +the compiler the effort of finding an AST node from a code span. + +We begin by computing which code is invalidated by the change (that is, any code +which depends on the changed code). We then re-compile the changed code. +Information which is depended upon is looked up in the saved metadata used for +incremental compilation. When we have re-compiled the changed code, then we +output the result (see below). If there are no fatal errors, then we continue to +compile the rest of the invalidated code. + + +### Compilation output + +The output of compilation is either success or a set of errors (as with today's +compiler, but see below for more detail on error message format). However, since +compilation can continue after returning an initial result, we might produce +further errors (I presume that IDEs provide a mechanism for the compiler to +communicate these asynchronously to the IDE plugin). + +In addition we must produce data to update the oracle, this should be done +directly, without involving the IDE plugin. + +Quick-check does not generate executable code or crate metadata. However, it +should (probably) update the metadata used for incremental compilation. + + +### Multiple crates + +Quick check only applies to a single crate, however, after some changes we might +need to re-compile dependent crates. This is the IDE's responsibility. In the +short term we can just trigger a full re-build (via Cargo) when the user starts +editing a file belonging to a different crate (there will obviously be some lag +there). The compiler must also generate crate metadata for the modified crate. + +Long term, the IDE might keep track of the dependency graph between crates +(provided by Cargo). The quick-check should signal when a crate's public +interface changes due to re-compilation. In that case the IDE can trigger +background re-compilation of dependent crates (possibly with some +delay/batching). + + +## The Oracle + +The oracle is a long-running tool which takes input from both full builds and +quick-checks, and responds to queries about a Rust program. Of particular note +is that it knows about a whole project, not just a single crate. In fact, other +than as a kind of module, it doesn't much care about the notion of a crate at +all. + +We require a data format for getting metadata from the compiler to the oracle. +Unfortunately none of the existing ones are quite right. Crate metadata is not +complete enough (it mostly only contains data about interfaces, not function +bodies), save-analysis data has been processed too far (basically into strings) +which loses some of the structure that would be useful, debuginfo is not Rust- +centric enough (i.e., does not contain Rust type information) and is based on +expanded source code. Furthermore, serialising any of the compiler's IRs is not +good enough: the AST and HIR do not contain any type or name information, the +HIR and MIR are post-expansion. + +The best option seems to be the save-analysis information. This is in a poor +format, but is the 'right' data (it can be based on an early AST and includes +type and name information). It can be evolved to be more efficient form over the +long run (it has been a low priority task for a while to support different +formats for the information). + +Full builds will generate a dump of save-analysis data for a whole crate. Quick +checks will generate data for the changed code. In both cases the oracle must +incrementally update its knowledge of the source code. How exactly to do this +when neither names nor ids are stable is an interesting question, but too much +detail for this RFC (especially as the implementation of ids in the compiler is +evolving). + +For crates which are not built from source (for example the standard library), +authors can choose to distribute the oracle's metadata to allow users to get a +good IDE experience with these crates. In this case, we only need metadata for +interfaces, not the bodies of functions or private items. The oracle should +handle such reduced metadata. It should be possible to generate the oracle's +metadata from the crate metadata, but this is not a short-term goal. (Note this +will require some knowledge in the IDE too - if there is no corresponding source +code, the IDE cannot 'jump to definition', for example). + +The oracle's data is platform-dependent. We must be careful when working with a +cross-compiled project to generate metadata for the target machine. This +shouldn't be a problem for normal compilation, but it means that quick-check +compilation must be configured for the same target, and care should be taken +with downloaded metadata. + +As well as metadata based on types and names, the oracle should keep track of +warnings. Since code with warnings but no errors is not re-compiled, a tool +outside the compiler must track them for display in the IDE. This will be done +by the oracle. + + +### Details + +#### API + +The oracle's API is a set of IPC calls. How exactly these should be implemented +is not clear. The most promising options are sending JSON over TCP, using +[thrift](https://thrift.apache.org/), or using Cap'n Proto (I'm unclear about +exactly what the transport layer looks like using Cap'n Proto, there is no Cap'n +Proto RCP implementation for Java, but I believe there is an alternative using +shared, memory mapped files as a buffer; I'm not familiar enough with the +library to work out what is needed). + +I've detailed the API I believe we'll need to start with. This is slightly more +than a minimal set. I expect it will expand as time goes by. At some point we +will want to stabilise parts of the API to allow for third party implementations +of the oracle and compiler. + +All API calls can return success or error results. Many calls involve a *span*; +for the oracle's API, this is defined as two byte offsets from the start of the +file (oracle spans must always be contained in a single file). + +There are some alternative span definitions: we could use file and column indices +rather than byte offsets (this has some edge case difficulties with the +definition of a newline - do unicode newlines count? It also requires some extra +computation), we could use character offsets (again involves some more +computation, but might be more robust). + +A problem is that Visual Studio uses UTF16 while Rust uses UTF8, there is (I +understand) no efficient way to convert between byte counts in these systems. +I'm not sure how to address this. It might require the oracle to be able to +operate in UTF16 mode. + +Where no return value is specified, the call returns success or failure (with a +reason). + +The philosophy of the API is that most functions should only take a single call, +as opposed to making each function as minimal and orthogonal as possible. This +is because IPC can be slow and response time is important for IDEs. + + +**Projects** + +Note that the oracle stores no metadata about a project. + +*init project* + +Takes a project name, returns an id string (something close to the project's name). + +*delete project* + +Takes a project id. + +*list projects* + +Takes nothing, returns a list of project ids. + +Each of the remaining calls takes a project identifier. + + +**Update** + +See section on input data format below. + +*update* + +Takes input data (actual source code rather than spans since we cannot assume +the user has saved the file) and a list of spans to invalidate. Where there are +no invalidated spans, the update call adds data (which will cause an error if +there are conflicts). Where there is no input data, update just invalidates. + +We might want to allow some shortcuts to invalidate an entire file or +recursively invalidate a directory. + + +**Description** + +*get definition* + +Takes a span, returns all 'definitions and declarations' for the identifier +covered by the span. Can return an error if the span does not cover exactly one +identifier or the oracle has no data for an identifier. + +The returned data is a list of 'defintion' data. That data includes the span for +the item, any documentation for the item, a code snippet for the item, +optionally a type for the item, and one or more kinds of definition (e.g., +'variable definition', 'field definition', 'function declaration'). + +*get references* + +Takes a span, returns a list of reference data (or an error). Each datum +consists of the span of the reference and a code snippet. + +*get docs* + +Takes a span, returns the same data as *get definition* but limited to doc strings. + +*get type* + +Takes a span, returns the same data as *get definition* but limited to type information. + +Question: are these useful/necessary? Or should users just call *get definition*? + +*search for identifier* + +Takes a search string or an id, and a struct of search parameters including case +sensitivity, and the kind of items to search (e.g., functions, traits, all +items). Returns a list of spans and code snippets. + + +**Code completion** + +*get suggestions* + +Takes a span (note that this span could be empty, e.g, for `foo.` we would use +the empty span which starts after the `.`; for `foo.b` we would use the span for +`b`), and returns a list of suggestions (is this useful? Is there any difference +from just using the caret position?). Each suggestion consists of the text for +completion plus the same information as returned for the *get definition* call. + + +#### Input data format + +The precise serialisation format of the oracle's input data will likely change +over time. At first, I propose we use csv, since that is what save-analysis +currently supports, and there is good decoding support for Rust. Longer term we +should use a binary format for more efficient serialisation and deserialisation. + +Each datum consists of an identifier, a kind, a span, and a set of fields (the +exact fields are dependent on the kind of data). + +If the datum is for a definition (of a trait, struct, etc.), then the identifier +is an absolute path (including the crate) to that definition. Question: how to +identify impls - do we need to distinguish multiple impls for the same trait and +data type? + +For statements and expressions, the identifier is a path to the expression's +function (or static/const) and a function relative id. Note that this means we +have to invalidate an entire function at a time (or at least all of the function +after the edited portion). It would be nice if we could avoid this and be more +fine-grained about invalidation, any ideas? + +I propose that we follow the save-analysis data format to start with (in terms +of the kinds of data available and the fields for each). However, we should use +identifiers rather than DefIds and distinguish fields from variables. + + +### Racer + +The oracle fulfills a similar role to +[Racer](https://github.com/phildawes/racer). Indeed, forking Racer may be a good +way to start development of the oracle. The oracle should provide more +information and should be more accurate by being more closely integrated with +the compiler. + +Racer could be refactored to be a client of the oracle, thus taking advantage of +more accurate data and a simpler implementation, whilst maintaining its +interface. This would be a nice way to make the oracle's data available to less +sophisticated editors. Alternatively, Racer could make use of the oracle's +metadata but do its own processing of that data to provide an alternate +implementation of an oracle. + + +### DXR and Rustdoc + +Both DXR and Rustdoc could be rewritten to talk to the oracle and run in a live +mode, rather than maintaining their own pre-processed data. This would have some +benefit in keeping these resources up to date as programs are edited (and +reducing the number of ways for doing essentially the same thing). However, this +does not seem like enough motivation to actually do the work. Could be an +interesting student project or something. + + +## Robust compilation + +The goal here is that when the user is typing, we should be able to run the +early stages of the quick-check compiler and still come up with sensible code +completion suggestions. The IDE and compiler can collaborate to some extent +here. + +As long as we can compile as far as type checking, then the compiler should +still generate metadata for the oracle. If we fail later (e.g., in borrow +checking) then we should return errors *and* metadata for the oracle. If we fail +to type check, then we cannot generate meaningful data for the oracle (or if we +succeed at type checking, but use some error recovery). + +THE IDE should instruct the oracle to invalidate some of its data. I believe that +this does not require deep knowledge about the program (i.e., we know a span has +changed and compilation has failed, we can instruct the oracle to invalidate all +data associated with that span. With luck, we can leverage the dependency +information the compiler has for incremental compilation here). + +In some cases a program would fail to parse or pass name resolution, but we +would like to try to type check. For example, + +```rust +fn main() { + let x = foo.bar. +``` + +will not parse, but we would like to suggest code completion options. + +```rust +fn main() { + let foo = foo(); + let x = fo; +} +``` + +will parse, but fail name resolution, but again we would like to suggest code +completion options. + +There are two issues: dealing with incomplete or incorrect names (e.g., `fo` in +the second example), and dealing with unfinished AST nodes (e.g., in the first +example we need an identifier to finish the `.` expression, a `;` to terminate +the let statement, and a `}` to terminate the `main` function). + +A solution to the first problem is replacing invalid names with some magic +identifier, and ignoring errors involving that identifier. @sanxiyn implemented +something like the second feature in a [PR](https://github.com/rust- +lang/rust/pull/21323). His approach was to take a command line argument for +where to 'complete at' and to treat that as the magic identifier. An alternate +approach would be to use a keyword or distinguished identifier which the IDE +could insert (based on the caret position), or to fallback to the magic +identifier whenever there is a name resolution error. + +Similarly during type checking, if we find a mismatched or unknown type, we +should try to continue type checking with the information available so as to +still be able to provide code completion information. We already do this to some +extent with `TyErr`, but we should do better. + +For the second issue, the problem is where to start parsing again and how many +'open' items should be terminated. This is closely related to error recovery in +parsers, which is a well-developed are of research with a long history, and +which I won't attempt to summarise here. As far as I can see, there are two +major differences since we are doing this in the IDE context: we know the extent +of edited code (the span of changes we are passing to the quick-check compiler) +and the previous state of the edited code, and we can likely assume that even in +new code, braces and parentheses are likely to be paired (since an IDE will +insert closing braces, etc.). Assuming that we keep the state of the code the +last time it parsed completely, we can expand the edited span to cover an entire +expression (or other item) and thus we know exactly where to start re-parsing. +In the case where we are writing new code, we can just close all 'open' items. + +Being able to generate more errors before stopping would be an advantage for the +compiler in any case. However, we probably do not want to use these mechanisms +under normal compilation, only when performing a quick-check from the IDE. + + + +## Error format + +Currently the compiler generates text error messages. I propose that we add a +mechanism to the compiler to support different formats for error messages. We +already structure our error messages to some extent (separating the span +information, the message, and the error code). Rather than turning these +components into text in a fairly ad hoc manner, we should preserve that +structure, and some central error handler should convert into a chosen format. +We should support the current text format, JSON (or some other structured +format) for tools to use, and HTML for rich error messages (this is somewhat +orthogonal to this RFC, but has been discussed in the past as a desirable +feature). + + +# Drawbacks + +It's a lot of work. On the other hand the largest changes are desirable for +general improvements in compilation speed or for other tools. + + +# Alternatives + +The oracle and quick-check compiler could be combined in a single tool. This +might be more efficient, but would increase complexity and decrease opportunity +for third party alternatives. + +The oracle could do more - actually perform some of the processing tasks usually +done by IDEs (such as editing source code) or other tools (refactoring, +reformating, etc.). + +Should the oracle hide the quick-check compiler? I.e., the IDE talks only to the +oracle and the oracle requests compilation as needed. This might make things a +bit simpler for the IDE and means less IPC overhead and complexity. Either the +oracle could be responsible for all coordination, or the IDE could remain +responsible for coordinating when crates are handled, and the oracle is +responsible for coordinating calls to the quick check compiler to build a single +crate. + + +# Unresolved questions + +Should the quick-check compilation be provided by a separate tool or a mode of +the compiler? It is fairly different in its operation from the compiler. It +might be better to provide a different 'frontend' rather than adding many more +options to the compiler. (I think the answer is 'yes'). + +Should quick-check be a long running process? It could save some time by not +having to reload metadata, but having to keep metadata for an entire project in +memory would be expensive. We could perhaps compromise by unloading when the +user needs to recompile a different crate. I believe it is probably better in +the long run, but a batch process is OK to start with. + +How and when should we generate crate metadata. It seems sensible to generate +this when we switch to editing/re-compiling a different crate. However, it's not +clear if this must be done from scratch or if it can be produced from the +incremental compilation metadata (see that RFC, I guess). + +What should we call the oracle tool? I don't particularly like "oracle", +although it is descriptive (it comes from the Go tool of the same name). +Alternatives are 'Rider', 'Racer Server', or anything you can think of. + +How do we handle different versions of Rust and interact with multi-rust? +Upgrades to the next stable version of Rust? + +Do we need to standardise error messages for the various parsers to prevent user +confusion (i.e., try to ensure that rustc and the various IDEs give the same +error messages). From eed47f3242a77308d6a07c30cd9411345f0e6c49 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Tue, 13 Oct 2015 15:02:16 -0400 Subject: [PATCH 0572/1195] Amend 1192 (RangeInclusive) to use an enum. Rational: 1. The word "finished" is very iterator specific. Really, this field is trying to indicate that the range is actually empty. 2. `start`/`end` don't make sense if the range is empty. Using an enum prevents coders from using the `start`/`end` of spent ranges. Basically, this makes it impossible for the coder to do something like `foo(my_range.take(10)); bar(my_range)` and forget to check `finished` in `bar`. 3. If we ever get better enum optimizations (specifically, utf8 code point ones) `'a'...'z'` should be the same size as `'a'..'z'`; the Empty variant can be encoded as an invalid code point. --- text/1192-inclusive-ranges.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index 78051be6b5a..243702745a0 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -26,10 +26,12 @@ more dots means more elements. `std::ops` defines ```rust -pub struct RangeInclusive { - pub start: T, - pub end: T, - pub finished: bool, +pub enum RangeInclusive { + Empty, + NonEmpty { + start: T, + end: T, + } } pub struct RangeToInclusive { @@ -37,12 +39,11 @@ pub struct RangeToInclusive { } ``` -Writing `a...b` in an expression desugars to `std::ops::RangeInclusive -{ start: a, end: b, finished: false }`. Writing `...b` in an +Writing `a...b` in an expression desugars to `std::ops::RangeInclusive::NonEmpty { start: a, end: b }`. Writing `...b` in an expression desugars to `std::ops::RangeToInclusive { end: b }`. `RangeInclusive` implements the standard traits (`Clone`, `Debug` -etc.), and implements `Iterator`. The `finished` field is to allow the +etc.), and implements `Iterator`. The `Empty` variant is to allow the `Iterator` implementation to work without hacks (see Alternatives). The use of `...` in a pattern remains as testing for inclusion @@ -79,8 +80,9 @@ winner. This RFC doesn't propose non-double-ended syntax, like `a...`, `...b` or `...` since it isn't clear that this is so useful. Maybe it is. -The `finished` field could be omitted, leaving two options: +The `Empty` variant could be omitted, leaving two options: +- `RangeInclusive` could be a struct including a `finished` field. - `a...b` only implements `IntoIterator`, not `Iterator`, by converting to a different type that does have the field. However, this means that `a...b` behaves differently to `a..b`, so From b8da6c79a6c423be37aa7c545e6c267a0a2e09f3 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 14 Oct 2015 18:05:55 -0700 Subject: [PATCH 0573/1195] RFC 1257 is specifying drain() --- text/{0000-drain-range-2.md => 1257-drain-range-2.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-drain-range-2.md => 1257-drain-range-2.md} (95%) diff --git a/text/0000-drain-range-2.md b/text/1257-drain-range-2.md similarity index 95% rename from text/0000-drain-range-2.md rename to text/1257-drain-range-2.md index 96096ee8efa..533f1f60e69 100644 --- a/text/0000-drain-range-2.md +++ b/text/1257-drain-range-2.md @@ -1,7 +1,7 @@ - Feature Name: drain-range - Start Date: 2015-08-14 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1257](https://github.com/rust-lang/rfcs/pull/1257) +- Rust Issue: [rust-lang/rust#27711](https://github.com/rust-lang/rust/issues/27711) # Summary @@ -106,7 +106,7 @@ The following will be heading towards stabilization after changes: - Use the name `.remove_range(a..b)` instead of `.drain(a..b)`. Since the method has two simultaneous roles, removing a range and yielding a range as an iterator, - either role could guide the name. + either role could guide the name. This alternative name was not very popular with the rust developers I asked (but they are already used to what `drain` means in rust context). From 7c1cb23977167e706e83152d2610c9825a7b9d0e Mon Sep 17 00:00:00 2001 From: Yehuda Katz Date: Wed, 14 Oct 2015 20:30:16 -0700 Subject: [PATCH 0574/1195] Address RFC comments --- text/0000-time-improvements.md | 74 +++++++++++++++++----------------- 1 file changed, 38 insertions(+), 36 deletions(-) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index 56f3bc64a2a..7e82c32566e 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -6,8 +6,8 @@ # Summary This RFC proposes several new types and associated APIs for working with times in Rust. -The primary new types are `ProcessTime`, for working with monotonic time within a single -process, and `SystemTime`, for working with times across processes on a single system +The primary new types are `Instance`, for working with time that is guaranteed to be +monotonic, and `SystemTime`, for working with times across processes on a single system (usually internally represented as a number of seconds since an epoch). # Motivations @@ -41,10 +41,12 @@ We would like to be able to do some basic operations with these instants: However, there are a number of problems that arise when trying to define these types and operations. -First of all, with the exception of instants created in the rutime of a single -process, instants are not monotonic. A simple example of this is that if a -program creates two files sequentially, it cannot assume that the creation time -of the second file is later than the creation time of the first file. +First of all, with the exception of moments in time created using system APIs that +guarantee monotonicity (because they were created within a single process, or +created during since the last boot), moments in time are not monotonic. +A simple example of this is that if a program creates two files sequentially, +it cannot assume that the creation time of the second file is later than the +creation time of the first file. This is because NTP (the network time protocol) can arbitrarily change the system clock, and can even **rewind time**. This kind of time travel means that @@ -57,8 +59,8 @@ unexpected consequences of this kind of "time travel". --- Leap seconds, which cannot be predicted, mean that it is impossible -to reliably add a number of seconds to a particular instant represented as a -human date and time ("1 million seconds from 2015-09-20 at midnight"). +to reliably add a number of seconds to a particular moment in time represented +as a human date and time ("1 million seconds from 2015-09-20 at midnight"). They also mean that seemingly simple concepts, like "1 minute", have caveats depending on exactly how they are used. Caveats related to leap seconds @@ -107,7 +109,7 @@ the case in Rust's standard library). These APIs help the programmer discover the possibility of system clock time travel, and either handle the error explicitly, or at least avoid propagating the problem into other APIs (by using `unwrap`). -It separates monotonic time (`ProcessTime`) from time derived from the system +It separates monotonic time (`Instant`) from time derived from the system clock (`SystemTime`), which must account for the possibility of time travel. This allows methods related to monotonic time to be uncaveated, while working with the system clock has more methods that return `Result`s. @@ -120,7 +122,7 @@ directly address time zones. ## Types ```rs -pub struct ProcessTime { +pub struct Instant { secs: u64, nanos: u32 } @@ -136,27 +138,27 @@ pub struct Duration { } ``` -### ProcessTime +### Instant -`ProcessTime` is the simplest of the instant types. It represents an opaque -(non-serializable!) timestamp that is guaranteed to be monotonic throughout -the timeframe of the process it was created in. +`Instant` is the simplest of the types representing moments in time. It +represents an opaque (non-serializable!) timestamp that is guaranteed to +be monotonic when compared to another `Instant`. > In this context, monotonic means that a timestamp created later in real-world > time will always be larger than a timestamp created earlier in real-world > time. -The `Duration` type can be used in conjunction with `ProcessTime`, and these +The `Duration` type can be used in conjunction with `Instant`, and these operations have none of the usual time-related caveats. -* Add a `Duration` to a `ProcessTime`, producing a new `ProcessTime` -* compare two `ProcessTime`s to each other -* subtract a `ProcessTime` from a later `ProcessTime`, producing a `Duration` -* ask for an amount of time elapsed since a `ProcessTime`, producing a `Duration` +* Add a `Duration` to a `Instant`, producing a new `Instant` +* compare two `Instant`s to each other +* subtract a `Instant` from a later `Instant`, producing a `Duration` +* ask for an amount of time elapsed since a `Instant`, producing a `Duration` -Asking for an amount of time elapsed from a given `ProcessTime` is a very common +Asking for an amount of time elapsed from a given `Instant` is a very common operation that is guaranteed to produce a positive `Duration`. Asking for the -difference between an earlier and a later `ProcessTime` also produces a positive +difference between an earlier and a later `Instant` also produces a positive `Duration` when used correctly. This design does not assume that negative `Duration`s are never useful, but @@ -166,29 +168,29 @@ to produce an `Err` (or `panic!`) when receiving a negative value, this design optimizes for the broadly useful positive `Duration`. ```rs -impl ProcessTime { +impl Instant { /// Panics if `earlier` is later than &self. - /// Because ProcessTime is monotonic, the only time that `earlier` should be + /// Because Instant is monotonic, the only time that `earlier` should be /// a later time is a bug in your code. - pub fn duration_from_earlier(&self, earlier: ProcessTime) -> Duration; + pub fn duration_from_earlier(&self, earlier: Instant) -> Duration; - /// Panics if self is later than the current time (can happen if a ProcessTime + /// Panics if self is later than the current time (can happen if a Instant /// is produced synthetically) pub fn elapsed(&self) -> Duration; } -impl Add for ProcessTime { +impl Add for Instant { type Output = SystemTime; } -impl Sub for ProcessTime { - type Output = ProcessTime; +impl Sub for Instant { + type Output = Instant; } -impl PartialEq for ProcessTime; -impl Eq for ProcessTime; -impl PartialOrd for ProcessTime; -impl Ord for ProcessTime; +impl PartialEq for Instant; +impl Eq for Instant; +impl PartialOrd for Instant; +impl Ord for Instant; ``` For convenience, several new constructors are added to `Duration`. Because any @@ -265,7 +267,7 @@ impl PartialOrd for SystemTime; impl Ord for SystemTime; ``` -The main difference from the design of `ProcessTime` is that it is impossible to +The main difference from the design of `Instant` is that it is impossible to know for sure that a `SystemTime` is in the past, even if the operation that produced it happened in the past (in real time). @@ -300,7 +302,7 @@ working with times that can move both forward and backward. # Alternatives One alternative design would be to attempt to have a single unified time -type. The rationale for now doing so is explained under Drawbacks. +type. The rationale for not doing so is explained under Drawbacks. Another possible alternative is to allow free math between instants, rather than providing operations for comparing later instants to earlier @@ -318,7 +320,7 @@ This RFC attempts to catch mistakes related to negative `Duration`s at the point where they are produced, rather than requiring all APIs that **take** a `Duration` to guard against negative values. -Because `Ord` is implemented on `SystemTime` and `ProcessTime`, it is +Because `Ord` is implemented on `SystemTime` and `Instant`, it is possible to compare two arbitrary times to each other first, and then use `duration_from_earlier` reliably to get a positive `Duration`. @@ -327,4 +329,4 @@ use `duration_from_earlier` reliably to get a positive `Duration`. What should `SystemTimeError` look like? This RFC leaves types related to human representations of dates and times -to a future proposal. +to a future proposal. \ No newline at end of file From 33775e6b88e13d0b52b2592f3dce21ee2b11e6f3 Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Tue, 13 Oct 2015 20:09:08 +0300 Subject: [PATCH 0575/1195] Amend RFC1228 with operator fixity and precedence --- text/1228-placement-left-arrow.md | 38 ++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/text/1228-placement-left-arrow.md b/text/1228-placement-left-arrow.md index d903a0cf12f..71f47e520e5 100644 --- a/text/1228-placement-left-arrow.md +++ b/text/1228-placement-left-arrow.md @@ -83,7 +83,9 @@ let ref_2 = in arena { value_expression }; # Detailed design -Extend the parser to parse `EXPR <- EXPR`. +Extend the parser to parse `EXPR <- EXPR`. The left arrow operator is +right-associative and has precedence higher than assignment and +binop-assignment, but lower than other binary operators. `EXPR <- EXPR` is parsed into an AST form that is desugared in much the same way that `in EXPR { BLOCK }` or `box (EXPR) EXPR` are @@ -158,6 +160,40 @@ let ref_1 = in arena <- value_expression; let ref_2 = in arena <- value_expression; ``` +## Precedence + +Finally, precedence of this operator may be defined to be anything from being +less than assignment/binop-assignment (set of right associative operators with +lowest precedence) to highest in the language. The most prominent choices are: + +1. Less than assignment: + + Assuming `()` never becomes a `Placer`, this resolves a pretty common + complaint that a statement such as `x = y <- z` is not clear or readable + by forcing the programmer to write `x = (y <- z)` for code to typecheck. + This, however introduces an inconsistency in parsing between `let x =` and + `x =`: `let x = (y <- z)` but `(x = z) <- y`. + +2. Same as assignment and binop-assignment: + + `x = y <- z = a <- b = c = d <- e <- f` parses as + `x = (y <- (z = (a <- (b = (c = (d <- (e <- f)))))))`. This is so far + the easiest option to implement in the compiler. + +3. More than assignment and binop-assignment, but less than any other operator: + + This is what currently this RFC proposes. This allows for various + expressions involving equality symbols and `<-` to be parsed reasonably and + consistently. For example `x = y <- z += a <- b <- c` would get parsed as `x + = ((y <- z) += (a <- (b <- c)))`. + +4. More than any operator: + + This is not a terribly interesting one, but still an option. Works well if + we want to force people enclose both sides of the operator into parentheses + most of the time. This option would get `x <- y <- z * a` parsed as `(x <- + (y <- z)) * a`. + # Unresolved questions None From 882d767d23e1bc7ee3513fe8d4e5006e49012623 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 9 Oct 2015 16:29:25 -0700 Subject: [PATCH 0576/1195] RFC: Amend `recover` with a `PanicSafe` bound Instead of a `'static` bound on the function, instead add a new marker trait, `PanicSafe`, to encapsulate the concept of exception safety as a trait bound which can be used to serve as a speed bump for users of `panic::recover`. --- text/1236-stabilize-catch-panic.md | 187 +++++++++++++++++++++++++++-- 1 file changed, 176 insertions(+), 11 deletions(-) diff --git a/text/1236-stabilize-catch-panic.md b/text/1236-stabilize-catch-panic.md index 6559a80b1a3..a01fc0b9391 100644 --- a/text/1236-stabilize-catch-panic.md +++ b/text/1236-stabilize-catch-panic.md @@ -5,8 +5,9 @@ # Summary -Move `std::thread::catch_panic` to `std::panic::recover` after removing the -`Send` bound from the closure parameter. +Move `std::thread::catch_panic` to `std::panic::recover` after replacing the +`Send + 'static` bounds on the closure parameter with a new `PanicSafe` +marker trait. # Motivation @@ -132,10 +133,10 @@ broken `Vec` is then observed during its destructor, leading to the eventual memory unsafety. It's important to keep in mind that panic safety in Rust is not solely limited -to memory safety. *Logical invariants* are often just as critical to keep correct -during execution and no `unsafe` code in Rust is needed to break a logical -invariant. In practice, however, these sorts of bugs are rarely observed due to -Rust's design: +to memory safety. *Logical invariants* are often just as critical to keep +correct during execution and no `unsafe` code in Rust is needed to break a +logical invariant. In practice, however, these sorts of bugs are rarely observed +due to Rust's design: * Rust doesn't expose uninitialized memory * Panics cannot be caught in a thread @@ -180,15 +181,179 @@ this RFC. At its heart, the change this RFC is proposing is to move `std::thread::catch_panic` to a new `std::panic` module and rename the function -to `recover`. Additionally, the `Send` bound from the closure parameter will be -removed (`'static` will stay), modifying the signature to be: +to `recover`. Additionally, the `Send + 'static` bounds on the closure parameter +will be replaced with a new trait `PanicSafe`, modifying the signature to +be: ```rust -fn recover R + 'static, R>(f: F) -> thread::Result +fn recover R + PanicSafe, R>(f: F) -> thread::Result ``` -More generally, however, this RFC also claims that this stable function does -not radically alter Rust's exception safety story (explained above). +Before analyzing this new signature, let's take a look at this new +`PanicSafe` trait. + +## An `PanicSafe` marker trait + +As discussed in the motivation section above, the current bounds of `Send + +'static` on the closure parameter are too restrictive for common use cases, but +they can serve as a "speed bump" (like poisoning on mutexes) to add to the +repertoire of mitigation strategies that Rust has by default for dealing with +panics. + +The purpose of this marker trait will be to identify patterns which do not need +to worry about exception safety and allow them by default. In situations where +exception safety *may* be concerned then an explicit annotation will be needed +to allow the usage. In other words, this marker trait will act similarly to a +"targeted `unsafe` block". + +For the implementation details, the following items will be added to the +`std::panic` module. + +```rust +pub trait PanicSafe {} +impl PanicSafe for .. {} + +impl<'a, T> !PanicSafe for &'a mut T {} +impl<'a, T: NoUnsafeCell> PanicSafe for &'a T {} +impl PanicSafe for Mutex {} + +pub trait NoUnsafeCell {} +impl NoUnsafeCell for .. {} +impl !NoUnsafeCell for UnsafeCell {} + +pub struct AssertPanicSafe(pub T); +impl PanicSafe for AssertPanicSafe {} + +impl Deref for AssertPanicSafe { + type Target = T; + fn deref(&self) -> &T { &self.0 } +} +impl DerefMut for AssertPanicSafe { + fn deref_mut(&mut self) -> &mut T { &mut self.0 } +} +``` + +Let's take a look at each of these items in detail: + +* `impl PanicSafe for .. {}` - this makes this trait a marker trait, implying + that a the trait is implemented for all types by default so long as the + consituent parts implement the trait. +* `impl !PanicSafe for &mut T {}` - this indicates that exception safety + needs to be handled when dealing with mutable references. Thinking about the + `recover` function, this means that the pointer could be modified inside the + block, but once it exits the data may or may not be in an invalid state. +* `impl PanicSafe for &T {}` - similarly to the above + implementation for `&mut T`, the purpose here is to highlight points where + data can be mutated across a `recover` boundary. If `&T` does not contains an + `UnsafeCell`, then no mutation should be possible and it is safe to allow. +* `impl PanicSafe for Mutex {}` - as mutexes are poisoned by default, they + are considered exception safe. +* `pub struct AssertPanicSafe(pub T);` - this is the "opt out" structure of + exception safety. Wrapping something in this type indicates an assertion that + it is exception safe and shouldn't be warned about when crossing the `recover` + boundary. Otherwise this type simply acts like a `T`. + +### Example usage + +The only consumer of the `PanicSafe` bound is the `recover` function on the +closure type parameter, and this ends up meaning that the *environment* needs to +be exception safe. In terms of error messages, this cause the compiler to emit +an error per closed-over-variable to indicate whether or not it is exception +safe to share across the boundary. + +It is also a critical design aspect that usage of `PanicSafe` or +`AssertPanicSafe` does not require `unsafe` code. As discussed above, panic +safety does not directly lead to memory safety problems in otherwise safe code. + +In the normal usage of `recover`, neither `PanicSafe` nor `AssertPanicSafe` +should be necessary to mention. For example when defining an FFI function: + +```rust +#[no_mangle] +pub extern fn called_from_c(ptr: *const c_char, num: i32) -> i32 { + let result = panic::recover(|| { + let s = unsafe { CStr::from_ptr(ptr) }; + println!("{}: {}", s, num); + }); + match result { + Ok(..) => 0, + Err(..) => 1, + } +} +``` + +Additionally, if FFI functions instead use normal Rust types, `AssertPanicSafe` +still need not be mentioned at all: + +```rust +#[no_mangle] +pub extern fn called_from_c(ptr: &i32) -> i32 { + let result = panic::recover(|| { + println!("{}", *ptr); + }); + match result { + Ok(..) => 0, + Err(..) => 1, + } +} +``` + +If, however, types are coming in which are flagged as not exception safe then +the `AssertPanicSafe` wrapper can be used to leverage `recover`: + +```rust +fn foo(data: &RefCell) { + panic::recover(|| { + println!("{}", data.borrow()); //~ ERROR RefCell is not panic safe + }); +} +``` + +This can be fixed with a simple assertion that the usage here is indeed +exception safe: + +```rust +fn foo(data: &RefCell) { + let data = AssertPanicSafe(data); + panic::recover(|| { + println!("{}", data.borrow()); // ok + }); +} +``` + +### Future extensions + +In the future, this RFC proposes adding the following implementation of +`PanicSafe`: + +```rust +impl PanicSafe for T {} +``` + +This implementation block encodes the "exception safe" boundary of +`thread::spawn` but is unfortunately not allowed today due to coherence rules. +If available, however, it would possibly reduce the number of false positives +which require using `AssertPanicSafe`. + +### Global complexity + +Adding a new marker trait is a pretty hefty move for the standard library. The +current marker traits, `Send` and `Sync`, are well known and are ubiquitous +throughout the ecosystem and standard library. Due to the way that these +properties are derived, adding a new marker trait can lead to a multiplicative +increase in global complexity (as all types must consider the marker trait). + +With `PanicSafe`, however, it is expected that this is not the case. The +`recover` function is not intented to be used commonly outside of FFI or thread +pool-like abstractions. Within FFI the `PanicSafe` trait is typically not +mentioned due to most types being relatively simple. Thread pools, on the other +hand, will need to mention `AssertPanicSafe`, but will likely propagate panics +to avoid exposing `PanicSafe` as a bound. + +Overall, the expected idiomatic usage of `recover` should mean that `PanicSafe` +is rarely mentioned, if at all. It is intended that `AssertPanicSafe` is ideally +only necessary where it actually needs to be considered (which idiomatically +isn't too often) and even then it's lightweight to use. ## Will Rust have exceptions? From be5eb6f88e3e89059917f168123ba10dd2900f43 Mon Sep 17 00:00:00 2001 From: Barosl Lee Date: Fri, 16 Oct 2015 14:40:23 +0900 Subject: [PATCH 0577/1195] Add the text --- text/0000-panic-safe-slicing.md | 89 +++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 text/0000-panic-safe-slicing.md diff --git a/text/0000-panic-safe-slicing.md b/text/0000-panic-safe-slicing.md new file mode 100644 index 00000000000..4794d491dee --- /dev/null +++ b/text/0000-panic-safe-slicing.md @@ -0,0 +1,89 @@ +- Feature Name: panic_safe_slicing +- Start Date: 2015-10-16 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add "panic-safe" or "total" alternatives to the existing panicking slicing syntax. + +# Motivation + +`SliceExt::get` and `SliceExt::get_mut` can be thought as non-panicking versions of the simple +slicing syntax, `a[idx]`. However, there is no such equivalent for `a[start..end]`, `a[start..]`, +or `a[..end]`. This RFC proposes such methods to fill the gap. + +# Detailed design + +Add `get_range`, `get_range_mut`, `get_range_unchecked`, `get_range_unchecked_mut` to `SliceExt`. + +`get_range` and `get_range_mut` may be implemented roughly as follows: + +```rust +use std::ops::{RangeFrom, RangeTo, Range}; +use std::slice::from_raw_parts; +use core::slice::SliceExt; + +trait Rangeable { + fn start(&self, slice: &T) -> usize; + fn end(&self, slice: &T) -> usize; +} + +impl Rangeable for RangeFrom { + fn start(&self, _: &T) -> usize { self.start } + fn end(&self, slice: &T) -> usize { slice.len() } +} + +impl Rangeable for RangeTo { + fn start(&self, _: &T) -> usize { 0 } + fn end(&self, _: &T) -> usize { self.end } +} + +impl Rangeable for Range { + fn start(&self, _: &T) -> usize { self.start } + fn end(&self, _: &T) -> usize { self.end } +} + +trait GetRangeExt: SliceExt { + fn get_range>(&self, range: R) -> Option<&[Self::Item]>; +} + +impl GetRangeExt for [T] { + fn get_range>(&self, range: R) -> Option<&[T]> { + let start = range.start(self); + let end = range.end(self); + + if start > end { return None; } + if end > self.len() { return None; } + + unsafe { Some(from_raw_parts(self.as_ptr().offset(start as isize), end - start)) } + } +} + +fn main() { + let a = [1, 2, 3, 4, 5]; + + assert_eq!(a.get_range(1..), Some(&a[1..])); + assert_eq!(a.get_range(..3), Some(&a[..3])); + assert_eq!(a.get_range(2..5), Some(&a[2..5])); + assert_eq!(a.get_range(..6), None); + assert_eq!(a.get_range(4..2), None); +} +``` + +`get_range_unchecked` and `get_range_unchecked_mut` should be the unchecked versions of the methods +above. + +# Drawbacks + +- Are these methods worth adding to `std`? Are such use cases common to justify such extention? + +# Alternatives + +- Stay as is. +- Could there be any other (and better!) total functions that serve the similar purpose? + +# Unresolved questions + +- Naming, naming, naming: Is `get_range` the most suitable name? How about `get_slice`, or just + `slice`? Or any others? From 23793ba18b8abf3ff0f5ef1f99c238477ddf787f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 16 Oct 2015 10:26:50 -0700 Subject: [PATCH 0578/1195] Fix typos --- text/1236-stabilize-catch-panic.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1236-stabilize-catch-panic.md b/text/1236-stabilize-catch-panic.md index a01fc0b9391..1a14aefe5a0 100644 --- a/text/1236-stabilize-catch-panic.md +++ b/text/1236-stabilize-catch-panic.md @@ -192,7 +192,7 @@ fn recover R + PanicSafe, R>(f: F) -> thread::Result Before analyzing this new signature, let's take a look at this new `PanicSafe` trait. -## An `PanicSafe` marker trait +## A `PanicSafe` marker trait As discussed in the motivation section above, the current bounds of `Send + 'static` on the closure parameter are too restrictive for common use cases, but @@ -257,7 +257,7 @@ Let's take a look at each of these items in detail: The only consumer of the `PanicSafe` bound is the `recover` function on the closure type parameter, and this ends up meaning that the *environment* needs to -be exception safe. In terms of error messages, this cause the compiler to emit +be exception safe. In terms of error messages, this causes the compiler to emit an error per closed-over-variable to indicate whether or not it is exception safe to share across the boundary. From 54038cb160b616b00660ba6fc2558fc5341854cd Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 16 Oct 2015 10:26:56 -0700 Subject: [PATCH 0579/1195] Remove no-longer-needed questions --- text/1236-stabilize-catch-panic.md | 19 ------------------- 1 file changed, 19 deletions(-) diff --git a/text/1236-stabilize-catch-panic.md b/text/1236-stabilize-catch-panic.md index 1a14aefe5a0..f943c613293 100644 --- a/text/1236-stabilize-catch-panic.md +++ b/text/1236-stabilize-catch-panic.md @@ -441,25 +441,6 @@ roughly analogous to an opaque "an unexpected error has occurred" message. Stabilizing `catch_panic` does little to change the tradeoffs around `Result` and `panic` that led to these conventions. -## Why remove `Send`? - -One of the primary use cases of `recover` is in an FFI context, where lots -of `*mut` and `*const` pointers are flying around. These two types aren't -`Send` by default, so having their values cross the `catch_panic` boundary -would be highly un-ergonomic (albeit still possible). As a result, this RFC -proposes removing the `Send` bound from the function. - -## Why keep `'static`? - -This RFC proposes leaving the `'static` bound on the closure parameter for now. -There isn't a clearly strong case (such as for `Send`) to remove this parameter -just yet, and it helps mitigate exception safety issues related to shared -references across the `recover` boundary. - -There is conversely also not a clearly strong case for *keeping* this bound, but -as it's the more conservative route (and backwards compatible to remove) it will -remain for now. - # Drawbacks A drawback of this RFC is that it can water down Rust's error handling story. From 45cf48e2dfc83495767a10795e32c60b32e9c089 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 19 Oct 2015 15:40:52 +0200 Subject: [PATCH 0580/1195] first draft of dropck-eyepatch RFC. --- text/0000-dropck-param-eyepatch.md | 426 +++++++++++++++++++++++++++++ 1 file changed, 426 insertions(+) create mode 100644 text/0000-dropck-param-eyepatch.md diff --git a/text/0000-dropck-param-eyepatch.md b/text/0000-dropck-param-eyepatch.md new file mode 100644 index 00000000000..69526ec430c --- /dev/null +++ b/text/0000-dropck-param-eyepatch.md @@ -0,0 +1,426 @@ +- Feature Name: dropck_eyepatch +- Start Date: 2015-10-19 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Refine the unguarded-escape-hatch from [RFC 1238][] (nonparametric +dropck) so that instead of a single attribute side-stepping *all* +dropck constraints for a type's destructor, we instead have a more +focused attribute that specifies exactly which type and/or lifetime +parameters the destructor is guaranteed not to access. + +[RFC 1238]: https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md +[RFC 769]: https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md + +# Motivation +[motivation]: #motivation + +The unguarded escape hatch (UGEH) from [RFC 1238] is a blunt +instrument: when you use `unsafe_destructor_blind_to_params`, it is +asserting that your destructor does not access borrowed data whose +type includes *any* lifetime or type parameter of the type. + +For example, the current destructor for `RawVec` (in `liballoc/`) +looks like this: + +```rust +impl Drop for RawVec { + #[unsafe_destructor_blind_to_params] + /// Frees the memory owned by the RawVec *without* trying to Drop its contents. + fn drop(&mut self) { + [... free memory using global system allocator ...] + } +} +``` + +The above is sound today, because the above destructor does not call +any methods that can access borrowed data in the values of type `T`, +and so we do not need to enforce the drop-ordering constraints imposed +when you leave out the `unsafe_destructor_blind_to_params` attribute. + +While the above attribute suffices for many use cases today, it is not +fine-grain enough for other cases of interest. In particular, it +cannot express that the destructor will not access borrowed data +behind a *subset* of the type parameters. + +Here are two concrete examples of where the need for this arises: + +## Example: `CheckedHashMap` + +The original Sound Generic Drop proposal ([RFC 769][]) +had an [appendix][RFC 769 CheckedHashMap] with an example of a +`CheckedHashMap` type that called the hashcode method +for all of the keys in the map in its destructor. +This is clearly a type where we *cannot* claim that we do not access +borrowed data potentially hidden behind `K`, so it would be unsound +to use the blunt `unsafe_destructor_blind_to_params` attribute on this +type. + +However, the values of the `V` parameter to `CheckedHashMap` are, in +all likelihood, *not* accessed by the `CheckedHashMap` destructor. If +that is the case, then it should be sound to instantiate `V` with a +type that contains references to other parts of the map (e.g., +references to the keys or to other values in the map). However, we +cannot express this today: There is no way to say that the +`CheckedHashMap` will not access borrowed data that is behind *just* +`V`. + +[RFC 769 CheckedHashMap]: https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md#appendix-a-why-and-when-would-drop-read-from-borrowed-data + +## Example: `Vec` + +The Rust developers have been talking for [a long time][RFC Issue 538] +about adding an `Allocator` trait that would allow users to override +the allocator used for the backing storage of collection types like +`Vec` and `HashMap`. + +For example, we would like to generalize the `RawVec` given above as +follows: + +```rust +#[unsafe_no_drop_flag] +pub struct RawVec { + ptr: Unique, + cap: usize, + alloc: A, +} + +impl Drop for RawVec { + #[should_we_put_ugeh_attribute_here_or_not(???)] + /// Frees the memory owned by the RawVec *without* trying to Drop its contents. + fn drop(&mut self) { + [... free memory using self.alloc ...] + } +} +``` + +However, we *cannot* soundly add an allocator parameter to a +collection that today uses the `unsafe_destructor_blind_to_params` +UGEH attribute in the destructor that deallocates, because that blunt +instrument would allow someone to write this: + +```rust +// (`ArenaAllocator`, when dropped, automatically frees its allocated blocks) + +// (Usual pattern for assigning same extent to `v` and `a`.) +let (v, a): (Vec, ArenaAllocator); + +a = ArenaAllocator::new(); +v = Vec::with_allocator(&a); + +... v.push(stuff) ... + +// at end of scope, `a` may be dropped before `v`, invalidating +// soundness of subsequent invocation of destructor for `v` (because +// that would try to free buffer of `v` via `v.buf.alloc` (== `&a`)). +``` + +The only way today to disallow the above unsound code would be to +remove `unsafe_destructor_blind_to_params` from `RawVec`/ `Vec`, which +would break other code (for example, code using `Vec` as the backing +storage for [cyclic graph structures][dropck_legal_cycles.rs]). + +[RFC Issue 538]: https://github.com/rust-lang/rfcs/issues/538 + +[dropck_legal_cycles.rs]: https://github.com/rust-lang/rust/blob/098a7a07ee6d11cf6d2b9d18918f26be95ee2f66/src/test/run-pass/dropck_legal_cycles.rs + +# Detailed design +[detailed design]: #detailed-design + + 1. Add a new fine-grained attribute, `unsafe_destructor_blind_to` + (which this RFC will sometimes call the "eyepatch", since it does + not make dropck totally blind; just blind on one "side"). + + 2. Remove `unsafe_destructor_blind_to_params`, since all uses of it + should be expressible via `unsafe_destructor_blind_to` (once that + has been completely implemented). + +## The "eyepatch" attribute + +Add a new attribute, `unsafe_destructor_blind_to(ARG)` (the "eyepatch"). + +The eyepatch is similar to `unsafe_destructor_blind_to_params`: it is +attached to the destructor[1](#footnote1), and it is meant +to assert that a destructor is guaranteed not to access certain kinds +of data accessible via `self`. + +The main difference is that the eyepatch has a single required +parameter, `ARG`. This is the place where you specify exactly *what* +the destructor is blind to (i.e., what will dropck treat as +inaccessible from the destructor for this type). + +There are two things one can put the `ARG` for a given eyepatch: one +of the type parameters for the type, or one of the lifetime parameters +for the type.[2](#footnote2) + +### Examples stolen from the Rustonomicon + +[nomicon dropck]: https://doc.rust-lang.org/nightly/nomicon/dropck.html + +So, adapting some examples from the Rustonomicon +[Drop Check][nomicon dropck] chapter, we would be able to write +the following. + +Example of eyepatch on a lifetime parameter:: + +```rust +struct InspectorA<'a>(&'a u8, &'static str); + +impl<'a> Drop for InspectorA<'a> { + #[unsafe_destructor_blind_to('a)] + fn drop(&mut self) { + println!("InspectorA(_, {}) knows when *not* to inspect.", self.1); + } +} +``` + +Example of eyepatch on a type parameter: + +```rust +use std::fmt; + +struct InspectorB(T, &'static str); + +impl Drop for InspectorB { + #[unsafe_destructor_blind_to(T)] + fn drop(&mut self) { + println!("InspectorB(_, {}) knows when *not* to inspect.", self.1); + } +} +``` + +Both of the above two examples are much the same as if we had used the +old `unsafe_destructor_blind_to_params` UGEH attribute. + +### Example: RawVec + +To generalize `RawVec` from the [motivation](#motivation) with an +`Allocator` correctly (that is, soundly and without breaking existing +code), we would now write: + +```rust +impl Drop for RawVec { + #[unsafe_destructor_blind_to(T)] + /// Frees the memory owned by the RawVec *without* trying to Drop its contents. + fn drop(&mut self) { + [... free memory using self.alloc ...] + } +} +``` + +The use of `unsafe_destructor_blind_to(T)` here asserts that even +though the destructor may access borrowed data through `A` (and thus +dropck must impose drop-ordering constraints for lifetimes occurring +in the type of `A`), the developer is guaranteeing that no access to +borrowed data will occur via the type `T`. + +The latter is not expressible today even with +`unsafe_destructor_blind_to_params`; there is no way to say that a +type will not access `T` in its destructor while also ensuring the +proper drop-ordering relationship between `RawVec` and `A`. + +### Example; Multiple Lifetimes + +Example: The above `InspectorA` carried a `&'static str` that was +always safe to access from the destructor. + +If we wanted to generalize this type a bit, we might write: + +```rust +struct InspectorC<'a,'b,'c>(&'a str, &'b str, &'c str); + +impl<'a,'b,'c> Drop for InspectorC<'a,'b,'c> { + #[unsafe_destructor_blind_to('a)] + #[unsafe_destructor_blind_to('c)] + fn drop(&mut self) { + println!("InspectorA(_, {}, _) knows when *not* to inspect.", self.1); + } +} +``` + +This type, like `InspectorA`, is careful to only access the `&str` +that it holds in its destructor; but now the borrowed string slice +does not have `'static` lifetime, so we must make sure that we do not +claim that we are blind to its lifetime (`'b`). + +(This example also illustrates that one can attach multiple instances +of the eyepatch attribute to a destructor, each with a distinct input +for its `ARG`.) + +Given the definition above, this code will compile and run properly: + +```rust +fn this_will_work() { + let b; // ensure that `b` strictly outlives `i`. + let (i,a,c); + a = format!("a"); + b = format!("b"); + c = format!("c"); + i = InspectorC(a, b, c); +} +``` + +while this code will be rejected by the compiler: + +```rust +fn this_will_not_work() { + let (a,c); + let (i,b); // OOPS: `b` not guaranteed to survive for `i`'s destructor. + a = format!("a"); + b = format!("b"); + c = format!("c"); + i = InspectorC(a, b, c); +} +``` + +## Semantics + +How does this work, you might ask? + +The idea is actually simple: the dropck rule stays mostly the same, +except for a small twist. + +The Drop-Check rule at this point essentially says: + +> if the type of `v` owns data of type `D`, where +> +> (1.) the `impl Drop for D` is either type-parametric, or lifetime-parametric over `'a`, and +> (2.) the structure of `D` can reach a reference of type `&'a _`, +> +> then `'a` must strictly outlive the scope of `v` + +The main change we want to make is to the second condition. +Instead of just saying "the structure of `D` can reach a reference of type `&'a _`", +we want first to replace eyepatched lifetimes and types within `D` with `'static` and `()`, +respectively. Call this revised type `patched(D)`. + +Then the new condition is: + +> (2.) the structure of patched(D) can reach a reference of type `&'a _`, + +*Everything* else is the same. + +In particular, the patching substitution is *only* applied with +respect to a particular destructor. Just because `Vec` is blind to `T` +does not mean that we will ignore the actual type instantiated at `T` +in terms of drop-ordering constraints. + +For example, in `Vec>`, even though `Vec` +itself is blind to the whole type `InspectorC<'a, 'name, 'c>` when we +are considering the `impl Drop for Vec`, we *still* honor the +constraint that `'name` must strictly outlive the `Vec` (because we +continue to consider all `D` that is data owned by a value `v`, +including when `D` == `InspectorC<'a,'name,'c>`). + +## Prototype + +pnkfelix has implemented a proof-of-concept +[implementation][pnkfelix prototype] of this feature. +It uses the substitution machinery we already have in the compiler +to express the semantics above. + +## Limitations of prototype (not part of design) + +Here we note a few limitations of the current prototype. These +limitations are *not* being proposed as part of the specification of +the feature. + +1. The eyepatch is not attached to the +destructor in the current [prototype][pnkfelix prototype]; it is +instead attached to the `struct`/`enum` definition itself. + +2. The eyepatch is only able to accept a type +parameter, not a lifetime, in the current +[prototype][pnkfelix prototype]; it is instead attached to the +`struct`/`enum` definition itself. + +Fixing the above limitations should just be a matter of engineering, +not a fundamental hurdle to overcome in the feature's design in the +context of the language. + +[pnkfelix prototype]: https://github.com/pnkfelix/rust/commits/fsk-nonparam-blind-to-indiv + +# Drawbacks +[drawbacks]: #drawbacks + +This attribute, like the original `unsafe_destructor_blind_to_params` +UGEH attribute, is ugly. + +It would be nicer if to actually change the language in a way where we +could check the assertions being made by the programmer, rather than +trusting them. (pnkfelix has some thoughts on this, which are mostly +reflected in what he wrote in the [RFC 1238 alternatives][].) + +[RFC 1238 alternatives]: https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md#continue-supporting-parametricity + +# Alternatives +[alternatives]: #alternatives + +## unsafe_destructor_blind_to(T1, T2, ...) + +The eyepatch could take multiple arguments, rather than requiring a +distinct instance of the attribute for each parameter that we are +blind to. + +However, I think that each usage of the attribute needs to be +considered, since it represents a separate "attack vector" where +unsoundness can be introduced, and therefore it deserves more than +just a comma and a space added to the program text when it is added. + +(I only weakly support the latter position; it is obviously easy +to support this form if that is deemed desirable.) + +## Wait for proper parametricity + +As alluded to in the [drawbacks][], in principle we could provide +similar expressiveness to that offered by the eyepatch (which is +acting as a fine-grained escape hatch from dropck) by instead offering +some language extension where the compiler would actually analyze the +code based on programmer annotations indicating which types and +lifetimes are not used by a function. + +In my opinion I am of two minds on this (but they are both in favor +this RFC rather than waiting for a sound compiler analysis): + + 1. We will always need an escape hatch. The programmer will always need + a way to assert something that she knows to be true, even if the compiler + cannot prove it. (A simple example: Calling a third-party API that has not + yet added the necessary annotations.) + + This RFC is proposing that we keep an escape hatch, but we make it more + expressive. + + 2. If we eventually *do* have a sound compiler analysis, I see the + compiler changes and library annotations suggested by this RFC as + being in line with what that compiler analysis would end up using + anyway. In other words: Assume we *did* add some way for the programmer + to write that `T` is parametric (e.g. `T: ?Special` in the [RFC 1238 alternatives]). + Even then, we would still need the compiler changes suggested by this RFC, + and at that point hopefully the task would be for the programmer to mechanically + replace occurrences of `#[unsafe_destructor_blind_to(T)` with `T: ?Special` + (and then see if the library builds). + + In other words, I see the form suggested by this RFC as being a step *towards* + a proper analysis, in the sense that it is getting programmers used to thinking + about the individual parameters and their relationship with the container, rather + than just reasoning about the container on its own without any consideration + of each type/lifetime parameter. + +## Do nothing + +If we do nothing, then we cannot add `Vec` soundly. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Is there any issue with writing `'a` in an attribute like +`#[unsafe_destructor_blind_to('a)]`? (The prototype, as mentioned +[above](#footnote2), does not currently accept lifetime parameter +inputs, so I do not know the answer off hand. + +Is the definition of the drop-check rule sound with this `patched(D)` +variant? (We have not proven any previous variation of the rule +sound; I think it would be an interesting student project though.) From e4fd4e69e64be7bafcb83a66344775b999f91f72 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Sun, 11 Oct 2015 22:27:33 -0400 Subject: [PATCH 0581/1195] Allow a custom panic handler --- text/0000-global-panic-handler.md | 183 ++++++++++++++++++++++++++++++ 1 file changed, 183 insertions(+) create mode 100644 text/0000-global-panic-handler.md diff --git a/text/0000-global-panic-handler.md b/text/0000-global-panic-handler.md new file mode 100644 index 00000000000..229fc04aabb --- /dev/null +++ b/text/0000-global-panic-handler.md @@ -0,0 +1,183 @@ +- Feature Name: panic_handler +- Start Date: 2015-10-08 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +When a thread panics in Rust, the unwinding runtime currently prints a message +to standard error containing the panic argument as well as the filename and +line number corresponding to the location from which the panic originated. +This RFC proposes a mechanism to allow user code to replace this logic with +custom handlers that will run before unwinding begins. + +# Motivation + +The default behavior is not always ideal for all programs: + +* Programs with command line interfaces do not want their output polluted by + random panic messages. +* Programs using a logging framework may want panic messages to be routed into + that system so that they can be processed like other events. +* Programs with graphical user interfaces may not have standard error attached + at all and want to be notified of thread panics to potentially display an + internal error dialog to the user. + +The standard library [previously +supported](https://doc.rust-lang.org/1.3.0/std/rt/unwind/fn.register.html) (in +unstable code) the registration of a set of panic handlers. This API had +several issues: + +* The system supported a fixed but unspecified number of handlers, and a + handler could never be unregistered once added. +* The callbacks were raw function pointers rather than closures. +* Handlers would be invoked on nested panics, which would result in a stack + overflow if a handler itself panicked. +* The callbacks were specified to take the panic message, file name and line + number directly. This would prevent us from adding more functionality in + the future, such as access to backtrace information. In addition, the + presence of file names and line numbers for all panics causes some amount of + binary bloat and we may want to add some avenue to allow for the omission of + those values in the future. + +# Detailed design + +A new module, `std::panic`, will be created with a panic handling API: + +```rust +/// Unregisters the current panic handler, returning it. +/// +/// If no custom handler is registered, the default handler will be returned. +/// +/// # Panics +/// +/// Panics if called from a panicking thread. Note that this will be a nested +/// panic and therefore abort the process. +pub fn take_handler() -> Box { ... } + +/// Registers a custom panic handler, replacing any that was previously +/// registered. +/// +/// # Panics +/// +/// Panics if called from a panicking thread. Note that this will be a nested +/// panic and therefore abort the process. +pub fn set_handler(handler: F) where F: Fn(&PanicInfo) + 'static + Sync + Send { ... } + +/// A struct providing information about a panic. +pub struct PanicInfo { ... } + +impl PanicInfo { + /// Returns the payload associated with the panic. + /// + /// This will commonly, but not always, be a `&'static str` or `String`. + pub fn payload(&self) -> &Any + Send { ... } + + /// Returns information about the location from which the panic originated, + /// if available. + pub fn location(&self) -> Option { ... } +} + +/// A struct containing information about the location of a panic. +pub struct Location<'a> { ... } + +impl<'a> Location<'a> { + /// Returns the name of the source file from which the panic originated. + pub fn file(&self) -> &str { ... } + + /// Returns the line number from which the panic originated. + pub fn line(&self) -> u32 { ... } +} +``` + +When a panic occurs, but before unwinding begins, the runtime will call the +registered panic handler. After the handler returns, the runtime will then +unwind the thread. If a thread panics while panicking (a "double panic"), the +panic handler will *not* be invoked and the process will abort. Note that the +thread is considered to be panicking while the panic handler is running, so a +panic originating from the panic handler will result in a double panic. + +The `take_handler` method exists to allow for handlers to "chain" by closing +over the previous handler and calling into it: + +```rust +let old_handler = panic::take_handler(); +panic::set_handler(move |info| { + println!("uh oh!"); + old_handler(info); +}); +``` + +This is obviously a racy operation, but as a single global resource, the global +panic handler should only be adjusted by applications rather than libraries, +most likely early in the startup process. + +The implementation of `set_handler` and `take_handler` will have to be +carefully synchronized to ensure that a handler is not replaced while executing +in another thread. This can be accomplished in a manner similar to [that used +by the `log` +crate](https://github.com/rust-lang-nursery/log/blob/aa8618c840dd88b27c487c9fc9571d89751583f3/src/lib.rs). +`take_handler` and `set_handler` will wait until no other threads are currently +running the panic handler, at which point they will atomically swap the handler +out as appropriate. + +Note that `location` will always return `Some` in the current implementation. +It returns an `Option` to hedge against possible future changes to the panic +system that would allow a crate to be compiled with location metadata removed +to minimize binary size. + +## Prior Art + +C++ has a +[`std::set_terminate`](http://www.cplusplus.com/reference/exception/set_terminate/) +function which registers a handler for uncaught exceptions, returning the old +one. The handler takes no arguments. + +Python passes uncaught exceptions to the global handler +[`sys.excepthook`](https://docs.python.org/2/library/sys.html#sys.excepthook) +which can be set by user code. + +In Java, uncaught exceptions [can be +handled](http://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#setUncaughtExceptionHandler(java.lang.Thread.UncaughtExceptionHandler)) +by handlers registered on an individual `Thread`, by the `Thread`'s, +`ThreadGroup`, and by a handler registered globally. The handlers are provided +with the `Throwable` that triggered the handler. + +# Drawbacks + +The more infrastructure we add to interact with panics, the more attractive it +becomes to use them as a more normal part of control flow. + +# Alternatives + +Panic handlers could be run after a panicking thread has unwound rather than +before. This is perhaps a more intuitive arrangement, and allows `catch_panic` +to prevent panic handlers from running. However, running handlers before +unwinding allows them access to more context, for example, the ability to take +a stack trace. + +`PanicInfo::location` could be split into `PanicInfo::file` and +`PanicInfo::line` to cut down on the API size, though that would require +handlers to deal with weird cases like a line number but no file being +available. + +[RFC 1100](https://github.com/rust-lang/rfcs/pull/1100) proposed an API based +around thread-local handlers. While there are reasonable use cases for the +registration of custom handlers on a per-thread basis, most of the common uses +for custom handlers want to have a single set of behavior cover all threads in +the process. Being forced to remember to register a handler in every thread +spawned in a program is tedious and error prone, and not even possible in many +cases for threads spawned in libraries the author has no control over. + +While out of scope for this RFC, a future extension could add thread-local +handlers on top of the global one proposed here in a straightforward manner. + +The implementation could be simplified by altering the API to store, and +`take_logger` to return, an `Arc` or +a bare function pointer. This seems like a somewhat weirder API, however, and +the implementation proposed above should not end up complex enough to justify +the change. + +# Unresolved questions + +None at the moment. From ceb091eb304292a6e1d9d1f444d595cc4acbcf7f Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 19 Oct 2015 23:25:46 +0200 Subject: [PATCH 0582/1195] add note about interaction between macro hygiene and attributes. --- text/0000-dropck-param-eyepatch.md | 39 ++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/text/0000-dropck-param-eyepatch.md b/text/0000-dropck-param-eyepatch.md index 69526ec430c..dd3175e2aac 100644 --- a/text/0000-dropck-param-eyepatch.md +++ b/text/0000-dropck-param-eyepatch.md @@ -346,14 +346,53 @@ context of the language. # Drawbacks [drawbacks]: #drawbacks +## Ugliness + This attribute, like the original `unsafe_destructor_blind_to_params` UGEH attribute, is ugly. +## Unchecked assertions boo + It would be nicer if to actually change the language in a way where we could check the assertions being made by the programmer, rather than trusting them. (pnkfelix has some thoughts on this, which are mostly reflected in what he wrote in the [RFC 1238 alternatives][].) +## Attributes lack hygiene + +As noted by arielb1, putting type parameter identifiers into attributes +is not likely to play well with macro hygiene. + +Here is a concrete example: + +```rust +struct Yell2(A, B); + +macro_rules! make_yell2a { + ($A:ident, $B:ident) => { + impl<$A:Debug,$B:Debug> Drop for Yell2<$A,$B> { + #[unsafe_destructor_blind_to(???)] // <---- + fn drop(&mut self) { + println!("Yell1(_, {:?})", self.1); + } + } + } +} + +make_yell2a!(X, Y); +``` + +Here is the issue: In the above, what does one put in for the `???` to +say that we are blind to the first type parameter to `Yell2`? +`#[unsafe_destructor_blind_to(A)` would be nonsense, becauase in the instantiation of the macro, `$A` will be mapped to the identifier `X`. so perhaps we should write it is blind to `X` -- but to me one big point of macro hygiene is that a macro definition should not have to build in knowledge of the identifiers chosen at the usage site, and this is the opposite of that. + +(I don't think `#[unsafe_destructor_blind_to($A)` works, because our attribute system operates at the same meta-level that macros operate at , but I would be happy to be proven wrong.) + +---- + +Despite my somewhat dire attitude above, I don't think this is a significant problem in the short term. This sort of macro is probably rare, and the combination of this macro with UGEH is doubly so. You cannot define a destructor multiple times for the same type, so it seems weird to me to abstract this code construction at this particular level. + + [RFC 1238 alternatives]: https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md#continue-supporting-parametricity # Alternatives From be75059fbf9468ab6f2be8cddce3f78503c1df94 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 20 Oct 2015 16:36:01 +0200 Subject: [PATCH 0583/1195] add arielb1 alternatives to the alts section. --- text/0000-dropck-param-eyepatch.md | 105 ++++++++++++++++++++++++++++- 1 file changed, 104 insertions(+), 1 deletion(-) diff --git a/text/0000-dropck-param-eyepatch.md b/text/0000-dropck-param-eyepatch.md index dd3175e2aac..4cadb2b88fa 100644 --- a/text/0000-dropck-param-eyepatch.md +++ b/text/0000-dropck-param-eyepatch.md @@ -89,7 +89,7 @@ pub struct RawVec { } impl Drop for RawVec { - #[should_we_put_ugeh_attribute_here_or_not(???)] + #[should_we_put_ugeh_attribute_here_or_not(???)] /// Frees the memory owned by the RawVec *without* trying to Drop its contents. fn drop(&mut self) { [... free memory using self.alloc ...] @@ -316,6 +316,7 @@ continue to consider all `D` that is data owned by a value `v`, including when `D` == `InspectorC<'a,'name,'c>`). ## Prototype +[prototype]: #prototype pnkfelix has implemented a proof-of-concept [implementation][pnkfelix prototype] of this feature. @@ -359,6 +360,7 @@ trusting them. (pnkfelix has some thoughts on this, which are mostly reflected in what he wrote in the [RFC 1238 alternatives][].) ## Attributes lack hygiene +[attributes-lack-hygiene]: #attributes-lack-hygiene As noted by arielb1, putting type parameter identifiers into attributes is not likely to play well with macro hygiene. @@ -412,7 +414,108 @@ just a comma and a space added to the program text when it is added. (I only weakly support the latter position; it is obviously easy to support this form if that is deemed desirable.) +## Use a blacklist not a whitelist +[blacklist-not-whitelist]: #use-a-blacklist-not-a-whitelist + +The `unsafe_destructor_blind_to` attribute acts as a whitelist of +parameters that we are telling dropck to ignore in its analysis +of this destructor. + +We could instead add a way to list the lifetimes and/or +type-expressions (e.g. parameters, projections from parameters) that +the destructor may access (and thus treat that list as a blacklist of +parameters that dropck needs to *include* in its analysis). + +arielb1 first suggested this as an attribute form +[here][blacklist attribute], but then provided a different formulation +of the idea by expressing it as a [`where`-clause][blacklist where] on +the `fn drop` method (which is what I will show in the next section). + +[blacklist attribute]: https://github.com/rust-lang/rfcs/pull/1327#issuecomment-149302743 + +[blacklist where]: https://github.com/rust-lang/rfcs/pull/1327#issuecomment-149329351 + +## Make dropck "see again" via (focused) where-clauses + +(This alternative carries over some ideas from +[the previous section][blacklist-not-whitelist], but it stands well on +its own as something to consider, so I am giving it its own section.) + +The idea is that we keep the UGEH attribute, blunt hammer that it is. +You first opt out of the dropck ordering constraints via that, and +then you add back in ordering constraints via `where` clauses. + +(The ordering constraints in question would normally be *implied* by +the dropck analysis; the point is that UGEH is opting out of that +analysis, and so we are now adding them back in.) + +Here is the allocator example expressed in this fashion: + +```rust +impl Drop for RawVec { + #[unsafe_destructor_blind_to_params] + /// Frees the memory owned by the RawVec *without* trying to Drop its contents. + fn drop<'s>(&'s mut self) where A: 's { + // ~~~~~~~~~~~ + // | + // | + // This constraint (that `A` outlives `'s`), and other conditions + // relating `'s` and `Self` are normally implied by Rust's type + // system, but `unsafe_destructor_blind_to_params` opts out of + // enforcing them. This `where`-clause is opting back into *just* + // the `A:'s` again. + // + // Note we are *still* opting out of `T: 's` via + // `unsafe_destructor_blind_to_params`, and thus our overall + // goal (of not breaking code that relies on `T` not having to + // survive the destructor call) is accomplished. + + [... free memory using self.alloc ...] + } +} +``` + +This approach, if we can make it work, seems fine to me. It certainly +avoids a number of problems that the eyepatch attribute has. + +Advantages of fn-drop-with-where-clauses: + + * It completely sidesteps the [hygiene issue][attributes-lack-hygiene]. + + * If the eyepatch attribute is to be limited to identifiers (type + parameters) and lifetimes, then this approach is more expressive, + since it would allow one to put type-projections into the + constraints. + +Drawbacks of fn-drop-with-where-clauses: + + * Its not 100% clear what our implementation strategy will be for it, + while the eyepatch attribute does have a [prototype]. + + I actually do not give this drawback much weight; resolving this + may be merely a matter of just trying to do it: e.g., build up the + set of where-clauses when we make the ADT's representatin, and + then have `dropck` insert instantiate and insert them as needed. + + * It might have the wrong ergonomics for developers: It seems bad to + have the blunt hammer introduce all sorts of potential + unsoundness, and rely on the developer to keep the set of + `where`-clauses on the `fn drop` up to date. + + This would be a pretty bad drawback, *if* the language and + compiler were to stagnate. But my intention/goal is to eventually + put in a [sound compiler analysis][wait-for-proper-parametricity]. + In other words, in the future, I will be more concerned about the + ergonomics of the code that uses the sound analysis. I will not be + concerned about "gotcha's" associated with the UGEH escape hatch. + +(The most important thing I want to convey is that I believe that both +the eyepatch attribute and fn-drop-with-where-clauses are capable of +resolving the real issues that I face today, and I would be happy for +either proposal to be accepted.) + ## Wait for proper parametricity +[wait-for-proper-parametricity]: #wait-for-proper-parametricity As alluded to in the [drawbacks][], in principle we could provide similar expressiveness to that offered by the eyepatch (which is From 67c2049be5421c43df3aec28db4ea782c5c548ab Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 20 Oct 2015 16:45:27 +0200 Subject: [PATCH 0584/1195] I always find myself adding these anchors (or some variant thereof) while I'm drafting an RFC. Lets put them into the template so that people will get them by default. (Hopefully we don't need an RFC to decide whether to change the RFC template in this manner.) --- 0000-template.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/0000-template.md b/0000-template.md index 6685ace5dad..a45c6110e58 100644 --- a/0000-template.md +++ b/0000-template.md @@ -4,27 +4,33 @@ - Rust Issue: (leave this empty) # Summary +[summary]: #summary One para explanation of the feature. # Motivation +[motivation]: #motivation Why are we doing this? What use cases does it support? What is the expected outcome? # Detailed design +[design]: #detailed-design This is the bulk of the RFC. Explain the design in enough detail for somebody familiar with the language to understand, and for somebody familiar with the compiler to implement. This should get into specifics and corner-cases, and include examples of how the feature is used. # Drawbacks +[drawbacks]: #drawbacks Why should we *not* do this? # Alternatives +[alternatives]: #alternatives What other designs have been considered? What is the impact of not doing this? # Unresolved questions +[unresolved]: #unresolved-questions What parts of the design are still TBD? From 8c6042271e9e0a05d3aff1e82c4aa5b1d0c96e7a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 20 Oct 2015 11:18:33 -0700 Subject: [PATCH 0585/1195] Fix a few minor typos --- text/0000-time-improvements.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index 7e82c32566e..6e876dbc652 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -6,7 +6,7 @@ # Summary This RFC proposes several new types and associated APIs for working with times in Rust. -The primary new types are `Instance`, for working with time that is guaranteed to be +The primary new types are `Instant`, for working with time that is guaranteed to be monotonic, and `SystemTime`, for working with times across processes on a single system (usually internally represented as a number of seconds since an epoch). @@ -121,7 +121,7 @@ directly address time zones. ## Types -```rs +```rust pub struct Instant { secs: u64, nanos: u32 @@ -167,7 +167,7 @@ use for negative values. Rather than require each API that takes a `Duration` to produce an `Err` (or `panic!`) when receiving a negative value, this design optimizes for the broadly useful positive `Duration`. -```rs +```rust impl Instant { /// Panics if `earlier` is later than &self. /// Because Instant is monotonic, the only time that `earlier` should be @@ -180,7 +180,7 @@ impl Instant { } impl Add for Instant { - type Output = SystemTime; + type Output = Instant; } impl Sub for Instant { @@ -202,7 +202,7 @@ The "standard" terminology comes from [JodaTime][joda-time-standard]. [joda-time-standard]: http://joda-time.sourceforge.net/apidocs/org/joda/time/Duration.html#standardDays(long) -```rs +```rust impl Duration { /// a standard minute is 60 seconds /// panics if the number of minutes is larger than u64 seconds @@ -241,7 +241,7 @@ This design attempts to help the programmer catch the most egregious of these kinds of mistakes (unexpected travel **back in time**) before the mistake propagates. -```rs +```rust impl SystemTime { /// Returns an `Err` if `earlier` is later pub fn duration_from_earlier(&self, earlier: SystemTime) -> Result; @@ -329,4 +329,4 @@ use `duration_from_earlier` reliably to get a positive `Duration`. What should `SystemTimeError` look like? This RFC leaves types related to human representations of dates and times -to a future proposal. \ No newline at end of file +to a future proposal. From 0d8c21fd65beb9c858b127862b5de4f3607e733d Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 20 Oct 2015 11:18:42 -0700 Subject: [PATCH 0586/1195] Add now() constructors --- text/0000-time-improvements.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index 6e876dbc652..178d5d57dec 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -169,6 +169,9 @@ optimizes for the broadly useful positive `Duration`. ```rust impl Instant { + /// Returns an instant corresponding to "now". + pub fn now() -> Instant; + /// Panics if `earlier` is later than &self. /// Because Instant is monotonic, the only time that `earlier` should be /// a later time is a bug in your code. @@ -243,6 +246,9 @@ propagates. ```rust impl SystemTime { + /// Returns the system time corresponding to "now". + pub fn now() -> SystemTime; + /// Returns an `Err` if `earlier` is later pub fn duration_from_earlier(&self, earlier: SystemTime) -> Result; From 6a3745ab0cb6ed73ff184b8ef60b07ead2604e6f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 20 Oct 2015 11:22:08 -0700 Subject: [PATCH 0587/1195] Add UNIX_EPOCH SystemTime constant --- text/0000-time-improvements.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index 178d5d57dec..07a9319eb8c 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -264,6 +264,13 @@ impl Sub for SystemTime { type Output = SystemTime; } +// An anchor which can be used to generate new SystemTime instances from a known +// Duration or convert a SystemTime to a Duration which can later then be used +// again to recreate the SystemTime. +// +// Defined to be "1970-01-01 00:00:00 UTC" on all systems. +const UNIX_EPOCH: SystemTime = ...; + // Note that none of these operations actually imply that the underlying system // operation that produced these SystemTimes happened at the same time // (for Eq) or before/after (for Ord) than the other system operation. From 353b17f9cece309efcbbbbb4f046046c62099891 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 20 Oct 2015 11:24:33 -0700 Subject: [PATCH 0588/1195] Add a `duration` method to SystemTimeError --- text/0000-time-improvements.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index 07a9319eb8c..c6d2f77a47e 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -278,6 +278,13 @@ impl PartialEq for SystemTime; impl Eq for SystemTime; impl PartialOrd for SystemTime; impl Ord for SystemTime; + +impl SystemTimeError { + /// A SystemTimeError originates from attempting to subtract two SystemTime + /// instances, a and b. If a < b then an error is returned, and the duration + /// returned represents (b - a). + pub fn duration(&self) -> Duration; +} ``` The main difference from the design of `Instant` is that it is impossible to @@ -339,7 +346,5 @@ use `duration_from_earlier` reliably to get a positive `Duration`. # Unresolved Questions -What should `SystemTimeError` look like? - This RFC leaves types related to human representations of dates and times to a future proposal. From 44e88d3a30fda3deee8330377e9517f79d194630 Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Thu, 22 Oct 2015 00:41:18 +0300 Subject: [PATCH 0589/1195] First draft on making src/grammar the grammar --- text/0000-grammar-is-canonical.md | 97 +++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) create mode 100644 text/0000-grammar-is-canonical.md diff --git a/text/0000-grammar-is-canonical.md b/text/0000-grammar-is-canonical.md new file mode 100644 index 00000000000..17eaccd55e2 --- /dev/null +++ b/text/0000-grammar-is-canonical.md @@ -0,0 +1,97 @@ +- Feature Name: grammar +- Start Date: 2015-10-21 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary +[src/grammar]: https://github.com/rust-lang/rust/tree/master/src/grammar + +Grammar of the Rust language should not be rustc implementation-defined. We have a formal grammar +at [src/grammar] which is to be used as the canonical and formal representation of the Rust +language. + +# Motivation +[motivation]: #motivation +[#1228]: https://github.com/rust-lang/rfcs/blob/master/text/1228-placement-left-arrow.md +[#1219]: https://github.com/rust-lang/rfcs/blob/master/text/1219-use-group-as.md +[#1192]: https://github.com/rust-lang/rfcs/blob/master/text/1192-inclusive-ranges.md + +In many RFCs proposing syntactic changes ([#1228], [#1219] and [#1192] being some of more recently +merged RFCs) the changes are described rather informally and are hard to both implement and +discuss which also leads to discussions containing a lot of guess-work. + +Making [src/grammar] to be the canonical grammar and demanding for description of syntactic changes +to be presented in terms of changes to the formal grammar should greatly simplify both the +discussion and implementation of the RFCs. Using a formal grammar also allows us to discover and +rule out existence of various issues with the grammar changes (e.g. grammar ambiguities) during +design phase rather than implementation phase or, even worse, after the stabilisation. + +# Detailed design +[design]: #detailed-design +[A-grammar]: https://github.com/rust-lang/rust/issues?utf8=✓&q=is:issue+is:open+label:A-grammar + +Sadly, the [grammar][src/grammar] in question is [not quite equivalent][A-grammar] to the +implementation in rustc yet. We cannot possibly hope to catch all the quirks in the rustc parser +implementation, therefore something else needs to be done. + +This RFC proposes following approach to making [src/grammar] the canonical Rust language grammar: + +1. Fix the already known discrepancies between implementation and [src/grammar]; +2. Make [src/grammar] a [semi-canonical grammar]; +3. After a period of time transition [src/grammar] to a [fully-canonical grammar]. + +## Semi-canonical grammar +[semi-canonical grammar]: #semi-canonical-grammar + +Once all known discrepancies between the [src/grammar] and rustc parser implementation are +resolved, [src/grammar] enters the state of being semi-canonical grammar of the Rust language. + +Semi-canonical means that all new development involving syntax changes are made and discussed in +terms of changes to the [src/grammar] and [src/grammar] is in general regarded to as the canonical +grammar except when new discrepancies are discovered. These discrepancies must be swiftly resolved, +but resolution will depend on what kind of discrepancy it is: + +1. For syntax changes/additions introduced after [src/grammar] gained the semi-canonical state, the + [src/grammar] is canonical; +2. For syntax that was present before [src/grammar] gained the semi-canonical state, in most cases + the implementation is canonical. + +This process is sure to become ambiguous over time as syntax is increasingly adjusted (it is harder +to “blame” syntax changes compared to syntax additions), therefore the resolution process of +discrepancies will also depend more on a decision from the Rust team. + +## Fully-canonical grammar +[fully-canonical grammar]: #fully-canonical-grammar + +After some time passes, [src/grammar] will transition to the state of fully canonical grammar. +After [src/grammar] transitions into this state, for any discovered discrepancies the +rustc parser implementation must be adjusted to match the [src/grammar], unless decided otherwise +by the RFC process. + +## RFC process changes for syntactic changes and additions + +Once the [src/grammar] enters semi-canonical state, all RFCs must describe syntax additions and +changes in terms of the formal [src/grammar]. Discussion about these changes are also expected (but +not necessarily will) to become more formal and easier to follow. + +# Drawbacks +[drawbacks]: #drawbacks + +This RFC introduces a period of ambiguity during which neither implementation nor [src/grammar] are +truly canonical representation of the Rust language. This will be less of an issue over time as +discrepancies are resolved, but its an issue nevertheless. + +# Alternatives +[alternatives]: #alternatives + +One alternative would be to immediately make [src/grammar] a fully-canonical grammar of the Rust +language at some arbitrary point in the future. + +Another alternative is to simply forget idea of having a formal grammar be the canonical grammar of +the Rust language. + +# Unresolved questions +[unresolved]: #unresolved-questions + +How much time should pass between [src/grammar] becoming semi-canonical and fully-canonical? From 5f69ff50de1fb6d0dd8c005b4f11f6e436e1f34c Mon Sep 17 00:00:00 2001 From: John Hodge Date: Sat, 24 Oct 2015 16:57:24 +0800 Subject: [PATCH 0590/1195] The order `const unsafe fn` was chosen (rust-lang/rust#29107) --- text/0911-const-fn.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0911-const-fn.md b/text/0911-const-fn.md index de4aae48e51..388d6213c14 100644 --- a/text/0911-const-fn.md +++ b/text/0911-const-fn.md @@ -178,7 +178,7 @@ invariants to be maintained (e.g. `std::ptr::Unique` requires a non-null pointer struct OptionalInt(u32); impl OptionalInt { /// Value must be non-zero - unsafe const fn new(val: u32) -> OptionalInt { + const unsafe fn new(val: u32) -> OptionalInt { OptionalInt(val) } } @@ -241,4 +241,4 @@ cannot be taken for granted, at least `if`/`else` should eventually work. Since it was accepted, the RFC has been updated as follows: -1. Allowed `unsafe const fn` +1. Allowed `const unsafe fn` From a3ffe694c62a2b6b8d165bef22ce64858f65ebdf Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 29 Oct 2015 10:20:16 -0700 Subject: [PATCH 0591/1195] RFC 1291 is promoting libc to rust-lang --- text/{0000-promote-libc.md => 1291-promote-libc.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-promote-libc.md => 1291-promote-libc.md} (100%) diff --git a/text/0000-promote-libc.md b/text/1291-promote-libc.md similarity index 100% rename from text/0000-promote-libc.md rename to text/1291-promote-libc.md From aff7879ee0cb44fd61349022a4ca8aa9a1d97f3a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 29 Oct 2015 10:22:29 -0700 Subject: [PATCH 0592/1195] Add links for 1291 --- text/1291-promote-libc.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1291-promote-libc.md b/text/1291-promote-libc.md index 127826b397e..92166b753fa 100644 --- a/text/1291-promote-libc.md +++ b/text/1291-promote-libc.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-09-21 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1291](https://github.com/rust-lang/rfcs/pull/1291) +- Rust Issue: N/A # Summary From ffe3f8b6ad61da89e0b9607ee78212cf442a8fe7 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 29 Oct 2015 10:23:25 -0700 Subject: [PATCH 0593/1195] RFC 1307 is some additional OsStr{,ing} methods --- text/{0000-osstring-methods.md => 1307-osstring-methods.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-osstring-methods.md => 1307-osstring-methods.md} (91%) diff --git a/text/0000-osstring-methods.md b/text/1307-osstring-methods.md similarity index 91% rename from text/0000-osstring-methods.md rename to text/1307-osstring-methods.md index 42b265df776..51d4ca1991d 100644 --- a/text/0000-osstring-methods.md +++ b/text/1307-osstring-methods.md @@ -1,7 +1,7 @@ -- Feature Name: osstring_simple_functions +- Feature Name: `osstring_simple_functions` - Start Date: 2015-10-04 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1307](https://github.com/rust-lang/rfcs/pull/1307) +- Rust Issue: [rust-lang/rust#29453](https://github.com/rust-lang/rust/issues/29453) # Summary From b7597d1c807d79c5b7e30c728c3a9991aa18834f Mon Sep 17 00:00:00 2001 From: llogiq Date: Fri, 30 Oct 2015 00:42:25 +0100 Subject: [PATCH 0594/1195] improvements thanks to brson's comments --- text/0000-deprecation.md | 58 +++++++++++++++++++--------------------- 1 file changed, 27 insertions(+), 31 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 3ef33765724..9fe4f438de0 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -9,7 +9,7 @@ This RFC proposes to allow library authors to use a `#[deprecated]` attribute, with optional `since = "`*version*`"`, `reason = "`*free text*`"` and `use = "`*substitute declaration*`"` fields. The compiler can then warn on deprecated items, while `rustdoc` can document their deprecation -accordingly. +accordingly. # Motivation @@ -29,29 +29,26 @@ Public API items (both plain `fn`s, methods, trait- and inherent fields and enum variants) can be given a `#[deprecated]` attribute. All possible fields are optional: -* `since` is defined to contain the exact version of the crate that -deprecated the item, as defined by Cargo.toml (thus following the semver -scheme). It makes no sense to put a version number higher than the current -newest version here, and this is not checked (but could be by external -lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). -To maximize usefulness, the version should be fully specified (e.g. no -wildcards or ranges). +* `since` is defined to contain the version of the crate at the time of +deprecating the item, following the semver scheme. It makes no sense to put a +version number higher than the current newest version here, and this is not +checked (but could be by external lints, e.g. +[rust-clippy](https://github.com/Manishearth/rust-clippy). * `reason` should contain a human-readable string outlining the reason for deprecating the item. While this field is not required, library authors are -strongly advised to make use of it to convey the reason to users of their -library. The string is required to be plain unformatted text (for now) so that -rustdoc can include it in the item's documentation without messing up the -formatting. -* `use` should be the full path to an API item that will replace the -functionality of the deprecated item, optionally (if the replacement is in a -different crate) followed by `@` and either a crate name (so that -`https://crates.io/crates/` followed by the name is a live link) or the URL to -a repository or other location where a surrogate can be obtained. Links must be -plain FTP, FTPS, HTTP or HTTPS links. The intention is to allow rustdoc (and -possibly other tools in the future, e.g. IDEs) to act on the included -information. The `use` field can have multiple values. - -On use of a *deprecated* item, `rustc` should `warn` of the deprecation. Note +strongly advised to make use of it to convey the reason for the deprecation to +users of their library. The string is interpreted as plain unformatted text +(for now) so that rustdoc can include it in the item's documentation without +messing up the formatting. +* `use`, if included, must be the import path (or a comma-separated list of +paths) to a set of API items that will replace the functionality of the +deprecated item. All crates in scope can be reached by this path. E.g. let's +say my `foo()` item was superceded by either the `bar()` or `baz()` functions +in the `bar` crate, I can `#[deprecate(use="bar::{bar,baz}")] foo()`, as long +as I have the `bar` crate in the library path. Rustc checks if the item is +actually available, otherwise returning an error. + +On use of a *deprecated* item, `rustc` will `warn` of the deprecation. Note that during Cargo builds, warnings on dependencies get silenced. Note that while this has the upside of keeping things tidy, it has a downside when it comes to deprecation: @@ -62,11 +59,11 @@ try to build `foobar` directly. We may want to create a service like `crater` to warn on use of deprecated items in library crates, however this is outside the scope of this RFC. -`rustdoc` should show deprecation on items, with a `[deprecated since x.y.z]` -box that may optionally show the reason and/or link to the replacement if -available. +`rustdoc` will show deprecation on items, with a `[deprecated]` +box that may optionally show the version, reason and/or link to the replacement +if available. -The language reference should be extended to describe this feature as outlined +The language reference will be extended to describe this feature as outlined in this RFC. Authors shall be advised to leave their users enough time to react before *removing* a deprecated item. @@ -83,7 +80,7 @@ quite complex. * Do nothing * make the `since` field required and check that it's a single version -* Optionally the deprecation lint chould check the current version as set by +* Optionally the deprecation lint could check the current version as set by cargo in the CARGO_CRATE_VERSION environment variable (the rust build process should set this environment variable, too). This would allow future deprecations to be shown in the docs early, but not warned against by the @@ -93,13 +90,12 @@ be `Allow` by default) * `reason` could include markdown formatting * The `use` could simply be plain text, which would remove much of the complexity here -* The `use` field contents could make use of the context in finding -replacements, e.g. extern crates, so that `time::precise_time_ns` would resolve -to the `time::precise_time_ns` API in the `time` crate, provided an -`extern crate time;` declaration is present * The `use` field could be left out and added later. However, this would lead people to describe a replacement in the `reason` field, as is already happening in the case of rustc-private deprecation +* Optionally, `cargo` could offer a new dependency category: "doc-dependencies" +which are used to pull in other crates' documentations to link them (this is +obviously not only relevant to deprecation). # Unresolved questions From c8a726c090c1cdb9acb9d95304ca4f8f021363f6 Mon Sep 17 00:00:00 2001 From: llogiq Date: Fri, 30 Oct 2015 05:56:41 +0100 Subject: [PATCH 0595/1195] Change `use` to semicolon-delimited, expand example --- text/0000-deprecation.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 9fe4f438de0..1c2168974cb 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -40,13 +40,14 @@ strongly advised to make use of it to convey the reason for the deprecation to users of their library. The string is interpreted as plain unformatted text (for now) so that rustdoc can include it in the item's documentation without messing up the formatting. -* `use`, if included, must be the import path (or a comma-separated list of +* `use`, if included, must be the import path (or a semicolon-delimited list of paths) to a set of API items that will replace the functionality of the deprecated item. All crates in scope can be reached by this path. E.g. let's say my `foo()` item was superceded by either the `bar()` or `baz()` functions -in the `bar` crate, I can `#[deprecate(use="bar::{bar,baz}")] foo()`, as long -as I have the `bar` crate in the library path. Rustc checks if the item is -actually available, otherwise returning an error. +in the `bar` crate, in conjunction with the `bruzz(_)` function in the `baz` +crate, I can `#[deprecate(use="bar::{bar,baz};baz::bruzz")] foo()`, as long +as I have the `bar` and `baz` crates in the library path. Rustc checks if the +item is actually available, otherwise returning an error. On use of a *deprecated* item, `rustc` will `warn` of the deprecation. Note that during Cargo builds, warnings on dependencies get silenced. Note that From d1f520f7ee38a9583e1393e6dd6cad156a000b67 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 6 Nov 2015 14:05:56 -0500 Subject: [PATCH 0596/1195] RFC 1298 is incremental compilation --- ...incremental-compilation.md => 1298-incremental-compilation.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-incremental-compilation.md => 1298-incremental-compilation.md} (100%) diff --git a/text/0000-incremental-compilation.md b/text/1298-incremental-compilation.md similarity index 100% rename from text/0000-incremental-compilation.md rename to text/1298-incremental-compilation.md From 6c4f9f8e15005086789ec773a75d362392c73a8d Mon Sep 17 00:00:00 2001 From: Andrew Paseltiner Date: Mon, 9 Nov 2015 13:42:54 -0500 Subject: [PATCH 0597/1195] Fix typo in RFC 505 Closes #1356. --- text/0505-api-comment-conventions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0505-api-comment-conventions.md b/text/0505-api-comment-conventions.md index 9db62ad3f76..a0243053ae4 100644 --- a/text/0505-api-comment-conventions.md +++ b/text/0505-api-comment-conventions.md @@ -65,7 +65,7 @@ The first line in any doc comment should be a single-line short sentence providing a summary of the code. This line is used as a summary description throughout Rustdoc's output, so it's a good idea to keep it short. -All doc comments, including the summary line, should be property punctuated. +All doc comments, including the summary line, should be properly punctuated. Prefer full sentences to fragments. The summary line should be written in third person singular present indicative From 0ed16cd15b93e62fc12f79fd77e56c2176e73b24 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 9 Nov 2015 15:26:49 -0800 Subject: [PATCH 0598/1195] RFC: Add #[repr(align = "N")] Extend the existing `#[repr]` attribute on structs with an `align = "N"` option to specify a custom alignment for `struct` types. [Rendered][link] [link]: https://github.com/alexcrichton/rfcs/blob/repr-align/text/0000-repr-align.md --- text/0000-repr-align.md | 104 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) create mode 100644 text/0000-repr-align.md diff --git a/text/0000-repr-align.md b/text/0000-repr-align.md new file mode 100644 index 00000000000..16d6e734957 --- /dev/null +++ b/text/0000-repr-align.md @@ -0,0 +1,104 @@ +- Feature Name: `repr_align` +- Start Date: 2015-11-09 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Extend the existing `#[repr]` attribute on structs with an `align = "N"` option +to specify a custom alignment for `struct` types. + +# Motivation +[motivation]: #motivation + +The alignment of a type is normally not worried about as the compiler will "do +the right thing" of picking an appropriate alignment for general use cases. +There are situations, however, where a nonstandard alignment may be desired when +operating with foreign systems. For example these sorts of situations tend to +necessitate or be much easier with a custom alignment: + +* Hardware can often have obscure requirements such as "this structure is + aligned to 32 bytes" when it in fact is only composed of 4-byte values. While + this can typically be manually calculated and managed, it's often also useful + to express this as a property of a type to get the compiler to do a little + extra work instead. +* C compilers like gcc and clang offer the ability to specify a custom alignment + for structures, and Rust can much more easily interoperate with these types if + Rust can also mirror the request for a custom alignment (e.g. passing a + structure to C correctly is much easier). +* Custom alignment can often be used for various tricks here and there and is + often convenient as "let's play around with an implementation" tool. For + example this can be used to statically allocate page tables in a kernel + or create an at-least cache-line-sized structure easily for concurrent + programming. + +Currently these sort of situations are possible in Rust but aren't necessarily +the most ergonomic as programmers must manually manage alignment. The purpose of +this RFC is to provide a lightweight annotation to alter the compiler-inferred +alignment of a structure to enable these situations much more easily. + +# Detailed design +[design]: #detailed-design + +The `#[repr]` attribute on `struct`s will be extended to include a form such as: + +```rust +#[repr(align = "16")] +struct MoreAligned(i32); +``` + +This structure will still have an alignment of 16 (as returned by +`mem::align_of`), and in this case the size will also be 16. + +Syntactically, the `repr` meta list will be extended to accept a meta item +name/value pair with the name "align" and the value as a string which can be +parsed as a `u64`. The restrictions on where this attribute can be placed along +with the accepted values are: + +* Custom alignment can only be specified on `struct` declarations for now. + Specifying a different alignment on perhaps `enum` or `type` definitions + should be a backwards-compatible extension. +* Alignment values must be a power of two. + +A custom alignment cannot *decrease* the alignment of a structure unless it is +also declared with `#[repr(packed)]` (to mirror what C does in this regard), but +it can increase the alignment (and hence size) of a structure (as shown +above). + +Semantically, it will be guaranteed (modulo `unsafe` code) that custom alignment +will always be respected. If a pointer to a non-aligned structure exists and is +used then it is considered unsafe behavior. Local variables, objects in arrays, +statics, etc, will all respect the custom alignment specified for a type. + +# Drawbacks +[drawbacks]: #drawbacks + +Specifying a custom alignment isn't always necessarily easy to do so via a +literal integer value. It may require usage of `#[cfg_attr]` in some situations +and may otherwise be much more convenient to name a different type instead. +Working with a raw integer, however, should provide the building block for +building up other abstractions and should be maximally flexible. It also +provides a relatively straightforward implementation and understanding of the +attribute at hand. + +This also currently does not allow for specifying the custom alignment of a +struct field (as C compilers also allow doing) without the usage of a newtype +structure. Currently `#[repr]` is not recognized here, but it would be a +backwards compatible extension to start reading it on struct fields. + +# Alternatives +[alternatives]: #alternatives + +Instead of using the `#[repr]` attribute as the "house" for the custom +alignment, there could instead be a new `#[align = "..."]` attribute. This is +perhaps more extensible to alignment in other locations such as a local variable +(with attributes on expressions), a struct field (where `#[repr]` is more of an +"outer attribute"), or enum variants perhaps. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* It is likely best to simply match the semantics of C/C++ in the regard of + custom alignment, but is it ensured that this RFC is the same as the behavior + of standard C compilers? From fa5b757ee9e2ad09b32583a383a7dad3ac7e67f9 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 9 Nov 2015 16:05:51 -0800 Subject: [PATCH 0599/1195] RFC: Add CommandExt::{exec, before_exec} Add two methods to the `std::os::unix::process::CommandExt` trait to provide more control over how processes are spawned on Unix, specifically: ```rust fn exec(&mut self) -> io::Error; fn before_exec(&mut self, f: F) -> &mut Self; ``` [Rendered][link] [link]: https://github.com/alexcrichton/rfcs/blob/process-ext/text/0000-process-ext-unix.md --- text/0000-process-ext-unix.md | 126 ++++++++++++++++++++++++++++++++++ 1 file changed, 126 insertions(+) create mode 100644 text/0000-process-ext-unix.md diff --git a/text/0000-process-ext-unix.md b/text/0000-process-ext-unix.md new file mode 100644 index 00000000000..cc67f148dff --- /dev/null +++ b/text/0000-process-ext-unix.md @@ -0,0 +1,126 @@ +- Feature Name: `process_exec` +- Start Date: 2015-11-09 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add two methods to the `std::os::unix::process::CommandExt` trait to provide +more control over how processes are spawned on Unix, specifically: + +```rust +fn exec(&mut self) -> io::Error; +fn before_exec(&mut self, f: F) -> &mut Self + where F: FnOnce() -> io::Result<()> + Send + Sync + 'static; +``` + +# Motivation +[motivation]: #motivation + +Although the standard library's implementation of spawning processes on Unix is +relatively complex, it unfortunately doesn't provide the same flexibility as +calling `fork` and `exec` manually. For example, these sorts of use cases are +not possible with the `Command` API: + +* The `exec` function cannot be called without `fork`. It's often useful on Unix + in doing this to avoid spawning processes or improve debuggability if the + pre-`exec` code was some form of shim. +* Execute other flavorful functions between the fork/exec if necessary. For + example some proposed extensions to the standard library are [dealing with the + controlling tty][tty] or dealing with [session leaders][session]. In theory + any sort of arbitrary code can be run between these two syscalls, and it may + not always be the case the standard library can provide a suitable + abstraction. + +[tty]: https://github.com/rust-lang/rust/pull/28982 +[session]: https://github.com/rust-lang/rust/pull/26470 + +Note that neither of these pieces of functionality are possible on Windows as +there is no equivalent of the `fork` or `exec` syscalls in the standard APIs, so +these are specifically proposed as methods on the Unix extension trait. + +# Detailed design +[design]: #detailed-design + +The following two methods will be added to the +`std::os::unix::process::CommandExt` trait: + +```rust +/// Performs all the required setup by this `Command`, followed by calling the +/// `execvp` syscall. +/// +/// On success this function will not return, and otherwise it will return an +/// error indicating why the exec (or another part of the setup of the +/// `Command`) failed. +/// +/// Note that the process may be in a "broken state" if this function returns in +/// error. For example the working directory, environment variables, signal +/// handling settings, various user/group information, or aspects of stdio +/// file descriptors may have changed. If a "transactional spawn" is required to +/// gracefully handle errors it is recommended to use the cross-platform `spawn` +/// instead. +fn exec(&mut self) -> io::Error; + +/// Schedules a closure to be run just before the `exec` function is invoked. +/// +/// This closure will be run in the context of the child process after the +/// `fork` and other aspects such as the stdio file descriptors and working +/// directory have successfully been changed. Note that this is often a very +/// constrained environment where normal operations like `malloc` or acquiring a +/// mutex are not guaranteed to work (due to other threads perhaps still running +/// when the `fork` was run). +/// +/// The closure is allowed to return an I/O error whose OS error code will be +/// communicated back to the parent and returned as an error from when the spawn +/// was requested. +/// +/// Multiple closures can be registered and they will be called in order of +/// their registration. If a closure returns `Err` then no further closures will +/// be called and the spawn operation will immediately return with a failure. +fn before_exec(&mut self, f: F) -> &mut Self + where F: FnOnce() -> io::Result<()> + Send + Sync + 'static; +``` + +The `exec` function is relatively straightforward as basically the entire spawn +operation minus the `fork`. The stdio handles will be inherited by default if +not otherwise configured. Note that a configuration of `piped` will likely just +end up with a broken half of a pipe on one of the file descriptors. + +The `before_exec` function has extra-restrictive bounds to preserve the same +qualities that the `Command` type has (notably `Send`, `Sync`, and `'static`). +This also happens after all other configuration has happened to ensure that +libraries can take advantage of the other operations on `Command` without having +to reimplement them manually in some circumstances. + +# Drawbacks +[drawbacks]: #drawbacks + +This change is possible to be a breaking change to `Command` as it will no +longer implement all marker traits by default (due to it containing closure +trait objects). While the common marker traits are handled here, it's possible +that there are some traits in the wild in use which this could break. + +Much of the functionality which may initially get funneled through `before_exec` +may actually be best implemented as functions in the standard library itself. +It's likely that many operations are well known across unixes and aren't niche +enough to stay outside the standard library. + +# Alternatives +[alternatives]: #alternatives + +Instead of souping up `Command` the type could instead provide accessors to all +of the configuration that it contains. This would enable this sort of +functionality to be built on crates.io first instead of requiring it to be built +into the standard library to start out with. Note that this may want to end up +in the standard library regardless, however. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Is it appropriate to run callbacks just before the `exec`? Should they instead + be run before any standard configuration like stdio has run? +* Is it possible to provide "transactional semantics" to the `exec` function + such that it is safe to recover from? Perhaps it's worthwhile to provide + partial transactional semantics in the form of "this can be recovered from so + long as all stdio is inherited". From 367e3096f064d52da462f3cee183ea88ea97dc66 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 10 Nov 2015 10:51:42 -0800 Subject: [PATCH 0600/1195] RFC: Improve Cargo target-specific dependencies Improve the target-specific dependency experience in Cargo by leveraging the same `#[cfg]` syntax that Rust has. [Rendered](https://github.com/alexcrichton/rfcs/blob/cargo-cfg-dependencies/text/0000-cargo-cfg-dependencies.md) --- text/0000-cargo-cfg-dependencies.md | 168 ++++++++++++++++++++++++++++ 1 file changed, 168 insertions(+) create mode 100644 text/0000-cargo-cfg-dependencies.md diff --git a/text/0000-cargo-cfg-dependencies.md b/text/0000-cargo-cfg-dependencies.md new file mode 100644 index 00000000000..b55c1d27207 --- /dev/null +++ b/text/0000-cargo-cfg-dependencies.md @@ -0,0 +1,168 @@ +- Feature Name: N/A +- Start Date: 2015-11-10 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Improve the target-specific dependency experience in Cargo by leveraging the +same `#[cfg]` syntax that Rust has. + +# Motivation +[motivation]: #motivation + +Currently in Cargo it's [relatively painful][issue] to list target-specific +dependencies. This can only be done by listing out the entire target string as +opposed to using the more-convenient `#[cfg]` annotations that Rust source code +has access to. Consequently a Windows-specific dependency ends up having to be +defined for four triples: `{i686,x86_64}-pc-windows-{gnu,msvc}`, and this is +unfortunately not forwards compatible as well! + +[issue]: https://github.com/rust-lang/cargo/issues/1007 + +As a result most crates end up unconditionally depending on target-specific +dependencies and rely on the crates themselves to have the relevant `#[cfg]` to +only be compiled for the right platforms. This experience leads to excessive +downloads, excessive compilations, and overall "unclean methods" to have a +platform specific dependency. + +This RFC proposes leveraging the same familiar syntax used in Rust itself to +define these dependencies. + +# Detailed design +[design]: #detailed-design + +The target-specific dependency syntax in Cargo will be expanded to to include +not only full target strings but also `#[cfg]` expressions: + +```toml +[target."cfg(windows)".dependencies] +winapi = "0.2" + +[target."cfg(unix)".dependencies] +unix-socket = "0.4" + +[target."cfg(target_os = \"macos\")".dependencies] +core-foundation = "0.2" +``` + +Specifically, the "target" listed here is considered special if it starts with +the string "cfg(" and ends with ")". If this is not true then Cargo will +continue to treat it as an opaque string and pass it to the compiler via +`--target` (Cargo's current behavior). + +> **Note**: There's an [issue open against TOML][toml-issue] to support +> single-quoted keys allowing more ergonomic syntax in some cases like: +> +> ```toml +> [target.'cfg(target_os = "macos")'.dependencies] +> core-foundation = "0.2" +> ``` + +[toml-issue]: https://github.com/toml-lang/toml/issues/354 + +Cargo will implement its own parser of this syntax inside the `cfg` expression, +it will not rely on the compiler itself. The grammar, however, will be the same +as the compiler for now: + +``` +cfg := "cfg(" meta-item * ")" +meta-item := ident | + ident "=" string | + ident "(" meta-item * ")" +``` + +Like Rust, Cargo will implement the `any`, `all`, and `not` operators for the +`ident(list)` syntax. The last missing piece is simply understand what `ident` +and `ident = "string"` values are defined for a particular target. To learn this +information Cargo will query the compiler via a new command line flag: + +``` +$ rustc --print cfg +unix +target_os="apple" +target_pointer_width="64" +... + +$ rustc --print cfg --target i686-pc-windows-msvc +windows +target_os="windows" +target_pointer_width="32" +... +``` + +The `--print cfg` command line flag will print out all built-in `#[cfg]` +directives defined by the compiler onto standard output. Each cfg will be +printed on its own line to allow external parsing. Cargo will use this to call +the compiler once (or twice if an explicit target is requested) when resolution +starts, and it will use these key/value pairs to execute the `cfg` queries in +the dependency graph being constructed. + +# Drawbacks +[drawbacks]: #drawbacks + +This is not a forwards-compatible extension to Cargo, so this will break +compatibility with older Cargo versions. If a crate is published with a Cargo +that supports this `cfg` syntax, it will not be buildable by a Cargo that does +not understand the `cfg` syntax. The registry itself is prepared to handle this +sort of situation as the "target" string is just opaque, however. + +This can be perhaps mitigated via a number of strategies: + +1. Have crates.io reject the `cfg` syntax until the implementation has landed on + stable Cargo for at least one full cycle. Applications, path dependencies, + and git dependencies would still be able to use this syntax, but crates.io + wouldn't be able to leverage it immediately. +2. Crates on crates.io wishing for compatibility could simply hold off on using + this syntax until this implementation has landed in stable Cargo for at least + a full cycle. This would mean that everyone could use it immediately but "big + crates" would be advised to hold off for compatibility for awhile. +3. Have crates.io rewrite dependencies as they're published. If you publish a + crate with a `cfg(windows)` dependency then crates.io could expand this to + all known triples which match `cfg(windows)` when storing the metadata + internally. This would mean that crates using `cfg` syntax would continue to + be compatible with older versions of Cargo so long as they were only used as + a crates.io dependency. + +For ease of implementation this RFC would recommend strategy (1) to help ease +this into the ecosystem without too much pain in terms of compatibility or +implementation. + +# Alternatives +[alternatives]: #alternatives + +Instead of using Rust's `#[cfg]` syntax, Cargo could support other options such +as patterns over the target string. For example it could accept something along +the lines of: + +```toml +[target."*-pc-windows-*".dependencies] +winapi = "0.2" + +[target."*-apple-*".dependencies] +core-foundation = "0.2" +``` + +While certainly more flexible than today's implementation, it unfortunately is +relatively error prone and doesn't cover all the use cases one may want: + +* Matching against a string isn't necessarily guaranteed to be robust moving + forward into the future. +* This doesn't support negation and other operators, e.g. `all(unix, not(osx))`. +* This doesn't support meta-families like `cfg(unix)`. + +Another possible alternative would be to have Cargo supply pre-defined families +such as `windows` and `unix` as well as the above pattern matching, but this +eventually just moves into the territory of what `#[cfg]` already provides but +may not always quite get there. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* This is no the only change that's known to Cargo which is known to not be + forwards-compatible, so it may be best to lump them all together into one + Cargo release instead of releasing them over time, but should this be blocked + on those ideas? (note they have not been formed into an RFC yet) + + From 06964898db0ce8f376b2f7f214bed004ea59d2ab Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 11 Nov 2015 17:09:36 -0800 Subject: [PATCH 0601/1195] Final clarifications * monotonic is not always increasing, just never decreasing * the `SystemTime` is in UTC * typos --- text/0000-time-improvements.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/0000-time-improvements.md b/text/0000-time-improvements.md index c6d2f77a47e..9b0ee1fe92a 100644 --- a/text/0000-time-improvements.md +++ b/text/0000-time-improvements.md @@ -145,7 +145,7 @@ represents an opaque (non-serializable!) timestamp that is guaranteed to be monotonic when compared to another `Instant`. > In this context, monotonic means that a timestamp created later in real-world -> time will always be larger than a timestamp created earlier in real-world +> time will always be not less than a timestamp created earlier in real-world > time. The `Duration` type can be used in conjunction with `Instant`, and these @@ -162,7 +162,7 @@ difference between an earlier and a later `Instant` also produces a positive `Duration` when used correctly. This design does not assume that negative `Duration`s are never useful, but -rather than the most common uses of `Duration` do not have a meaningful +rather that the most common uses of `Duration` do not have a meaningful use for negative values. Rather than require each API that takes a `Duration` to produce an `Err` (or `panic!`) when receiving a negative value, this design optimizes for the broadly useful positive `Duration`. @@ -227,7 +227,8 @@ impl Duration { benchmarks.** A `SystemTime` represents a time stored on the local machine derived from the -system clock. For example, it is used to represent `mtime` on the file system. +system clock (in UTC). For example, it is used to represent `mtime` on the file +system. The most important caveat of `SystemTime` is that it is **not monotonic**. This means that you can save a file to the file system, then save another file to From 66417584b3ce91e44f92858d5ca50658ab75e677 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Fri, 13 Nov 2015 17:08:36 -0500 Subject: [PATCH 0602/1195] Don't throw away endpoint after exhausting range --- text/1192-inclusive-ranges.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index 243702745a0..de512694888 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -27,7 +27,9 @@ more dots means more elements. ```rust pub enum RangeInclusive { - Empty, + Empty { + at: T, + }, NonEmpty { start: T, end: T, @@ -85,7 +87,7 @@ The `Empty` variant could be omitted, leaving two options: - `RangeInclusive` could be a struct including a `finished` field. - `a...b` only implements `IntoIterator`, not `Iterator`, by converting to a different type that does have the field. However, - this means that `a...b` behaves differently to `a..b`, so + this means that `a.. .b` behaves differently to `a..b`, so `(a...b).map(|x| ...)` doesn't work (the `..` version of that is used reasonably often, in the author's experience) - `a...b` can implement `Iterator` for types that can be stepped From f9ad4ed215350656bb3d89c96ae7724d38877d26 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 16 Nov 2015 11:56:23 -0500 Subject: [PATCH 0603/1195] Accept RFC #1300 -- intrinsic semantics --- ...000-intrinsic-semantics.md => 1300-intrinsic-semantics.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-intrinsic-semantics.md => 1300-intrinsic-semantics.md} (96%) diff --git a/text/0000-intrinsic-semantics.md b/text/1300-intrinsic-semantics.md similarity index 96% rename from text/0000-intrinsic-semantics.md rename to text/1300-intrinsic-semantics.md index f5fdc4a67a3..a4e3ffbe551 100644 --- a/text/0000-intrinsic-semantics.md +++ b/text/1300-intrinsic-semantics.md @@ -1,7 +1,7 @@ - Feature Name: intrinsic-semantics - Start Date: 2015-09-29 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1300 +- Rust Issue: N/A # Summary From bf37ca136a157fd2229b9dab026bf48ee903eb4d Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 16 Nov 2015 10:17:13 -0800 Subject: [PATCH 0604/1195] RFC 1288 is expanding std::time --- ...{0000-time-improvements.md => 1288-time-improvements.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-time-improvements.md => 1288-time-improvements.md} (98%) diff --git a/text/0000-time-improvements.md b/text/1288-time-improvements.md similarity index 98% rename from text/0000-time-improvements.md rename to text/1288-time-improvements.md index 9b0ee1fe92a..d480051c3dd 100644 --- a/text/0000-time-improvements.md +++ b/text/1288-time-improvements.md @@ -1,7 +1,7 @@ -- Feature Name: time_improvements +- Feature Name: `time_improvements` - Start Date: 2015-09-20 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1288](https://github.com/rust-lang/rfcs/pull/1288) +- Rust Issue: [rust-lang/rust#29866](https://github.com/rust-lang/rust/issues/29866) # Summary From 894fc45e48138775d4298ff64b72172101b34b62 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 17 Nov 2015 15:51:29 +1300 Subject: [PATCH 0605/1195] WIP --- text/0000-ide.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/text/0000-ide.md b/text/0000-ide.md index 8a0ab3eaf92..a6c9662c86e 100644 --- a/text/0000-ide.md +++ b/text/0000-ide.md @@ -294,8 +294,8 @@ Takes a span, returns all 'definitions and declarations' for the identifier covered by the span. Can return an error if the span does not cover exactly one identifier or the oracle has no data for an identifier. -The returned data is a list of 'defintion' data. That data includes the span for -the item, any documentation for the item, a code snippet for the item, +The returned data is a list of 'definition' data. That data includes the span +for the item, any documentation for the item, a code snippet for the item, optionally a type for the item, and one or more kinds of definition (e.g., 'variable definition', 'field definition', 'function declaration'). @@ -317,8 +317,8 @@ Question: are these useful/necessary? Or should users just call *get definition* *search for identifier* Takes a search string or an id, and a struct of search parameters including case -sensitivity, and the kind of items to search (e.g., functions, traits, all -items). Returns a list of spans and code snippets. +sensitivity, the scope of the search, and the kind of items to search (e.g., +functions, traits, all items). Returns a list of spans and code snippets. **Code completion** @@ -430,12 +430,12 @@ the let statement, and a `}` to terminate the `main` function). A solution to the first problem is replacing invalid names with some magic identifier, and ignoring errors involving that identifier. @sanxiyn implemented -something like the second feature in a [PR](https://github.com/rust- -lang/rust/pull/21323). His approach was to take a command line argument for -where to 'complete at' and to treat that as the magic identifier. An alternate -approach would be to use a keyword or distinguished identifier which the IDE -could insert (based on the caret position), or to fallback to the magic -identifier whenever there is a name resolution error. +something like the second feature in a +[PR](https://github.com/rust-lang/rust/pull/21323). His approach was to take a +command line argument for where to 'complete at' and to treat that as the magic +identifier. An alternate approach would be to use a keyword or distinguished +identifier which the IDE could insert (based on the caret position), or to +fallback to the magic identifier whenever there is a name resolution error. Similarly during type checking, if we find a mismatched or unknown type, we should try to continue type checking with the information available so as to From 78da1e95c5b9de4c1654a3e1cd5672117b065e97 Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Tue, 1 Sep 2015 21:07:18 -0600 Subject: [PATCH 0606/1195] Allow overlapping implementations for marker traits --- ...llow-overlapping-impls-on-marker-traits.md | 113 ++++++++++++++++++ 1 file changed, 113 insertions(+) create mode 100644 0000-allow-overlapping-impls-on-marker-traits.md diff --git a/0000-allow-overlapping-impls-on-marker-traits.md b/0000-allow-overlapping-impls-on-marker-traits.md new file mode 100644 index 00000000000..2172e786f4c --- /dev/null +++ b/0000-allow-overlapping-impls-on-marker-traits.md @@ -0,0 +1,113 @@ +- Feature Name: Allow overlapping impls for marker traits +- Start Date: 2015-09-02 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Preventing overlapping implementations of a trait makes complete sense in the +context of determining method dispatch. There must not be ambiguity in what code +will actually be run for a given type. However, for marker traits, there are no +associated methods for which to indicate ambiguity. There is no harm in a type +being marked as `Sync` for multiple reasons. + +# Motivation + +This is purely to improve the ergonomics of adding/implementing marker traits. +While specialization will certainly make all cases not covered today possible, +removing the restriction entirely will improve the ergonomics in several edge +cases. + +# Detailed design + +For the purpose of this RFC, the definition of a marker trait is a trait with no +associated functions, which does not inherit from any other trait. The design +here is quite straightforward. The following code fails to compile today: + +```rust +trait Marker {} + +struct GenericThing { + a: A, + b: B, +} + +impl Marker> for A {} +impl Marker> for B {} +``` + +The two impls are considered overlapping, as there is no way to prove currently +that `A` and `B` are not the same type. However, in the case of marker traits, +there is no actual reason that they couldn't be overlapping, as no code could +actually change based on the `impl`. + +For a concrete use case, consider some setup like the following: + +```rust +trait QuerySource { + fn select>(&self, columns: C) -> SelectSource { + ... + } +} + +trait Column {} +trait Table: QuerySource {} +trait Selectable: Column {} + +impl> Selectable for C {} +``` + +However, when the following becomes introduced: + +```rust +struct JoinSource { + left: Left, + right: Right, +} + +impl QuerySource for JoinSource where + Left: Table + JoinTo, + Right: Table, +{ + ... +} +``` + +It becomes impossible to satisfy the requirements of `select`. The following +impl is disallowed today: + +```rust +impl Selectable> for C where + Left: Table + JoinTo, + Right: Table, + C: Column, +{} + +impl Selectable> for C where + Left: Table + JoinTo, + Right: Table, + C: Column, +{} +``` + +Since `Left` and `Right` might be the same type, this causes an overlap. +However, there's also no reason to forbid the overlap. There is no way to work +around this today. Even if you write an impl that is more specific about the +tables, that would be considered a non-crate local blanket implementation. The +only way to write it today is to specify each column individually. + +# Drawbacks + +With this change, adding any methods to an existing marker trait, even +defaulted, would be a breaking change. Once specialization lands, this could +probably be considered an acceptable breakage. + +# Alternatives + +Once specialization lands, there does not appear to be a case that is impossible +to write, albeit with some additional boilerplate, as you'll have to manually +specify the empty impl for any overlap that might occur. + +# Unresolved questions + +None at this time. From 6773e6ff93f4fbdbf78e71d01ba4b8cc9a50a4ce Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Thu, 8 Oct 2015 15:52:30 -0600 Subject: [PATCH 0607/1195] Update the definition of "marker trait" The consensus has been that inheriting from a trait which has associated items does not introduce coherence violations for the subtrait. --- 0000-allow-overlapping-impls-on-marker-traits.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/0000-allow-overlapping-impls-on-marker-traits.md b/0000-allow-overlapping-impls-on-marker-traits.md index 2172e786f4c..1dfcba35587 100644 --- a/0000-allow-overlapping-impls-on-marker-traits.md +++ b/0000-allow-overlapping-impls-on-marker-traits.md @@ -21,8 +21,8 @@ cases. # Detailed design For the purpose of this RFC, the definition of a marker trait is a trait with no -associated functions, which does not inherit from any other trait. The design -here is quite straightforward. The following code fails to compile today: +associated items. The design here is quite straightforward. The following code +fails to compile today: ```rust trait Marker {} From 9b40614a64de1c247ef95b43cbe1f4e6d169c2dc Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 16 Nov 2015 12:08:32 -0500 Subject: [PATCH 0608/1195] Accept RFC #1268. Add notes on motivation and unresolved questions. --- ...llow-overlapping-impls-on-marker-traits.md | 29 ++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) rename 0000-allow-overlapping-impls-on-marker-traits.md => text/1268-allow-overlapping-impls-on-marker-traits.md (73%) diff --git a/0000-allow-overlapping-impls-on-marker-traits.md b/text/1268-allow-overlapping-impls-on-marker-traits.md similarity index 73% rename from 0000-allow-overlapping-impls-on-marker-traits.md rename to text/1268-allow-overlapping-impls-on-marker-traits.md index 1dfcba35587..0df23acbb26 100644 --- a/0000-allow-overlapping-impls-on-marker-traits.md +++ b/text/1268-allow-overlapping-impls-on-marker-traits.md @@ -18,6 +18,14 @@ While specialization will certainly make all cases not covered today possible, removing the restriction entirely will improve the ergonomics in several edge cases. +Some examples include: + +- the coercible trait design presents at [RFC #91][91]; +- the `ExnSafe` trait proposed in [RFC #1236][1236]. + +[91]: https://github.com/rust-lang/rfcs/pull/91 +[1236]: https://github.com/rust-lang/rfcs/pull/1236 + # Detailed design For the purpose of this RFC, the definition of a marker trait is a trait with no @@ -110,4 +118,23 @@ specify the empty impl for any overlap that might occur. # Unresolved questions -None at this time. +How can we implement this design? Simply lifting the coherence +restrictions is easy enough, but we will encounter some challenges +when we come to test whether a given trait impl holds. For example, if +we have something like: + +```rust +impl MarkerTrait for T { } +impl MarkerTrait for T { } +``` + +means that a type `Foo: MarkerTrait` can hold *either* by `Foo: Send` +*or* by `Foo: Sync`. Today, we prefer to break down an obligation like +`Foo: MarkerTrait` into component obligations (e.g., `Foo: Send`). Due +to coherence, there is always one best way to do this (sort of --- +where clauses complicate matters). That is, except for complications +due to type inference, there is a best impl to choose. But under this +proposal, there would not be. Experimentation is needed (similar +concerns arise with the proposals around specialization, so it may be +that progress on that front will answer the questions raised here). + From 71981be4b3be7402b69afc773b2936a28dcb49a5 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 16 Nov 2015 12:10:14 -0500 Subject: [PATCH 0609/1195] Add links to tracking issue etc for RFC #1268 --- text/1268-allow-overlapping-impls-on-marker-traits.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1268-allow-overlapping-impls-on-marker-traits.md b/text/1268-allow-overlapping-impls-on-marker-traits.md index 0df23acbb26..48e0bcdd09b 100644 --- a/text/1268-allow-overlapping-impls-on-marker-traits.md +++ b/text/1268-allow-overlapping-impls-on-marker-traits.md @@ -1,7 +1,7 @@ - Feature Name: Allow overlapping impls for marker traits - Start Date: 2015-09-02 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1268 +- Rust Issue: https://github.com/rust-lang/rust/issues/29864 # Summary From 56f7d8745337a75a8e8f7fd89b5a5aa4b70be77a Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 17 Nov 2015 17:29:56 -0500 Subject: [PATCH 0610/1195] Add unresolved question for overlapping impls on marker traits --- text/1268-allow-overlapping-impls-on-marker-traits.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/1268-allow-overlapping-impls-on-marker-traits.md b/text/1268-allow-overlapping-impls-on-marker-traits.md index 48e0bcdd09b..0df66d3faa7 100644 --- a/text/1268-allow-overlapping-impls-on-marker-traits.md +++ b/text/1268-allow-overlapping-impls-on-marker-traits.md @@ -118,7 +118,7 @@ specify the empty impl for any overlap that might occur. # Unresolved questions -How can we implement this design? Simply lifting the coherence +**How can we implement this design?** Simply lifting the coherence restrictions is easy enough, but we will encounter some challenges when we come to test whether a given trait impl holds. For example, if we have something like: @@ -138,3 +138,6 @@ proposal, there would not be. Experimentation is needed (similar concerns arise with the proposals around specialization, so it may be that progress on that front will answer the questions raised here). +**Should we add some explicit way to indicate that this is a marker +trait?** This would address the drawback that adding items is a +backwards incompatible change. From 8ec2ccd11da542b6d5fb914a2485a2b27682c447 Mon Sep 17 00:00:00 2001 From: llogiq Date: Thu, 19 Nov 2015 09:21:29 +0100 Subject: [PATCH 0611/1195] relegated use to alternatives, user story, clippy --- text/0000-deprecation.md | 96 ++++++++++++++++++++++++---------------- 1 file changed, 58 insertions(+), 38 deletions(-) diff --git a/text/0000-deprecation.md b/text/0000-deprecation.md index 1c2168974cb..0528bb261f7 100644 --- a/text/0000-deprecation.md +++ b/text/0000-deprecation.md @@ -6,10 +6,9 @@ # Summary This RFC proposes to allow library authors to use a `#[deprecated]` attribute, -with optional `since = "`*version*`"`, `reason = "`*free text*`"` and -`use = "`*substitute declaration*`"` fields. The compiler can then -warn on deprecated items, while `rustdoc` can document their deprecation -accordingly. +with optional `since = "`*version*`"` and `reason = "`*free text*`"`fields. The +compiler can then warn on deprecated items, while `rustdoc` can document their +deprecation accordingly. # Motivation @@ -30,29 +29,20 @@ fields and enum variants) can be given a `#[deprecated]` attribute. All possible fields are optional: * `since` is defined to contain the version of the crate at the time of -deprecating the item, following the semver scheme. It makes no sense to put a -version number higher than the current newest version here, and this is not -checked (but could be by external lints, e.g. -[rust-clippy](https://github.com/Manishearth/rust-clippy). +deprecating the item, following the semver scheme. Rustc does not know about +versions, thus the content of this field is not checked (but will be by external +lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). * `reason` should contain a human-readable string outlining the reason for deprecating the item. While this field is not required, library authors are strongly advised to make use of it to convey the reason for the deprecation to users of their library. The string is interpreted as plain unformatted text (for now) so that rustdoc can include it in the item's documentation without messing up the formatting. -* `use`, if included, must be the import path (or a semicolon-delimited list of -paths) to a set of API items that will replace the functionality of the -deprecated item. All crates in scope can be reached by this path. E.g. let's -say my `foo()` item was superceded by either the `bar()` or `baz()` functions -in the `bar` crate, in conjunction with the `bruzz(_)` function in the `baz` -crate, I can `#[deprecate(use="bar::{bar,baz};baz::bruzz")] foo()`, as long -as I have the `bar` and `baz` crates in the library path. Rustc checks if the -item is actually available, otherwise returning an error. On use of a *deprecated* item, `rustc` will `warn` of the deprecation. Note -that during Cargo builds, warnings on dependencies get silenced. Note that -while this has the upside of keeping things tidy, it has a downside when it -comes to deprecation: +that during Cargo builds, warnings on dependencies get silenced. While this has +the upside of keeping things tidy, it has a downside when it comes to +deprecation: Let's say I have my `llogiq` crate that depends on `foobar` which uses a deprecated item of `serde`. I will never get the warning about this unless I @@ -60,9 +50,8 @@ try to build `foobar` directly. We may want to create a service like `crater` to warn on use of deprecated items in library crates, however this is outside the scope of this RFC. -`rustdoc` will show deprecation on items, with a `[deprecated]` -box that may optionally show the version, reason and/or link to the replacement -if available. +`rustdoc` will show deprecation on items, with a `[deprecated]` box that may +optionally show the version and reason where available. The language reference will be extended to describe this feature as outlined in this RFC. Authors shall be advised to leave their users enough time to react @@ -71,38 +60,69 @@ before *removing* a deprecated item. The internally used feature can either be subsumed by this or possibly renamed to avoid a name clash. +# Intended Use + +Crate author Anna wants to evolve her crate's API. She has found that one +type, `Foo`, has a better implementation in the `rust-foo` crate. Also she has +written a `frob(Foo)` function to replace the earlier `Foo::frobnicate(self)` +method. + +So Anna first bumps the version of her crate (because deprecation is always +done on a version change) from `0.1.1` to `0.2.1`. She also adds the following +prefix to the `Foo` type: + +``` +extern crate rust_foo; + +#[deprecated(since = "0.2.1", use="rust_foo::Foo", + reason="The rust_foo version is more advanced, and this crates' will likely be discontinued")] +struct Foo { .. } +``` + +Users of her crate will see the following once they `cargo update` and `build`: + +``` +src/foo_use.rs:27:5: 27:8 warning: Foo is marked deprecated as of version 0.2.1 +src/foo_use.rs:27:5: 27:8 note: The rust_foo version is more advanced, and this crates' will likely be discontinued +``` + +Rust-clippy will likely gain more sophisticated checks for deprecation: + +* `future_deprecation` will warn on items marked as deprecated, but with a +version lower than their crates', while `current_deprecation` will warn only on +those items marked as deprecated where the version is equal or lower to the +crates' one. +* `deprecation_syntax` will check that the `since` field really contains a +semver number and not some random string. + +Clippy users can then activate the clippy checks and deactivate the standard +deprecation checks. + # Drawbacks -* The required checks for the `since` and `use` fields are potentially -quite complex. * Once the feature is public, we can no longer change its design # Alternatives * Do nothing * make the `since` field required and check that it's a single version -* Optionally the deprecation lint could check the current version as set by -cargo in the CARGO_CRATE_VERSION environment variable (the rust build process -should set this environment variable, too). This would allow future -deprecations to be shown in the docs early, but not warned against by the -stability lint (there could however be a `future-deprecation` lint that should -be `Allow` by default) * require either `reason` or `use` be present * `reason` could include markdown formatting -* The `use` could simply be plain text, which would remove much of the -complexity here -* The `use` field could be left out and added later. However, this would -lead people to describe a replacement in the `reason` field, as is already -happening in the case of rustc-private deprecation +* rename the `reason` field to `note` to clarify it's broader usage. +* add a `note` field and make `reason` a field with specific meaning, perhaps +even predefine a number of valid reason strings, as JEP277 currently does +* Add a `use` field containing a plain text of what to use instead +* Add a `use` field containing a path to some function, type, etc. to replace +the current feature. Currently with the rustc-private feature, people are +describing a replacement in the `reason` field, which is clearly not the +original intention of the field * Optionally, `cargo` could offer a new dependency category: "doc-dependencies" which are used to pull in other crates' documentations to link them (this is -obviously not only relevant to deprecation). +obviously not only relevant to deprecation) # Unresolved questions * What other restrictions should we introduce now to avoid being bound to a possibly flawed design? -* How should the multiple values in the `use` field work? Just split by -comma or some other delimiter? * Can / Should the `std` library make use of the `#[deprecated]` extensions? * Bikeshedding: Are the names good enough? From eced7db19ab95858011121ddd3666c23649725b5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 19 Nov 2015 10:37:42 -0800 Subject: [PATCH 0612/1195] RFC 1270 is #[deprecated] --- ...000-deprecation.md => 1270-deprecation.md} | 40 +++++++++---------- 1 file changed, 20 insertions(+), 20 deletions(-) rename text/{0000-deprecation.md => 1270-deprecation.md} (92%) diff --git a/text/0000-deprecation.md b/text/1270-deprecation.md similarity index 92% rename from text/0000-deprecation.md rename to text/1270-deprecation.md index 0528bb261f7..9aeac97ecdb 100644 --- a/text/0000-deprecation.md +++ b/text/1270-deprecation.md @@ -1,19 +1,19 @@ - Feature Name: Public Stability - Start Date: 2015-09-03 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1270](https://github.com/rust-lang/rfcs/pull/1270) +- Rust Issue: [rust-lang/rust#29935](https://github.com/rust-lang/rust/issues/29935) # Summary This RFC proposes to allow library authors to use a `#[deprecated]` attribute, -with optional `since = "`*version*`"` and `reason = "`*free text*`"`fields. The -compiler can then warn on deprecated items, while `rustdoc` can document their +with optional `since = "`*version*`"` and `reason = "`*free text*`"`fields. The +compiler can then warn on deprecated items, while `rustdoc` can document their deprecation accordingly. # Motivation -Library authors want a way to evolve their APIs; which also involves -deprecating items. To do this cleanly, they need to document their intentions +Library authors want a way to evolve their APIs; which also involves +deprecating items. To do this cleanly, they need to document their intentions and give their users enough time to react. Currently there is no support from the language for this oft-wanted feature @@ -23,7 +23,7 @@ interface to use while maximizing usefulness of the metadata introduced. # Detailed design -Public API items (both plain `fn`s, methods, trait- and inherent +Public API items (both plain `fn`s, methods, trait- and inherent `impl`ementations as well as `const` definitions, type definitions, struct fields and enum variants) can be given a `#[deprecated]` attribute. All possible fields are optional: @@ -34,14 +34,14 @@ versions, thus the content of this field is not checked (but will be by external lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). * `reason` should contain a human-readable string outlining the reason for deprecating the item. While this field is not required, library authors are -strongly advised to make use of it to convey the reason for the deprecation to -users of their library. The string is interpreted as plain unformatted text -(for now) so that rustdoc can include it in the item's documentation without +strongly advised to make use of it to convey the reason for the deprecation to +users of their library. The string is interpreted as plain unformatted text +(for now) so that rustdoc can include it in the item's documentation without messing up the formatting. -On use of a *deprecated* item, `rustc` will `warn` of the deprecation. Note -that during Cargo builds, warnings on dependencies get silenced. While this has -the upside of keeping things tidy, it has a downside when it comes to +On use of a *deprecated* item, `rustc` will `warn` of the deprecation. Note +that during Cargo builds, warnings on dependencies get silenced. While this has +the upside of keeping things tidy, it has a downside when it comes to deprecation: Let's say I have my `llogiq` crate that depends on `foobar` which uses a @@ -50,7 +50,7 @@ try to build `foobar` directly. We may want to create a service like `crater` to warn on use of deprecated items in library crates, however this is outside the scope of this RFC. -`rustdoc` will show deprecation on items, with a `[deprecated]` box that may +`rustdoc` will show deprecation on items, with a `[deprecated]` box that may optionally show the version and reason where available. The language reference will be extended to describe this feature as outlined @@ -65,16 +65,16 @@ to avoid a name clash. Crate author Anna wants to evolve her crate's API. She has found that one type, `Foo`, has a better implementation in the `rust-foo` crate. Also she has written a `frob(Foo)` function to replace the earlier `Foo::frobnicate(self)` -method. +method. So Anna first bumps the version of her crate (because deprecation is always -done on a version change) from `0.1.1` to `0.2.1`. She also adds the following +done on a version change) from `0.1.1` to `0.2.1`. She also adds the following prefix to the `Foo` type: ``` extern crate rust_foo; -#[deprecated(since = "0.2.1", use="rust_foo::Foo", +#[deprecated(since = "0.2.1", use="rust_foo::Foo", reason="The rust_foo version is more advanced, and this crates' will likely be discontinued")] struct Foo { .. } ``` @@ -113,8 +113,8 @@ deprecation checks. even predefine a number of valid reason strings, as JEP277 currently does * Add a `use` field containing a plain text of what to use instead * Add a `use` field containing a path to some function, type, etc. to replace -the current feature. Currently with the rustc-private feature, people are -describing a replacement in the `reason` field, which is clearly not the +the current feature. Currently with the rustc-private feature, people are +describing a replacement in the `reason` field, which is clearly not the original intention of the field * Optionally, `cargo` could offer a new dependency category: "doc-dependencies" which are used to pull in other crates' documentations to link them (this is @@ -122,7 +122,7 @@ obviously not only relevant to deprecation) # Unresolved questions -* What other restrictions should we introduce now to avoid being bound to a +* What other restrictions should we introduce now to avoid being bound to a possibly flawed design? * Can / Should the `std` library make use of the `#[deprecated]` extensions? * Bikeshedding: Are the names good enough? From edeec7a00df24234ab7ef0b57896ddc9ec90c38f Mon Sep 17 00:00:00 2001 From: Paul Dicker Date: Sun, 22 Nov 2015 16:25:35 +0100 Subject: [PATCH 0613/1195] Move the sections about file locking and caching to alternatives --- 0000-open-options.md | 425 +++++++++++++++++++++++-------------------- 1 file changed, 225 insertions(+), 200 deletions(-) diff --git a/0000-open-options.md b/0000-open-options.md index 0e021b42cfa..10a4619bc22 100644 --- a/0000-open-options.md +++ b/0000-open-options.md @@ -89,30 +89,27 @@ try!(file.read(&mut buffer)); ``` ### No access mode set -On Windows it is possible to open a file without setting an access mode. You can -do practically nothing with the file, but you can read -[metadata](https://msdn.microsoft.com/en-us/library/windows/desktop/aa363788%28v=vs.85%29.aspx) -such as the file size or timestamp. +Even if you don't have read or write permission to a file, it is possible to +open it on some systems by opening it with no access mode set (or the equivalent +there of). This is true for Windows, Linux (with the flag `O_PATH`) and +GNU/Hurd. -On Unix it is traditionally not possible to open a file without specifying the -access mode, because of the way the access flags where defined: something like -`O_RDONLY = 0`, `O_WRONLY = 1` and `O_RDWR = 2`. When no flags are set, the -access mode is `0` and you fall back to opening the file read-only. +What can be done with a file opened this way is system-specific and niche. Since +Linux version 2.6.39 all three operating systems support reading metadata such +as the file size and timestamps. -Linux since version 2.6.39 has functionality similar to Windows by opening the -file with `O_RDONLY | O_PATH`. Since version 3.6 you can call `fstat` on a file -descriptor opened this way. +On practically all variants of Unix opening a file without specifying the access +mode falls back to opening the file read-only. This is because of the way the +access flags where traditionally defined: `O_RDONLY = 0`, `O_WRONLY = 1` and +`O_RDWR = 2`. When no flags are set, the access mode is `0`: read-only. But +code that relies on this is considered buggy and not portable. -For what it's worth -[GNU/Hurd](http://www.gnu.org/software/libc/manual/html_node/Access-Modes.html) -allows opening files without an access mode, because it defines `O_RDONLY = 1` -and `O_WRONLY = 2`. It allows all operations on the file that do not involve -reading or writing the data, like `chmod`. - -On Unix systems that fall back to opening the file read-only, Rust will fail -opening the file with `E_INVALID`. Otherwise, if for example you are developing -on OS X but forget to set `.read(true)` when opening a file, it would work on -OS X but not on other systems. +What should Rust do when no access mode is specified? Fall back to read-only, +open with the most similar system-specific mode, or always fail to open? This +RFC proposes to always fail. This is the conservative choice, and can be changed +to open in a system-specific mode if a clear use case arises. Implementing a +fallback is not worth it: it is no great effort to set the access mode +explicitly. ### Windows-specific @@ -312,143 +309,6 @@ On Unix this is done by setting the creation mode using `.custom_flags()` with specifying `.access_mode()` (see above). -## Sharing / locking -On Unix it is possible for multiple processes to read and write to the same file -at the same time. - -When you open a file on Windows, the system by default denies other processes to -read or write to the file, or delete it. By setting the sharing mode, it is -possible to allow other processes read, write and/or delete access. For -cross-platform consistency, Rust imitates Unix by setting all sharing flags. - -Unix has no equivalent to the kind of file locking that Windows has. It has two -types of advisory locking, POSIX and BSD-style. Advisory means any process that -does not use locking itself can happily ignore the locking af another process. -As if that is not bad enough, they both have -[problems](http://0pointer.de/blog/projects/locking.html) that make them close -to unusable for modern multi-threaded programs. Linux may in some very rare -cases support mandatory file locking, but it is just as broken as advisory. - -For Rust, the sharing mode can be set with a Windows-specific option. Given the -problems above, I don't expect there to ever be a cross-platform option for file -locking. - - -### Windows-specific: Share mode -`.share_mode(FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE)` - -It is possible to set the individual share permissions with `.share_mode()`. - -The current philosophy of this function is that others should have no rights, -unless explicitly granted. I think a better fit for Rust would be to give all -others all rights, unless explicitly denied, e.g.: -`.share_mode(DENY_READ | DENY_WRITE | DENY_DELETE)`. - - -## Caching behaviour - -### Read cache hint -Instead of requesting only the data necessary for a single `read()` call from a -storage device, an operating system may request more data than necessary to have -it already available for the next read call (e.g. the read-ahead cache). If you -read the file sequentially this is beneficial, for completely random access it -can become a penalty. Operating systems generally have good heuristics, but you -may get a performance win if you give the os a hint about how you will read the -file. - -Do some real-world benchmarks before setting this option. - - -#### Cache hint -``` -.cache_hint(enum CacheHint) - -enum CacheHint { - None, - Sequential, - Random, -} -``` - -On Windows this maps to the flags `FILE_FLAG_SEQUENTIAL_SCAN` and -`FILE_FLAG_RANDOM_ACCESS`. On Linux and FreeBSD they map to the system call -`posix_fadvise()` with the flags `POSIX_FADV_SEQUENTIAL` and -`POSIX_FADV_RANDOM`. - -This option is ignored on operating systems that do not support caching hints. - - -### Write cache -See [Ensuring data reaches disk](https://lwn.net/Articles/457667/) - -1. copy data to kernel space -2. the kernel may wait a short wile -3. data is written to the cache of the storage device -4. data is written to persistent storage - -The Rust functions `sync_all()` and `sync_data()` control step 2: they force all -data in the write buffer of the kernel to be written to the storage device. This -is important to ensure critical data reaches the storage device in case of a -system crash or power outage, but comes with a large performance penalty. - -All modern operating systems also support a mode where each call to `write()` -will not return until the data is written to the storage device, thus removing -step 2 for _all_ writes. This can be a useful options for writing critical data, -where you would call `sync_data()` after each write. This saves a system call -for each write, and you are sure to never forget it. - - -#### Sync all -`.sync_all(true)`: implement an open option with the same name as the free -standing call. - -On Windows this means setting the flag `FILE_FLAG_WRITE_THROUGH`, and on Unix -(except OS X) the flag `O_SYNC`. - -OS X does not support `O_SYNC`, but it is possible to call fcntl with -`F_NOCACHE` to get the same effect. This has the side-effect that data also does -not end up in the read cache, so this can have a performance penalty when -reading if a file is opened with a read-write access mode. - - -#### Sync data -`.sync_data(true)` - -Some systems support syncing only the data written, but can wait with updating -less critical metadata such as the last modified timestamp. If the metadata is -not critical (and it rarely is), you should always use `sync_data()` as an easy -performance win. - -Linux since version 2.6.33 supports this mode with `O_DSYNC`, as does Solaris -and recent versions of NetBSD. If a system does not support only syncing data, -Rust will fall back to full syncing. - -If `.sync_all(true)` is specified, `.sync_data()` is ignored. - - -### Completely bypass the kernel -Normally the os kernel will process read or write calls and store the data -temporarily in a kernel-space buffer. The kernel makes sure the data size and -alignment of reads and writes correspondent to sectors on the storage device, -usually 512 or 4096 bytes. Also the kernel can keep data recently read or -written in cache, to speed up future file operations. - -Some operating systems allow you to completely bypass the copy of data to or -from kernel space. This is generally a bad idea. Applications will have to -figure out and handle alignment restrictions themselves, and implement manual -caching. It is mostly useful for database applications that may have more -knowledge about their optimal caching behaviour than the os. And it can have a -use when reading many gigabytes of data (like a backup process), which may -destroy the os cache for other processes. - -This is available on Windows with the flag `FILE_FLAG_NO_BUFFERING`, and on -Linux and some variants of BSD with `O_DIRECT`. Making correct use of this mode -involves low-level tuning and operating system dependant behaviour. It makes no -sense for Rust to expose this as a simple, cross-platform option. For -applications that really wish to use it, it is no problem to submit it as a -custom flag. - - ## Asynchronous IO Out op scope. @@ -502,31 +362,41 @@ For the custom flags on Unix, the bits that define the access mode are masked out with `O_ACCMODE`, to ensure they do not interfere with the access mode set by Rusts options. -| [Windows](https://msdn.microsoft.com/en-us/library/windows/desktop/hh449426%28v=vs.85%29.aspx): -|:--------------------------- -| FILE_FLAG_BACKUP_SEMANTICS -| FILE_FLAG_DELETE_ON_CLOSE -| FILE_FLAG_NO_BUFFERING -| FILE_FLAG_OPEN_NO_RECALL -| FILE_FLAG_OPEN_REPARSE_POINT -| FILE_FLAG_OVERLAPPED -| FILE_FLAG_POSIX_SEMANTICS -| FILE_FLAG_RANDOM_ACCESS -| FILE_FLAG_SESSION_AWARE -| FILE_FLAG_SEQUENTIAL_SCAN -| FILE_FLAG_WRITE_THROUGH +[Windows](https://msdn.microsoft.com/en-us/library/windows/desktop/hh449426%28v=vs.85%29.aspx): + +bit| flag +--:|:-------------------------------- +31 | FILE_FLAG_WRITE_THROUGH +30 | FILE_FLAG_OVERLAPPED +29 | FILE_FLAG_NO_BUFFERING +28 | FILE_FLAG_RANDOM_ACCESS +27 | FILE_FLAG_SEQUENTIAL_SCAN +26 | FILE_FLAG_DELETE_ON_CLOSE +25 | FILE_FLAG_BACKUP_SEMANTICS +24 | FILE_FLAG_POSIX_SEMANTICS +23 | FILE_FLAG_SESSION_AWARE +21 | FILE_FLAG_OPEN_REPARSE_POINT +20 | FILE_FLAG_OPEN_NO_RECALL +19 | FILE_FLAG_FIRST_PIPE_INSTANCE +18 | FILE_FLAG_OPEN_REQUIRING_OPLOCK + Unix: | POSIX | Linux | OS X | FreeBSD | OpenBSD | NetBSD |Dragonfly BSD| Solaris | |:------------|:------------|:------------|:------------|:------------|:------------|:------------|:------------| -| O_DIRECTORY | O_DIRECTORY | | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | -| O_NOCTTY | O_NOCTTY | | O_NOCTTY | | O_NOCTTY | | O_NOCTTY | +| O_TRUNC | O_TRUNC | O_TRUNC | O_TRUNC | O_TRUNC | O_TRUNC | O_TRUNC | O_TRUNC | +| O_CREAT | O_CREAT | O_CREAT | O_CREAT | O_CREAT | O_CREAT | O_CREAT | O_CREAT | +| O_EXCL | O_EXCL | O_EXCL | O_EXCL | O_EXCL | O_EXCL | O_EXCL | O_EXCL | +| O_APPEND | O_APPEND | O_APPEND | O_APPEND | O_APPEND | O_APPEND | O_APPEND | O_APPEND | +| O_CLOEXEC | O_CLOEXEC | O_CLOEXEC | O_CLOEXEC | O_CLOEXEC | O_CLOEXEC | O_CLOEXEC | O_CLOEXEC | +| O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | O_DIRECTORY | +| O_NOCTTY | O_NOCTTY | O_NOCTTY | O_NOCTTY | | O_NOCTTY | | O_NOCTTY | | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | O_NOFOLLOW | | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | O_NONBLOCK | -| O_DSYNC | O_DSYNC | | | | O_DSYNC | | O_DSYNC | +| O_SYNC | O_SYNC | O_SYNC | O_SYNC | O_SYNC | O_SYNC | O_FSYNC | O_SYNC | +| O_DSYNC | O_DSYNC | O_DSYNC | | | O_DSYNC | | O_DSYNC | | O_RSYNC | | | | | O_RSYNC | | O_RSYNC | -| O_SYNC | O_SYNC | | O_SYNC | O_SYNC | O_SYNC | O_FSYNC | O_SYNC | | | O_DIRECT | | O_DIRECT | | O_DIRECT | O_DIRECT | | | | O_ASYNC | | | | O_ASYNC | | | | | O_NOATIME | | | | | | | @@ -561,21 +431,20 @@ HANDLE hTemplateFile; - Current: when `.append(true)` is set, it is not possible to modify file attributes on Windows, but it is possible to change the file mode on Unix. New: allow file attributes to be modified on Windows in append-mode. -- Current: `.read()` and `.write()` set individual bit flags instead of generic - flags. New: Set generic flags, as recommend by Microsoft. e.g. `GENERIC_WRITE` - instead of `FILE_GENERIC_WRITE` and `GENERIC_READ` instead of +- Current: On Windows `.read()` and `.write()` set individual bit flags instead + of generic flags. New: Set generic flags, as recommend by Microsoft. e.g. + `GENERIC_WRITE` instead of `FILE_GENERIC_WRITE` and `GENERIC_READ` instead of `FILE_GENERIC_READ`. Currently truncate is broken on Windows, this fixes it. - Current: when no access mode is set, this falls back to opening the file - read-only on Unix. - New: open with `O_RDONLY | O_PATH` on Linux, and fail with `E_INVALID` on all - other Unix variants. + read-only on Unix, and opening with no access permissions on Windows. + New: always fail to open if no access mode is set. - Rename the Windows-specific `.desired_access()` to `.access_mode()` ### Creation mode -- Do not allow `.truncate(true)` if the access mode is read-only and/or append. - This is currently buggy on Windows, and works on some versions of Unix, but - not on others (implementation defined). - Implement `.create_new()`. +- Do not allow `.truncate(true)` if the access mode is read-only and/or append. +- Do not allow `.create(true)` or `.create_new (true)` if the access mode is + read-only. - Remove the Windows-specific `.creation_disposition()`. It has no use, because all its options can be set in a cross-platform way. - Split the Windows-specific `.flags_and_attributes()` into `.custom_flags()` @@ -585,14 +454,6 @@ HANDLE hTemplateFile; bits, and the custom flags that modify the behaviour of the current file handle. -### Sharing / locking -- Currently `.share_mode()` grants permissions, change it to grant by default, - and possibly deny permissions. - -### Caching behaviour -- Implement `.cache_hint()`. -- Implement `.sync_all()` and `.sync_data()`. - ### Other options - Set the close-on-exec flag atomically on Unix if supported. - Implement `.custom_flags()` on Windows and Unix to pass custom flags to the @@ -620,17 +481,181 @@ Also this RFC is in line with the vision for IO in the # Alternatives -Keep the status quo. +The first version of this RFC contained a proposal for options that control +caching anf file locking. They are out of scope for now, but included here for +reference. -# Unresolved questions -Implementation and testing of `.sync_all()` and `.sync_data()` could uncover -some corner cases, but I don't expect any that would give great trouble. +## Sharing / locking +On Unix it is possible for multiple processes to read and write to the same file +at the same time. + +When you open a file on Windows, the system by default denies other processes to +read or write to the file, or delete it. By setting the sharing mode, it is +possible to allow other processes read, write and/or delete access. For +cross-platform consistency, Rust imitates Unix by setting all sharing flags. + +Unix has no equivalent to the kind of file locking that Windows has. It has two +types of advisory locking, POSIX and BSD-style. Advisory means any process that +does not use locking itself can happily ignore the locking af another process. +As if that is not bad enough, they both have +[problems](http://0pointer.de/blog/projects/locking.html) that make them close +to unusable for modern multi-threaded programs. Linux may in some very rare +cases support mandatory file locking, but it is just as broken as advisory. + + +### Windows-specific: Share mode +`.share_mode(FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE)` -Should `.cache_hint()` take an enum? +It is possible to set the individual share permissions with `.share_mode()`. -Rename the Windows-specific `.desired_access()` to `.access_mode()`? +The current philosophy of this function is that others should have no rights, +unless explicitly granted. I think a better fit for Rust would be to give all +others all rights, unless explicitly denied, e.g.: +`.share_mode(DENY_READ | DENY_WRITE | DENY_DELETE)`. -What should be done about the missing variables for `CreateFile2`? -Are there any other options that we should define while at it? \ No newline at end of file +## Controlling caching +When dealing file file systems and hard disks, there are several kinds of +caches. Giving hints or controlling them may improve performance or data +consistency. +1. *read-ahead (performance of reads and overwrites)* + Instead of requesting only the data necessary for a single `read()` call from + a storage device, an operating system may request more data than necessary to + have it already available for the next read. +2. *os cache (performance of reads and overwrites)* + The os may keep the data of previous reads and writes in memory to increase + the performance of future reads and possibly writes. +3. *os staging area (convenience/performance of reads and writes)* + The size and alignment of data reads and writes to a disk should + correspondent to sectors on the storage device, usually 512 or 4096 bytes. + The os makes sure a regular `write()` or `read()` doesn't have to care about + this. For example a small write (say a 100 bytes) has to rewrite a whole + sector. The os often has the surrounding data in its cache and can + efficiently combine it to write the whole sector. +4. *delayed writing (performance/correctness of writes)* + The os may delay writes to improve performance, for example by batching + consecutive writes, and scheduling with reads to minimize seeking. +5. *on-disk write cache (performance/correctness of writes)* + Most hard disk / storage devices have a small RAM cache. It can speed up + reads, and writes can return as soon as the data is written to the devices + cache. + + +### Read-ahead hint +``` +.read_ahead_hint(enum CacheHint) + +enum ReadAheadHint { + Default, + Sequential, + Random, +} +``` + +If you read a file sequentially the read-ahead is beneficial, for completely +random access it can become a penalty. + +- `Default` uses the generally good heuristics of the operating system. +- `Sequential` indicates sequential but not neccesary consecutive access. + With this the os may increase the amount of data that is read ahead. +- `Random` indicates mainly random access. The os may disable its read-ahead + cache. + +This option is treated as a hint. It is ignored if the os does not support it, +or if the behaviour of the application proves it is set wrong. + +Open flags / system calls: +- Windows: flags `FILE_FLAG_SEQUENTIAL_SCAN` and `FILE_FLAG_RANDOM_ACCESS` +- Linux, FreeBSD, NetBSD: `posix_fadvise()` with the flags + `POSIX_FADV_SEQUENTIAL` and `POSIX_FADV_RANDOM` +- OS X: `fcntl()` with with `F_RDAHEAD 0` for random (there is no special mode + for sequential). + + +### OS cache +`used_once(true)` + +When reading many gigabytes of data a process may push useful data from other +processes out of the os cache. To keep the performance of the whole system up, a +process could indicate to the os whether data is only needed once, or not needed +anymore. On Linux, FreeBSD and NetBSD this is possible with fcntl +`POSIX_FADV_DONTNEED` after a read or write with sync (or before close). On +FreeBSD and NetBSD it is also possible to specify this up-front with fnctl +`POSIX_FADV_NOREUSE`, and on OS X with fnctl `F_NOCACHE`. Windows does not seem +to provide an option for this. + +This option may negatively effect the performance of writes smaller than the +sector size, as cached data may not be available to the os staging area. + +This control over the os cache is the main reason some applications use direct +io, despite it being less convenient and disabling other useful caches. + + +### Delayed writing and on-disk write cache +`.sync_data(true)` and `.sync_all(true)` + +There can be two delays (by the os and by the disk cache) between when an +application performs a write, and when the data is written to persistent +storage. They increase performance, but increase the risk of data loss in case +of a systems crash or power outage. + +When dealing with critical data, it may be useful to control these caches to +make the chance of data loss smaller. The application should normally do so by +calling Rusts stand-alone functions `sync_data()` or `sync_all()` at meaningful +points (e.g. when the file is in a consistent state, or a state it can recover +from). + +However, `.sync_data()` and `.sync_all()` may also be given as an open option. +This guarantees every write will not return before the data is written to disk. +These options improve reliability as and you can never accidentally forget a +sync. + +Whether perfermance with these options is worse than with the stand-alone +functions is hard to say. With these options the data maybe has to be +synchronised more often. But the stand-alone functions often sync outstanding +writes to all files, while the options possibly sync only the current file. + +The difference between `.sync_all()` and `.sync_data(true)` is that +`.sync_data(true)` does not update the less critical metadata such as the last +modified timestamp (although it will be written eventually). + +Open flags: +- Windows: `FILE_FLAG_WRITE_THROUGH` for `.sync_all()` +- Unix: `O_SYNC` for `.sync_all()` and `O_DSYNC` for `.sync_data()` + +If a system does not support syncing only data, this option will fall back to +syncing both data and metadata. If `.sync_all(true)` is specified, +`.sync_data()` is ignored. + + +### Direct access / no caching +Most operating systems offer a mode that reads data straight from disk to an +application buffer, or that writes straight from a buffer to disk. This avoid +the small cost of a memory copy. It has the side effect that the data is not +available to the os to provide caching. Also, because this does not use the +_os staging area_ all reads and writes have to take care of data sizes and +alignment themselves. + +Overview: +- _os staging area_: not used +- _read-ahead_: not used +- _os cache_: data may be used, but is not added +- _delayed writing_: no delay +- _on-disk write cache_: maybe + +Open flags / system calls: +- Windows: flag `FILE_FLAG_NO_BUFFERING` +- Linux, FreeBSD, NetBSD, Dragonfly BSD: flag `O_DIRECT` + +The other options offer a more fine-grained control over caching, and usually +offer better performance or correctness guarantees. This option is sometimes +used by applications as a crude way to control (disable) the _os cache_. + +Rust should not currently expose this as an open option, because it should be +used with an abstraction / external crate that handles the data size and +alignment requirements. If it should be used at all. + + +# Unresolved questions +None. From b6b4b4b189f48668476d62f0a979e57c9d4f0923 Mon Sep 17 00:00:00 2001 From: Paul Dicker Date: Sun, 22 Nov 2015 16:27:33 +0100 Subject: [PATCH 0614/1195] Oops, move file to the correct directory --- 0000-open-options.md => text/0000-open-options.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename 0000-open-options.md => text/0000-open-options.md (100%) diff --git a/0000-open-options.md b/text/0000-open-options.md similarity index 100% rename from 0000-open-options.md rename to text/0000-open-options.md From 55d1032776bc5fac167a0044eaeab81c6a44fc14 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 23 Nov 2015 14:26:29 -0800 Subject: [PATCH 0615/1195] RFC 1252 is expanding the OpenOptions structure --- text/{0000-open-options.md => 1252-open-options.md} | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) rename text/{0000-open-options.md => 1252-open-options.md} (99%) diff --git a/text/0000-open-options.md b/text/1252-open-options.md similarity index 99% rename from text/0000-open-options.md rename to text/1252-open-options.md index 10a4619bc22..6854dc4f22c 100644 --- a/text/0000-open-options.md +++ b/text/1252-open-options.md @@ -1,7 +1,7 @@ -- Feature Name: expand-open-options +- Feature Name: `expand_open_options` - Start Date: 2015-08-04 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1252](https://github.com/rust-lang/rfcs/pull/1252) +- Rust Issue: [rust-lang/rust#30014](https://github.com/rust-lang/rust/issues/30014) # Summary @@ -65,7 +65,7 @@ _Implementation detail_: On Windows opening a file in append-mode has one flag _less_, the right to change existing data is removed. On Unix opening a file in append-mode has one flag _extra_, that sets the status of the file descriptor to append-mode. You could say that on Windows write is a superset of append, while -on Unix append is a superset of write. +on Unix append is a superset of write. Because of this append is treated as a separate access mode in Rust, and if `.append(true)` is specified than `.write()` is ignored. @@ -317,7 +317,7 @@ Out op scope. ### Inheritance of file descriptors Leaking file descriptors to child processes can cause problems and can be a -security vulnerability. See this report by +security vulnerability. See this report by [Python](https://www.python.org/dev/peps/pep-0446/). On Windows, child processes do not inherit file descriptors by default (but this From 3c4f0f6fb9b9c9ca2d8cb956129e934cdcf84239 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 23 Nov 2015 21:10:11 +0100 Subject: [PATCH 0616/1195] Massive redo of Detailed Design; expanded parts of spec in passing. As noted in the Edit History addition: * replaced detailed design with a specification-oriented presentation rather than an implementation-oriented algorithm. * fixed some oversights in the specification (that led to matchers like `break { stuff }` being accepted), * expanded the follows sets for `ty` to include `OpenDelim(Brace), Ident(where), Or` (since Rust's grammar already requires all of `|foo:TY| {}`, `fn foo() -> TY {}` and `fn foo() -> TY where {}` to work). * expanded the follow set for `pat` to include `Or` (since Rust's grammar already requires `match (true,false) { PAT | PAT => {} }` and `|PAT| {}` to work). See also [RFC issue 1336][]. Not noted in Edit History addition: * expanded/revised terminology section to fit new detailed design * added "examples of valid and invalid matchers" subsection, that uses the specification from detailed design to explain why each is valid/invalid. * rewrote the algorithm to actually implement the (new) specification, and moved the discussion of the algorithm to a non-binding appendix. --- text/0550-macro-future-proofing.md | 579 ++++++++++++++++++++++++++--- 1 file changed, 533 insertions(+), 46 deletions(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 6cfdec75fa3..6ebb5df51e0 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -2,39 +2,63 @@ - RFC PR: [550](https://github.com/rust-lang/rfcs/pull/550) - Rust Issue: [20563](https://github.com/rust-lang/rust/pull/20563) +# Summary + +Future-proof the allowed forms that input to an MBE can take by requiring +certain delimiters following NTs in a matcher. In the future, it will be +possible to lift these restrictions backwards compatibly if desired. + # Key Terminology - `macro`: anything invokable as `foo!(...)` in source code. - `MBE`: macro-by-example, a macro defined by `macro_rules`. -- `matcher`: the left-hand-side of a rule in a `macro_rules` invocation. -- `macro parser`: the bit of code in the Rust parser that will parse the input - using a grammar derived from all of the matchers. -- `NT`: non-terminal, the various "meta-variables" that can appear in a matcher. -- `fragment`: The piece of Rust syntax that an NT can accept. -- `fragment specifier`: The identifier in an NT that specifies which fragment - the NT accepts. +- `matcher`: the left-hand-side of a rule in a `macro_rules` invocation, or a subportion thereof. +- `macro parser`: the bit of code in the Rust parser that will parse the input using a grammar derived from all of the matchers. +- `fragment`: The class of Rust syntax that a given matcher will accept (or "match"). +- `repetition` : a fragment that follows a regular repeating pattern +- `NT`: non-terminal, the various "meta-variables" or repetition matchers that can appear in a matcher, specified in MBE syntax with a leading `$` character. +- `simple NT`: a "meta-variable" non-terminal (further discussion below). +- `complex NT`: a repetition matching non-terminal, specified via Kleene closure operators (`*`, `+`). +- `token`: an atomic element of a matcher; i.e. identifiers, operators, open/close delimiters, *and* simple NT's. +- `token tree`: a tree structure formed from tokens (the leaves), complex NT's, and finite sequences of token trees. +- `delimiter token`: a token that is meant to divide the end of one fragment and the start of the next fragment. +- `separator token`: an optional delimiter token in an complex NT that separates each pair of elements in the matched repetition. +- `separated complex NT`: a complex NT that has its own separator token. +- `delimited sequence`: a sequence of token trees with appropriate open- and close-delimiters at the start and end of the sequence. +- `empty fragment`: The class of invisible Rust syntax that separates tokens, i.e. whitespace, or (in some lexical contexts), the empty token sequence. +- `fragment specifier`: The identifier in a simple NT that specifies which fragment the NT accepts. - `language`: a context-free language. Example: ```rust macro_rules! i_am_an_mbe { - (start $foo:expr end) => ($foo) + (start $foo:expr $($i:ident),* end) => ($foo) } ``` -`(start $foo:expr end)` is a matcher, `$foo` is an NT with `expr` as its -fragment specifier. +`(start $foo:expr $($i:ident),* end)` is a matcher. The whole matcher +is a delimited sequence (with open- and close-delimiters `(` and `)`), +and `$foo` and `$i` are simple NT's with `expr` and `ident` as their +respective fragment specifiers. -# Summary +`$(i:ident),*` is *also* an NT; it is a complex NT that matches a +comma-seprated repetition of identifiers. The `,` is the separator +token for the complex NT; it occurs in between each pair of elements +(if any) of the matched fragment. -Future-proof the allowed forms that input to an MBE can take by requiring -certain delimiters following NTs in a matcher. In the future, it will be -possible to lift these restrictions backwards compatibly if desired. +Another example of a complex NT is `$(hi $e:expr ;)+`, which matches +any fragment of the form `hi ; hi ; ...` where `hi +;` occurs at least once. Note that this complex NT does not +have a dedicated separator token. + +(Note that Rust's parser ensures that delimited sequences always occur +with proper nesting of token tree structure and correct matching of open- +and close-delimiters.) # Motivation -In current Rust, the `macro_rules` parser is very liberal in what it accepts +In current Rust (version 0.12; i.e. pre 1.0), the `macro_rules` parser is very liberal in what it accepts in a matcher. This can cause problems, because it is possible to write an MBE which corresponds to an ambiguous grammar. When an MBE is invoked, if the macro parser encounters an ambiguity while parsing, it will bail out with a @@ -68,45 +92,329 @@ proposal is to prevent such scenarios in the future by requiring certain ambiguity need only be considered when combined with these sets of delimiters, rather than any possible arbitrary matcher. +---- + +Another example of a potential extension to the language that +motivates a restricted set of "delimiter tokens" is +([postponed][Postponed 961]) [RFC 352][], "Allow loops to return +values other than `()`", where the `break` expression would now accept +an optional input expression: `break `. + + * This proposed extension to the language, combined with the facts that + `break` and `{ ... ? }` are Rust expressions, implies that + `{` should not be in the follow set for the `expr` fragment specifier. + + * Thus in a slightly more ideal world the following program would not be + accepted, because the interpretation of the macro could change if we + were to accept RFC 352: + + ```rust + macro_rules! foo { + ($e:expr { stuff }) => { println!("{:?}", $e) } + } + + fn main() { + loop { foo!(break { stuff }); } + } + ``` + + (in our non-ideal world, the program is legal in Rust versions 1.0 + through at least 1.4) + +[RFC 352]: https://github.com/rust-lang/rfcs/pull/352 + +[Postponed 961]: https://github.com/rust-lang/rfcs/issues/961 + # Detailed design -The algorithm for recognizing valid matchers `M` follows. Note that a matcher -is merely a token tree. A "simple NT" is an NT without repetitions. That is, -`$foo:ty` is a simple NT but `$($foo:ty)+` is not. `FOLLOW(NT)` is the set of -allowed tokens for the given NT's fragment specifier, and is defined below. -`F` is used for representing the separator in complex NTs. In `$($foo:ty),+`, -`F` would be `,`, and for `$($foo:ty)+`, `F` would be `EOF`. - -*input*: a token tree `M` representing a matcher and a token `F` - -*output*: whether M is valid - -For each token `T` in `M`: - -1. If `T` is not an NT, continue. -2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If - `T'` is `EOF` or a close delimiter of a token tree, replace `T'` with - `F`. If `T'` is in the set `FOLLOW(NT)`, `T'` is EOF, or `T'` is any close - delimiter, continue. Otherwise, reject. -3. Else, `T` is a complex NT. - 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on the - contents with `F` set to the token following `T`. If it accepts, - continue, else, reject. - 2. If `T` has the form `$(...)U+` or `$(...)U*` for some token `U`, run - the algorithm on the contents with `F` set to `U`. If it accepts, - check that the last token in the sequence can be followed by `F`. If - so, accept. Otherwise, reject. +We will tend to use the variable "M" to stand for a matcher, +variables "t" and "u" for arbitrary individual tokens, +and the variables "tt" and "uu" for arbitrary token trees. +(The use of "tt" does present potential ambiguity with its +additional role as a fragment specifier; but it will be clear +from context which interpretation is meant.) -This algorithm should be run on every matcher in every `macro_rules` -invocation, with `F` as `EOF`. If it rejects a matcher, an error should be -emitted and compilation should not complete. +"SEP" will range over separator tokens, +"OP" over the Kleene operators `*` and `+`, and +"OPEN"/"CLOSE" over matching token pairs surrounding a delimited sequence (e.g. `[` and `]`). + +We also use Greek letters "α" "β" "γ" "δ" to stand for potentially empty +token-tree sequences. (However, the +Greek letter "ε" (epsilon) has a special role in the presentation and +does not stand for a token-tree sequence.) + + * This Greek letter convention is usually just employed when the + presence of a sequence is a technical detail; in particular, when I + wish to *emphasize* that we are operating on a sequence of + token-trees, I will use the notation "tt ..." for the sequence, not + a Greek letter + +Note that a matcher is merely a token tree. A "simple NT", as +mentioned above, is an meta-variable NT; thus it is a +non-repetition. For example, `$foo:ty` is a simple NT but +`$($foo:ty)+` is a complex NT. + +Note also that in the context of this RFC, the term "token" generally +*includes* simple NTs. + +Finally, it is useful for the reader to keep in mind that according to +the definitions of this RFC, no simple NT matches +the empty fragment, and likewise no token matches +the empty fragment of Rust syntax. (Thus, the *only* NT that can match +the empty fragment is a complex NT.) + +## The Matcher Invariant + +This RFC establishes the following two-part invariant for valid matchers + + 1. For any two successive token tree sequences in a matcher `M` + (i.e. `M = ... tt uu ...`), we must have + FOLLOW(`... tt`) ⊇ FIRST(`uu ...`) + + 2. For any separated complex NT in a matcher, `M = ... $(tt ...) SEP OP ...`, + we must have + `SEP` ∈ FOLLOW(`tt ...`). + +The first part says that whatever actual token that comes after a +matcher must be somewhere in the predetermined follow set. This +ensures that a legal macro definition will continue to assign the same +determination as to where `... tt` ends and `uu ...` begins, even as +new syntactic forms are added to the language. + +The second part says that a separated complex NT must use a seperator +token that is part of the predetermined follow set for the internal +contents of the NT. This ensures that a legal macro definition will +continue to parse an input fragment into the same delimited sequence +of `tt ...`'s, even as new syntactic forms are added to the language. + +(This is assuming that all such changes are appropriately restricted, +by the definition of FOLLOW below, of course.) + +The above invariant is only formally meaningful if one knows what +FIRST and FOLLOW denote. We address this in the following sections. + +## FIRST and FOLLOW, informally + +FIRST and FOLLOW are defined as follows. + +A given matcher M maps to three sets: FIRST(M), LAST(M) and FOLLOW(M). + +Each of the three sets is made up of tokens. FIRST(M) and LAST(M) may +also contain a distinguished non-token element ε ("epsilon"), which +indicates that M can match the empty fragment. (But FOLLOW(M) is +always just a set of tokens.) + +Informally: + + * FIRST(M): collects the tokens potentially used first when matching a fragment to M. + + * LAST(M): collects the tokens potentially used last when matching a fragment to M. + + * FOLLOW(M): the set of tokens allowed to follow immediately after some fragment + matched by M. + + In other words: t ∈ FOLLOW(M) if and only if there exists (potentially empty) token sequences α, β, γ, δ where: + * M matches β, + * t matches γ, and + * The concatenation α β γ δ is a parseable Rust program. + +We use the shorthand ANYTOKEN to denote the set of all tokens (including simple NTs). + + * (For example, if any token is legal after a matcher M, then FOLLOW(M) = ANYTOKEN.) + +(To review one's understanding of the above informal descriptions, the +reader at this point may want to jump ahead to the +[examples of FIRST/LAST][examples-of-first-and-last] before reading +their formal definitions.) + +## FIRST, LAST + +Below are formal inductive definitions for FIRST and LAST. + +"A ∪ B" denotes set union, "A ∩ B" denotes set intersection, and +"A \ B" denotes set difference (i.e. all elements of A that are not present +in B). + +FIRST(M), defined by case analysis on the sequence M and the structure +of its first token-tree (if any): + + * if M is the empty sequence, then FIRST(M) = { ε }, + + * if M starts with a token t, then FIRST(M) = { t }, + + (Note: this covers the case where M starts with a delimited + token-tree sequence, `M = OPEN tt ... CLOSE ...`, in which case `t = OPEN` and + thus FIRST(M) = { `OPEN` }.) + + (Note: this critically relies on the property that no simple NT matches the + empty fragment.) + + * Otherwise, M is a token-tree sequence starting with a complex NT: + `M = $( tt ... ) OP α`, or `M = $( tt ... ) SEP OP α`, + (where `α` is the (potentially empty) sequence of token trees for the rest of the matcher). + + * Let sep_set = { SEP } if SEP present; otherwise sep_set = {}. + + * If ε ∈ FIRST(`tt ...`), then FIRST(M) = (FIRST(`tt ...`) \ { ε }) ∪ sep_set ∪ FIRST(`α`) + + * Else if OP = `*`, then FIRST(M) = FIRST(`tt ...`) ∪ FIRST(`α`) + + * Otherwise (OP = `+`), FIRST(M) = FIRST(`tt ...`) + +Note: The ε-case above, + +> FIRST(M) = (FIRST(`tt ...`) \ { ε }) ∪ sep_set ∪ FIRST(`α`) + +may seem complicated, so lets take a moment to break it down. In the +ε case, the sequence `tt ...` may be empty. Therefore our first +token may be `SEP` itself (if it is present), or it may be the first +token of `α`); that's why the result is including "sep_set ∪ +FIRST(`α`)". Note also that if `α` itself may match the empty +fragment, then FIRST(`α`) will ensure that ε is included in our +result, and conversely, if `α` cannot match the empty fragment, then +we must *ensure* that ε is *not* included in our result; these two +facts together are why we can and should unconditionally remove ε +from FIRST(`tt ...`). + +---- + +LAST(M), defined by case analysis on M itself (a sequence of token-trees): + + * if M is the empty sequence, then LAST(M) = { ε } + + * if M is a singleton token t, then LAST(M) = { t } + + * if M is the singleton complex NT repeating zero or more times, + `M = $( tt ... ) *`, or `M = $( tt ... ) SEP *` + + * Let sep_set = { SEP } if SEP present; otherwise sep_set = {}. + + * if ε ∈ LAST(`tt ...`) then LAST(M) = LAST(`tt ...`) ∪ sep_set + + * otherwise, the sequence `tt ...` must be non-empty; LAST(M) = LAST(`tt ...`) ∪ { ε } + + * if M is the singleton complex NT repeating one or more times, + `M = $( tt ... ) +`, or `M = $( tt ... ) SEP +` + + * Let sep_set = { SEP } if SEP present; otherwise sep_set = {}. + + * if ε ∈ LAST(`tt ...`) then LAST(M) = LAST(`tt ...`) ∪ sep_set + + * otherwise, the sequence `tt ...` must be non-empty; LAST(M) = LAST(`tt ...`) + + * if M is a delimited token-tree sequence `OPEN tt ... CLOSE`, then LAST(M) = { `CLOSE` } + + * if M is a non-empty sequence of token-trees `tt uu ...`, + + * If ε ∈ LAST(`uu ...`), then LAST(M) = LAST(`tt`) ∪ (LAST(`uu ...`) \ { ε }). + + * Otherwise, the sequence `uu ...` must be non-empty; then LAST(M) = LAST(`uu ...`) + +NOTE: The presence or absence of SEP *is* relevant to the above +definitions, but solely in the case where the interior of the complex +NT could be empty (i.e. ε ∈ FIRST(interior)). (I overlooked this fact +in my first round of prototyping.) + +NOTE: The above definition for LAST assumes that we keep our +pre-existing rule that the seperator token in a complex NT is *solely* for +separating elements; i.e. that such NT's do not match fragments that +*end with* the seperator token. If we choose to lift this restriction +in the future, the above definition will need to be revised +accordingly. + +## Examples of FIRST and LAST +[examples-of-first-and-last]: #examples-of-first-and-last + +Below are some examples of FIRST and LAST. +(Note in particular how the special ε element is introduced and +eliminated based on the interation between the pieces of the input.) + +Our first example is presented in a tree structure to elaborate on how +the analysis of the matcher composes. (Some of the simpler subtrees +have been elided.) + + INPUT: $( $d:ident $e:expr );* $( $( h )* );* $( f ; )+ g + ~~~~~~~~ ~~~~~~~ ~ + | | | + FIRST: { $d:ident } { $e:expr } { h } + + + INPUT: $( $d:ident $e:expr );* $( $( h )* );* $( f ; )+ + ~~~~~~~~~~~~~~~~~~ ~~~~~~~ ~~~ + | | | + FIRST: { $d:ident } { h, ε } { f } + + INPUT: $( $d:ident $e:expr );* $( $( h )* );* $( f ; )+ g + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~ + | | | | + FIRST: { $d:ident, ε } { h, ε, ; } { f } { g } + + + INPUT: $( $d:ident $e:expr );* $( $( h )* );* $( f ; )+ g + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + | + FIRST: { $d:ident, h, ;, f } + +Thus: + + * FIRST(`$($d:ident $e:expr );* $( $(h)* );* $( f ;)+ g`) = { `$d:ident`, `h`, `;`, `f` } + +Note however that: + + * FIRST(`$($d:ident $e:expr );* $( $(h)* );* $($( f ;)+ g)*`) = { `$d:ident`, `h`, `;`, `f`, ε } + +Here are similar examples but now for LAST. + + * LAST(`$d:ident $e:expr`) = { `$e:expr` } + * LAST(`$( $d:ident $e:expr );*`) = { `$e:expr`, ε } + * LAST(`$( $d:ident $e:expr );* $(h)*`) = { `$e:expr`, ε, `h` } + * LAST(`$( $d:ident $e:expr );* $(h)* $( f ;)+`) = { `;` } + * LAST(`$( $d:ident $e:expr );* $(h)* $( f ;)+ g`) = { `g` } + + and again, changing the end part of matcher changes its last set considerably: + + * LAST(`$( $d:ident $e:expr );* $(h)* $($( f ;)+ g)*`) = { `$e:expr`, ε, `h`, `g` } + +## FOLLOW(M) + +Finally, the definition for `FOLLOW(M)` is built up incrementally atop +more primitive functions. + +We first assume a primitive mapping, `FOLLOW(NT)` (defined +[below][follow-nt]) from a simple NT to the set of allowed tokens for +the fragment specifier for that NT. + +Second, we generalize FOLLOW to tokens: FOLLOW(t) = FOLLOW(NT) if t is (a simple) NT. +Otherwise, t must be some other (non NT) token; in this case FOLLOW(t) = ANYTOKEN. + +Finally, we generalize FOLLOW to arbitrary matchers by composing the primitive +functions above: + +``` +FOLLOW(M) = FOLLOW(t1) ∩ FOLLOW(t2) ∩ ... ∩ FOLLOW(tN) + where { t1, t2, ..., tN } = (LAST(M) \ { ε }) +``` + +Examples of FOLLOW (expressed as equality relations between sets, to avoid +incoporating details of FOLLOW(NT) in these examples): + + * FOLLOW(`$( $d:ident $e:expr )*`) = FOLLOW(`$e:expr`) + * FOLLOW(`$( $d:ident $e:expr )* $(;)*`) = FOLLOW(`$e:expr`) ∩ ANYTOKEN = FOLLOW(`$e:expr`) + * FOLLOW(`$( $d:ident $e:expr )* $(;)* $( f |)+`) = ANYTOKEN + +## FOLLOW(NT) +[follow-nt]: #follownt + +Here is the definition for FOLLOW(NT), which maps every simple NT to +the set of tokens that are allowed to follow it, based on the fragment +specifier for the NT. The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. -- `FOLLOW(pat)` = `{FatArrow, Comma, Eq}` +- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or}` - `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` -- `FOLLOW(ty)` = `{Comma, FatArrow, Colon, Eq, Gt, Ident(as), Semi}` +- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Ident(as), Ident(where), Semi, Or}` - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(path)` = `FOLLOW(ty)` - `FOLLOW(block)` = any token @@ -117,6 +425,26 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, (Note that close delimiters are valid following any NT.) +## Examples of valid and invalid matchers + +With the above specification in hand, we can present arguments for +why particular matchers are legal and others are not. + + * `($ty:ty < foo ,)` : illegal, because FIRST(`< foo ,`) = { `<` } ⊈ FOLLOW(`ty`) + + * `($ty:ty , foo <)` : legal, because FIRST(`, foo <`) = { `,` } is ⊆ FOLLOW(`ty`). + + * `($pa:pat $pb:pat $ty:ty ,)` : illegal, because FIRST(`$pb:pat $ty:ty ,`) = { `$pb:pat` } ⊈ FOLLOW(`pat`), and also FIRST(`$ty:ty ,`) = { `$ty:ty` } ⊈ FOLLOW(`pat`). + + * `( $($a:tt $b:tt)* ; )` : legal, because FIRST(`$b:tt`) = { `$b:tt` } is ⊆ FOLLOW(`tt`) = ANYTOKEN, as is FIRST(`;`) = { `;` }. + + * `( $($t:tt),* , $(t:tt),* )` : legal (though any attempt to actually use this macro will signal a local ambguity error during expansion). + + * `($ty:ty $(; not sep)* -)` : illegal, because FIRST(`$(; not sep)* -`) = { `;`, `-` } is not in FOLLOW(`ty`). + + * `($($ty:ty)-+)` : illegal, because separator `-` is not in FOLLOW(`ty`). + + # Drawbacks It does restrict the input to a MBE, but the choice of delimiters provides @@ -144,3 +472,162 @@ reasonable freedom and can be extended in the future. - Updated by https://github.com/rust-lang/rfcs/pull/1209, which added semicolons into the follow set for types. + +- Updated by (fill in after PR number is assigned). + * replaced detailed design with a specification-oriented presentation rather than an implementation-oriented algorithm. + * fixed some oversights in the specification (that led to matchers like `break { stuff }` being accepted), + * expanded the follows sets for `ty` to include `OpenDelim(Brace), Ident(where), Or` (since Rust's grammar already requires all of `|foo:TY| {}`, `fn foo() -> TY {}` and `fn foo() -> TY where {}` to work). + * expanded the follow set for `pat` to include `Or` (since Rust's grammar already requires `match (true,false) { PAT | PAT => {} }` and `|PAT| {}` to work). See also [RFC issue 1336][]. + +[RFC issue 1336]: https://github.com/rust-lang/rfcs/issues/1336 + +# Appendices + +## Appendix A: Algorithm for recognizing valid matchers. + +The detailed design above only sought to provide a *specification* for +what a correct matcher is (by defining FIRST, LAST, and FOLLOW, and +specifying the invariant relating FIRST and FOLLOW for all valid +matchers. + +The above specification can be implemented efficiently; we here give +one example algorithm for recognizing valid matchers. + + * This is not the only possible algorithm; for example, one could + precompute a table mapping every suffix of every token-tree + sequence to its FIRST set, by augmenting `FirstSet` below + accordingly. + + Or one could store a subset of such information during the + precomputation, such as just the FIRST sets for complex NT's, and + then use that table to inform a *forward scan* of the input. + + The latter is in fact what my prototype implementation does; I must + emphasize the point that the algorithm here is not prescriptive. + + * The intent of this RFC is that the specifications of FIRST + and FOLLOW above will take precedence over this algorithm if the two + are found to be producing inconsistent results. + +The algorithm for recognizing valid matchers `M` is named ValidMatcher. + +To define it, we will need a mapping from submatchers of M to the +FIRST set for that submatcher; that is handled by `FirstSet`. + +### Procedure FirstSet(M) + +*input*: a token tree `M` representing a matcher + +*output*: `FIRST(M)` + +``` +Let M = tts[1] tts[2] ... tts[n]. +Let curr_first = { ε }. + +For i in n down to 1 (inclusive): + Let tt = tts[i]. + + 1. If tt is a token, curr_first := { tt } + + 2. Else if tt is a delimited sequence `OPEN uu ... ClOSE`, + curr_first := { OPEN } + + 3. Else tt is a complex NT `$(uu ...) SEP OP` + + Let inner_first = FirstSet(`uu ...`) i.e. recursive call + + if OP == `*` or ε ∈ inner_first then + curr_first := curr_first ∪ inner_first + else + curr_first := inner_first + +return curr_first +``` + +(Note: If we were precomputing a full table in this procedure, we would need +a recursive invocation on (uu ...) in step 2 of the for-loop.) + +### Predicate ValidMatcher(M) + +To simplify the specification, we assume in this presentation that all +simple NT's have a valid fragment specifier (i.e., one that has an +entry in the FOLLOW(NT) table above. + +This algorithm works by scanning forward across the matcher M = α β, +(where α is the prefix we have scanned so far, and β is the suffix +that remains to be scanned). We maintain LAST(α) as we scan, and use +it to compute FOLLOW(α) and compare that to FIRST(β). + +*input*: a token tree, `M`, and a set of tokens that could follow it, `F`. + +*output*: LAST(M) (and also signals failure whenever M is invalid) + +``` +Let last_of_prefix = { ε } + +Let M = tts[1] tts[2] ... tts[n]. + +For i in 1 up to n (inclusive): + // For reference: + // α = tts[1] .. tts[i] + // β = tts[i+1] .. tts[n] + // γ is some outer token sequence; the input F represents FIRST(γ) + + 1. Let tt = tts[i]. + + 2. Let first_of_suffix; // aka FIRST(β γ) + + 3. let S = FirstSet(tts[i+1] .. tts[n]); + + 4. if ε ∈ S then + // (include the follow information if necessary) + + first_of_suffix := S ∪ F + + 5. else + + first_of_suffix := S + + 6. Update last_of_prefix via case analysis on tt: + + a. If tt is a token: + last_of_prefix := { tt } + + b. Else if tt is a delimited sequence `OPEN uu ... CLOSE`: + + i. run ValidMatcher( M = `uu ...`, F = { `CLOSE` }) + + ii. last_of_prefix := { `CLOSE` } + + c. Else, tt must be a complex NT, + in other words, `NT = $( uu .. ) SEP OP` or `NT = $( uu .. ) OP`: + + i. If SEP present, + let sublast = ValidMatcher( M = `uu ...`, F = first_of_suffix ∪ { `SEP` }) + + ii. else: + let sublast = ValidMatcher( M = `uu ...`, F = first_of_suffix) + + iii. If ε in sublast then: + last_of_prefix := last_of_prefix ∪ (sublast \ ε) + + iv. Else: + last_of_prefix := sublast + + 7. At this point, last_of_prefix == LAST(α) and first_of_suffix == FIRST(β γ). + + For each simple NT token t in last_of_prefix: + + a. If first_of_suffix ⊆ FOLLOW(t), then we are okay so far. + + b. Otherwise, we have found a token t whose follow set is not compatible + with the FIRST(β γ), and must signal failure. + +// After running the above for loop on all of `M`, last_of_prefix == LAST(M) + +Return last_of_prefix +``` + +This algorithm should be run on every matcher in every `macro_rules` +invocation, with `F` = { `EOF` }. If it rejects a matcher, an error +should be emitted and compilation should not complete. From 2077d8d944621f223676c2ada7a9d3536c9a613d Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 27 Nov 2015 16:47:23 +0100 Subject: [PATCH 0617/1195] Add the self-referential PR number for the amendment. --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 6ebb5df51e0..e0ff158030d 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -473,7 +473,7 @@ reasonable freedom and can be extended in the future. - Updated by https://github.com/rust-lang/rfcs/pull/1209, which added semicolons into the follow set for types. -- Updated by (fill in after PR number is assigned). +- Updated by https://github.com/rust-lang/rfcs/pull/1384: * replaced detailed design with a specification-oriented presentation rather than an implementation-oriented algorithm. * fixed some oversights in the specification (that led to matchers like `break { stuff }` being accepted), * expanded the follows sets for `ty` to include `OpenDelim(Brace), Ident(where), Or` (since Rust's grammar already requires all of `|foo:TY| {}`, `fn foo() -> TY {}` and `fn foo() -> TY where {}` to work). From 13fcc384211ddbafaddb6f14d85321439f9ac622 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 30 Nov 2015 15:35:44 +0100 Subject: [PATCH 0618/1195] current macro_rules.rs has { if, in } \subset FOLLOW(pat) --- text/0550-macro-future-proofing.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index e0ff158030d..9d69752abc8 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -412,9 +412,9 @@ specifier for the NT. The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. -- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or}` +- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in)}` - `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` -- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Ident(as), Ident(where), Semi, Or}` +- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where)}` - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(path)` = `FOLLOW(ty)` - `FOLLOW(block)` = any token From fa24ae5dd1c55530d1663bf1fd77c071a85817f6 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 30 Nov 2015 17:05:58 +0100 Subject: [PATCH 0619/1195] fix typo noted by cmr. thanks! --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 9d69752abc8..4b3e9b18c66 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -396,7 +396,7 @@ FOLLOW(M) = FOLLOW(t1) ∩ FOLLOW(t2) ∩ ... ∩ FOLLOW(tN) ``` Examples of FOLLOW (expressed as equality relations between sets, to avoid -incoporating details of FOLLOW(NT) in these examples): +incorporating details of FOLLOW(NT) in these examples): * FOLLOW(`$( $d:ident $e:expr )*`) = FOLLOW(`$e:expr`) * FOLLOW(`$( $d:ident $e:expr )* $(;)*`) = FOLLOW(`$e:expr`) ∩ ANYTOKEN = FOLLOW(`$e:expr`) From 602bb545c829d9a425614aa8edcbd9b08542386a Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 30 Nov 2015 17:08:50 +0100 Subject: [PATCH 0620/1195] updated edit history section to account for commit 13fcc384211ddbafaddb6f14d85321439f9ac622 --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 4b3e9b18c66..e513679b762 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -477,7 +477,7 @@ reasonable freedom and can be extended in the future. * replaced detailed design with a specification-oriented presentation rather than an implementation-oriented algorithm. * fixed some oversights in the specification (that led to matchers like `break { stuff }` being accepted), * expanded the follows sets for `ty` to include `OpenDelim(Brace), Ident(where), Or` (since Rust's grammar already requires all of `|foo:TY| {}`, `fn foo() -> TY {}` and `fn foo() -> TY where {}` to work). - * expanded the follow set for `pat` to include `Or` (since Rust's grammar already requires `match (true,false) { PAT | PAT => {} }` and `|PAT| {}` to work). See also [RFC issue 1336][]. + * expanded the follow set for `pat` to include `Or` (since Rust's grammar already requires `match (true,false) { PAT | PAT => {} }` and `|PAT| {}` to work); see also [RFC issue 1336][]. Also added `If` and `In` to follow set for `pat` (to make the specifiation match the old implementation). [RFC issue 1336]: https://github.com/rust-lang/rfcs/issues/1336 From 5aefeb5961cedb5c98ce46886de8971e556aa4c0 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 30 Nov 2015 17:26:15 +0100 Subject: [PATCH 0621/1195] fixed poor wording in edit history --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index e513679b762..230a146b209 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -475,7 +475,7 @@ reasonable freedom and can be extended in the future. - Updated by https://github.com/rust-lang/rfcs/pull/1384: * replaced detailed design with a specification-oriented presentation rather than an implementation-oriented algorithm. - * fixed some oversights in the specification (that led to matchers like `break { stuff }` being accepted), + * fixed some oversights in the specification that led to matchers like `$e:expr { stuff }` being accepted (which match fragments like `break { stuff }`, significantly limiting future language extensions), * expanded the follows sets for `ty` to include `OpenDelim(Brace), Ident(where), Or` (since Rust's grammar already requires all of `|foo:TY| {}`, `fn foo() -> TY {}` and `fn foo() -> TY where {}` to work). * expanded the follow set for `pat` to include `Or` (since Rust's grammar already requires `match (true,false) { PAT | PAT => {} }` and `|PAT| {}` to work); see also [RFC issue 1336][]. Also added `If` and `In` to follow set for `pat` (to make the specifiation match the old implementation). From 80740bac94ffb03603ed9b93fef8f4c037615cab Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 1 Dec 2015 17:35:28 +0100 Subject: [PATCH 0622/1195] Allocators, take III, at long last. --- text/0000-kinds-of-allocators.md | 2061 ++++++++++++++++++++++++++++++ 1 file changed, 2061 insertions(+) create mode 100644 text/0000-kinds-of-allocators.md diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md new file mode 100644 index 00000000000..4cbff5c22a3 --- /dev/null +++ b/text/0000-kinds-of-allocators.md @@ -0,0 +1,2061 @@ +- Feature Name: allocator_api +- Start Date: 2015-12-01 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add a standard allocator interface and support for user-defined +allocators, with the following goals: + + 1. Allow libraries (in libstd and elsewhere) to be generic with + respect to the particular allocator, to support distinct, + stateful, per-container allocators. + + 2. Require clients to supply metadata (such as block size and + alignment) at the allocation and deallocation sites, to ensure + hot-paths are as efficient as possible. + + 3. Provide high-level abstraction over the layout of an object in + memory. + +Regarding GC: We plan to allow future allocators to integrate +themselves with a standardized reflective GC interface, but leave +specification of such integration for a later RFC. (The design +describes a way to add such a feature in the future while ensuring +that clients do not accidentally opt-in and risk unsound behavior.) + +# Motivation +[motivation]: #motivation + +As noted in [RFC PR 39][] (and reiterated in [RFC PR 244][]), modern general purpose allocators are good, +but due to the design tradeoffs they must make, cannot be optimal in +all contexts. (It is worthwhile to also read discussion of this claim +in papers such as +[Reconsidering Custom Malloc](#reconsidering-custom-memory-allocation).) + +Therefore, the standard library should allow clients to plug in their +own allocator for managing memory. + +## Allocators are used in C++ system programming + +The typical reasons given for use of custom allocators in C++ are among the +following: + + 1. Speed: A custom allocator can be tailored to the particular + memory usage profiles of one client. This can yield advantages + such as: + + * A bump-pointer based allocator, when available, is faster + than calling `malloc`. + + * Adding memory padding can reduce/eliminate false sharing of + cache lines. + + 2. Stability: By segregating different sub-allocators and imposing + hard memory limits upon them, one has a better chance of handling + out-of-memory conditions. + + If everything comes from a single global heap, it becomes much + harder to handle out-of-memory conditions because by the time the + handler runs, it is almost certainly going to be unable to + allocate any memory for its own work. + + 3. Instrumentation and debugging: One can swap in a custom + allocator that collects data such as number of allocations, + or time for requests to be serviced. + +## Allocators should feel "rustic" + +In addition, for Rust we want an allocator API design that leverages +the core type machinery and language idioms (e.g. using `Result`, with +a `NonZero` okay variant and a zero-sized error variant), and provides +premade functions for common patterns for allocator clients (such as +allocating either single instances of a type, or arrays of some types +of dynamically-determined length). + +## Garbage Collection integration + +Finally, we want our allocator design to allow for a garbage +collection (GC) interface to be added in the future. + +At the very least, we do not want to accidentally *disallow* GC by +choosing an allocator API that is fundamentally incompatible with it. + +(However, this RFC does not actually propose a concrete solution for +how to integrate allocators with GC.) + +# Detailed design +[design]: #detailed-design + +## The `Allocator` trait at a glance + +The source code for the `Allocator` trait prototype ks provided in an +[appendix][Source for Allocator]. But since that section is long, here +we summarize the high-level points of the `Allocator` API. + +(See also the [walk thru][] section, which actually links to +individual sections of code.) + + * Basic implementation of the trait requires just two methods + (`alloc` and `dealloc`). You can get an initial implemention off + the ground with relatively little effort. + + * All methods that can fail to satisfy a request return a `Result` + (rather than building in an assumption that they panic or abort). + + * Furthermore, allocator implementations are discouraged from + directly panicking or aborting on out-of-memory (OOM) during + calls to allocation methods; instead, + clients that do wish to report that OOM occurred via a particular + allocator can do so via the `Allocator::oom()` method. + + * OOM is not the only type of error that may occur in general; + allocators can inject more specific error types to indicate + why an allocation failed. + + * The metadata for any allocation is captured in a `Kind` + abstraction. This type carries (at minimum) the size and alignment + requirements for a memory request. + + * The `Kind` type provides a large family of functional construction + methods for building up the description of how memory is laid out. + + * Any sized type `T` can be mapped to its `Kind`, via `Kind::new::()`, + + * Heterogenous structure; e.g. `kind1.extend(kind2)`, + + * Homogenous array types: `kind.repeat(n)` (for `n: usize`), + + * There are packed and unpacked variants for the latter two methods. + + * Helper `Allocator` methods like `fn alloc_one` and `fn + alloc_array` allow client code to interact with an allocator + without ever directly constructing a `Kind`. + + * Once an `Allocator` implementor has the `fn alloc` and `fn dealloc` + methods working, it can provide overrides of the other methods, + providing hooks that take advantage of specific details of how your + allocator is working underneath the hood. + + * In particular, the interface provides a few ways to let clients + potentially reuse excess memory associated with a block + + * `fn realloc` is a common pattern (where the client hopes that + the method will reuse the original memory when satisfying the + `realloc` request). + + * `fn alloc_excess` and `fn usable_size` provide an alternative + pattern, where your allocator tells the client about the excess + memory provided to satisfy a request, and the client can directly + expand into that excess memory, without doing round-trip requests + through the allocator itself. + +## Semantics of allocators and their memory blocks +[semantics of allocators]: #semantics-of-allocators-and-their-memory-blocks + +In general, an allocator provide access to a memory pool that owns +some amount of backing storage. The pool carves off chunks of that +storage and hands it out, via the allocator, as individual blocks of +memory to service client requests. (A "client" here is usually some +container library, like `Vec` or `HashMap`, that has been suitably +parameterized so that it has an `A:Allocator` type parameter.) + +So, an interaction between a program, a collection library, and an +allocator might look like this: + + +If you cannot see the SVG linked here, try the [ASCII art version][ascii-art] appendix. +Also, if you have suggestions for changes to the SVG, feel free to write them as a comment +in that appendix; (but be sure to be clear that you are pointing out a suggestion for the SVG). + + +In general, an allocator might be the backing memory pool itself; or +an allocator might merely be a *handle* that references the memory +pool. In the former case, when the allocator goes out of scope or is +otherwise dropped, the memory pool is dropped as well; in the latter +case, dropping the allocator has no effect on the memory pool. + + * One allocator that acts as a handle is the global heap allocator, + whose associated pool is the low-level `#[allocator]` crate. + + * Another allocator that acts as a handle is a `&'a Pool`, where + `Pool` is some structure implementing a sharable backing store. + The big [example][] section shows an instance of this. + + * An allocator that is its own memory pool would be a type + analogous to `Pool` that implements the `Allocator` interface + directly, rather than via `&'a Pool`. + + * A case in the middle of the two extremes might be something like an + allocator of the form `Rc>`. This reflects *shared* + ownership between a collection of allocators handles: dropping one + handle will not drop the pool as long as at least one other handle + remains, but dropping the last handle will drop the pool itself. + +A client that is generic over all possible `A:Allocator` instances +cannot know which of the above cases it falls in. This has consequences +in terms of the restrictions that must be met by client code +interfacing with an allocator, which we discuss in a +later [section on lifetimes][lifetimes]. + + +## Example Usage +[example]: #example-usage + +Lets jump into a demo. Here is a (super-dumb) bump-allocator that uses +the `Allocator` trait. + +### Implementing the `Allocator` trait + +First, the bump-allocator definition itself: each such allocator will +have its own name (for error reports from OOM), start and limit +pointers (`ptr` and `end`, respectively) to the backing storage it is +allocating into, as well as the byte alignment (`align`) of that +storage, and an `avail: AtomicPtr` for the cursor tracking how +much we have allocated from the backing storage. +(The `avail` field is an atomic because eventually we want to try +sharing this demo allocator across scoped threads.) + +```rust +struct DumbBumpPool { + name: &'static str, + ptr: *mut u8, + end: *mut u8, + avail: AtomicPtr, + align: usize, +} +``` + +The initial implementation is pretty straight forward: just immediately +allocate the whole pool's backing storage. + +(If we wanted to be really clever we might layer this type on top of +*another* allocator. +For this demo I want to try to minimize cleverness, so we will use +`heap::allocate` to grab the backing storage instead of taking an +`Allocator` of our own.) + + +```rust +impl DumbBumpPool { + fn new(name: &'static str, + size_in_bytes: usize, + start_align: usize) -> DumbBumpPool { + unsafe { + let ptr = heap::allocate(size_in_bytes, start_align); + if ptr.is_null() { panic!("allocation failed."); } + let end = ptr.offset(size_in_bytes as isize); + DumbBumpPool { + name: name, + ptr: ptr, end: end, avail: AtomicPtr::new(ptr), + align: start_align + } + } + } +} +``` + +Since clients are not allowed to have blocks that outlive their +associated allocator (see the [lifetimes][] section), +it is sound for us to always drop the backing storage for an allocator +when the allocator itself is dropped +(regardless of what sequence of `alloc`/`dealloc` interactions occured +with the allocator's clients). + +```rust +impl Drop for DumbBumpPool { + fn drop(&mut self) { + unsafe { + let size = self.end as usize - self.ptr as usize; + heap::deallocate(self.ptr, size, self.align); + } + } +} +``` + +Now, before we get into the trait implementation itself, here is an +interesting simple design choice: + + * To show-off the error abstraction in the API, we make a special + error type that covers a third case that is not part of the + standard `enum AllocErr`. + +Specifically, our bump allocator has *three* error conditions that we +will expose: + + 1. the inputs could be invalid, + + 2. the memory could be exhausted, or, + + 3. there could be *interference* between two threads. + This latter scenario means that this allocator failed + on this memory request, but the client might + quite reasonably just *retry* the request. + +```rust +#[derive(Copy, Clone, PartialEq, Eq, Debug)] +enum BumpAllocError { Invalid, MemoryExhausted, Interference } + +impl alloc::AllocError for BumpAllocError { + fn invalid_input() -> Self { BumpAllocError::MemoryExhausted } + fn is_memory_exhausted(&self) -> bool { *self == BumpAllocError::MemoryExhausted } + fn is_request_unsupported(&self) -> bool { false } + fn is_transient(&self) { *self == BumpAllocError::Interference } +} +``` + +With that out of the way, here are some other design choices of note: + + * Our Bump Allocator is going to use a most simple-minded deallocation + policy: calls to `fn dealloc` are no-ops. Instead, every request takes + up fresh space in the backing storage, until the pool is exhausted. + (This was one reason I use the word "Dumb" in its name.) + + * Since we want to be able to share the bump-allocator amongst multiple + (lifetime-scoped) threads, we will implement the `Allocator` interface + as a *handle* pointing to the pool; in this case, a simple reference. + +Here is the demo implementation of `Allocator` for the type. + +```rust +impl<'a> Allocator for &'a DumbBumpPool { + type Kind = alloc::Kind; + type Error = BumpAllocError; + + unsafe fn alloc(&mut self, kind: &Self::Kind) -> Result { + let curr = self.avail.load(Ordering::Relaxed) as usize; + let align = *kind.align(); + let curr_aligned = (curr.overflowing_add(align - 1)) & !(align - 1); + let size = *kind.size(); + let remaining = (self.end as usize) - curr_aligned; + if remaining <= size { + return Err(BumpAllocError::MemoryExhausted); + } + + let curr = curr as *mut u8; + let curr_aligned = curr_aligned as *mut u8; + let new_curr = curr_aligned.offset(size as isize); + + if curr != self.avail.compare_and_swap(curr, new_curr, Ordering::Relaxed) { + return Err(BumpAllocError::Interference); + } else { + println!("alloc finis ok: 0x{:x} size: {}", curr_aligned as usize, size); + return Ok(NonZero::new(curr_aligned)); + } + } + + unsafe fn dealloc(&mut self, _ptr: Address, _kind: &Self::Kind) -> Result<(), Self::Error> { + // this bump-allocator just no-op's on dealloc + Ok(()) + } + + unsafe fn oom(&mut self) -> ! { + panic!("exhausted memory in {}", self.name); + } + +} +``` + +(Niko Matsakis has pointed out that this particular allocator might +avoid interference errors by using fetch-and-add rather than +compare-and-swap. The devil's in the details as to how one might +accomplish that while still properly adjusting for alignment; in any +case, the overall point still holds in cases outside of this specific +demo.) + +And that is it; we are done with our allocator implementation. + +### Using an `A:Allocator` from the client side + +We assume that `Vec` has been extended with a `new_in` method that +takes an allocator argument that it uses to satisfy its allocation +requests. + +```rust +fn demo_alloc(a1:A1, a2: A2, print_state: F) { + let mut v1 = Vec::new_in(a1); + let mut v2 = Vec::new_in(a2); + println!("demo_alloc, v1; {:?} v2: {:?}", v1, v2); + for i in 0..10 { + v1.push(i as u64 * 1000); + v2.push(i as u8); + v2.push(i as u8); + } + println!("demo_alloc, v1; {:?} v2: {:?}", v1, v2); + print_state(); + for i in 10..100 { + v1.push(i as u64 * 1000); + v2.push(i as u8); + v2.push(i as u8); + } + println!("demo_alloc, v1.len: {} v2.len: {}", v1.len(), v2.len()); + print_state(); + for i in 100..1000 { + v1.push(i as u64 * 1000); + v2.push(i as u8); + v2.push(i as u8); + } + println!("demo_alloc, v1.len: {} v2.len: {}", v1.len(), v2.len()); + print_state(); +} + +fn main() { + use std::thread::catch_panic; + + if let Err(panicked) = catch_panic(|| { + let alloc = DumbBumpPool::new("demo-bump", 4096, 1); + demo_alloc(&alloc, &alloc, || println!("alloc: {:?}", alloc)); + }) { + match panicked.downcast_ref::() { + Some(msg) => { + println!("DumbBumpPool panicked: {}", msg); + } + None => { + println!("DumbBumpPool panicked"); + } + } + } + + // // The below will be (rightly) rejected by compiler when + // // all pieces are properly in place: It is not valid to + // // have the vector outlive the borrowed allocator it is + // // referencing. + // + // let v = { + // let alloc = DumbBumpPool::new("demo2", 4096, 1); + // let mut v = Vec::new_in(&alloc); + // for i in 1..4 { v.push(i); } + // v + // }; + + let alloc = DumbBumpPool::new("demo-bump", 4096, 1); + for i in 0..100 { + let r = ::std::thread::scoped(|| { + let v = Vec::new_in(&alloc); + for j in 0..10 { + v.push(j); + } + }); + } + + println!("got here"); +} +``` + +And that's all to the demo, folks. + +## Allocators and lifetimes +[lifetimes]: #allocators-and-lifetimes + +As mentioned above, allocators provide access to a memory pool. An +allocator can *be* the pool (in the sense that the allocator owns the +backing storage that represents the memory blocks it hands out), or an +allocator can just be a handle that points at the pool. + +Some pools have indefinite extent. An example of this is the global +heap allocator, requesting memory directly from the low-level +`#[allocator]` crate. Clients of an allocator with such a pool need +not think about how long the allocator lives; instead, they can just +freely allocate blocks, use them at will, and deallocate them at +arbitrary points in the future. Memory blocks that come from such a +pool will leak if it is not explicitly deallocated. + +Other pools have limited extent: they are created, they build up +infrastructure to manage their blocks of memory, and at some point, +such pools are torn down. Memory blocks from such a pool may or may +not be returned to the operating system during that tearing down. + +There is an immediate question for clients of an allocator with the +latter kind of pool (i.e. one of limited extent): whether it should +attempt to spend time deallocating such blocks, and if so, at what +time to do so? + +Again, note: + + * generic clients (i.e. that accept any `A:Allocator`) *cannot know* + what kind of pool they have, or how it relates to the allocator it + is given, + + * dropping the client's allocator may or may not imply the dropping + of the pool itself! + +That is, code written to a specific `Allocator` implementation may be +able to make assumptions about the relationship between the memory +blocks and the allocator(s), but the generic code we expect the +standard library to provide cannot make such assumptions. + +To satisfy the above scenarios in a sane, consistent, general fashion, +the `Allocator` trait assumes/requires all of the following: + + 1. (for allocator impls and clients): in the absence of other + information (e.g. specific allocator implementations), all blocks + from a given pool have lifetime equivalent to the lifetime of the + pool. + + This implies if a client is going to read from, write to, or + otherwise manipulate a memory block, the client *must* do so before + its associated pool is torn down. + + (It also implies the converse: if a client can prove that the pool + for an allocator is still alive, then it can continue to work + with a memory block from that allocator even after the allocator + is dropped.) + + 2. (for allocator impls): an allocator *must not* outlive its + associated pool. + + All clients can assume this in their code. + + (This constraint provides generic clients the preconditions they + need to satisfy the first condition. In particular, even though + clients do not generally know what kind of pool is associated with + its allocator, it can conservatively assume that all blocks will + live at least as long as the allocator itself.) + + 3. (for allocator impls and clients): all clients of an allocator + *should* eventually call the `dealloc` method on every block they + want freed (otherwise, memory may leak). + + However, allocator implementations *must* remain sound even if + this condition is not met: If `dealloc` is not invoked for all + blocks and this condition is somehow detected, then an allocator + can panic (or otherwise signal failure), but that sole violation + must not cause undefined behavior. + + (This constraint is to encourage generic client authors to write + code that will not leak memory when instantiated with allocators + of indefinite extent, such as the global heap allocator.) + + 4. (for allocator impls): moving an allocator value *must not* + invalidate its outstanding memory blocks. + + All clients can assume this in their code. + + So if a client allocates a block from an allocator (call it `a1`) + and then `a1` moves to a new place (e.g. via`let a2 = a1;`), then + it remains sound for the client to deallocate that block via + `a2`. + + Note that this implies that it is not sound to implement an + allocator that embeds its own pool structurally inline. + + E.g. this is *not* a legal allocator: + ```rust + struct MegaEmbedded { pool: [u8; 1024*1024], cursor: usize, ... } + impl Allocator for MegaEmbedded { ... } + ``` + The latter impl is simply unreasonable (at least if one is + intending to satisfy requests by returning pointers into + `self.bytes`). + + (Note of course, `impl Allocator for &mut MegaEmbedded` is in + principle *fine*; that would then be an allocator that is an + indirect handle to an unembedded pool.) + + 5. (for allocator impls and clients) if an allocator is cloneable, the + client *can assume* that all clones + are interchangably compatible in terms of their memory blocks: if + allocator `a2` is a clone of `a1`, then one can allocate a block + from `a1` and return it to `a2`, or vice versa, or use `a2.realloc` + on the block, et cetera. + + This essentially means that any cloneable + allocator *must* be a handle indirectly referencing a pool of some + sort. (Though do remember that such handles can collectively share + ownership of their pool, such as illustrated in the + `Rc>` example given earlier.) + + (Note: one might be tempted to further conclude that this also + implies that allocators implementing `Copy` must have pools of + indefinite extent. While this seems reasonable for Rust as it + stands today, I am slightly worried whether it would continue to + hold e.g. in a future version of Rust with something like + `Gc: Copy`, where the `GcPool` and its blocks is reclaimed + (via finalization) sometime after being determined to be globally + unreachable. Then again, perhaps it would be better to simply say + "we will not support that use case for the allocator API", so that + clients would be able to employ the reasoning outlined in the + outset of this paragraph.) + + +## A walk through the Allocator trait +[walk thru]: #a-walk-through-the-allocator-trait + +### Role-Based Type Aliases + +Allocation code often needs to deal with values that boil down to a +`usize` in the end. But there are distinct roles (e.g. "size", +"alignment") that such values play, and I decided those roles would be +worth hard-coding into the method signatures. + + * Therefore, I made [type aliases][] for `Size`, `Capacity`, `Alignment`, and `Address`. + +Furthermore, all values of the above types must be non-zero for any +allocation action to make sense. + + * Therefore, I made them instances of the `NonZero` type. + +### Basic implementation + +An instance of an allocator has many methods, but an implementor of +the trait need only provide two method bodies: [alloc and dealloc][]. + +(This is only *somewhat* analogous to the `Iterator` trait in Rust. It +is currently very uncommon to override any methods of `Iterator` ecept +for `fn next`. However, I expect it will be much more common for +`Allocator` to override at least some of the other methods, like `fn +realloc`.) + +The `alloc` method returns an `Address` when it succeeds, and +`dealloc` takes such an address as its input. But the client must also +provide metadata for the allocated block like its size and alignment. +This is encapsulated in the `Kind` argument to `alloc` and `dealloc`. + +### Kinds of allocations + +A `Kind` just carries the metadata necessary for satisfying an +allocation request. Its (current, private) representation is just a +size and alignment. + +The more interesting thing about `Kind` is the +family of public methods associated with it for building new kinds via +composition; these are shown in the [kind api][]. + +### Reallocation Methods + +Of course, real-world allocation often needs more than just +`alloc`/`dealloc`: in particular, one often wants to avoid extra +copying if the existing block of memory can be conceptually expanded +in place to meet new allocation needs. In other words, we want +`realloc`, plus alternatives to it that allow clients to avoid +round-tripping through the allocator API. + +For this, the [memory reuse][] family of methods is appropriate. + +### Type-based Helper Methods + +Some readers might skim over the `Kind` API and immediately say "yuck, +all I wanted to do was allocate some nodes for a tree-structure and +let my clients choose how the backing memory is chosen! Why do I have +to wrestle with this `Kind` business?" + +I agree with the sentiment; that's why the `Allocator` trait provides +a family of methods capturing [common usage patterns][]. + +## Unchecked variants + +Finally, all of the methods above return `Result`, and guarantee some +amount of input validation. (This is largely because I observed code +duplication doing such validation on the client side; or worse, such +validation accidentally missing.) + +However, some clients will want to bypass such checks (and do it +without risking undefined behavior by ensuring the preconditions hold +via local invariants in their container type). + +For these clients, the `Allocator` trait provides +["unchecked" variants][unchecked variants] of nearly all of its +methods. + +The idea here is that `Allocator` implementors are encouraged +to streamline the implmentations of such methods by assuming that all +of the preconditions hold. + + * However, to ease initial `impl Allocator` development for a given + type, all of the unchecked methods have default implementations + that call out to their checked counterparts. + + * (In other words, "unchecked" is in some sense a privilege being + offered to impl's; but there is no guarantee that an arbitrary impl + takes advantage of the privilege.) + +## Why this API +[Why this API]: #why-this-api + +Here are some quick points about how this API was selected + +### Why not just `free(ptr)` for deallocation? + +As noted in [RFC PR 39][] (and reiterated in [RFC PR 244][]), the basic `malloc` interface +{`malloc(size) -> ptr`, `free(ptr)`, `realloc(ptr, size) -> ptr`} is +lacking in a number of ways: `malloc` lacks the ability to request a +particular alignment, and `realloc` lacks the ability to express a +copy-free "reuse the input, or do nothing at all" request. Another +problem with the `malloc` interface is that it burdens the allocator +with tracking the sizes of allocated data and re-extracting the +allocated size from the `ptr` in `free` and `realloc` calls (the +latter can be very cheap, but there is still no reason to pay that +cost in a language like Rust where the relevant size is often already +immediately available as a compile-time constant). + +Therefore, in the name of (potential best-case) speed, we want to +require client code to provide the metadata like size and alignment +to both the allocation and deallocation call sites. + +### Why not just `alloc`/`dealloc` (or `alloc`/`dealloc`/`realloc`)? + +* The `alloc_one`/`dealloc_one` and `alloc_array`/`dealloc_array` + capture a very common pattern for allocation of memory blocks where + a simple value or array type is being allocated. + +* The `alloc_array_unchecked` and `dealloc_array_unchecked` likewise + capture a similar pattern, but are "less safe" in that they put more + of an onus on the caller to validate the input parameters before + calling the methods. + +* The `alloc_excess` and `realloc_excess` methods provide a way for + callers who can make use of excess memory to avoid unnecessary calls + to `realloc`. + + +### Why `alloc_array_unchecked` and `dealloc_array_unchecked`? + +### Why the `Kind` abstraction? + +While we do want to require clients to hand the allocator the size and +alignment, we have found that the code to compute such things follows +regular patterns. It makes more sense to factor those patterns out +into a common abstraction; this is what `Kind` provides: a high-level +API for describing the memory layout of a composite structure by +composing the layout of its subparts. + +### Why return `Result` rather than a raw pointer? + +My hypothesis is that the standard allocator API should embrace +`Result` as the standard way for describing local error conditions in +Rust. + +In principle, we can use `Result` without adding *any* additional +overhead (at least in terms of the size of the values being returned +from the allocation calls), because the error type for the `Result` +can be zero-sized if so desired. That is why the error is an +associated type of the `Allocator`: allocators that want to ensure the +results have minimum size can use the zero-sized `RequestUnsatisfied` +or `MemoryExhausted` types as their associated `Self::Error`. + + * `RequestUnsatisfied` is a catch-all type that any allocator + could use as its error type; doing so provides no hint to the + client as to what they could do to try to service future memory + requests. + + * `MemoryExhausted` is a specific error type meant for allocators + that could in principle handle *any* sane input request, if there + were sufficient memory available. (By "sane" we mean for example + that the input arguments do not cause an arithmetic overflow during + computation of the size of the memory block -- if they do, then it + is reasonable for an allocator with this error type to respond that + insufficent memory was available, rather than e.g. panicking.) + +### Why return `Result` rather than directly `oom` on failure + +Again, my hypothesis is that the standard allocator API should embrace +`Result` as the standard way for describing local error conditions in +Rust. + +I want to leave it up to the clients to decide if they can respond to +out-of-memory (OOM) conditions on allocation failure. + +However, since I also suspect that some programs would benefit from +contextual information about *which* allocator is reporting memory +exhaustion, I have made `oom` a method of the `Allocator` trait, so +that allocator clients can just call that on error (assuming they want +to trust the failure behavior of the allocator). + +### Why is `usable_size` ever needed? Why not call `kind.size()` directly, as is done in the default implementation? + +`kind.size()` returns the minimum required size that the client needs. +In a block-based allocator, this may be less than the *actual* size +that the allocator would ever provide to satisfy that `kind` of +request. Therefore, `usable_size` provides a way for clients to +observe what the minimum actual size of an allocated block for +that`kind` would be, for a given allocator. + +(Note that the documentation does say that in general it is better for +clients to use `alloc_excess` and `realloc_excess` instead, if they +can, as a way to directly observe the *actual* amount of slop provided +by the particular allocator.) + +### Why is `Allocator` an `unsafe trait`? + +It just seems like a good idea given how much of the standard library +is going to assume that allocators are implemented according to their +specification. + +(I had thought that `unsafe fn` for the methods would suffice, but +that is putting the burden of proof (of soundness) in the *wrong* +direction...) + +## The GC integration strategy +[gc integration]: #the-gc-integration-strategy + +One of the main reasons that [RFC PR 39] was not merged as written +was because it did not account for garbage collection (GC). + +In particular, assuming that we eventually add support for GC in some +form, then any value that holds a reference to an object on the GC'ed +heap will need some linkage to the GC. In particular, if the *only* +such reference (i.e. the one with sole ownership) is held in a block +managed by a user-defined allocator, then we need to ensure that all +such references are found when the GC does its work. + +The Rust project has control over the `libstd` provided allocators, so +the team can adapt them as necessary to fit the needs of whatever GC +designs come around. But the same is not true for user-defined +allocators: we want to ensure that adding support for them does not +inadvertantly kill any chance for adding GC later. + +### The inspiration for Kind + +Some aspects of the design of this RFC were selected in the hopes that +it would make such integration easier. In particular, the introduction +of the relatively high-level `Kind` abstraction was developed, in +part, as a way that a GC-aware allocator would build up a tracing +method associated with a kind. + +Then I realized that the `Kind` abstraction may be valuable on its +own, without GC: It encapsulates important patterns when working with +representing data as memory records. + +So, this RFC offers the `Kind` abstraction without promising that it +solves the GC problem. (It might, or it might not; we don't know yet.) + +### Forwards-compatibility + +So what *is* the solution for forwards-compatibility? + +It is this: Rather than trying to build GC support into the +`Allocator` trait itself, we instead assume that when GC support +comes, it may come with a new trait (call it `GcAwareAllocator`). + + * (Perhaps we will instead use an attribute; the point is, whatever + option we choose can be incorporated into the meta-data for a + crate.) + +Allocators that are are GC-compatible have to explicitly declare +themselves as such, by implementing `GcAwareAllocator`, which will +then impose new conditions on the methods of `Allocator`, for example +ensuring e.g. that allocated blocks of memory can be scanned +(i.e. "parsed") by the GC (if that in fact ends up being necessary). + +This way, we can deploy an `Allocator` trait API today that does not +provide the necessary reflective hooks that a GC wuold need to access. + +Crates that define their own `Allocator` implementations without also +claiming them to be GC-compatible will be forbidden from linking with +crates that require GC support. (In other words, when GC support +comes, we assume that the linking component of the Rust compiler will +be extended to check such compatibility requirements.) + +# Drawbacks +[drawbacks]: #drawbacks + +The API may be over-engineered. + +The core set of methods (the ones without `unchecked`) return +`Result` and potentially impose unwanted input validation overhead. + + * The `_unchecked` variants are intended as the response to that, + for clients who take care to validate the many preconditions + themselves in order to minimize the allocation code paths. + +# Alternatives +[alternatives]: #alternatives + +## Just adopt [RFC PR 39][] with this RFC's GC strategy + +The GC-compatibility strategy described here (in [gc integration][]) +might work with a large number of alternative designs, such as that +from [RFC PR 39][]. + +While that is true, it seems like it would be a little short-sighted. +In particular, I have neither proven *nor* disproven the value of +`Kind` system described here with respect to GC integration. + +As far as I know, it is the closest thing we have to a workable system +for allowing client code of allocators to accurately describe the +layout of values they are planning to allocate, which is the main +ingredient I believe to be necessary for the kind of dynamic +reflection that a GC will require of a user-defined allocator. + +## Make `Kind` an associated type of `Allocator` trait + +I explored making an `AllocKind` bound and then having + +```rust +pub unsafe trait Allocator { + /// Describes the sort of records that this allocator can + /// construct. + type Kind: AllocKind; + + ... +} +``` + +Such a design might indeed be workable. (I found it awkward, which is +why I abandoned it.) + +But the question is: What benefit does it bring? + +The main one I could imagine is that it might allow us to introduce a +division, at the type-system level, between two kinds of allocators: +those that are integrated with the GC (i.e., have an associated +`Allocator::Kind` that ensures that all allocated blocks are scannable +by a GC) and allocators that are *not* integrated with the GC (i.e., +have an associated `Allocator::Kind` that makes no guarantees about +one will know how to scan the allocated blocks. + +However, no such design has proven itself to be "obviously feasible to +implement," and therefore it would be unreasonable to make the `Kind` +an associated type of the `Allocator` trait without having at least a +few motivating examples that *are* clearly feasible and useful. + +## Variations on the `Kind` API + + * Should `Kind` offer a `fn resize(&self, new_size: usize) -> Kind` constructor method? + (Such a method would rule out deriving GC tracers from kinds; but we could + maybe provide it as an `unsafe` method.) + + * Should `Kind` ensure an invariant that its associated size is + always a multiple of its alignment? + + * Doing this would allow simplifying a small part of the API, + namely the distinct `Kind::repeat` (returns both a kind and an + offset) versus `Kind::array` (where the offset is derivable from + the input `T`). + + * Such a constraint would have precendent; in particular, the + `aligned_alloc` function of C11 requires the given size + be a multiple of the alignment. + + * On the other hand, both the system and jemalloc allocators seem + to support more flexible allocation patterns. Imposing the above + invariant implies a certain loss of expressiveness over what we + already provide today. + + * Should `Kind` ensure an invariant that its associated size is always positive? + + * Pro: Removes something that allocators would need to check about + input kinds (the backing memory allocators will tend to require + that the input sizes are positive). + + * Con: Requiring positive size means that zero-sized types do not have an associated + `Kind`. That's not the end of the world, but it does make the `Kind` API slightly + less convenient (e.g. one cannot use `extend` with a zero-sized kind to + forcibly inject padding, because zero-sized kinds do not exist). + + * Should `Kind::align_to` add padding to the associated size? (Probably not; this would + make it impossible to express certain kinds of patteerns.) + + * Should the `Kind` methods that might "fail" return `Result` instead of `Option`? + +## Variations on the `Allocator` API + + * Should `Allocator::alloc` be safe instead of `unsafe fn`? + + * Clearly `fn dealloc` and `fn realloc` need to be `unsafe`, since + feeding in improper inputs could cause unsound behavior. But is + there any analogous input to `fn alloc` that could cause + unsoundness (assuming that the `Kind` struct enforces invariants + like "the associated size is non-zero")? + + * (I left it as `unsafe fn alloc` just to keep the API uniform with + `dealloc` and `realloc`.) + + * Should `Allocator::realloc` not require that `new_kind.align()` + evenly divide `kind.align()`? In particular, it is not too + expensive to check if the two kinds are not compatible, and fall + back on `alloc`/`dealloc` in that case. + + * Should `Allocator` not provide unchecked variants on `fn alloc`, + `fn realloc`, et cetera? (To me it seems having them does no harm, + apart from potentially misleading clients who do not read the + documentation about what scenarios yield undefined behavior. + + * Another option here would be to provide a `trait + UncheckedAllocator: Allocator` that carries the unchecked + methods, so that clients who require such micro-optimized paths + can ensure that their clients actually pass them an + implementation that has the checks omitted. + + * On the flip-side of the previous bullet, should `Allocator` provide + `fn alloc_one_unchecked` and `fn dealloc_one_unchecked` ? + I think the only check that such variants would elide would be that + `T` is not zero-sized; I'm not sure that's worth it. + (But the resulting uniformity of the whole API might shift the + balance to "worth it".) + +# Unresolved questions +[unresolved]: #unresolved-questions + + * Should `Kind` be an associated type of `Allocator` (see + [alternatives][] section for discussion). + (In fact, most of the "Variations correspond to potentially + unresolved questions.) + + * Should `dealloc` return a `Result` or not? (Under what + circumstances would we expect `dealloc` to fail in a manner worth + signalling? The main one I can think of is a transient failure, + which is why the documentation for that method spends so much time + discussing it.) + + * Are the type definitions for `Size`, `Capacity`, `Alignment`, and + `Address` an abuse of the `NonZero` type? (Or do we just need some + constructor for `NonZero` that asserts that the input is non-zero)? + + * Should `fn oom(&self)` take in more arguments (e.g. to allow the + client to provide more contextual information about the OOM + condition)? + + * Does `AllocError::is_transient` belong in this version of the API, + or should we wait to add it later? (I originally suspected that + libstd data types would want to make use of it, which would means + we should add it. However, in the absence of a concrete example + stdlib type that would use it, we may be better off removing `fn + is_transient` from this API (instead specifying that allocators + with such transient failures will block (i.e. loop and retry + internally), with the expectation that if a need for such an + allocator does arise, we will then represent the API extension + via a different trait (perhaps an extension trait of `Allocator`). + + * On that note, if we remove the `fn is_transient` method, should + we get rid of the `AllocError` bound entirely? Is the given set + of methods actually worth providing to all generic clients? + + (Keeping it seems very low cost to me; implementors can always opt + to use the `MemoryExhausted` error type, which is cheap. But my + intuition may be wrong.) + + * Do we need `Allocator::max_size` and `Allocator::max_align` ? + + * Should default impl of `Allocator::max_align` return `None`, or is + there more suitable default? (perhaps e.g. `PLATFORM_PAGE_SIZE`?) + + The previous allocator documentation provided by Daniel Micay + suggest that we should specify that behavior unspecified if + allocation is too large, but if that is the case, then we should + definitely provide some way to *observe* that threshold.) + + From what I can tell, we cannot currently assume that all + low-level allocators will behave well for large alignments. + See https://github.com/rust-lang/rust/issues/30170 + + +# Appendices + +## Bibliography +[Bibliography]: #bibliography + +### RFC Pull Request #39: Allocator trait +[RFC PR 39]: https://github.com/rust-lang/rfcs/pull/39/files + +Daniel Micay, 2014. RFC: Allocator trait. https://github.com/thestinger/rfcs/blob/ad4cdc2662cc3d29c3ee40ae5abbef599c336c66/active/0000-allocator-trait.md + +### RFC Pull Request #244: Allocator RFC, take II +[RFC PR 244]: https://github.com/rust-lang/rfcs/pull/244 + +Felix Klock, 2014, Allocator RFC, take II, https://github.com/pnkfelix/rfcs/blob/d3c6068e823f495ee241caa05d4782b16e5ef5d8/active/0000-allocator.md + +### Dynamic Storage Allocation: A Survey and Critical Review +Paul R. Wilson, Mark S. Johnstone, Michael Neely, and David Boles, 1995. [Dynamic Storage Allocation: A Survey and Critical Review](https://parasol.tamu.edu/~rwerger/Courses/689/spring2002/day-3-ParMemAlloc/papers/wilson95dynamic.pdf) ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps . Slightly modified version appears in Proceedings of 1995 International Workshop on Memory Management (IWMM '95), Kinross, Scotland, UK, September 27--29, 1995 Springer Verlag LNCS + +### Reconsidering custom memory allocation +[ReCustomMalloc]: http://dl.acm.org/citation.cfm?id=582421 + +Emery D. Berger, Benjamin G. Zorn, and Kathryn S. McKinley. 2002. [Reconsidering custom memory allocation][ReCustomMalloc]. In Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA '02). + +### The memory fragmentation problem: solved? +[MemFragSolvedP]: http://dl.acm.org/citation.cfm?id=286864 + +Mark S. Johnstone and Paul R. Wilson. 1998. [The memory fragmentation problem: solved?][MemFragSolvedP]. In Proceedings of the 1st international symposium on Memory management (ISMM '98). + +### EASTL: Electronic Arts Standard Template Library +[EASTL]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html + +Paul Pedriana. 2007. [EASTL] -- Electronic Arts Standard Template Library. Document number: N2271=07-0131 + +### Towards a Better Allocator Model +[Halpern proposal]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1850.pdf + +Pablo Halpern. 2005. [Towards a Better Allocator Model][Halpern proposal]. Document number: N1850=05-0110 + +### Various allocators + +[jemalloc], [tcmalloc], [Hoard] + +[jemalloc]: http://www.canonware.com/jemalloc/ + +[tcmalloc]: http://goog-perftools.sourceforge.net/doc/tcmalloc.html + +[Hoard]: http://www.hoard.org/ + +[tracing garbage collector]: http://en.wikipedia.org/wiki/Tracing_garbage_collection + +[malloc/free]: http://en.wikipedia.org/wiki/C_dynamic_memory_allocation + +## ASCII art version of Allocator message sequence chart +[ascii-art]: #ascii-art-version-of-allocator-message-sequence-chart + +This is an ASCII art version of the SVG message sequence chart +from the [semantics of allocators] section. + +``` +Program Vec Allocator + || + || + +--------------- create allocator -------------------> ** (an allocator is born) + *| <------------ return allocator A ---------------------+ + || | + || | + +- create vec w/ &mut A -> ** (a vec is born) | + *| <------return vec V ------+ | + || | | + *------- push W_1 -------> *| | + | || | + | || | + | +--- allocate W array ---> *| + | | || + | | || + | | +---- (request system memory if necessary) + | | *| <-- ... + | | || + | *| <--- return *W block -----+ + | || | + | || | + *| <------- (return) -------+| | + || | | + +------- push W_2 -------->+| | + | || | + *| <------- (return) -------+| | + || | | + +------- push W_3 -------->+| | + | || | + *| <------- (return) -------+| | + || | | + +------- push W_4 -------->+| | + | || | + *| <------- (return) -------+| | + || | | + +------- push W_5 -------->+| | + | || | + | +---- realloc W array ---> *| + | | || + | | || + | | +---- (request system memory if necessary) + | | *| <-- ... + | | || + | *| <--- return *W block -----+ + *| <------- (return) -------+| | + || | | + || | | + . . . + . . . + . . . + || | | + || | | + || (end of Vec scope) | | + || | | + +------ drop Vec --------> *| | + | || (Vec destructor) | + | || | + | +---- dealloc W array --> *| + | | || + | | +---- (potentially return system memory) + | | *| <-- ... + | | || + | *| <------- (return) --------+ + *| <------- (return) --------+ | + || | + || | + || | + || (end of Allocator scope) | + || | + +------------------ drop Allocator ------------------> *| + | || + | |+---- (return any remaining associated memory) + | *| <-- ... + | || + *| <------------------ (return) -------------------------+ + || + || + . + . + . +``` + + +## Transcribed Source for Allocator trait API +[Source for Allocator]: #transcribed-source-for-allocator-trait-api + +Here is the whole source file for my prototype allocator API, +sub-divided roughly accordingly to functionality. + +(We start with the usual boilerplate...) + +```rust +// Copyright 2015 The Rust Project Developers. See the COPYRIGHT +// file at the top-level directory of this distribution and at +// http://rust-lang.org/COPYRIGHT. +// +// Licensed under the Apache License, Version 2.0 or the MIT license +// , at your +// option. This file may not be copied, modified, or distributed +// except according to those terms. + +#![unstable(feature = "allocator_api", + reason = "the precise API and guarantees it provides may be tweaked \ + slightly, especially to possibly take into account the \ + types being stored to make room for a future \ + tracing garbage collector", + issue = "27700")] + +use core::cmp; +use core::fmt; +use core::mem; +use core::nonzero::NonZero; +use core::ptr::{self, Unique}; + +``` + +### Type Aliases +[type aliases]: #type-aliases + +```rust +pub type Size = NonZero; +pub type Capacity = NonZero; +pub type Alignment = NonZero; + +pub type Address = NonZero<*mut u8>; + +/// Represents the combination of a starting address and +/// a total capacity of the returned block. +pub struct Excess(Address, Capacity); + +fn size_align() -> (usize, usize) { + (mem::size_of::(), mem::align_of::()) +} + +``` + +### Kind API +[kind api]: #kind-api + +```rust +/// Category for a memory record. +/// +/// An instance of `Kind` describes a particular layout of memory. +/// You build a `Kind` up as an input to give to an allocator. +/// +/// All kinds have an associated positive size; note that this implies +/// zero-sized types have no corresponding kind. +#[derive(Copy, Clone, Debug, PartialEq, Eq)] +pub struct Kind { + // size of the requested block of memory, measured in bytes. + size: Size, + // alignment of the requested block of memory, measured in bytes. + // we ensure that this is always a power-of-two, because API's + ///like `posix_memalign` require it and it is a reasonable + // constraint to impose on Kind constructors. + // + // (However, we do not analogously require `align >= sizeof(void*)`, + // even though that is *also* a requirement of `posix_memalign`.) + align: Alignment, +} + + +// FIXME: audit default implementations for overflow errors, +// (potentially switching to overflowing_add and +// overflowing_mul as necessary). + +impl Kind { + // (private constructor) + fn from_size_align(size: usize, align: usize) -> Kind { + assert!(align.is_power_of_two()); + let size = unsafe { assert!(size > 0); NonZero::new(size) }; + let align = unsafe { assert!(align > 0); NonZero::new(align) }; + Kind { size: size, align: align } + } + + /// The minimum size in bytes for a memory block of this kind. + pub fn size(&self) -> NonZero { self.size } + + /// The minimum byte alignment for a memory block of this kind. + pub fn align(&self) -> NonZero { self.align } + + /// Constructs a `Kind` suitable for holding a value of type `T`. + /// Returns `None` if no such kind exists (e.g. for zero-sized `T`). + pub fn new() -> Option { + let (size, align) = size_align::(); + if size > 0 { Some(Kind::from_size_align(size, align)) } else { None } + } + + /// Produces kind describing a record that could be used to + /// allocate backing structure for `T` (which could be a trait + /// or other unsized type like a slice). + /// + /// Returns `None` when no such kind exists; for example, when `x` + /// is a reference to a zero-sized type. + pub fn for_value(t: &T) -> Option { + let (size, align) = (mem::size_of_val(t), mem::align_of_val(t)); + if size > 0 { + Some(Kind::from_size_align(size, align)) + } else { + None + } + } + + /// Creates a kind describing the record that can hold a value + /// of the same kind as `self`, but that also is aligned to + /// alignment `align` (measured in bytes). + /// + /// If `self` already meets the prescribed alignment, then returns + /// `self`. + /// + /// Note that this method does not add any padding to the overall + /// size, regardless of whether the returned kind has a different + /// alignment. In other words, if `K` has size 16, `K.align_to(32)` + /// will *still* have size 16. + pub fn align_to(&self, align: Alignment) -> Self { + if align > self.align { + let pow2_align = align.checked_next_power_of_two().unwrap(); + debug_assert!(pow2_align > 0); // (this follows from self.align > 0...) + Kind { align: unsafe { NonZero::new(pow2_align) }, + ..*self } + } else { + *self + } + } + + /// Returns the amount of padding we must insert after `self` + /// to ensure that the following address will satisfy `align` + /// (measured in bytes). + /// + /// Behavior undefined if `align` is not a power-of-two. + /// + /// Note that in practice, this is only useable if `align <= + /// self.align` otherwise, the amount of inserted padding would + /// need to depend on the particular starting address for the + /// whole record, because `self.align` would not provide + /// sufficient constraint. + pub fn padding_needed_for(&self, align: Alignment) -> usize { + debug_assert!(*align <= *self.align()); + let len = *self.size(); + let len_rounded_up = (len + *align - 1) & !(*align - 1); + return len_rounded_up - len; + } + + /// Creates a kind describing the record for `n` instances of + /// `self`, with a suitable amount of padding between each to + /// ensure that each instance is given its requested size and + /// alignment. On success, returns `(k, offs)` where `k` is the + /// kind of the array and `offs` is the distance between the start + /// of each element in the array. + /// + /// On zero `n` or arithmetic overflow, returns `None`. + pub fn repeat(&self, n: usize) -> Option<(Self, usize)> { + if n == 0 { return None; } + let padded_size = match self.size.checked_add(self.padding_needed_for(self.align)) { + None => return None, + Some(padded_size) => padded_size, + }; + let alloc_size = match padded_size.checked_mul(n) { + None => return None, + Some(alloc_size) => alloc_size, + }; + Some((Kind::from_size_align(alloc_size, *self.align), padded_size)) + } + + /// Creates a kind describing the record for `self` followed by + /// `next`, including any necessary padding to ensure that `next` + /// will be properly aligned. Note that the result kind will + /// satisfy the alignment properties of both `self` and `next`. + /// + /// Returns `Some((k, offset))`, where `k` is kind of the concatenated + /// record and `offset` is the relative location, in bytes, of the + /// start of the `next` embedded witnin the concatenated record + /// (assuming that the record itself starts at offset 0). + /// + /// On arithmetic overflow, returns `None`. + pub fn extend(&self, next: Self) -> Option<(Self, usize)> { + let new_align = unsafe { NonZero::new(cmp::max(*self.align, *next.align)) }; + let realigned = Kind { align: new_align, ..*self }; + let pad = realigned.padding_needed_for(new_align); + let offset = *self.size() + pad; + let new_size = offset + *next.size(); + Some((Kind::from_size_align(new_size, *new_align), offset)) + } + + /// Creates a kind describing the record for `n` instances of + /// `self`, with no padding between each instance. + /// + /// On zero `n` or overflow, returns `None`. + pub fn repeat_packed(&self, n: usize) -> Option { + let scaled = match self.size().checked_mul(n) { + None => return None, + Some(scaled) => scaled, + }; + let size = unsafe { assert!(scaled > 0); NonZero::new(scaled) }; + Some(Kind { size: size, align: self.align }) + } + + /// Creates a kind describing the record for `self` followed by + /// `next` with no additional padding between the two. Since no + /// padding is inserted, the alignment of `next` is irrelevant, + /// and is not incoporated *at all* into the resulting kind. + /// + /// Returns `(k, offset)`, where `k` is kind of the concatenated + /// record and `offset` is the relative location, in bytes, of the + /// start of the `next` embedded witnin the concatenated record + /// (assuming that the record itself starts at offset 0). + /// + /// (The `offset` is always the same as `self.size()`; we use this + /// signature out of convenience in matching the signature of + /// `fn extend`.) + /// + /// On arithmetic overflow, returns `None`. + pub fn extend_packed(&self, next: Self) -> Option<(Self, usize)> { + let new_size = match self.size().checked_add(*next.size()) { + None => return None, + Some(new_size) => new_size, + }; + let new_size = unsafe { NonZero::new(new_size) }; + Some((Kind { size: new_size, ..*self }, *self.size())) + } + + // Below family of methods *assume* inputs are pre- or + // post-validated in some manner. (The implementations here + ///do indirectly validate, but that is not part of their + /// specification.) + // + // Since invalid inputs could yield ill-formed kinds, these + // methods are `unsafe`. + + /// Creates kind describing the record for a single instance of `T`. + /// Requires `T` has non-zero size. + pub unsafe fn new_unchecked() -> Self { + let (size, align) = size_align::(); + Kind::from_size_align(size, align) + } + + + /// Creates a kind describing the record for `self` followed by + /// `next`, including any necessary padding to ensure that `next` + /// will be properly aligned. Note that the result kind will + /// satisfy the alignment properties of both `self` and `next`. + /// + /// Returns `(k, offset)`, where `k` is kind of the concatenated + /// record and `offset` is the relative location, in bytes, of the + /// start of the `next` embedded witnin the concatenated record + /// (assuming that the record itself starts at offset 0). + /// + /// Requires no arithmetic overflow from inputs. + pub unsafe fn extend_unchecked(&self, next: Self) -> (Self, usize) { + self.extend(next).unwrap() + } + + /// Creates a kind describing the record for `n` instances of + /// `self`, with a suitable amount of padding between each. + /// + /// Requires non-zero `n` and no arithmetic overflow from inputs. + /// (See also the `fn array` checked variant.) + pub unsafe fn repeat_unchecked(&self, n: usize) -> (Self, usize) { + self.repeat(n).unwrap() + } + + /// Creates a kind describing the record for `n` instances of + /// `self`, with no padding between each instance. + /// + /// Requires non-zero `n` and no arithmetic overflow from inputs. + /// (See also the `fn array_packed` checked variant.) + pub unsafe fn repeat_packed_unchecked(&self, n: usize) -> Self { + self.repeat_packed(n).unwrap() + } + + /// Creates a kind describing the record for `self` followed by + /// `next` with no additional padding between the two. Since no + /// padding is inserted, the alignment of `next` is irrelevant, + /// and is not incoporated *at all* into the resulting kind. + /// + /// Returns `(k, offset)`, where `k` is kind of the concatenated + /// record and `offset` is the relative location, in bytes, of the + /// start of the `next` embedded witnin the concatenated record + /// (assuming that the record itself starts at offset 0). + /// + /// (The `offset` is always the same as `self.size()`; we use this + /// signature out of convenience in matching the signature of + /// `fn extend`.) + /// + /// Requires no arithmetic overflow from inputs. + /// (See also the `fn extend_packed` checked variant.) + pub unsafe fn extend_packed_unchecked(&self, next: Self) -> (Self, usize) { + self.extend_packed(next).unwrap() + } + + /// Creates a kind describing the record for a `[T; n]`. + /// + /// On zero `n`, zero-sized `T`, or arithmetic overflow, returns `None`. + pub fn array(n: usize) -> Option { + Kind::new::() + .and_then(|k| k.repeat(n)) + .map(|(k, offs)| { + debug_assert!(offs == mem::size_of::()); + k + }) + } + + /// Creates a kind describing the record for a `[T; n]`. + /// + /// Requires nonzero `n`, nonzero-sized `T`, and no arithmetic + /// overflow; otherwise behavior undefined. + pub fn array_unchecked(n: usize) -> Self { + Kind::array::(n).unwrap() + } + +} + +``` + +### AllocError API +[error api]: #allocerror-api + +```rust +/// `AllocError` instances provide feedback about the cause of an allocation failure. +pub trait AllocError { + /// Construct an error that indicates operation failure due to + /// invalid input values for the request. + /// + /// This can be used, for example, to signal an overflow occurred + /// during arithmetic computation. (However, since overflows + /// frequently represent an allocation attempt that would exhaust + /// memory, clients are alternatively allowed to constuct an error + /// representing memory exhaustion in such scenarios.) + fn invalid_input() -> Self; + + /// Returns true if the error is due to hitting some resource + /// limit or otherwise running out of memory. This condition + /// strongly implies that *some* series of deallocations would + /// allow a subsequent reissuing of the original allocation + /// request to succeed. + /// + /// Exhaustion is a common interpretation of an allocation failure; + /// e.g. usually when `malloc` returns `null`, it is because of + /// hitting a user resource limit or system memory exhaustion. + /// + /// Note that the resource exhaustion could be specific to the + /// original allocator (i.e. the only way to free up memory is by + /// deallocating memory attached to that allocator), or it could + /// be associated with some other state outside of the original + /// alloactor. The `AllocError` trait does not distinguish between + /// the two scenarios. + /// + /// Finally, error responses to allocation input requests that are + /// *always* illegal for *any* allocator (e.g. zero-sized or + /// arithmetic-overflowing requests) are allowed to respond `true` + /// here. (This is to allow `MemoryExhausted` as a valid error type + /// for an allocator that can handle all "sane" requests.) + fn is_memory_exhausted(&self) -> bool; + + /// Returns true if the allocator is fundamentally incapable of + /// satisfying the original request. This condition implies that + /// such an allocation request will never succeed on this + /// allocator, regardless of environment, memory pressure, or + /// other contextual condtions. + /// + /// An example where this might arise: A block allocator that only + /// supports satisfying memory requests where each allocated block + /// is at most `K` bytes in size. + fn is_request_unsupported(&self) -> bool; + + /// Returns true only if the error is transient. "Transient" is + /// meant here in the sense that there is a reasonable chance that + /// re-issuing the same allocation request in the future *could* + /// succeed, even if nothing else changes about the overall + /// context of the request. + /// + /// An example where this might arise: An allocator shared across + /// threads that fails upon detecting interference (rather than + /// e.g. blocking). + fn is_transient(&self) -> bool { false } // most errors are not transient +} + +/// The `MemoryExhausted` error represents a blanket condition +/// that the given request was not satisifed for some reason beyond +/// any particular limitations of a given allocator. +/// +/// It roughly corresponds to getting `null` back from a call to `malloc`: +/// you've probably exhausted memory (though there might be some other +/// explanation; see discussion with `AllocError::is_memory_exhausted`). +/// +/// Allocators that can in principle allocate any kind of legal input +/// might choose this as their associated error type. +#[derive(Copy, Clone, PartialEq, Eq, Debug)] +pub struct MemoryExhausted; + +/// The `AllocErr` error specifies whether an allocation failure is +/// specifically due to resource exhaustion or if it is due to +/// something wrong when combining the given input arguments with this +/// allocator. + +/// Allocators that only support certain classes of inputs might choose this +/// as their associated error type, so that clients can respond appropriately +/// to specific error failure scenarios. +#[derive(Copy, Clone, PartialEq, Eq, Debug)] +pub enum AllocErr { + /// Error due to hitting some resource limit or otherwise running + /// out of memory. This condition strongly implies that *some* + /// series of deallocations would allow a subsequent reissuing of + /// the original allocation request to succeed. + Exhausted, + + /// Error due to allocator being fundamentally incapable of + /// satisfying the original request. This condition implies that + /// such an allocation request will never succeed on the given + /// allocator, regardless of environment, memory pressure, or + /// other contextual condtions. + Unsupported, +} + +impl AllocError for MemoryExhausted { + fn invalid_input() -> Self { MemoryExhausted } + fn is_memory_exhausted(&self) -> bool { true } + fn is_request_unsupported(&self) -> bool { false } +} + +impl AllocError for AllocErr { + fn invalid_input() -> Self { AllocErr::Unsupported } + fn is_memory_exhausted(&self) -> bool { *self == AllocErr::Exhausted } + fn is_request_unsupported(&self) -> bool { *self == AllocErr::Unsupported } +} + +``` + +### Allocator trait header +[trait header]: #allocator-trait-header + +```rust +/// An implementation of `Allocator` can allocate, reallocate, and +/// deallocate arbitrary blocks of data described via `Kind`. +/// +/// Some of the methods require that a kind *fit* a memory block. +/// What it means for a kind to "fit" a memory block means is that +/// the following two conditions must hold: +/// +/// 1. The block's starting address must be aligned to `kind.align()`. +/// +/// 2. The block's size must fall in the range `[orig, usable]`, where: +/// +/// * `orig` is the size last used to allocate the block, and +/// +/// * `usable` is the capacity that was (or would have been) +/// returned when (if) the block was allocated via a call to +/// `alloc_excess` or `realloc_excess`. +/// +/// Note that due to the constraints in the methods below, a +/// lower-bound on `usable` can be safely approximated by a call to +/// `usable_size`. +pub unsafe trait Allocator { + /// When allocation requests cannot be satisified, an instance of + /// this error is returned. + /// + /// Many allocators will want to use the zero-sized + /// `MemoryExhausted` type for this. + type Error: AllocError + fmt::Debug; + +``` + +### Allocator core alloc and dealloc +[alloc and dealloc]: #allocator-core-alloc-and-dealloc + +```rust + /// Returns a pointer suitable for holding data described by + /// `kind`, meeting its size and alignment guarantees. + /// + /// The returned block of storage may or may not have its contents + /// initialized. (Extension subtraits might restrict this + /// behavior, e.g. to ensure initialization.) + /// + /// Returns `Err` if allocation fails or if `kind` does + /// not meet allocator's size or alignment constraints. + unsafe fn alloc(&mut self, kind: Kind) -> Result; + + /// Deallocate the memory referenced by `ptr`. + /// + /// `ptr` must have previously been provided via this allocator, + /// and `kind` must *fit* the provided block (see above); + /// otherwise yields undefined behavior. + /// + /// Returns `Err` only if deallocation fails in some fashion. If + /// the returned error is *transient*, then ownership of the + /// memory block is transferred back to the caller (see + /// `AllocError::is_transient`). Otherwise, callers must assume + /// that ownership of the block has been unrecoverably lost. + /// + /// Note: Implementors are encouraged to avoid `Err`-failure from + /// `dealloc`; most memory allocation APIs do not support + /// signalling failure in their `free` routines, and clients are + /// likely to incorporate that assumption into their own code and + /// just `unwrap` the result of this call. + unsafe fn dealloc(&mut self, ptr: Address, kind: Kind) -> Result<(), Self::Error>; + + /// Allocator-specific method for signalling an out-of-memory + /// condition. + /// + /// Any activity done by the `oom` method should ensure that it + /// does not infinitely regress in nested calls to `oom`. In + /// practice this means implementors should eschew allocating, + /// especially from `self` (directly or indirectly). + /// + /// Implementors of this trait are discouraged from panicking or + /// aborting from other methods in the event of memory exhaustion; + /// instead they should return an appropriate error from the + /// invoked method, and let the client decide whether to invoke + /// this `oom` method. + unsafe fn oom(&mut self) -> ! { ::core::intrinsics::abort() } + +``` + +### Allocator-specific quantities and limits +[quantites and limits]: #allocator-specific-quantities-and-limits + +```rust + // == ALLOCATOR-SPECIFIC QUANTITIES AND LIMITS == + // max_size, max_align, usable_size + + /// The maximum requestable size in bytes for memory blocks + /// managed by this allocator. + /// + /// Returns `None` if this allocator has no explicit maximum size. + /// (Note that such allocators may well still have an *implicit* + /// maximum size; i.e. allocation requests can always fail.) + fn max_size(&self) -> Option { None } + + /// The maximum requestable alignment in bytes for memory blocks + /// managed by this allocator. + /// + /// Returns `None` if this allocator has no assigned maximum + /// alignment. (Note that such allocators may well still have an + /// *implicit* maximum alignment; i.e. allocation requests can + /// always fail.) + fn max_align(&self) -> Option { None } + + /// Returns the minimum guaranteed usable size of a successful + /// allocation created with the specified `kind`. + /// + /// Clients who wish to make use of excess capacity are encouraged + /// to use the `alloc_excess` and `realloc_excess` instead, as + /// this method is constrained to conservatively report a value + /// less than or equal to the minimum capacity for *all possible* + /// calls to those methods. + /// + /// However, for clients that do not wish to track the capacity + /// returned by `alloc_excess` locally, this method is likely to + /// produce useful results. + unsafe fn usable_size(&self, kind: Kind) -> Capacity { kind.size() } + +``` + +### Allocator methods for memory reuse +[memory reuse]: #allocator-methods-for-memory-reuse + +```rust + // == METHODS FOR MEMORY REUSE == + // realloc. alloc_excess, realloc_excess + + /// Returns a pointer suitable for holding data described by + /// `new_kind`, meeting its size and alignment guarantees. To + /// accomplish this, this may extend or shrink the allocation + /// referenced by `ptr` to fit `new_kind`. + /// + /// * `ptr` must have previously been provided via this allocator. + /// + /// * `kind` must *fit* the `ptr` (see above). (The `new_kind` + /// argument need not fit it.) + /// + /// Behavior undefined if either of latter two constraints are unmet. + /// + /// In addition, `new_kind` should not impose a stronger alignment + /// constraint than `kind`. (In other words, `new_kind.align()` + /// must evenly divide `kind.align()`; note this implies the + /// alignment of `new_kind` must not exceed that of `kind`.) + /// However, behavior is well-defined (though underspecified) when + /// this constraint is violated; further discussion below. + /// + /// If this returns `Ok`, then ownership of the memory block + /// referenced by `ptr` has been transferred to this + /// allocator. The memory may or may not have been freed, and + /// should be considered unusable (unless of course it was + /// transferred back to the caller again via the return value of + /// this method). + /// + /// Returns `Err` only if `new_kind` does not meet the allocator's + /// size and alignment constraints of the allocator or the + /// alignment of `kind`, or if reallocation otherwise fails. (Note + /// that did not say "if and only if" -- in particular, an + /// implementation of this method *can* return `Ok` if + /// `new_kind.align() > old_kind.align()`; or it can return `Err` + /// in that scenario.) + /// + /// If this method returns `Err`, then ownership of the memory + /// block has not been transferred to this allocator, and the + /// contents of the memory block are unaltered. + unsafe fn realloc(&mut self, + ptr: Address, + kind: Kind, + new_kind: Kind) -> Result { + // All Kind alignments are powers of two, so a comparison + // suffices here (rather than resorting to a `%` operation). + if new_kind.size() <= self.usable_size(kind) && new_kind.align() <= kind.align() { + return Ok(ptr); + } else { + let result = self.alloc(new_kind); + if let Ok(new_ptr) = result { + ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*kind.size(), *new_kind.size())); + loop { + if let Err(err) = self.dealloc(ptr, kind) { + // all we can do from the realloc abstraction + // is either: + // + // 1. free the block we just finished copying + // into and pass the error up, + // 2. ignore the dealloc error, or + // 3. try again. + // + // They are all terrible; 1 seems unjustifiable. + // So we choose 2, unless the error is transient. + if err.is_transient() { continue; } + } + break; + } + } + result + } + } + + /// Behaves like `fn alloc`, but also returns the whole size of + /// the returned block. For some `kind` inputs, like arrays, this + /// may include extra storage usable for additional data. + unsafe fn alloc_excess(&mut self, kind: Kind) -> Result { + self.alloc(kind).map(|p| Excess(p, self.usable_size(kind))) + } + + /// Behaves like `fn realloc`, but also returns the whole size of + /// the returned block. For some `kind` inputs, like arrays, this + /// may include extra storage usable for additional data. + unsafe fn realloc_excess(&mut self, + ptr: Address, + kind: Kind, + new_kind: Kind) -> Result { + self.realloc(ptr, kind, new_kind) + .map(|p| Excess(p, self.usable_size(new_kind))) + } + +``` + +### Allocator common usage patterns +[common usage patterns]: #allocator-common-usage-patterns + +```rust + // == COMMON USAGE PATTERNS == + // alloc_one, dealloc_one, alloc_array, realloc_array. dealloc_array + + /// Allocates a block suitable for holding an instance of `T`. + /// + /// Captures a common usage pattern for allocators. + /// + /// The returned block is suitable for passing to the + /// `alloc`/`realloc` methods of this allocator. + unsafe fn alloc_one(&mut self) -> Result, Self::Error> { + if let Some(k) = Kind::new::() { + self.alloc(k).map(|p|Unique::new(*p as *mut T)) + } else { + // (only occurs for zero-sized T) + debug_assert!(mem::size_of::() == 0); + Err(Self::Error::invalid_input()) + } + } + + /// Deallocates a block suitable for holding an instance of `T`. + /// + /// Captures a common usage pattern for allocators. + unsafe fn dealloc_one(&mut self, mut ptr: Unique) -> Result<(), Self::Error> { + let raw_ptr = NonZero::new(ptr.get_mut() as *mut T as *mut u8); + self.dealloc(raw_ptr, Kind::new::().unwrap()) + } + + /// Allocates a block suitable for holding `n` instances of `T`. + /// + /// Captures a common usage pattern for allocators. + /// + /// The returned block is suitable for passing to the + /// `alloc`/`realloc` methods of this allocator. + unsafe fn alloc_array(&mut self, n: usize) -> Result, Self::Error> { + match Kind::array::(n) { + Some(kind) => self.alloc(kind).map(|p|Unique::new(*p as *mut T)), + None => Err(Self::Error::invalid_input()), + } + } + + /// Reallocates a block previously suitable for holding `n_old` + /// instances of `T`, returning a block suitable for holding + /// `n_new` instances of `T`. + /// + /// Captures a common usage pattern for allocators. + /// + /// The returned block is suitable for passing to the + /// `alloc`/`realloc` methods of this allocator. + unsafe fn realloc_array(&mut self, + ptr: Unique, + n_old: usize, + n_new: usize) -> Result, Self::Error> { + let old_new_ptr = (Kind::array::(n_old), Kind::array::(n_new), *ptr); + if let (Some(k_old), Some(k_new), ptr) = old_new_ptr { + self.realloc(NonZero::new(ptr as *mut u8), k_old, k_new) + .map(|p|Unique::new(*p as *mut T)) + } else { + Err(Self::Error::invalid_input()) + } + } + + /// Deallocates a block suitable for holding `n` instances of `T`. + /// + /// Captures a common usage pattern for allocators. + unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), Self::Error> { + let raw_ptr = NonZero::new(*ptr as *mut u8); + if let Some(k) = Kind::array::(n) { + self.dealloc(raw_ptr, k) + } else { + Err(Self::Error::invalid_input()) + } + } + +``` + +### Allocator unchecked method variants +[unchecked variants]: #allocator-unchecked-method-variants + +```rust + // UNCHECKED METHOD VARIANTS + + /// Returns a pointer suitable for holding data described by + /// `kind`, meeting its size and alignment guarantees. + /// + /// The returned block of storage may or may not have its contents + /// initialized. (Extension subtraits might restrict this + /// behavior, e.g. to ensure initialization.) + /// + /// Returns `None` if request unsatisfied. + /// + /// Behavior undefined if input does not meet size or alignment + /// constraints of this allocator. + unsafe fn alloc_unchecked(&mut self, kind: Kind) -> Option

{ + // (default implementation carries checks, but impl's are free to omit them.) + self.alloc(kind).ok() + } + + /// Deallocate the memory referenced by `ptr`. + /// + /// `ptr` must have previously been provided via this allocator, + /// and `kind` must *fit* the provided block (see above). + /// Otherwise yields undefined behavior. + unsafe fn dealloc_unchecked(&mut self, ptr: Address, kind: Kind) { + // (default implementation carries checks, but impl's are free to omit them.) + self.dealloc(ptr, kind).unwrap() + } + + /// Returns a pointer suitable for holding data described by + /// `new_kind`, meeting its size and alignment guarantees. To + /// accomplish this, may extend or shrink the allocation + /// referenced by `ptr` to fit `new_kind`. + //// + /// (In other words, ownership of the memory block associated with + /// `ptr` is first transferred back to this allocator, but the + /// same block may or may not be transferred back as the result of + /// this call.) + /// + /// * `ptr` must have previously been provided via this allocator. + /// + /// * `kind` must *fit* the `ptr` (see above). (The `new_kind` + /// argument need not fit it.) + /// + /// * `new_kind` must meet the allocator's size and alignment + /// constraints. In addition, `new_kind.align()` must equal + /// `kind.align()`. (Note that this is a stronger constraint + /// that that imposed by `fn realloc`.) + /// + /// Behavior undefined if any of latter three constraints are unmet. + /// + /// If this returns `Some`, then the memory block referenced by + /// `ptr` may have been freed and should be considered unusable. + /// + /// Returns `None` if reallocation fails; in this scenario, the + /// original memory block referenced by `ptr` is unaltered. + unsafe fn realloc_unchecked(&mut self, + ptr: Address, + kind: Kind, + new_kind: Kind) -> Option
{ + // (default implementation carries checks, but impl's are free to omit them.) + self.realloc(ptr, kind, new_kind).ok() + } + + /// Behaves like `fn alloc_unchecked`, but also returns the whole + /// size of the returned block. + unsafe fn alloc_excess_unchecked(&mut self, kind: Kind) -> Option { + self.alloc_excess(kind).ok() + } + + /// Behaves like `fn realloc_unchecked`, but also returns the + /// whole size of the returned block. + unsafe fn realloc_excess_unchecked(&mut self, + ptr: Address, + kind: Kind, + new_kind: Kind) -> Option { + self.realloc_excess(ptr, kind, new_kind).ok() + } + + + /// Allocates a block suitable for holding `n` instances of `T`. + /// + /// Captures a common usage pattern for allocators. + /// + /// Requires inputs are non-zero and do not cause arithmetic + /// overflow, and `T` is not zero sized; otherwise yields + /// undefined behavior. + unsafe fn alloc_array_unchecked(&mut self, n: usize) -> Option> { + let kind = Kind::array_unchecked::(n); + self.alloc_unchecked(kind).map(|p|Unique::new(*p as *mut T)) + } + + /// Reallocates a block suitable for holding `n_old` instances of `T`, + /// returning a block suitable for holding `n_new` instances of `T`. + /// + /// Captures a common usage pattern for allocators. + /// + /// Requires inputs are non-zero and do not cause arithmetic + /// overflow, and `T` is not zero sized; otherwise yields + /// undefined behavior. + unsafe fn realloc_array_unchecked(&mut self, + ptr: Unique, + n_old: usize, + n_new: usize) -> Option> { + let (k_old, k_new, ptr) = (Kind::array_unchecked::(n_old), + Kind::array_unchecked::(n_new), + *ptr); + self.realloc_unchecked(NonZero::new(ptr as *mut u8), k_old, k_new) + .map(|p|Unique::new(*p as *mut T)) + } + + /// Deallocates a block suitable for holding `n` instances of `T`. + /// + /// Captures a common usage pattern for allocators. + /// + /// Requires inputs are non-zero and do not cause arithmetic + /// overflow, and `T` is not zero sized; otherwise yields + /// undefined behavior. + unsafe fn dealloc_array_unchecked(&mut self, ptr: Unique, n: usize) { + let kind = Kind::array_unchecked::(n); + self.dealloc_unchecked(NonZero::new(*ptr as *mut u8), kind); + } +} +``` From 738ebe304ee4f7201f74eb01203871ea347f1530 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Sun, 6 Dec 2015 19:41:52 +0100 Subject: [PATCH 0623/1195] oops this question was folded into the previous one. --- text/0000-kinds-of-allocators.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 4cbff5c22a3..67c844d04c3 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -709,9 +709,6 @@ to both the allocation and deallocation call sites. callers who can make use of excess memory to avoid unnecessary calls to `realloc`. - -### Why `alloc_array_unchecked` and `dealloc_array_unchecked`? - ### Why the `Kind` abstraction? While we do want to require clients to hand the allocator the size and From af6090fbb297488cd7fff8763a1f6df62a391181 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Sun, 6 Dec 2015 19:43:49 +0100 Subject: [PATCH 0624/1195] oops `RequestUnsatisfied` was removed during the drafting process... --- text/0000-kinds-of-allocators.md | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 67c844d04c3..9732a1d2592 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -729,13 +729,8 @@ overhead (at least in terms of the size of the values being returned from the allocation calls), because the error type for the `Result` can be zero-sized if so desired. That is why the error is an associated type of the `Allocator`: allocators that want to ensure the -results have minimum size can use the zero-sized `RequestUnsatisfied` -or `MemoryExhausted` types as their associated `Self::Error`. - - * `RequestUnsatisfied` is a catch-all type that any allocator - could use as its error type; doing so provides no hint to the - client as to what they could do to try to service future memory - requests. +results have minimum size can use the zero-sized `MemoryExhausted` type +as their associated `Self::Error`. * `MemoryExhausted` is a specific error type meant for allocators that could in principle handle *any* sane input request, if there From be627c252ad33f314bfd74251009d46ac9d971b0 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Sun, 6 Dec 2015 19:46:35 -0500 Subject: [PATCH 0625/1195] typo fix --- text/0000-kinds-of-allocators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 9732a1d2592..79be93312eb 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -91,7 +91,7 @@ how to integrate allocators with GC.) ## The `Allocator` trait at a glance -The source code for the `Allocator` trait prototype ks provided in an +The source code for the `Allocator` trait prototype is provided in an [appendix][Source for Allocator]. But since that section is long, here we summarize the high-level points of the `Allocator` API. From dea46124b32c57d2ce9d4d3fb994b0acc9bcb311 Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Sun, 6 Dec 2015 20:31:02 -0500 Subject: [PATCH 0626/1195] RFC: Add #[repr(pack = "N")] Signed-off-by: Peter Atashian --- text/0000-repr-pack.md | 88 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 text/0000-repr-pack.md diff --git a/text/0000-repr-pack.md b/text/0000-repr-pack.md new file mode 100644 index 00000000000..f2d2a1c01d7 --- /dev/null +++ b/text/0000-repr-pack.md @@ -0,0 +1,88 @@ +- Feature Name: `repr_pack` +- Start Date: 2015-12-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Extend the existing `#[repr]` attribute on structs with a `pack = "N"` option to +specify a custom packing for `struct` types. + +# Motivation +[motivation]: #motivation + +Many C/C++ compilers allow a packing to be specified for structs which +effectivally lowers the alignment for a struct and its fields (for example with +MSVC there is `#pragma pack(N)`). Such packing is used extensively in certain +C/C++ libraries (such as Windows API which uses it all over the place making +writing Rust libraries such as `winapi` a nightmare). + +At the moment the only way to work around the lack of a proper +`#[repr(pack = "N")]` attribute is to use `#[repr(packed)]` and then manually +fill in padding which is a burdensome task. + +# Detailed design +[design]: #detailed-design + +The `#[repr]` attribute on `struct`s will be extended to include a form such as: + +```rust +#[repr(pack = "2")] +struct LessAligned(i16, i32); +``` + +This structure will have an alignment of 2 and a size of 6, as well as the +second field having an offset of 2 instead of 4 from the base of the struct. +This is in contrast to without the attribute where the structure would have an +alignment of 4 and a size of 8, and the second field would have an offset of 4 +from the base of the struct. + +Syntactically, the `repr` meta list will be extended to accept a meta item +name/value pair with the name "pack" and the value as a string which can be +parsed as a `u64`. The restrictions on where this attribute can be placed along +with the accepted values are: + +* Custom packing can only be specified on `struct` declarations for now. + Specifying a different packing on perhaps `enum` or `type` definitions should + be a backwards-compatible extension. +* Packing values must be a power of two. + +By specifying this attribute, the alignment of the struct would be the smaller +of the specified packing and the default alignment of the struct otherwise. The +alignments of each struct field for the purpose of positioning fields would also +be the smaller of the specified packing and the alignment of the type of that +field. If the specified packing is greater than or equal to the default +alignment of the struct, then the alignment and layout of the struct should be +unaffected. + +When combined with `#[repr(C)]` the size alignment and layout of the struct +should match the equivalent struct in C. + +`#[repr(packed)]` and `#[repr(pack = "1")]` should have identical behavior. + +Because this lowers the effective alignment of fields in the same way that +`#[repr(packed)]` does (which caused https://github.com/rust-lang/rust/issues/27060 ), +while accessing a field should be safe, borrowing a field should be unsafe. + +# Drawbacks +[drawbacks]: #drawbacks + +This would unfortunately make my life easier even though one of the unstated +goals of Rust is to make my life as difficult as possible when doing FFI with +Windows API. + +# Alternatives +[alternatives]: #alternatives + +* The alternative is not doing this and forcing people to continue using + `#[repr(packed)]` with manual padding. +* Alternatively a new attribute could be used such as `#[pack]`. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* The behavior specified here should match the behavior of MSVC at least. Does + it match the behavior of other C/C++ compilers as well? +* Should it still be safe to borrow fields whose alignment is less than or equal + to the specified packing or should all field borrows be unsafe? From c6e522b99dba847b72819f53483503bbb9ddf54d Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Sun, 6 Dec 2015 20:38:26 -0500 Subject: [PATCH 0627/1195] More alternative Signed-off-by: Peter Atashian --- text/0000-repr-pack.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-repr-pack.md b/text/0000-repr-pack.md index f2d2a1c01d7..9efc5dcbe14 100644 --- a/text/0000-repr-pack.md +++ b/text/0000-repr-pack.md @@ -78,6 +78,8 @@ Windows API. * The alternative is not doing this and forcing people to continue using `#[repr(packed)]` with manual padding. * Alternatively a new attribute could be used such as `#[pack]`. +* `#[repr(packed)]` could be extended as either `#[repr(packed(N))]` or + `#[repr(packed = "N")]`. # Unresolved questions [unresolved]: #unresolved-questions From c92aa339c06d51580a325dbea3540910ae58f9e2 Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Sun, 6 Dec 2015 21:36:23 -0500 Subject: [PATCH 0628/1195] Clarify that this is really needed Signed-off-by: Peter Atashian --- text/0000-repr-pack.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/0000-repr-pack.md b/text/0000-repr-pack.md index 9efc5dcbe14..f9a56bbc5c5 100644 --- a/text/0000-repr-pack.md +++ b/text/0000-repr-pack.md @@ -20,7 +20,10 @@ writing Rust libraries such as `winapi` a nightmare). At the moment the only way to work around the lack of a proper `#[repr(pack = "N")]` attribute is to use `#[repr(packed)]` and then manually -fill in padding which is a burdensome task. +fill in padding which is a burdensome task. Even then that isn't quite right +because the overall alignment of the struct would end up as 1 even though it +needs to be N (or the default if that is smaller than N), so this fills in a gap +which is basically impossible to do in Rust at the moment. # Detailed design [design]: #detailed-design From 898c79146648d438c8f944c70af7f88a2c046c51 Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Sun, 6 Dec 2015 21:43:37 -0500 Subject: [PATCH 0629/1195] Clarify wrongness of first alternative Signed-off-by: Peter Atashian --- text/0000-repr-pack.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-repr-pack.md b/text/0000-repr-pack.md index f9a56bbc5c5..43f1fe0d232 100644 --- a/text/0000-repr-pack.md +++ b/text/0000-repr-pack.md @@ -79,7 +79,8 @@ Windows API. [alternatives]: #alternatives * The alternative is not doing this and forcing people to continue using - `#[repr(packed)]` with manual padding. + `#[repr(packed)]` with manual padding, although such structs would always have + an alignment of 1 which is often wrong. * Alternatively a new attribute could be used such as `#[pack]`. * `#[repr(packed)]` could be extended as either `#[repr(packed(N))]` or `#[repr(packed = "N")]`. From 6e269ded0b3f5857e00c68ebf816d071f187399d Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Sun, 6 Dec 2015 21:44:58 -0500 Subject: [PATCH 0630/1195] Address cmr's nit Signed-off-by: Peter Atashian --- text/0000-repr-pack.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/text/0000-repr-pack.md b/text/0000-repr-pack.md index 43f1fe0d232..61c1b194d21 100644 --- a/text/0000-repr-pack.md +++ b/text/0000-repr-pack.md @@ -52,12 +52,11 @@ with the accepted values are: * Packing values must be a power of two. By specifying this attribute, the alignment of the struct would be the smaller -of the specified packing and the default alignment of the struct otherwise. The -alignments of each struct field for the purpose of positioning fields would also -be the smaller of the specified packing and the alignment of the type of that -field. If the specified packing is greater than or equal to the default -alignment of the struct, then the alignment and layout of the struct should be -unaffected. +of the specified packing and the default alignment of the struct. The alignments +of each struct field for the purpose of positioning fields would also be the +smaller of the specified packing and the alignment of the type of that field. If +the specified packing is greater than or equal to the default alignment of the +struct, then the alignment and layout of the struct should be unaffected. When combined with `#[repr(C)]` the size alignment and layout of the struct should match the equivalent struct in C. From 64f62299f1de3c37a90a8e9df2580e5388b27e96 Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Sun, 6 Dec 2015 22:02:43 -0500 Subject: [PATCH 0631/1195] More cmr nitfixes Signed-off-by: Peter Atashian --- text/0000-repr-pack.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-repr-pack.md b/text/0000-repr-pack.md index 61c1b194d21..61258e614ae 100644 --- a/text/0000-repr-pack.md +++ b/text/0000-repr-pack.md @@ -13,17 +13,17 @@ specify a custom packing for `struct` types. [motivation]: #motivation Many C/C++ compilers allow a packing to be specified for structs which -effectivally lowers the alignment for a struct and its fields (for example with +effectively lowers the alignment for a struct and its fields (for example with MSVC there is `#pragma pack(N)`). Such packing is used extensively in certain -C/C++ libraries (such as Windows API which uses it all over the place making -writing Rust libraries such as `winapi` a nightmare). +C/C++ libraries (such as Windows API which uses it pervasively making writing +Rust libraries such as `winapi` challenging). At the moment the only way to work around the lack of a proper `#[repr(pack = "N")]` attribute is to use `#[repr(packed)]` and then manually fill in padding which is a burdensome task. Even then that isn't quite right because the overall alignment of the struct would end up as 1 even though it needs to be N (or the default if that is smaller than N), so this fills in a gap -which is basically impossible to do in Rust at the moment. +which is impossible to do in Rust at the moment. # Detailed design [design]: #detailed-design From cbb18a4bbf71798da9dccfbe9c1662e9d06fc1fb Mon Sep 17 00:00:00 2001 From: Peter Atashian Date: Sun, 6 Dec 2015 22:23:19 -0500 Subject: [PATCH 0632/1195] Am I doing this right? Signed-off-by: Peter Atashian --- text/0000-repr-pack.md | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/text/0000-repr-pack.md b/text/0000-repr-pack.md index 61258e614ae..447137bdf8f 100644 --- a/text/0000-repr-pack.md +++ b/text/0000-repr-pack.md @@ -64,15 +64,21 @@ should match the equivalent struct in C. `#[repr(packed)]` and `#[repr(pack = "1")]` should have identical behavior. Because this lowers the effective alignment of fields in the same way that -`#[repr(packed)]` does (which caused https://github.com/rust-lang/rust/issues/27060 ), -while accessing a field should be safe, borrowing a field should be unsafe. +`#[repr(packed)]` does (which caused [issue #27060][gh27060]), while accessing a +field should be safe, borrowing a field should be unsafe. + +Specifying `#[repr(packed)]` and `#[repr(pack = "N")]` where N is not 1 should +result in an error. + +Specifying `#[repr(pack = "A")]` and `#[repr(align = "B")]` should still pack +together fields with the packing specified, but then increase the overall +alignment to the alignment specified. Depends on [RFC #1358][rfc1358] landing. # Drawbacks [drawbacks]: #drawbacks -This would unfortunately make my life easier even though one of the unstated -goals of Rust is to make my life as difficult as possible when doing FFI with -Windows API. +Duplication in the language where `#[repr(packed)]` and `#[repr(pack = "1")]` +have identical behavior. # Alternatives [alternatives]: #alternatives @@ -91,3 +97,6 @@ Windows API. it match the behavior of other C/C++ compilers as well? * Should it still be safe to borrow fields whose alignment is less than or equal to the specified packing or should all field borrows be unsafe? + +[gh27060]: https://github.com/rust-lang/rust/issues/27060 +[rfc1358]: https://github.com/rust-lang/rfcs/pull/1358 From 16152360f5906a844d5b4e45e68ded60d2590dc8 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 15 Dec 2015 22:29:06 +1300 Subject: [PATCH 0633/1195] Remove material on error format and robust compilation This is a somewhat separate topic, most of which doesn't need an RFC. The material here didn't really add much. --- text/0000-ide.md | 93 ++---------------------------------------------- 1 file changed, 2 insertions(+), 91 deletions(-) diff --git a/text/0000-ide.md b/text/0000-ide.md index a6c9662c86e..6f3eb3c015d 100644 --- a/text/0000-ide.md +++ b/text/0000-ide.md @@ -141,6 +141,8 @@ communicate these asynchronously to the IDE plugin). In addition we must produce data to update the oracle, this should be done directly, without involving the IDE plugin. +TODO metadata - really? + Quick-check does not generate executable code or crate metadata. However, it should (probably) update the metadata used for incremental compilation. @@ -384,97 +386,6 @@ does not seem like enough motivation to actually do the work. Could be an interesting student project or something. -## Robust compilation - -The goal here is that when the user is typing, we should be able to run the -early stages of the quick-check compiler and still come up with sensible code -completion suggestions. The IDE and compiler can collaborate to some extent -here. - -As long as we can compile as far as type checking, then the compiler should -still generate metadata for the oracle. If we fail later (e.g., in borrow -checking) then we should return errors *and* metadata for the oracle. If we fail -to type check, then we cannot generate meaningful data for the oracle (or if we -succeed at type checking, but use some error recovery). - -THE IDE should instruct the oracle to invalidate some of its data. I believe that -this does not require deep knowledge about the program (i.e., we know a span has -changed and compilation has failed, we can instruct the oracle to invalidate all -data associated with that span. With luck, we can leverage the dependency -information the compiler has for incremental compilation here). - -In some cases a program would fail to parse or pass name resolution, but we -would like to try to type check. For example, - -```rust -fn main() { - let x = foo.bar. -``` - -will not parse, but we would like to suggest code completion options. - -```rust -fn main() { - let foo = foo(); - let x = fo; -} -``` - -will parse, but fail name resolution, but again we would like to suggest code -completion options. - -There are two issues: dealing with incomplete or incorrect names (e.g., `fo` in -the second example), and dealing with unfinished AST nodes (e.g., in the first -example we need an identifier to finish the `.` expression, a `;` to terminate -the let statement, and a `}` to terminate the `main` function). - -A solution to the first problem is replacing invalid names with some magic -identifier, and ignoring errors involving that identifier. @sanxiyn implemented -something like the second feature in a -[PR](https://github.com/rust-lang/rust/pull/21323). His approach was to take a -command line argument for where to 'complete at' and to treat that as the magic -identifier. An alternate approach would be to use a keyword or distinguished -identifier which the IDE could insert (based on the caret position), or to -fallback to the magic identifier whenever there is a name resolution error. - -Similarly during type checking, if we find a mismatched or unknown type, we -should try to continue type checking with the information available so as to -still be able to provide code completion information. We already do this to some -extent with `TyErr`, but we should do better. - -For the second issue, the problem is where to start parsing again and how many -'open' items should be terminated. This is closely related to error recovery in -parsers, which is a well-developed are of research with a long history, and -which I won't attempt to summarise here. As far as I can see, there are two -major differences since we are doing this in the IDE context: we know the extent -of edited code (the span of changes we are passing to the quick-check compiler) -and the previous state of the edited code, and we can likely assume that even in -new code, braces and parentheses are likely to be paired (since an IDE will -insert closing braces, etc.). Assuming that we keep the state of the code the -last time it parsed completely, we can expand the edited span to cover an entire -expression (or other item) and thus we know exactly where to start re-parsing. -In the case where we are writing new code, we can just close all 'open' items. - -Being able to generate more errors before stopping would be an advantage for the -compiler in any case. However, we probably do not want to use these mechanisms -under normal compilation, only when performing a quick-check from the IDE. - - - -## Error format - -Currently the compiler generates text error messages. I propose that we add a -mechanism to the compiler to support different formats for error messages. We -already structure our error messages to some extent (separating the span -information, the message, and the error code). Rather than turning these -components into text in a fairly ad hoc manner, we should preserve that -structure, and some central error handler should convert into a chosen format. -We should support the current text format, JSON (or some other structured -format) for tools to use, and HTML for rich error messages (this is somewhat -orthogonal to this RFC, but has been discussed in the past as a desirable -feature). - - # Drawbacks It's a lot of work. On the other hand the largest changes are desirable for From e76929e6434768121919cfffe40953e565339745 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Dec 2015 17:16:01 +0100 Subject: [PATCH 0634/1195] Fix realloc bug in spec and impl. --- text/0000-kinds-of-allocators.md | 43 ++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 11 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 79be93312eb..c4a093b75a2 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1636,17 +1636,22 @@ impl AllocError for AllocErr { /// /// 1. The block's starting address must be aligned to `kind.align()`. /// -/// 2. The block's size must fall in the range `[orig, usable]`, where: +/// 2. The block's size must fall in the range `[use_min, use_max]`, where: /// -/// * `orig` is the size last used to allocate the block, and +/// * `use_min` is `self.usable_size(kind).0`, and /// -/// * `usable` is the capacity that was (or would have been) +/// * `use_max` is the capacity that was (or would have been) /// returned when (if) the block was allocated via a call to /// `alloc_excess` or `realloc_excess`. /// -/// Note that due to the constraints in the methods below, a -/// lower-bound on `usable` can be safely approximated by a call to -/// `usable_size`. +/// Note that: +/// +/// * the size of the kind most recently used to allocate the block +/// is guaranteed to be in the range `[use_min, use_max]`, and +/// +/// * a lower-bound on `use_max` can be safely approximated by a call to +/// `usable_size`. +/// pub unsafe trait Allocator { /// When allocation requests cannot be satisified, an instance of /// this error is returned. @@ -1732,9 +1737,21 @@ pub unsafe trait Allocator { /// always fail.) fn max_align(&self) -> Option { None } - /// Returns the minimum guaranteed usable size of a successful + /// Returns bounds on the guaranteed usable size of a successful /// allocation created with the specified `kind`. /// + /// In particular, for a given kind `k`, if `usable_size(k)` returns + /// `(l, m)`, then one can use a block of kind `k` as if it has any + /// size in the range `[l, m]` (inclusive). + /// + /// (All implementors of `fn usable_size` must ensure that + /// `l <= k.size() <= m`) + /// + /// Both the lower- and upper-bounds (`l` and `m` respectively) are + /// provided: An allocator based on size classes could misbehave + /// if one attempts to deallocate a block without providing a + /// correct value for its size (i.e., one within the range `[l, m]`). + /// /// Clients who wish to make use of excess capacity are encouraged /// to use the `alloc_excess` and `realloc_excess` instead, as /// this method is constrained to conservatively report a value @@ -1744,7 +1761,9 @@ pub unsafe trait Allocator { /// However, for clients that do not wish to track the capacity /// returned by `alloc_excess` locally, this method is likely to /// produce useful results. - unsafe fn usable_size(&self, kind: Kind) -> Capacity { kind.size() } + unsafe fn usable_size(&self, kind: Kind) -> (Capacity, Capacity) { + (kind.size(), kind.size()) + } ``` @@ -1796,9 +1815,11 @@ pub unsafe trait Allocator { ptr: Address, kind: Kind, new_kind: Kind) -> Result { + let (min, max) = self.usable_size(kind); + let s = new_kind.size(); // All Kind alignments are powers of two, so a comparison // suffices here (rather than resorting to a `%` operation). - if new_kind.size() <= self.usable_size(kind) && new_kind.align() <= kind.align() { + if min <= s && s <= max && new_kind.align() <= kind.align() { return Ok(ptr); } else { let result = self.alloc(new_kind); @@ -1829,7 +1850,7 @@ pub unsafe trait Allocator { /// the returned block. For some `kind` inputs, like arrays, this /// may include extra storage usable for additional data. unsafe fn alloc_excess(&mut self, kind: Kind) -> Result { - self.alloc(kind).map(|p| Excess(p, self.usable_size(kind))) + self.alloc(kind).map(|p| Excess(p, self.usable_size(kind).1)) } /// Behaves like `fn realloc`, but also returns the whole size of @@ -1840,7 +1861,7 @@ pub unsafe trait Allocator { kind: Kind, new_kind: Kind) -> Result { self.realloc(ptr, kind, new_kind) - .map(|p| Excess(p, self.usable_size(new_kind))) + .map(|p| Excess(p, self.usable_size(new_kind).1)) } ``` From 087f4c136b44536d76ac748d132658941c552278 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Dec 2015 17:35:09 +0100 Subject: [PATCH 0635/1195] removed transient errors from API. --- text/0000-kinds-of-allocators.md | 86 ++++++++++++++------------------ 1 file changed, 37 insertions(+), 49 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index c4a093b75a2..288fcc29e5b 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -292,17 +292,23 @@ will expose: 3. there could be *interference* between two threads. This latter scenario means that this allocator failed on this memory request, but the client might - quite reasonably just *retry* the request. + quite reasonably just *retry* the request. This is + an error condition specific to this allocator, so we + will identify it via a separate `fn is_transient` inherent + method. ```rust #[derive(Copy, Clone, PartialEq, Eq, Debug)] enum BumpAllocError { Invalid, MemoryExhausted, Interference } +impl BumpAllocError { + fn is_transient(&self) { *self == BumpAllocError::Interference } +} + impl alloc::AllocError for BumpAllocError { fn invalid_input() -> Self { BumpAllocError::MemoryExhausted } fn is_memory_exhausted(&self) -> bool { *self == BumpAllocError::MemoryExhausted } fn is_request_unsupported(&self) -> bool { false } - fn is_transient(&self) { *self == BumpAllocError::Interference } } ``` @@ -989,8 +995,9 @@ few motivating examples that *are* clearly feasible and useful. * Should `dealloc` return a `Result` or not? (Under what circumstances would we expect `dealloc` to fail in a manner worth signalling? The main one I can think of is a transient failure, - which is why the documentation for that method spends so much time - discussing it.) + which was in a previous version of the API but has since been removed. + Still, if errors *can* happen, maybe its best to provide *some* way + for a client to catch them and report them in context.) * Are the type definitions for `Size`, `Capacity`, `Alignment`, and `Address` an abuse of the `NonZero` type? (Or do we just need some @@ -1000,25 +1007,13 @@ few motivating examples that *are* clearly feasible and useful. client to provide more contextual information about the OOM condition)? - * Does `AllocError::is_transient` belong in this version of the API, - or should we wait to add it later? (I originally suspected that - libstd data types would want to make use of it, which would means - we should add it. However, in the absence of a concrete example - stdlib type that would use it, we may be better off removing `fn - is_transient` from this API (instead specifying that allocators - with such transient failures will block (i.e. loop and retry - internally), with the expectation that if a need for such an - allocator does arise, we will then represent the API extension - via a different trait (perhaps an extension trait of `Allocator`). - - * On that note, if we remove the `fn is_transient` method, should - we get rid of the `AllocError` bound entirely? Is the given set + * Should we get rid of the `AllocError` bound entirely? Is the given set of methods actually worth providing to all generic clients? (Keeping it seems very low cost to me; implementors can always opt to use the `MemoryExhausted` error type, which is cheap. But my intuition may be wrong.) - + * Do we need `Allocator::max_size` and `Allocator::max_align` ? * Should default impl of `Allocator::max_align` return `None`, or is @@ -1033,6 +1028,12 @@ few motivating examples that *are* clearly feasible and useful. low-level allocators will behave well for large alignments. See https://github.com/rust-lang/rust/issues/30170 +# Change History + +* Changed `fn usable_size` to return `(l, m)` rathern than just `m`. + +* Removed `fn is_transient` from `trait AllocError`, and removed discussion + of transient errors from the API. # Appendices @@ -1559,17 +1560,6 @@ pub trait AllocError { /// supports satisfying memory requests where each allocated block /// is at most `K` bytes in size. fn is_request_unsupported(&self) -> bool; - - /// Returns true only if the error is transient. "Transient" is - /// meant here in the sense that there is a reasonable chance that - /// re-issuing the same allocation request in the future *could* - /// succeed, even if nothing else changes about the overall - /// context of the request. - /// - /// An example where this might arise: An allocator shared across - /// threads that fails upon detecting interference (rather than - /// e.g. blocking). - fn is_transient(&self) -> bool { false } // most errors are not transient } /// The `MemoryExhausted` error represents a blanket condition @@ -1683,11 +1673,9 @@ pub unsafe trait Allocator { /// and `kind` must *fit* the provided block (see above); /// otherwise yields undefined behavior. /// - /// Returns `Err` only if deallocation fails in some fashion. If - /// the returned error is *transient*, then ownership of the - /// memory block is transferred back to the caller (see - /// `AllocError::is_transient`). Otherwise, callers must assume - /// that ownership of the block has been unrecoverably lost. + /// Returns `Err` only if deallocation fails in some fashion. + /// In this case callers must assume that ownership of the block has + /// been unrecoverably lost (memory may have been leaked). /// /// Note: Implementors are encouraged to avoid `Err`-failure from /// `dealloc`; most memory allocation APIs do not support @@ -1825,21 +1813,21 @@ pub unsafe trait Allocator { let result = self.alloc(new_kind); if let Ok(new_ptr) = result { ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*kind.size(), *new_kind.size())); - loop { - if let Err(err) = self.dealloc(ptr, kind) { - // all we can do from the realloc abstraction - // is either: - // - // 1. free the block we just finished copying - // into and pass the error up, - // 2. ignore the dealloc error, or - // 3. try again. - // - // They are all terrible; 1 seems unjustifiable. - // So we choose 2, unless the error is transient. - if err.is_transient() { continue; } - } - break; + if let Err(_) = self.dealloc(ptr, kind) { + // all we can do from the realloc abstraction + // is either: + // + // 1. free the block we just finished copying + // into and pass the error up, + // 2. panic (same as if we had called `unwrap`), + // 3. try to dealloc again, or + // 4. ignore the dealloc error. + // + // They are all terrible; (1.) and (2.) seem unjustifiable, + // and (3.) seems likely to yield an infinite loop (unless + // we add back in some notion of a transient error + // into the API). + // So we choose (4.): ignore the dealloc error. } } result From af0b05f3dffcad2b9579efc93a7c368a4a4ecaf7 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Dec 2015 17:42:27 +0100 Subject: [PATCH 0636/1195] try to improve description of `fn is_memory_exhausted` --- text/0000-kinds-of-allocators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 288fcc29e5b..6aaace1d2b0 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1528,7 +1528,7 @@ pub trait AllocError { /// Returns true if the error is due to hitting some resource /// limit or otherwise running out of memory. This condition - /// strongly implies that *some* series of deallocations would + /// serves as a hint that some series of deallocations *might* /// allow a subsequent reissuing of the original allocation /// request to succeed. /// From 533bcf8bc52133cd01d0cf3ca0c917b1fe7d1da3 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Dec 2015 17:59:56 +0100 Subject: [PATCH 0637/1195] small updates: IEtF 2119, and discussion of `&mut MegaEmbedded` --- text/0000-kinds-of-allocators.md | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 6aaace1d2b0..c84e375772d 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -493,7 +493,11 @@ blocks and the allocator(s), but the generic code we expect the standard library to provide cannot make such assumptions. To satisfy the above scenarios in a sane, consistent, general fashion, -the `Allocator` trait assumes/requires all of the following: +the `Allocator` trait assumes/requires all of the following conditions. +(Note: this list of conditions uses the phrases "should", "must", and "must not" +in a formal manner, in the style of [IETF RFC 2119][].) + +[IETF RFC 2119]: https://www.ietf.org/rfc/rfc2119.txta 1. (for allocator impls and clients): in the absence of other information (e.g. specific allocator implementations), all blocks @@ -550,15 +554,25 @@ the `Allocator` trait assumes/requires all of the following: E.g. this is *not* a legal allocator: ```rust struct MegaEmbedded { pool: [u8; 1024*1024], cursor: usize, ... } - impl Allocator for MegaEmbedded { ... } + impl Allocator for MegaEmbedded { ... } // INVALID IMPL ``` The latter impl is simply unreasonable (at least if one is intending to satisfy requests by returning pointers into `self.bytes`). - (Note of course, `impl Allocator for &mut MegaEmbedded` is in - principle *fine*; that would then be an allocator that is an - indirect handle to an unembedded pool.) + Note that an allocator that owns its pool *indirectly* + (i.e. does not have the pool's state embedded in the allocator) is fine: + ```rust + struct MegaIndirect { pool: *mut [u8; 1024*1024], cursor: usize, ... } + impl Allocator for MegaIndirect { ... } // OKAY + ``` + + (I originally claimed that `impl Allocator for &mut MegaEmbedded` + would also be a legal example of an allocator that is an indirect handle + to an unembedded pool, but others pointed out that handing out the + addresses pointing into that embedded pool could end up violating our + aliasing rules for `&mut`. I obviously did not expect that outcome; I + would be curious to see what the actual design space is here.) 5. (for allocator impls and clients) if an allocator is cloneable, the client *can assume* that all clones From 1fc45cd40a268ae6b534e56000ef7dc9eb729bfe Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Dec 2015 18:05:35 +0100 Subject: [PATCH 0638/1195] account for the RefCell oversight. --- text/0000-kinds-of-allocators.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index c84e375772d..9c183e3633a 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -194,6 +194,14 @@ case, dropping the allocator has no effect on the memory pool. handle will not drop the pool as long as at least one other handle remains, but dropping the last handle will drop the pool itself. + FIXME: `RefCell` is not going to work with the allocator API + envisaged here; see [comment from gankro][]. We will need to + address this (perhaps just by pointing out that it is illegal and + suggesting a standard pattern to work around it) before this RFC + can be accepted. + +[comment from gankro]: https://github.com/rust-lang/rfcs/pull/1398#issuecomment-162681096 + A client that is generic over all possible `A:Allocator` instances cannot know which of the above cases it falls in. This has consequences in terms of the restrictions that must be met by client code @@ -1001,6 +1009,9 @@ few motivating examples that *are* clearly feasible and useful. # Unresolved questions [unresolved]: #unresolved-questions + * Since we cannot do `RefCell` (see FIXME above), what is + our standard recommendation for what to do instead? + * Should `Kind` be an associated type of `Allocator` (see [alternatives][] section for discussion). (In fact, most of the "Variations correspond to potentially From 553d59ed672038ab1a2964fcb978fbfb4b3f2e8b Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Dec 2015 18:08:19 +0100 Subject: [PATCH 0639/1195] fix typo. --- text/0000-kinds-of-allocators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 9c183e3633a..01aca4a5cf0 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -860,7 +860,7 @@ ensuring e.g. that allocated blocks of memory can be scanned (i.e. "parsed") by the GC (if that in fact ends up being necessary). This way, we can deploy an `Allocator` trait API today that does not -provide the necessary reflective hooks that a GC wuold need to access. +provide the necessary reflective hooks that a GC would need to access. Crates that define their own `Allocator` implementations without also claiming them to be GC-compatible will be forbidden from linking with From 7654c007b7c30f1388cb7b5240ae66cb52b29b28 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 17 Dec 2015 14:21:00 -0800 Subject: [PATCH 0640/1195] RFC 1328 is global panic handlers --- ...global-panic-handler.md => 1328-global-panic-handler.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-global-panic-handler.md => 1328-global-panic-handler.md} (97%) diff --git a/text/0000-global-panic-handler.md b/text/1328-global-panic-handler.md similarity index 97% rename from text/0000-global-panic-handler.md rename to text/1328-global-panic-handler.md index 229fc04aabb..299a4254a6a 100644 --- a/text/0000-global-panic-handler.md +++ b/text/1328-global-panic-handler.md @@ -1,7 +1,7 @@ -- Feature Name: panic_handler +- Feature Name: `panic_handler` - Start Date: 2015-10-08 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1328](https://github.com/rust-lang/rfcs/pull/1328) +- Rust Issue: [rust-lang/rust#30449](https://github.com/rust-lang/rust/issues/30449) # Summary From d95caf2447f683f9fc680603d720ceecc39c793d Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 18 Dec 2015 11:56:55 -0800 Subject: [PATCH 0641/1195] RFC: Deprecate type aliases in std::os::*::raw Deprecate type aliases and structs in `std::os::$platform::raw` in favor of trait-based accessors which return Rust types rather than the equivalent C type aliases. --- text/0000-trim-std-os.md | 146 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 146 insertions(+) create mode 100644 text/0000-trim-std-os.md diff --git a/text/0000-trim-std-os.md b/text/0000-trim-std-os.md new file mode 100644 index 00000000000..1496aff06f1 --- /dev/null +++ b/text/0000-trim-std-os.md @@ -0,0 +1,146 @@ +- Feature Name: N/A +- Start Date: 2015-12-18 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Deprecate type aliases and structs in `std::os::$platform::raw` in favor of +trait-based accessors which return Rust types rather than the equivalent C type +aliases. + +# Motivation +[motivation]: #motivation + +[RFC 517][io-reform] set forth a vision for the `raw` modules in the standard +library to perform lowering operations on various Rust types to their platform +equivalents. For example the `fs::Metadata` structure can be lowered to the +underlying `sys::stat` structure. The rationale for this was to enable building +abstractions externally from the standard library by exposing all of the +underlying data that is obtained from the OS. + +[io-reform]: https://github.com/rust-lang/rfcs/blob/master/text/0517-io-os-reform.md + +This strategy, however, runs into a few problems: + +* For some libc structures, such as `stat`, there's not actually one canonical + definition. For example on 32-bit Linux the definition of `stat` will change + depending on whether [LFS][lfs] is enabled (via the `-D_FILE_OFFSET_BITS` + macro). This means that if std is advertises these `raw` types as being "FFI + compatible with libc", it's not actually correct in all circumstances! +* Intricately exporting raw underlying interfaces (such as [`&stat` from + `&fs::Metadata`][std-as-stat]) makes it difficult to change the + implementation over time. Today the 32-bit Linux standard library [doesn't + use LFS functions][std-no-lfs], so files over 4GB cannot be opened. Changing + this, however, would [involve changing the `stat` + structure][libc-stat-change] and may be difficult to do. +* Trait extensions in the `raw` module attempt to return the `libc` aliased type + on all platforms, for example [`DirEntryExt::ino`][std-nio] returns a type of + `ino_t`. The `ino_t` type is billed as being FFI compatible with the libc + `ino_t` type, but not all platforms store the `d_ino` field in `dirent` with + the `ino_t` type. For example on Android the [definition of + `ino_t`][android-ino_t] is `u32` but the [actual stored value is + `u64`][android-d_ino]. This means that on Android we're actually silently + truncating the return value! + +[lfs]: http://users.suse.com/~aj/linux_lfs.html +[std-as-stat]: https://github.com/rust-lang/rust/blob/29ea4eef9fa6e36f40bc1f31eb1e56bf5941ee72/src/libstd/sys/unix/fs.rs#L81-L92 +[std-no-lfs]: https://github.com/rust-lang/rust/issues/30050 +[std-ino]: https://github.com/rust-lang/rust/blob/29ea4eef9fa6e36f40bc1f31eb1e56bf5941ee72/src/libstd/sys/unix/fs.rs#L192-L197 +[libc-stat-change]: https://github.com/rust-lang-nursery/libc/blob/2c7e08c959e599ca221581b1670a9ecbbeac2dcb/src/unix/notbsd/linux/other/b32/mod.rs#L28-L71 +[android-d_ino]: https://github.com/rust-lang-nursery/libc/blob/2c7e08c959e599ca221581b1670a9ecbbeac2dcb/src/unix/notbsd/android/mod.rs#L50 +[android-ino_t]: https://github.com/rust-lang-nursery/libc/blob/2c7e08c959e599ca221581b1670a9ecbbeac2dcb/src/unix/notbsd/android/mod.rs#L11 + +Over time it's basically turned out that exporting the somewhat-messy details of +libc has gotten a little messy in the standard library as well. Exporting this +functionality (e.g. being able to access all of the fields), is quite useful +however! This RFC proposes tweaking the design of the extensions in +`std::os::*::raw` to allow the same level of information exposure that happens +today but also cut some of the tie from libc to std to give us more freedom to +change these implementation details and work around weird platforms. + +# Detailed design +[design]: #detailed-design + +First, the types and type aliases in `std::os::*::raw` will all be +deprecated. For example `stat`, `ino_t`, `dev_t`, `mode_t`, etc, will all be +deprecated (in favor of their definitions in the `libc` crate). Note that the C +integer types, `c_int` and friends, will not be deprecated. + +Next, all existing extension traits will cease to return platform specific type +aliases (such as the `DirEntryExt::ino` function). Instead they will return +`u64` across the board unless it's 100% known for sure that fewer bits will +suffice. This will improve consistency across platforms as well as avoid +truncation problems such as those Android is experiencing. Furthermore this +frees std from dealing with any odd FFI compatibility issues, punting that to +the libc crate itself it the values are handed back into C. + +The `std::os::*::fs::MetadataExt` will have its `as_raw_stat` method deprecated, +and it will instead grow functions to access all the associated fields of the +underlying `stat` structure. This means that there will now be a +trait-per-platform to expose all this information. Also note that all the +methods will likely return `u64` in accordance with the above modification. + +With these modifications to what `std::os::*::raw` includes and how it's +defined, it should be easy to tweak existing implementations and ensure values +are transmitted in a lossless fashion. The changes, however, are both breaking +changes and don't immediately enable fixing bugs like using LFS on Linux: + +* Code such as `let a: ino_t = entry.ino()` would break as the `ino()` function + will return `u64`, but the definition of `ino_t` may not be `u64` for all + platforms. +* The `stat` structure itself on 32-bit Linux still uses 32-bit fields (e.g. it + doesn't mirror `stat64` in libc). + +To help with these issues, more extensive modifications can be made to the +platform specific modules. All type aliases can be switched over to `u64` and +the `stat` structure could simply be redefined to `stat64` on Linux (minus +keeping the same name). This would, however, explicitly mean that +**std::os::raw is no longer FFI compatible with C**. + +This breakage can be clearly indicated in the deprecation messages, however. +Additionally, this fits within std's [breaking changes policy][api-evolution] as +a local `as` cast should be all that's needed to patch code that breaks to +straddle versions of Rust. + +[api-evolution]: https://github.com/rust-lang/rfcs/blob/master/text/1105-api-evolution.md + +# Drawbacks +[drawbacks]: #drawbacks + +As mentioned above, this RFC is strictly-speaking a breaking change. It is +expected that not much code will break, but currently there is no data +supporting this. + +Returning `u64` across the board could be confusing in some circumstances as it +may wildly differ both in terms of signedness as well as size from the +underlying C type. Converting it back to the appropriate type runs the risk of +being onerous, but accessing these raw fields in theory happens quite rarely as +std should primarily be exporting cross-platform accessors for the various +fields here and there. + +# Alternatives +[alternatives]: #alternatives + +* The documentation of the raw modules in std could be modified to indicate that + the types contained within are intentionally not FFI compatible, and the same + structure could be preserved today with the types all being rewritten to what + they would be anyway if this RFC were implemented. For example `ino_t` on + Android would change to `u64` and `stat` on 32-bit Linux would change to + `stat64`. In doing this, however, it's not clear why we'd keep around all the + C namings and structure. + +* Instead of breaking existing functionality, new accessors and types could be + added to acquire the "lossless" version of a type. For example we could add a + `ino64` function on `DirEntryExt` which returns a `u64`, and for `stat` we + could add `as_raw_stat64`. This would, however, force `Metadata` to store two + different `stat` structures, and the breakage in practice this will cause may + be small enough to not warrant these great lengths. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Is the policy of almost always returning `u64` too strict? Should types like + `mode_t` be allowed as `i32` explicitly? Should the sign at least attempt to + always be preserved? From 618677ff40a121f7ad79c49e0b4f4f2e4ab957dd Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Sat, 19 Dec 2015 15:45:48 -0800 Subject: [PATCH 0642/1195] std::slice::{ copy, set }; --- text/0000-slice-copy-set.md | 60 +++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) create mode 100644 text/0000-slice-copy-set.md diff --git a/text/0000-slice-copy-set.md b/text/0000-slice-copy-set.md new file mode 100644 index 00000000000..a8a453f7c5a --- /dev/null +++ b/text/0000-slice-copy-set.md @@ -0,0 +1,60 @@ +- Feature Name: std::slice::{ copy, set }; +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Safe `memcpy` from one slice to another of the same type and length, and a safe +`memset` of a slice of type `T: Copy`. + +# Motivation +[motivation]: #motivation + +Currently, the only way to quickly copy from one non-`u8` slice to another is to +use a loop, or unsafe methods like `std::ptr::copy_nonoverlapping`. This allows +us to guarantee a `memcpy` for `Copy` types, and is safe. The only way to +`memset` a slice, currently, is a loop, and we should expose a method to allow +people to do this. This also completely gets rid of the point of +`std::slice::bytes`, which means we can remove this deprecated and useless +module. + +# Detailed design +[design]: #detailed-design + +Add two functions to `std::slice`. + +```rust +pub fn set(slice: &mut [T], value: T); +pub fn copy(src: &[T], dst: &mut [T]); +``` + +`set` loops through slice, setting each member to value. This will lower to a +memset in all cases possible. + +`copy` panics if `src.len() != dst.len()`, then `memcpy`s the members from +`src` to `dst`. + +# Drawbacks +[drawbacks]: #drawbacks + +Two new functions in `std::slice`. + +# Alternatives +[alternatives]: #alternatives + +We could name these functions something different. + +`memcpy` is also pretty weird, here. Panicking if the lengths differ is +different from what came before; I believe it to be the safest path, because I +think I'd want to know, personally, if I'm passing the wrong lengths to copy. +However, `std::slice::bytes::copy_memory`, the function I'm basing this on, only +panics if `dst.len() < src.len()`. So... room for discussion, here. + +However, these are necessary functions. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None, as far as I can tell. From 08a87e4ac99dcecec973ceef249c8305239529ba Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Sat, 19 Dec 2015 16:21:29 -0800 Subject: [PATCH 0643/1195] Change a few things --- text/0000-slice-copy-set.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/text/0000-slice-copy-set.md b/text/0000-slice-copy-set.md index a8a453f7c5a..6cd41da9d49 100644 --- a/text/0000-slice-copy-set.md +++ b/text/0000-slice-copy-set.md @@ -31,7 +31,7 @@ pub fn copy(src: &[T], dst: &mut [T]); ``` `set` loops through slice, setting each member to value. This will lower to a -memset in all cases possible. +memset in all possible cases. `copy` panics if `src.len() != dst.len()`, then `memcpy`s the members from `src` to `dst`. @@ -39,7 +39,15 @@ memset in all cases possible. # Drawbacks [drawbacks]: #drawbacks -Two new functions in `std::slice`. +Two new functions in `std::slice`. `std::slice::set` *will not* be lowered to a +`memset` in any case where the bytes of `value` are not all the same, as in + +```rust +// let points: [f32; 16]; +std::slice::set(&mut points, 1.0); // This is not lowered to a memset + // (However, it is lowered to a simd loop, + // which is what a memset is, in reality) +``` # Alternatives [alternatives]: #alternatives @@ -52,7 +60,7 @@ think I'd want to know, personally, if I'm passing the wrong lengths to copy. However, `std::slice::bytes::copy_memory`, the function I'm basing this on, only panics if `dst.len() < src.len()`. So... room for discussion, here. -However, these are necessary functions. +These are necessary functions, in the opinion of the author. # Unresolved questions [unresolved]: #unresolved-questions From ff919f110a1437f41b0d820312a79645c8ff447a Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Sat, 19 Dec 2015 19:30:54 -0800 Subject: [PATCH 0644/1195] Switch to fill, Add some language specifically about being defined for slices with uninitialized values Also add some language about alternatives --- ...ce-copy-set.md => 0000-slice-copy-fill.md} | 36 +++++++++++++------ 1 file changed, 25 insertions(+), 11 deletions(-) rename text/{0000-slice-copy-set.md => 0000-slice-copy-fill.md} (60%) diff --git a/text/0000-slice-copy-set.md b/text/0000-slice-copy-fill.md similarity index 60% rename from text/0000-slice-copy-set.md rename to text/0000-slice-copy-fill.md index 6cd41da9d49..4c3ff0dfac0 100644 --- a/text/0000-slice-copy-set.md +++ b/text/0000-slice-copy-fill.md @@ -1,4 +1,4 @@ -- Feature Name: std::slice::{ copy, set }; +- Feature Name: std::slice::copy, slice::fill - Start Date: (fill me in with today's date, YYYY-MM-DD) - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -23,15 +23,23 @@ module. # Detailed design [design]: #detailed-design -Add two functions to `std::slice`. +Add one function to `std::slice`. ```rust -pub fn set(slice: &mut [T], value: T); pub fn copy(src: &[T], dst: &mut [T]); ``` -`set` loops through slice, setting each member to value. This will lower to a -memset in all possible cases. +and one function to Primitive Type `slice`. + +```rust +impl [T] where T: Copy { + pub fn fill(&mut self, value: T); +} +``` + +`fill` loops through slice, setting each member to value. This will lower to a +memset in all possible cases. It is defined to call `fill` on a slice which has +uninitialized members. `copy` panics if `src.len() != dst.len()`, then `memcpy`s the members from `src` to `dst`. @@ -39,20 +47,21 @@ memset in all possible cases. # Drawbacks [drawbacks]: #drawbacks -Two new functions in `std::slice`. `std::slice::set` *will not* be lowered to a -`memset` in any case where the bytes of `value` are not all the same, as in +One new function in `std::slice`, and one new method on `slice`. `[T]::fill` +*will not* be lowered to a `memset` in any case where the bytes of `value` are +not all the same, as in ```rust // let points: [f32; 16]; -std::slice::set(&mut points, 1.0); // This is not lowered to a memset - // (However, it is lowered to a simd loop, - // which is what a memset is, in reality) +points.fill(1.0); // This is not lowered to a memset (However, it is lowered to + // a simd loop, which is what a memset is, in reality) ``` # Alternatives [alternatives]: #alternatives -We could name these functions something different. +We could name these functions something else. `fill`, for example, could be +called `set`. `memcpy` is also pretty weird, here. Panicking if the lengths differ is different from what came before; I believe it to be the safest path, because I @@ -60,6 +69,11 @@ think I'd want to know, personally, if I'm passing the wrong lengths to copy. However, `std::slice::bytes::copy_memory`, the function I'm basing this on, only panics if `dst.len() < src.len()`. So... room for discussion, here. +`fill` could be a free function, and `copy` could be a method. It is the +opinion of the author that `copy` is best as a free function, as it is +non-obvious which should be the "owner", `dst` or `src`. `fill` is more +obviously a method. + These are necessary functions, in the opinion of the author. # Unresolved questions From 094f5568e5eb9e106163f208366ed0be7fb617a4 Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Sun, 20 Dec 2015 05:50:25 -0800 Subject: [PATCH 0645/1195] More edits from the crowd --- text/0000-slice-copy-fill.md | 55 +++++++++++++++++++++++------------- 1 file changed, 36 insertions(+), 19 deletions(-) diff --git a/text/0000-slice-copy-fill.md b/text/0000-slice-copy-fill.md index 4c3ff0dfac0..e80bbc6e9be 100644 --- a/text/0000-slice-copy-fill.md +++ b/text/0000-slice-copy-fill.md @@ -23,33 +23,40 @@ module. # Detailed design [design]: #detailed-design -Add one function to `std::slice`. - -```rust -pub fn copy(src: &[T], dst: &mut [T]); -``` - -and one function to Primitive Type `slice`. +Add two methods to Primitive Type `slice`. ```rust impl [T] where T: Copy { pub fn fill(&mut self, value: T); + pub fn copy_from(&mut self, src: &[T]); } ``` `fill` loops through slice, setting each member to value. This will lower to a -memset in all possible cases. It is defined to call `fill` on a slice which has -uninitialized members. +memset in all possible cases. It is defined behavior to call `fill` on a slice +which has uninitialized members, and `dst` is guaranteed to be fully filled +afterwards. + +`copy` panics if `src.len() != dst.len()`, then `memcpy`s the members into +`dst` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`, +`dst` can have uninitialized members, and `dst` is guaranteed to be fully filled +afterwards. This means, for example, that the following is fully defined: -`copy` panics if `src.len() != dst.len()`, then `memcpy`s the members from -`src` to `dst`. +```rust +let s1: [u8; 16] = unsafe { std::mem::uninitialized() }; +let s2: [u8; 16] = unsafe { std::mem::uninitialized() }; +s1.fill(42); +s2.copy_from(s1); +println!("{}", s2); +``` + +And the program will print 16 '8's. # Drawbacks [drawbacks]: #drawbacks -One new function in `std::slice`, and one new method on `slice`. `[T]::fill` -*will not* be lowered to a `memset` in any case where the bytes of `value` are -not all the same, as in +Two new methods on `slice`. `[T]::fill` *will not* be lowered to a `memset` in +any case where the bytes of `value` are not all the same, as in ```rust // let points: [f32; 16]; @@ -57,24 +64,34 @@ points.fill(1.0); // This is not lowered to a memset (However, it is lowered to // a simd loop, which is what a memset is, in reality) ``` +Also, `copy_from` has it's arguments in a different order from it's most similar +`unsafe` alternative, `std::ptr::copy_nonoverlapping`. This is due to an +unfortunate error that cannot be solved with the now stable +`copy_nonoverlapping`, and the design decision should not be extended to +`copy_from`. + # Alternatives [alternatives]: #alternatives We could name these functions something else. `fill`, for example, could be called `set`. +`copy_from` could be called `copy_to`, and have the order of the arguments +switched around. This is a bad idea, as `copy_from` has a natural connection to +`dst = src` syntax. + `memcpy` is also pretty weird, here. Panicking if the lengths differ is different from what came before; I believe it to be the safest path, because I think I'd want to know, personally, if I'm passing the wrong lengths to copy. However, `std::slice::bytes::copy_memory`, the function I'm basing this on, only panics if `dst.len() < src.len()`. So... room for discussion, here. -`fill` could be a free function, and `copy` could be a method. It is the -opinion of the author that `copy` is best as a free function, as it is -non-obvious which should be the "owner", `dst` or `src`. `fill` is more -obviously a method. +`fill` and `copy_from` could both be free functions, and were in the original +draft of this document. However, overwhelming support for these as methods has +meant that these have become methods. -These are necessary functions, in the opinion of the author. +These are necessary, in the opinion of the author. Much unsafe code has been +written because these do not exist. # Unresolved questions [unresolved]: #unresolved-questions From 509a559c461c768b2b3d81228c27fab6c1fd486d Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Sun, 20 Dec 2015 06:32:30 -0800 Subject: [PATCH 0646/1195] Rename `fill` to `fill_with` --- text/0000-slice-copy-fill.md | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/text/0000-slice-copy-fill.md b/text/0000-slice-copy-fill.md index e80bbc6e9be..30519ca0e27 100644 --- a/text/0000-slice-copy-fill.md +++ b/text/0000-slice-copy-fill.md @@ -27,25 +27,25 @@ Add two methods to Primitive Type `slice`. ```rust impl [T] where T: Copy { - pub fn fill(&mut self, value: T); + pub fn fill_with(&mut self, value: T); pub fn copy_from(&mut self, src: &[T]); } ``` -`fill` loops through slice, setting each member to value. This will lower to a -memset in all possible cases. It is defined behavior to call `fill` on a slice -which has uninitialized members, and `dst` is guaranteed to be fully filled +`fill_with` loops through slice, setting each member to value. This will lower to a +memset in all possible cases. It is defined behavior to call `fill_with` on a slice +which has uninitialized members, and `self` is guaranteed to be fully filled afterwards. -`copy` panics if `src.len() != dst.len()`, then `memcpy`s the members into -`dst` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`, -`dst` can have uninitialized members, and `dst` is guaranteed to be fully filled +`copy` panics if `src.len() != self.len()`, then `memcpy`s the members into +`self` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`; +`self` can have uninitialized members, and `self` is guaranteed to be fully filled afterwards. This means, for example, that the following is fully defined: ```rust let s1: [u8; 16] = unsafe { std::mem::uninitialized() }; let s2: [u8; 16] = unsafe { std::mem::uninitialized() }; -s1.fill(42); +s1.fill_with(42); s2.copy_from(s1); println!("{}", s2); ``` @@ -55,13 +55,13 @@ And the program will print 16 '8's. # Drawbacks [drawbacks]: #drawbacks -Two new methods on `slice`. `[T]::fill` *will not* be lowered to a `memset` in +Two new methods on `slice`. `[T]::fill_with` *will not* be lowered to a `memset` in any case where the bytes of `value` are not all the same, as in ```rust // let points: [f32; 16]; -points.fill(1.0); // This is not lowered to a memset (However, it is lowered to - // a simd loop, which is what a memset is, in reality) +points.fill_with(1.0); // This is not lowered to a memset (However, it is lowered to + // a simd loop, which is what a memset is, in reality) ``` Also, `copy_from` has it's arguments in a different order from it's most similar @@ -73,8 +73,8 @@ unfortunate error that cannot be solved with the now stable # Alternatives [alternatives]: #alternatives -We could name these functions something else. `fill`, for example, could be -called `set`. +We could name these functions something else. `fill_with`, for example, could be +called `set` or `fill`. `copy_from` could be called `copy_to`, and have the order of the arguments switched around. This is a bad idea, as `copy_from` has a natural connection to @@ -86,12 +86,12 @@ think I'd want to know, personally, if I'm passing the wrong lengths to copy. However, `std::slice::bytes::copy_memory`, the function I'm basing this on, only panics if `dst.len() < src.len()`. So... room for discussion, here. -`fill` and `copy_from` could both be free functions, and were in the original -draft of this document. However, overwhelming support for these as methods has -meant that these have become methods. +`fill_with` and `copy_from` could both be free functions, and were in the +original draft of this document. However, overwhelming support for these as +methods has meant that these have become methods. -These are necessary, in the opinion of the author. Much unsafe code has been -written because these do not exist. +These are necessary, in my opinion. Much unsafe code has been written because +these do not exist. # Unresolved questions [unresolved]: #unresolved-questions From 824d43c131f99b30faa6537dc038e9780e57d46c Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Sun, 20 Dec 2015 17:41:02 -0800 Subject: [PATCH 0647/1195] Final name change; back to `fill` --- text/0000-slice-copy-fill.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/text/0000-slice-copy-fill.md b/text/0000-slice-copy-fill.md index 30519ca0e27..7e96e577950 100644 --- a/text/0000-slice-copy-fill.md +++ b/text/0000-slice-copy-fill.md @@ -27,13 +27,13 @@ Add two methods to Primitive Type `slice`. ```rust impl [T] where T: Copy { - pub fn fill_with(&mut self, value: T); + pub fn fill(&mut self, value: T); pub fn copy_from(&mut self, src: &[T]); } ``` -`fill_with` loops through slice, setting each member to value. This will lower to a -memset in all possible cases. It is defined behavior to call `fill_with` on a slice +`fill` loops through slice, setting each member to value. This will lower to a +memset in all possible cases. It is defined behavior to call `fill` on a slice which has uninitialized members, and `self` is guaranteed to be fully filled afterwards. @@ -45,7 +45,7 @@ afterwards. This means, for example, that the following is fully defined: ```rust let s1: [u8; 16] = unsafe { std::mem::uninitialized() }; let s2: [u8; 16] = unsafe { std::mem::uninitialized() }; -s1.fill_with(42); +s1.fill(42); s2.copy_from(s1); println!("{}", s2); ``` @@ -55,12 +55,12 @@ And the program will print 16 '8's. # Drawbacks [drawbacks]: #drawbacks -Two new methods on `slice`. `[T]::fill_with` *will not* be lowered to a `memset` in +Two new methods on `slice`. `[T]::fill` *will not* be lowered to a `memset` in any case where the bytes of `value` are not all the same, as in ```rust // let points: [f32; 16]; -points.fill_with(1.0); // This is not lowered to a memset (However, it is lowered to +points.fill(1.0); // This is not lowered to a memset (However, it is lowered to // a simd loop, which is what a memset is, in reality) ``` @@ -73,8 +73,8 @@ unfortunate error that cannot be solved with the now stable # Alternatives [alternatives]: #alternatives -We could name these functions something else. `fill_with`, for example, could be -called `set` or `fill`. +We could name these functions something else. `fill`, for example, could be +called `set`, `fill_from`, or `fill_with`. `copy_from` could be called `copy_to`, and have the order of the arguments switched around. This is a bad idea, as `copy_from` has a natural connection to @@ -86,7 +86,7 @@ think I'd want to know, personally, if I'm passing the wrong lengths to copy. However, `std::slice::bytes::copy_memory`, the function I'm basing this on, only panics if `dst.len() < src.len()`. So... room for discussion, here. -`fill_with` and `copy_from` could both be free functions, and were in the +`fill` and `copy_from` could both be free functions, and were in the original draft of this document. However, overwhelming support for these as methods has meant that these have become methods. From a7bf742db2af8f5be06322a139171d8819ead929 Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Mon, 21 Dec 2015 00:17:58 -0800 Subject: [PATCH 0648/1195] Correction from bluss --- text/0000-slice-copy-fill.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-slice-copy-fill.md b/text/0000-slice-copy-fill.md index 7e96e577950..b2f9ca761f5 100644 --- a/text/0000-slice-copy-fill.md +++ b/text/0000-slice-copy-fill.md @@ -46,7 +46,7 @@ afterwards. This means, for example, that the following is fully defined: let s1: [u8; 16] = unsafe { std::mem::uninitialized() }; let s2: [u8; 16] = unsafe { std::mem::uninitialized() }; s1.fill(42); -s2.copy_from(s1); +s2.copy_from(&s1); println!("{}", s2); ``` From 4489604752d2629556cfa7ef93047652e5b1cb4e Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 21 Dec 2015 15:14:32 +0100 Subject: [PATCH 0649/1195] first draft. update: zebra stripe the embedded laundry list of bugs. update: try to clarify what I am talking about in the "In practice" section. update: add example of re-export. update: added pub(crate) example. update: added discussion of precedent in Scala. --- text/0000-pub-restricted.md | 924 ++++++++++++++++++++++++++++++++++++ 1 file changed, 924 insertions(+) create mode 100644 text/0000-pub-restricted.md diff --git a/text/0000-pub-restricted.md b/text/0000-pub-restricted.md new file mode 100644 index 00000000000..e9d991688dd --- /dev/null +++ b/text/0000-pub-restricted.md @@ -0,0 +1,924 @@ +- Feature Name: pub_restricted +- Start Date: 2015-12-18 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Expand the current `pub`/non-`pub` categorization of items with the +ability to say "make this item visible *solely* to a (named) module +tree." + +The current `crate` is one such tree, and would be expressed via: +`pub(crate) item`. Other trees can be denoted via a path employed in a +`use` statement, e.g. `pub(a::b) item`, or `pub(super) item`. + +# Motivation +[motivation]: #motivation + +Right now, if you have a definition for an item `X` that you want to +use in many places in a module tree, you can either +(1.) define `X` at the root of the tree as a non-`pub` item, or +(2.) you can define `X` as a `pub` item in some submodule +(and import into the root of the module tree via `use`). + +But: Sometimes neither of these options is really what you want. + +There are scenarios where developers would like an item to be visible +to a particular module subtree (or a whole crate in its entirety), but +it is not possible to move the item's (non-pub) definition to the root +of that subtree (which would be the usual way to expose an item to a +subtree without making it pub). + +If the definition of `X` itself needs access to other private items +within a submodule of the tree, then `X` *cannot* be put at the root +of the module tree. Illustration: + +```rust +// Intent: `a` exports `I` and `foo`, but nothing else. +pub mod a { + pub const I: i32 = 3; + + // `semisecret` will be used "many" places within `a`, but + // is not meant to be exposed outside of `a`. + fn semisecret(x: i32) -> i32 { use self::b::c::J; x + J } + + pub fn foo(y: i32) -> i32 { semisecret(I) + y } + pub fn bar(z: i32) -> i32 { semisecret(I) * z } + + mod b { + mod c { + const J: i32 = 4; // J is meant to be hidden from the outside world. + } + } +} +``` + +(Note: the `pub mod a` is meant to be at the root of some crate.) + +The latter code fails to compile, due to the privacy violation where +the body of `fn semisecret` attempts to access `a::b::c::J`, which +is not visible in the context of `a`. + +A standard way to deal with this today is to use the second approach +described above (labelled "(2.)"): move `fn semisecret` down into the place where it can +access `J`, marking `fn semisecret` as `pub` so that it can still be +accessed within the items of `a`, and then re-exporting `semisecret` +as necessary up the module tree. + +```rust +// Intent: `a` exports `I` and `foo`, but nothing else. +pub mod a { + pub const I: i32 = 3; + + // `semisecret` will be used "many" places within `a`, but + // is not meant to be exposed outside of `a`. + // (If we put `pub use` here, then *anyone* could access it.) + use self::b::semisecret; + + pub fn foo(y: i32) -> i32 { semisecret(I) + y } + pub fn bar(z: i32) -> i32 { semisecret(I) * z } + + mod b { + pub use self::c::semisecret; + mod c { + const J: i32 = 4; // J is meant to be hidden from the outside world. + pub fn semisecret(x: i32) -> i32 { x + J } + } + } +} +``` + +This works, but there is a serious issue with it: One cannot easily +tell exactly how "public" `fn semisecret` is. In particular, +understanding who can access `semisecret` requires reasoning about +(1.) all of the `pub use`'s (aka re-exports) of `semisecret`, and +(2.) the `pub`-ness of every module in a path leading to `fn +semisecret` or one of its re-exports. + +This RFC seeks to remedy the above problem via two main changes. + + 1. Give the user a way to explicitly restrict the intended scope + of where a `pub`-licized item can be used. + + 2. Modify the privacy rules so that `pub`-restricted items cannot be + used nor re-exported outside of their respective restricted areas. + +## Impact + +This difficulty in reasoning about the "publicness" of a name is not +just a problem for users; it also complicates efforts within the +compiler to verify that a surface API for a type does not itself use +or expose any private names. + +[There][18241] are [a][28325] number [of][28450] bugs [filed][28514] against +[privacy][29668] checking; some are simply +implementation issues, but the comment threads in the issues make it +clear that in some cases, different people have very different mental +models about how privacy interacts with aliases (e.g. `type` +declarations) and re-exports. + +In theory, we can add the changes of this RFC without breaking any old +code. (That is, in principle the only affected code is that for item +definitions that use `pub(restriction)`. This limited addition would +still provide value to users in their reasoning about the visibility +of such items.) + +In practice, I expect that as part of the implementation of this RFC, +we will probably fix pre-existing bugs in the parts of privacy +checking verifying that surface API's do not use or expose private +names. + +Important: No such fixes to such pre-existing bugs are being +concretely proposed by this RFC; I am merely musing that by adding a +more expressive privacy system, we will open the door to fix bugs +whose exploits, under the old system, were the only way to express +certain patterns of interest to developers. + + +[18241]: https://github.com/rust-lang/rust/issues/18241 + + +[28325]: https://github.com/rust-lang/rust/issues/28325 + + +[28450]: https://github.com/rust-lang/rust/issues/28450 + + +[28514]: https://github.com/rust-lang/rust/issues/28514 + + +[29668]: https://github.com/rust-lang/rust/issues/29668 + + +[RFC 136]: https://github.com/rust-lang/rfcs/blob/master/text/0136-no-privates-in-public.md + + +[RFC amendment 200]: https://github.com/rust-lang/rfcs/pull/200 + + +# Detailed design +[design]: #detailed-design + +The main problem identified in the [motivation][] section is this: + +From an module-internal definition like +```rust +pub mod a { [...] mod b { [...] pub fn semisecret(x: i32) -> i32 { x + J } [...] } } +``` +one cannot readily tell exactly how "public" the `fn semisecret` is meant to be. + +As already stated, this RFC seeks to remedy the above problem via two +main changes. + + 1. Give the user a way to explicitly restrict the intended scope + of where a `pub`-licized item can be used. + + 2. Modify the privacy rules so that `pub`-restricted items cannot be + used nor re-exported outside of their respective restricted areas. + +## Syntax + +The new feature is to restrict the scope by adding the module subtree +(which acts as the restricted area) in parentheses after the `pub` +keyword, like so: + +```rust +pub(a::b::c) item; +``` + +The path in the restriction is resolved just like a `use` statement: it +is resolved absolutely, from the crate root. + +Just like `use` statements, one can also write relative paths, by +starting them with `self` or a sequence of `super`'s. + +```rust +pub(super::super) item; +// or +pub(self) item; // (semantically equiv to no `pub`; see below) +``` + +In addition to the forms analogous to `use`, there is one new form: + +```rust +pub(crate) item; +``` + +In other words, the grammar is changed like so: + +old: +``` +VISIBILITY ::= | `pub` +``` + +new: +``` +VISIBILITY ::= | `pub` | `pub` `(` USE_PATH `)` | `pub` `(` `crate` `)` +``` + +One can use these `pub(restriction)` forms anywhere that one can +currently use `pub`. In particular, one can use them on item +defintions, methods in an impl, the fields of a struct +definition, and on `pub use` re-exports. + +## Semantics + +The meaning of `pub(restriction)` is as follows: The definition of +every item, method, field, or name (e.g. a re-export) is associated +with a restriction. + +A restriction is either: the universe of all crates (aka +"unrestricted"), the current crate, or an absolute path to a module +sub-hierarchy in the current crate. A restricted thing cannot be +directly "used" in source code outside of its restricted area. (The +term "used" here is meant to cover both direct reference in the +source, and also implicit reference as the inferred type of an +expression or pattern.) + + * `pub` written with no explicit restriction means that there is no + restriction, or in other words, the restriction is the universe of + all crates. + + * `pub(crate)` means that the restriction is the current crate. + + * `pub()` means that the restriction is the module + sub-hierarchy denoted by ``, resolved in the context of the + occurrence of the `pub` modifier. (This is to ensure that `super` + and `self` make sense in such paths.) + +As noted above, the definition means that `pub(self) item` is the same +as if one had written just `item`. + + * The main reason to support this level of generality (which is + otherwise just "redundant syntax") is macros: one can write a macro + that expands to `pub($arg) item`, and a macro client can pass in + `self` as the `$arg` to get the effect of a non-pub definition. + +NOTE: even if the restriction of an item or name indicates that it is +accessible in some context, it may still be impossible to reference +it. In particular, we will still keep our existing rules regarding +`pub` items defined in non-`pub` modules; such items would have no +restriction, but still may be inaccessible if they are not re-exported in +some manner. + +## Revised Example +[revised]: #revised-example + +In the running example, one could instead write: + +```rust +// Intent: `a` exports `I` and `foo`, but nothing else. +pub mod a { + pub const I: i32 = 3; + + // `semisecret` will be used "many" places within `a`, but + // is not meant to be exposed outside of `a`. + // (`pub use` would be *rejected*; see Note 1 below) + use self::b::semisecret; + + pub fn foo(y: i32) -> i32 { semisecret(I) + y } + pub fn bar(z: i32) -> i32 { semisecret(I) * z } + + mod b { + pub(a) use self::c::semisecret; + mod c { + const J: i32 = 4; // J is meant to be hidden from the outside world. + + // `pub(a)` means "usable within hierarchy of `mod a`, but not + // elsewhere." + pub(a) fn semisecret(x: i32) -> i32 { x + J } + } + } +} +``` + +Note 1: The compiler would reject the variation of the above written +as: + +```rust +pub mod a { [...] pub use self::b::semisecret; [...] } +``` + +because `pub(a) fn semisecret` says that it cannot be used outside of +`a`, and therefore it be incorrect (or at least useless) to reexport +`semisecret` outside of `a`. + +Note 2: The most direct interpretation of the rules here leads me to +conclude that `b`'s re-export of `semisecret` needs to be restricted +to `a` as well. However, it may be possible to loosen things so that +the re-export could just stay as `pub` with no extra restriction; see +discussion of "IRS:PUNPM" in Unresolved Questions. + +This richer notion of privacy does offer us some other ways to +re-write the running example; instead of defining `fn semisecret` +within `c` so that it can access `J`, we might instead expose `J` to +`mod b` and then put `fn semisecret`, like so: + +```rust +pub mod a { + [...] + mod b { + use self::c::J; + pub(a) fn semisecret(x: i32) -> i32 { x + J } + mod c { + pub(b) const J: i32 = 4; + } + } +} +``` + +(This RFC takes no position on which of the above two structures is +"better"; a toy example like this does not provide enough context to +judge.) + +## Restrictions +[restrictions]: #restrictions + +Lets discuss what the restrictions actually mean. + +Some basic definitions: An item is just as it is declared in the Rust +reference manual: a component of a crate, located at a fixed path +(potentially at the "outermost" anonymous module) within the module +tree of the crate. + +Every item can be thought of as having some hidden implementation +component(s) along with an exposed surface API. + +So, for example, in `pub fn foo(x: Input) -> Output { Body }`, the +surface of `foo` includes `Input` and `Output`, while the `Body` is +hidden. + +The pre-existing privacy rules (both prior to and after this RFC) try +to enforce two things: (1.) when a item references a path, all of the +names on that path need to be visible (in terms of privacy) in the +referencing context and, (2.) private items should not be exposed in +the surface of public API's. + + * I am using the term "surface" rather than "signature" deliberately, + since I think the term "signature" is too broad to be used to + accurately describe the current semantics of rustc. See my recent + [Surface blog post][] for further discussion. + +[Surface blog post]: http://blog.pnkfx.org/blog/2015/12/19/signatures-and-surfaces-thoughts-on-privacy-versus-dependency/ + +This RFC is expanding the scope of (2.) above, so that the rules are now: + + 1. when a item references a path (in its implementation or in its + signature), all of the names on that path must be visible in the + referencing context. + + 2. items *restricted* to an area R should not be exposed in the + surface API of names or items that can themselves be exported + beyond R. (Privacy is now a special case of this more general + notion.) + + For convenience, it is legal to declare a field (or inherent + method) with a strictly larger area of restriction than its + `self`. See discussion in the [examples][parts-more-public-than-whole]. + +In principle, validating (1.) can be done via the pre-existing privacy +code. (However, it may make sense to do it by mapping each name to its +associated restriction; I don't think that will change the outcome, +but it might make the checking code simpler. But I am not an expert on +the current state of the privacy checking code.) + +Validating (2.) requires traversing the surface API for each item and +comparing the restriction for every reference to the restriction of +the item itself. + +## Trait methods + +Currently, trait associated item syntax carries no `pub` modifier. + +A question arises when trying to apply the terminology of this RFC: +are trait associated items implicitly `pub`, in the sense that they +are unrestricted? + +The simple answer is: No, associated items are not implicitly `pub`; +at least, not in general. (They are not in general implicitly `pub` +today either, as discussed in [RFC 136][when public (RFC 136)].) +(If they were implictly `pub`, things would be difficult; further +discussion in attached [appendix][associated items digression].) + +[when public (RFC 136)]: https://github.com/rust-lang/rfcs/blob/master/text/0136-no-privates-in-public.md#when-is-an-item-public + +However, since this RFC is introducing multiple kinds of `pub`, we +should address the topic of what *is* the `pub`-ness of associated +items. + + * When analyzing a trait definition, then associated items should be + considered to inherit the `pub`-ness, if any, of their defining + trait. + + We want to make sure that this code continues to work: + + ```rust + mod a { + struct S(String); + trait Trait { + fn make_s(&self) -> S; // referencing `S` is ok, b/c `Trait` is not `pub` + } + } + ``` + + And under this RFC, we now allow this as well: + + ```rust + mod a { + struct S(String); + mod b { + pub(a) trait Trait { + fn mk_s(&self) -> ::a::S; + // referencing `::a::S` is ok, b/c `Trait` is restricted to `::a` + } + } + use self::b::Trait; + } + ``` + + Note that in stable Rust today, it is an error to declare the latter trait + within `mod b` as non-`pub` (since the `use self::b::Trait` would be + referencing a private item), + *and* in the Rust nightly channel it is a warning to declare it + as `pub trait Trait { ... }`. + + The point of this RFC is to give users a sensible way to declare + such traits within `b`, without allowing them to be exposed outside + of `a`. + + * When analyzing an `impl Trait for Type`, there may be distinct + restrictions assigned to the `Trait` and the `Type`. However, + since both the `Trait` and the `Type` must be visible in the + context of the module where the `impl` occurs, there should + be a subtree relationship between the two restrictions; in other + words, one restriction should be less than (or equal to) the other. + + So just use the minimum of the two restrictions when analyzing + the right-hand sides of the associated items in the impl. + + Note: I am largely adopting this rule in an attempt to be + consistent with [RFC 136][when public (RFC 136)]. I invite + discussion of whether this rule actually makes sense as phrased + here. + +## More examples! +[examples]: #more-examples + +These examples meant to explore the syntax a bit. They are *not* meant +to provide motivation for the feature (i.e. I am not claiming that the +feature is making this code cleaner or easier to reason about). + +### Impl item example +[impl-item-example]: #impl-item-example + +```rust +pub struct S; + +mod a { + pub fn call_foo(s: &S) { s.foo(); } + + impl S { + pub(a) fn foo(&self) { println!("only callable within `a`"); } + } +} + +fn rejected(s: &S) { + s.foo(); //~ ERROR: `S::foo` not visible outside of module `a` +} +``` + +(You may be wondering: "Could we move that `impl S` out to the +top-level, out of `mod a`?" Well ... see discussion in the +[unresolved questions][def-outside-restriction].) + +### Restricting fields example +[restricting fields example]: #restricting-fields-example + +```rust +mod a { + #[derive(Default)] + struct Priv(i32); + + pub mod b { + use a::Priv as Priv_a; + + #[derive(Default)] + pub struct F { + pub x: i32, + y: Priv_a, + pub(a) z: Priv_a, + } + + #[derive(Default)] + pub struct G(pub i32, Priv_a, pub(a) Priv_a); + + // ... accesses to F.{x,y,z} ... + // ... accesses to G.{0,1,2} ... + } + // ... accesses to F.{x,z} ... + // ... accesses to G.{0,2} ... +} + +mod k { + use a::b::{F, G}; + // ... accesses to F and F.x ... + // ... accesses to G and G.0 ... +} +``` + + +### Fields and inherent methods more public than self +[parts-more-public-than-whole]: #fields-and-inherent-methods-more-public-than-self + +In Rust today, one can write + +```rust +mod a { struct X { pub y: i32, } } +``` + +This RFC was crafted to say that fields and inherent methods +can have an associated restriction that is larger than the restriction +of its `self`. This was both to keep from breaking the above +code, and also because it would be annoying to be forced to write: + +```rust +mod a { struct X { pub(a) y: i32, } } +``` + +(This RFC is not an attempt to resolve things like +[Rust Issue 30079][30079]; the decision of how to handle that issue +can be dealt with orthogonally, in my opinion.) + +[30079]: https://github.com/rust-lang/rust/issues/30079 + + +So, under this RFC, the following is legal: + +```rust +mod a { + pub use self::b::stuff_with_x; + mod b { + struct X { pub y: i32, pub(a) z: i32 } + mod c { + impl super::X { + pub(c) fn only_in_c(&mut self) { self.y += 1; } + + pub fn callanywhere(&mut self) { + self.only_in_c(); + println!("X.y is now: {}", self.y); + } + } + } + pub fn stuff_with_x() { + let mut x = X { y: 10, z: 20}; + x.callanywhere(); + } + } +} +``` + +In particular: + + * It is okay that the fields `y` and `z` and the inherent method + `fn callanywhere` are more publicly visible than `X`. + + (Just because we declare something `pub` does not mean it will + actually be *possible* to reach it from arbitrary contexts. Whether + or not such access is possible will depend on many things, including + but not limited to the restriction attached and also future decisions + about issues like [issue 30079][30079].) + + * We are allowed to restrict an inherent method, `fn only_in_c`, to + a subtree of the module tree where `X` is itself visible. + +### Re-exports + +Here is an example of a `pub use` re-export using the new +feature, including both correct and invalid uses of the extended form. + +```rust +mod a { + mod b { + pub(a) struct X { pub y: i32, pub(a) z: i32 } // restricted to `mod a` tree + mod c { + pub mod d { + pub(super) use a::b::X as P; // ok: a::b::c is submodule of `a` + } + + fn swap_ok(x: d::P) -> d::P { // ok: `P` accessible here + X { z: x.y, y: x.z } + } + } + + fn swap_bad(x: c::d::P) -> c::d::P { //~ ERROR: `c::d::P` not visible outside `a::b::c` + X { z: x.y, y: x.z } + } + + mod bad { + pub use super::X; //~ ERROR: `X` cannot be reexported outside of `a` + } + } + + fn swap_ok2(x: X) -> X { // ok: `X` accessible from `mod a`. + X { z: x.y, y: x.z } + } +} +``` + +### Crate restricted visibility + +This is a concrete illusration of how one might use the `pub(crate) item` form, +(which is perhaps quite similar to Java's default "package visibility"). + +Crate `c1`: + +```rust +pub mod a { + struct Priv(i32); + + pub(crate) struct R { pub y: i32, z: Priv } // ok: field allowed to be more public + pub struct S { pub y: i32, z: Priv } + + pub fn to_r_bad(s: S) -> R { ... } //~ ERROR: `R` restricted solely to this crate + + pub(crate) fn to_r(s: S) -> R { R { y: s.y, z: s.z } } // ok: restricted to crate +} + +use a::{R, S}; // ok: `a::R` and `a::S` are both visible + +pub use a::R as ReexportAttempt; //~ ERROR: `a::R` restricted solely to this crate +``` + +Crate `c2`: + +```rust +extern crate c1; + +use c1::a::S; // ok: `S` is unrestricted + +use c1::a::R; //~ ERROR: `c1::a::R` not visible outside of its crate +``` + +## Precedent + +When I started on this I was not sure if this form of delimited access +to a particular module subtree had a precedent; the closest thing I +could think of was C++ `friend` modifiers (but `friend` is far more +ad-hoc and free-form than what is being proposed here). + +### Scala + +It has since been pointed out to me that Scala has scoped access +modifiers `protected[Y]` and `private[Y]`, which specify that access +is provided upto `Y` (where `Y` can be a package, class or singleton +object). + +The feature proposed by this RFC appears to be similar in intent to +Scala's scoped access modifiers. + +Having said that, I will admit that I am not clear on what +distinction, if any, Scala draws between `protected[Y]` and +`private[Y]` when `Y` is a package, which is the main analogy for our +purposes, or if they just allow both forms as synonyms for +convenience. + +(I can imagine a hypothetical distinction in Scala when `Y` is a +class, but my skimming online has not provided insight as to what the +actual distinction is.) + +Even if there is some distinction drawn between the two forms in +Scala, I suspect Rust does not need an analogous distinction in it's +`pub(restricted)` + +# Drawbacks +[drawbacks]: #drawbacks + +Obviously, +`pub(restriction) item` complicates the surface syntax of the language. + + * However, my counter-argument to this drawback is that this feature + in fact *simplifies* the developer's mental model. It is easier to + directly encode the expected visibility of an item via + `pub(restriction)` than to figure out the right concoction via a + mix of nested `mod` and `pub use` statements. And likewise, it is + easier to read it too. + +Developers may misuse this form and make it hard to access the tasty +innards of other modules. + + * This is true, but I claim it is irrelevant. + + The effect of this change is solely on the visibility of items + *within* a crate. No rules for inter-crate access change. + + From the perspective of cross-crate development, this RFC changes + nothing, except that it may lead some crate authors to make some + things no longer universally `pub` that they were forced to make + visible before due to earlier limitations. I claim that in such + cases, those crate authors probably always intended for such items + to be non-`pub`, but language limitations were forcing their hand. + + As for intra-crate access: My expectation is that an individual + crate will be made by a team of developers who can work out what + mutual visibility they want and how it should evolve over time. + This feature may affect their work flow to some degree, but they + can choose to either use it or not, based on their own internal + policies. + + +# Alternatives +[alternatives]: #alternatives + +## Do not extend the language! + + * Change privacy rules and make privacy analysis "smarter" + (e.g. global reachabiliy analysis) + + The main problem with this approach is that we tried it, and it + did not work well: The implementation was buggy, and the user-visible + error messages were hard to understand. + + See discussion when the team was discussing the [public items amendment][] + +[public items amendment]: https://github.com/rust-lang/meeting-minutes/blob/master/weekly-meetings/2014-09-16.md#rfc-public-items + + * "Fix" the mental model of privacy (if necessary) without extending + the language. + + The alternative is bascially saying: "Our existing system is fine; all + of the problems with it are due to bugs in the implementation" + + I am sympathetic to this response. However, I think it doesn't + quite hold up. Some users want to be able to define items that are + exposed outside of their module but still restrict the scope of + where they can be referenced, as discussed in the [motivation][] + section, and I do not think the current model can be "fixed" to + support that use case, at least not without adding some sort of + global reachability analysis as discussed in the previous bullet. + +In addition, these two alternatives do not address the main point +being made in the [motivation][] section: one cannot tell exactly how +"public" a `pub` item is, without working backwards through the module +tree for all of its re-exports. + +## Curb your ambitions! + + * Instead of adding support for restricting to arbitrary module + subtrees, narrow the feature to just `pub(crate) item`, so that one + chooses either "module private" (by adding no modifier), or + "universally visible" (by adding `pub`), or "visible to just the + current crate" (by adding `pub(crate)`). + + This would be somewhat analogous to Java's relatively coarse + grained privacy rules, where one can choose `public`, `private`, + `protected`, or the unnamed "package" visibility. + + I am all for keeping the implementation simple. However, the reason + that we should support arbitrary module subtrees is that doing so + will enable certain refactorings. Namely, if I decide I want to + inline the definition for one or more crates `A1`, `A2`, ... into + client crate `C` (i.e. replacing `extern crate A1;` with an + suitably defined `mod A1 { ... }`, but I do not want to worry about + whether doing so will risk future changes violating abstraction + boundaries that were previously being enforced via `pub(crate)`, + then I believe allowing `pub(path)` will allow a mechanical tool to + do the inline refactoring, rewriting each `pub(crate)` as `pub(A1)` + as necessary. + +# Unresolved questions +[unresolved]: #unresolved-questions + +## Can definition site fall outside restriction? +[def-outside-restriction]: #can-definition-site-fall-outside-restriction + +For example, is it illegal to do the following: + +```rust +mod a { + mod child { } + mod b { pub(super::child) const J: i32 = 3; } +} +``` + +Or does it just mean that `J`, despite being defined in `mod b`, is +itself not accessible in `mod b`? + +pnkfelix is personally inclined to make this sort of thing illegal, +mainly because he finds it totally unintuitive, but is interested in +hearing counter-arguments. Certainly the earlier [impl item example][] +would look prettier as: + +```rust +pub struct S; + +impl S { + pub(a) fn foo(&self) { println!("only callable within `a`"); } +} + +mod a { + pub fn call_foo(s: &S) { s.foo(); } + +} + +fn rejected(s: &S) { + s.foo(); //~ ERROR: `S::foo` not visible outside of module `a` +} +``` + +## Implicit Restriction Satisfaction (IRS:PUNPM) + +If a re-export occurs within a non-`pub` module, can we treat it as +implicitly satisfying a restriction to `super` imposed by the item it +is re-exporting? + +In particular, the [revised example][revised] included: + +```rust +// Intent: `a` exports `I` and `foo`, but nothing else. +pub mod a { + [...] + mod b { + pub(a) use self::c::semisecret; + mod c { pub(a) fn semisecret(x: i32) -> i32 { x + J } } + } +} +``` + +However, since `b` is non-`pub`, its `pub` items and re-exports are +solely accessible via the subhierarchy of its module parent (i.e., +`mod a`, as long as no entity attempts to re-export them to a braoder +scope. + +In other words, in some sense `mod b { pub use item; }` *could* +implicitly satisfy a restriction to `super` imposed by `item` (if we +chose to allow it). + +Note: If it were `pub mod b` or `pub(restrict) mod b`, then the above +reasoning would not hold. Therefore, this discussion is limited to +re-exports from non-`pub` modules. + +If we do not allow such implicit restriction satisfaction +for `pub use` re-exports from non-`pub` modules (IRS:PUNPM), then: + +```rust +pub mod a { + [...] + mod b { + pub use self::c::semisecret; + mod c { pub(a) fn semisecret(x: i32) -> i32 { x + J } } + } +} +``` + +would be rejected, and one would be expected to write either: + +```rust + pub(super) use self::c::semisecret; +``` + +or + +```rust + pub(a) use self::c::semisecret; +``` + + +(Side note: I am *not* saying that under IRS:PUNPM, the two forms `pub +use item` and `pub(super) use item` would be considered synonymous, +even in the context of a non-pub module like `mod b`. In particular, +`pub(super) use item` may be imposing a new restriction on the +re-exported name that was not part of its original definition.) + +# Appendices + +## Associated Items Digression +[associated items digression]: #associated-items-digression + +If associated items were implicitly `pub`, in the sense that they are +unrestricted, then that would conflict with the rules imposed by this +RFC, in the sense that the surface API of a non-`pub` trait is +composed of its associated items, and so if all associated items were +implicitly `pub` and unrestricted, then this code would be rejected: + +```rust +mod a { + struct S(String); + trait Trait { + fn mk_s(&self) -> S; // is this implicitly `pub` and unrestricted? + } + impl Trait for () { fn mk_s(&self) -> S { S(format!("():()")) } } + impl Trait for i32 { fn mk_s(&self) -> S { S(format!("{}:i32", self)) } } + pub fn foo(x:i32) -> String { format!("silly{}{}", ().mk_s().0, x.mk_s().0) } +} +``` + +If associated items were implicitly `pub` and unrestricted, then the +above code would be rejected under direct interpretation of the rules +of this RFC (because `fn make_s` is implicitly unrestricted, but the +surface of `fn make_s` references `S`, a non-`pub` item). This would +be backwards-incompatible (and just darn inconvenient too). + +So, to be clear, this RFC is *not* suggesting that associated items be +implicitly `pub` and unrestricted. From e35b29890a694fe6d652efcdd4e45808937a4685 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 21 Dec 2015 16:44:37 +0100 Subject: [PATCH 0650/1195] fix typo in link defn. --- text/0000-pub-restricted.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-pub-restricted.md b/text/0000-pub-restricted.md index e9d991688dd..47882d45b8c 100644 --- a/text/0000-pub-restricted.md +++ b/text/0000-pub-restricted.md @@ -471,7 +471,7 @@ to provide motivation for the feature (i.e. I am not claiming that the feature is making this code cleaner or easier to reason about). ### Impl item example -[impl-item-example]: #impl-item-example +[impl item example]: #impl-item-example ```rust pub struct S; From 65ea6716d0383d4174286b074ded37722bde1562 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 21 Dec 2015 09:25:15 -0800 Subject: [PATCH 0651/1195] Typos --- text/0000-cargo-cfg-dependencies.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-cargo-cfg-dependencies.md b/text/0000-cargo-cfg-dependencies.md index b55c1d27207..055a389cfbd 100644 --- a/text/0000-cargo-cfg-dependencies.md +++ b/text/0000-cargo-cfg-dependencies.md @@ -33,7 +33,7 @@ define these dependencies. # Detailed design [design]: #detailed-design -The target-specific dependency syntax in Cargo will be expanded to to include +The target-specific dependency syntax in Cargo will be expanded to include not only full target strings but also `#[cfg]` expressions: ```toml @@ -160,7 +160,7 @@ may not always quite get there. # Unresolved questions [unresolved]: #unresolved-questions -* This is no the only change that's known to Cargo which is known to not be +* This is not the only change that's known to Cargo which is known to not be forwards-compatible, so it may be best to lump them all together into one Cargo release instead of releasing them over time, but should this be blocked on those ideas? (note they have not been formed into an RFC yet) From b29c3f7420fb7f7d3c4e96be7cea35fb23e680ad Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Tue, 22 Dec 2015 16:37:56 -0500 Subject: [PATCH 0652/1195] Amend RFC 1270 to describe actual implementation Change `reason` to `note` and remove `use`. --- text/1270-deprecation.md | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/text/1270-deprecation.md b/text/1270-deprecation.md index 9aeac97ecdb..ebd327adaaf 100644 --- a/text/1270-deprecation.md +++ b/text/1270-deprecation.md @@ -6,7 +6,7 @@ # Summary This RFC proposes to allow library authors to use a `#[deprecated]` attribute, -with optional `since = "`*version*`"` and `reason = "`*free text*`"`fields. The +with optional `since = "`*version*`"` and `note = "`*free text*`"`fields. The compiler can then warn on deprecated items, while `rustdoc` can document their deprecation accordingly. @@ -32,12 +32,11 @@ possible fields are optional: deprecating the item, following the semver scheme. Rustc does not know about versions, thus the content of this field is not checked (but will be by external lints, e.g. [rust-clippy](https://github.com/Manishearth/rust-clippy). -* `reason` should contain a human-readable string outlining the reason for -deprecating the item. While this field is not required, library authors are -strongly advised to make use of it to convey the reason for the deprecation to -users of their library. The string is interpreted as plain unformatted text -(for now) so that rustdoc can include it in the item's documentation without -messing up the formatting. +* `note` should contain a human-readable string outlining the reason for +deprecating the item and/or what to use instead. While this field is not required, +library authors are strongly advised to make use of it. The string is interpreted +as plain unformatted text (for now) so that rustdoc can include it in the item's +documentation without messing up the formatting. On use of a *deprecated* item, `rustc` will `warn` of the deprecation. Note that during Cargo builds, warnings on dependencies get silenced. While this has @@ -51,7 +50,7 @@ to warn on use of deprecated items in library crates, however this is outside the scope of this RFC. `rustdoc` will show deprecation on items, with a `[deprecated]` box that may -optionally show the version and reason where available. +optionally show the version and note where available. The language reference will be extended to describe this feature as outlined in this RFC. Authors shall be advised to leave their users enough time to react @@ -74,8 +73,8 @@ prefix to the `Foo` type: ``` extern crate rust_foo; -#[deprecated(since = "0.2.1", use="rust_foo::Foo", - reason="The rust_foo version is more advanced, and this crates' will likely be discontinued")] +#[deprecated(since = "0.2.1", + note="The rust_foo version is more advanced, and this crate's will likely be discontinued")] struct Foo { .. } ``` @@ -83,7 +82,7 @@ Users of her crate will see the following once they `cargo update` and `build`: ``` src/foo_use.rs:27:5: 27:8 warning: Foo is marked deprecated as of version 0.2.1 -src/foo_use.rs:27:5: 27:8 note: The rust_foo version is more advanced, and this crates' will likely be discontinued +src/foo_use.rs:27:5: 27:8 note: The rust_foo version is more advanced, and this crate's will likely be discontinued ``` Rust-clippy will likely gain more sophisticated checks for deprecation: @@ -108,7 +107,7 @@ deprecation checks. * make the `since` field required and check that it's a single version * require either `reason` or `use` be present * `reason` could include markdown formatting -* rename the `reason` field to `note` to clarify it's broader usage. +* rename the `reason` field to `note` to clarify its broader usage. (**done!**) * add a `note` field and make `reason` a field with specific meaning, perhaps even predefine a number of valid reason strings, as JEP277 currently does * Add a `use` field containing a plain text of what to use instead From 51517e5135f12d4b5045b83edde65a660eabbef9 Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Mon, 28 Dec 2015 18:16:58 +0000 Subject: [PATCH 0653/1195] String/Vec::replace_range(RangeArgument, IntoIterator) --- text/0000-replace-slice.md | 209 +++++++++++++++++++++++++++++++++++++ 1 file changed, 209 insertions(+) create mode 100644 text/0000-replace-slice.md diff --git a/text/0000-replace-slice.md b/text/0000-replace-slice.md new file mode 100644 index 00000000000..cf3a2382108 --- /dev/null +++ b/text/0000-replace-slice.md @@ -0,0 +1,209 @@ +- Feature Name: replace-slice +- Start Date: 2015-12-28 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +Add a `replace_slice` method to `Vec` and `String` removes a range of elements, +and replaces it in place with a given sequence of values. +The new sequence does not necessarily have the same length as the range it replaces. + +# Motivation +[motivation]: #motivation + +An implementation of this operation is either slow or dangerous. + +The slow way uses `Vec::drain`, and then `Vec::insert` repeatedly. +The latter part takes quadratic time: +potentially many elements after the replaced range are moved by one offset +potentially many times, once for each new element. + +The dangerous way, detailed below, takes linear time +but involves unsafely moving generic values with `std::ptr::copy`. +This is non-trivial `unsafe` code, where a bug could lead to double-dropping elements +or exposing uninitialized elements. +(Or for `String`, breaking the UTF-8 invariant.) +It therefore benefits form having a shared, carefully-reviewed implementation +rather than leaving it to every potential user to do it themselves. + +While it could be an external crate on crates.io, +this operation is general-purpose enough that I think it belongs in the standard library, +similar to `Vec::drain`. + +# Detailed design +[design]: #detailed-design + +An example implementation is below. + +The proposal is to have inherent methods instead of extension traits. +(Traits are used to make this testable outside of `std` +and to make a point in Unresolved Questions below.) + +```rust +#![feature(collections, collections_range, str_char)] + +extern crate collections; + +use collections::range::RangeArgument; +use std::ptr; + +trait ReplaceVecSlice { + fn replace_slice(&mut self, range: R, iterable: I) + where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator; +} + +impl ReplaceVecSlice for Vec { + fn replace_slice(&mut self, range: R, iterable: I) + where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator + { + let len = self.len(); + let range_start = *range.start().unwrap_or(&0); + let range_end = *range.end().unwrap_or(&len); + assert!(range_start <= range_end); + assert!(range_end <= len); + let mut iter = iterable.into_iter(); + // Overwrite range + for i in range_start..range_end { + if let Some(new_element) = iter.next() { + unsafe { + *self.get_unchecked_mut(i) = new_element + } + } else { + // Iterator shorter than range + self.drain(i..range_end); + return + } + } + // Insert rest + let iter_len = iter.len(); + let elements_after = len - range_end; + let free_space_start = range_end; + let free_space_end = free_space_start + iter_len; + + // FIXME: merge the reallocating case with the first ptr::copy below? + self.reserve(iter_len); + + let p = self.as_mut_ptr(); + unsafe { + // In case iter.next() panics, leak some elements rather than risk double-freeing them. + self.set_len(free_space_start); + // Shift everything over to make space (duplicating some elements). + ptr::copy(p.offset(free_space_start as isize), + p.offset(free_space_end as isize), + elements_after); + for i in free_space_start..free_space_end { + if let Some(new_element) = iter.next() { + *self.get_unchecked_mut(i) = new_element + } else { + // Iterator shorter than its ExactSizeIterator::len() + ptr::copy(p.offset(free_space_end as isize), + p.offset(i as isize), + elements_after); + self.set_len(i + elements_after); + return + } + } + self.set_len(free_space_end + elements_after); + } + // Iterator longer than its ExactSizeIterator::len(), degenerate to quadratic time + for (new_element, i) in iter.zip(free_space_end..) { + self.insert(i, new_element); + } + } +} + +trait ReplaceStringSlice { + fn replace_slice(&mut self, range: R, s: &str) where R: RangeArgument; +} + +impl ReplaceStringSlice for String { + fn replace_slice(&mut self, range: R, s: &str) where R: RangeArgument { + if let Some(&start) = range.start() { + assert!(self.is_char_boundary(start)); + } + if let Some(&end) = range.end() { + assert!(self.is_char_boundary(end)); + } + unsafe { + self.as_mut_vec() + }.replace_slice(range, s.bytes()) + } +} + +#[test] +fn it_works() { + let mut v = vec![1, 2, 3, 4, 5]; + v.replace_slice(2..4, [10, 11, 12].iter().cloned()); + assert_eq!(v, &[1, 2, 10, 11, 12, 5]); + v.replace_slice(1..3, Some(20)); + assert_eq!(v, &[1, 20, 11, 12, 5]); + let mut s = "Hello, world!".to_owned(); + s.replace_slice(7.., "世界!"); + assert_eq!(s, "Hello, 世界!"); +} + +#[test] +#[should_panic] +fn char_boundary() { + let mut s = "Hello, 世界!".to_owned(); + s.replace_slice(..8, "") +} +``` + +This implementation defends against `ExactSizeIterator::len()` being incorrect. +If `len()` is too high, it reserves more capacity than necessary +and does more copying than necessary, +but stays in linear time. +If `len()` is too low, the algorithm degenerates to quadratic time +using `Vec::insert` for each additional new element. + +# Drawbacks +[drawbacks]: #drawbacks + +Same as for any addition to `std`: +not every program needs it, and standard library growth has a maintainance cost. + +# Alternatives +[alternatives]: #alternatives + +* Status quo: leave it to every one who wants this to do it the slow way or the dangerous way. +* Publish a crate on crates.io. + Individual crates tend to be not very discoverable, + so not this situation would not be so different from the status quo. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Should the `ExactSizeIterator` bound be removed? + The lower bound of `Iterator::size_hint` could be used instead of `ExactSizeIterator::len`, + but the degenerate quadratic time case would become “normal”. + With `ExactSizeIterator` it only happens when `ExactSizeIterator::len` is incorrect + which means that someone is doing something wrong. + +* Alternatively, should `replace_slice` panic when `ExactSizeIterator::len` is incorrect? + +* It would be nice to be able to `Vec::replace_slice` with a slice + without writing `.iter().cloned()` explicitly. + This is possible with the same trick as for the `Extend` trait + ([RFC 839](https://github.com/rust-lang/rfcs/blob/master/text/0839-embrace-extend-extinguish.md)): + accept iterators of `&T` as well as iterators of `T`: + + ```rust + impl<'a, T: 'a> ReplaceVecSlice<&'a T> for Vec where T: Copy { + fn replace_slice(&mut self, range: R, iterable: I) + where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator + { + self.replace_slice(range, iterable.into_iter().cloned()) + } + } + ``` + + However, this trick can not be used with an inherent method instead of a trait. + (By the way, what was the motivation for `Extend` being a trait rather than inherent methods, + before RFC 839?) + +* Naming. + I accidentally typed `replace_range` instead of `replace_slice` several times + while typing up this RFC. From 5b5a56d9ea7682b478f6f264693e7cb3afc06cc1 Mon Sep 17 00:00:00 2001 From: Ticki Date: Mon, 28 Dec 2015 23:34:08 +0100 Subject: [PATCH 0654/1195] 'Contains' method for ranges --- text/0000-contains-method-for-ranges.md | 77 +++++++++++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100644 text/0000-contains-method-for-ranges.md diff --git a/text/0000-contains-method-for-ranges.md b/text/0000-contains-method-for-ranges.md new file mode 100644 index 00000000000..a072fac9866 --- /dev/null +++ b/text/0000-contains-method-for-ranges.md @@ -0,0 +1,77 @@ +- Feature Name: contains_method +- Start Date: 2015-12-28 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Implement a method, `contains()`, for `Range`, `RangeFrom`, and `RangeTo`, checking if a number is in the range. + +Note that the alternatives are just as important as the main proposal. + +# Motivation +[motivation]: #motivation + +The motivation behind this is simple: To be able to write simpler and more expressive code. This RFC introduces a "syntactic sugar" without doing so. + +# Detailed design +[design]: #detailed-design + +Implement a method, `contains()`, for `Range`, `RangeFrom`, and `RangeTo`. This method will check if a number is bound by the range. It will yield a boolean based on the condition defined by the range. + +The implementation is as follows (placed in libcore, and reexported by libstd): + +```rust +use core::ops::{Range, RangeTo, RangeFrom}; + +impl Range where Idx: PartialOrd { + fn contains(&self, item: Idx) -> bool { + self.start <= item && self.end > item + } +} + +impl RangeTo where Idx: PartialOrd { + fn contains(&self, item: Idx) -> bool { + self.end > item + } +} + +impl RangeFrom where Idx: PartialOrd { + fn contains(&self, item: Idx) -> bool { + self.start <= item + } +} + +``` + +# Drawbacks +[drawbacks]: #drawbacks + +Lacks of generics (see Alternatives). + +# Alternatives +[alternatives]: #alternatives + +## Add a `Contains` trait + +This trait provides the method `.contains()` and implements it for all the Range types. + +## Add a `.contains(item: Self::Item)` iterator method + +This method returns a boolean, telling if the iterator contains the item given as parameter. Using method specialization, this can achieve the same performance as the method suggested in this RFC. + +This is more flexible, and provide better performance (due to specialization) than just passing a closure comparing the items to a `any()` method. + +## Make `.any()` generic over a new trait + +Call this trait, `ItemPattern`. This trait is implemented for `Item` and `FnMut(Item) -> bool`. This is, in a sense, similar to `std::str::pattern::Pattern`. + +Then let `.any()` generic over this trait (`T: ItemPattern`) to allow `any()` taking `Self::Item` searching through the iterator for this particular value. + +This will not achieve the same performance as the other proposals. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None. From a57d6c46c1f28d0a4467718b9c00c5ae5ade4400 Mon Sep 17 00:00:00 2001 From: Ticki Date: Mon, 28 Dec 2015 23:54:31 +0100 Subject: [PATCH 0655/1195] Fix mistake in the trait bound --- text/0000-contains-method-for-ranges.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-contains-method-for-ranges.md b/text/0000-contains-method-for-ranges.md index a072fac9866..0ee3d1cc91f 100644 --- a/text/0000-contains-method-for-ranges.md +++ b/text/0000-contains-method-for-ranges.md @@ -57,7 +57,7 @@ Lacks of generics (see Alternatives). This trait provides the method `.contains()` and implements it for all the Range types. -## Add a `.contains(item: Self::Item)` iterator method +## Add a `.contains>(i: I)` iterator method This method returns a boolean, telling if the iterator contains the item given as parameter. Using method specialization, this can achieve the same performance as the method suggested in this RFC. From b8625ae9f02dfdbe07e0b9f53ba82fc3974ca8d4 Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Mon, 28 Dec 2015 23:06:09 -0800 Subject: [PATCH 0656/1195] Fix most of nagisa's complaints --- text/0000-slice-copy-fill.md | 25 +++++++------------------ 1 file changed, 7 insertions(+), 18 deletions(-) diff --git a/text/0000-slice-copy-fill.md b/text/0000-slice-copy-fill.md index b2f9ca761f5..fb8a45a6752 100644 --- a/text/0000-slice-copy-fill.md +++ b/text/0000-slice-copy-fill.md @@ -1,4 +1,4 @@ -- Feature Name: std::slice::copy, slice::fill +- Feature Name: slice\_copy\_fill - Start Date: (fill me in with today's date, YYYY-MM-DD) - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -37,7 +37,7 @@ memset in all possible cases. It is defined behavior to call `fill` on a slice which has uninitialized members, and `self` is guaranteed to be fully filled afterwards. -`copy` panics if `src.len() != self.len()`, then `memcpy`s the members into +`copy_from` panics if `src.len() != self.len()`, then `memcpy`s the members into `self` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`; `self` can have uninitialized members, and `self` is guaranteed to be fully filled afterwards. This means, for example, that the following is fully defined: @@ -64,12 +64,6 @@ points.fill(1.0); // This is not lowered to a memset (However, it is lowered to // a simd loop, which is what a memset is, in reality) ``` -Also, `copy_from` has it's arguments in a different order from it's most similar -`unsafe` alternative, `std::ptr::copy_nonoverlapping`. This is due to an -unfortunate error that cannot be solved with the now stable -`copy_nonoverlapping`, and the design decision should not be extended to -`copy_from`. - # Alternatives [alternatives]: #alternatives @@ -77,22 +71,17 @@ We could name these functions something else. `fill`, for example, could be called `set`, `fill_from`, or `fill_with`. `copy_from` could be called `copy_to`, and have the order of the arguments -switched around. This is a bad idea, as `copy_from` has a natural connection to -`dst = src` syntax. +switched around. This would follow `ptr::copy_nonoverlapping` ordering, and not +`dst = src` or `.clone_from()` ordering. -`memcpy` is also pretty weird, here. Panicking if the lengths differ is -different from what came before; I believe it to be the safest path, because I -think I'd want to know, personally, if I'm passing the wrong lengths to copy. -However, `std::slice::bytes::copy_memory`, the function I'm basing this on, only -panics if `dst.len() < src.len()`. So... room for discussion, here. +`copy_from` could panic only if `dst.len() < src.len()`. This would be the same +as what came before, but we would also lose the guarantee that an uninitialized +slice would be fully initialized. `fill` and `copy_from` could both be free functions, and were in the original draft of this document. However, overwhelming support for these as methods has meant that these have become methods. -These are necessary, in my opinion. Much unsafe code has been written because -these do not exist. - # Unresolved questions [unresolved]: #unresolved-questions From 22d0deaf7c24710cb0e72c4a5d24c21547586bbc Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Tue, 29 Dec 2015 07:25:09 +0000 Subject: [PATCH 0657/1195] replace_slice: incorporate some feedback --- text/0000-replace-slice.md | 53 ++++++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 22 deletions(-) diff --git a/text/0000-replace-slice.md b/text/0000-replace-slice.md index cf3a2382108..3ee9d2c966e 100644 --- a/text/0000-replace-slice.md +++ b/text/0000-replace-slice.md @@ -82,30 +82,32 @@ impl ReplaceVecSlice for Vec { let free_space_start = range_end; let free_space_end = free_space_start + iter_len; - // FIXME: merge the reallocating case with the first ptr::copy below? - self.reserve(iter_len); - - let p = self.as_mut_ptr(); - unsafe { - // In case iter.next() panics, leak some elements rather than risk double-freeing them. - self.set_len(free_space_start); - // Shift everything over to make space (duplicating some elements). - ptr::copy(p.offset(free_space_start as isize), - p.offset(free_space_end as isize), - elements_after); - for i in free_space_start..free_space_end { - if let Some(new_element) = iter.next() { - *self.get_unchecked_mut(i) = new_element - } else { - // Iterator shorter than its ExactSizeIterator::len() - ptr::copy(p.offset(free_space_end as isize), - p.offset(i as isize), - elements_after); - self.set_len(i + elements_after); - return + if iter_len > 0 { + // FIXME: merge the reallocating case with the first ptr::copy below? + self.reserve(iter_len); + + let p = self.as_mut_ptr(); + unsafe { + // In case iter.next() panics, leak some elements rather than risk double-freeing them. + self.set_len(free_space_start); + // Shift everything over to make space (duplicating some elements). + ptr::copy(p.offset(free_space_start as isize), + p.offset(free_space_end as isize), + elements_after); + for i in free_space_start..free_space_end { + if let Some(new_element) = iter.next() { + *self.get_unchecked_mut(i) = new_element + } else { + // Iterator shorter than its ExactSizeIterator::len() + ptr::copy(p.offset(free_space_end as isize), + p.offset(i as isize), + elements_after); + self.set_len(i + elements_after); + return + } } + self.set_len(free_space_end + elements_after); } - self.set_len(free_space_end + elements_after); } // Iterator longer than its ExactSizeIterator::len(), degenerate to quadratic time for (new_element, i) in iter.zip(free_space_end..) { @@ -207,3 +209,10 @@ not every program needs it, and standard library growth has a maintainance cost. * Naming. I accidentally typed `replace_range` instead of `replace_slice` several times while typing up this RFC. + Update: I’m told `splice` is how this operation is called. + +* The method could return an iterator of the replaced elements. + Nothing would happen when the method is called, + only when the returned iterator is advanced or dropped. + There’s is precedent of this in `Vec::drain`, + though the input iterator being lazily consumed could be surprising. From 3ef78eb1a10db93d953bf80fdff4b26bfa28d482 Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Tue, 29 Dec 2015 00:32:15 -0800 Subject: [PATCH 0658/1195] Tone down the stuff about lowering to memset --- text/0000-slice-copy-fill.md | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/text/0000-slice-copy-fill.md b/text/0000-slice-copy-fill.md index fb8a45a6752..25f554bdd88 100644 --- a/text/0000-slice-copy-fill.md +++ b/text/0000-slice-copy-fill.md @@ -32,10 +32,11 @@ impl [T] where T: Copy { } ``` -`fill` loops through slice, setting each member to value. This will lower to a -memset in all possible cases. It is defined behavior to call `fill` on a slice -which has uninitialized members, and `self` is guaranteed to be fully filled -afterwards. +`fill` loops through slice, setting each member to value. This will usually +lower to a memset in optimized builds. It is likely that this is only the +initial implementation, and will be optimized later to be almost as fast as, or +as fast as, memset. It is defined behavior to call `fill` on a slice which has +uninitialized members, and `self` is guaranteed to be fully filled afterwards. `copy_from` panics if `src.len() != self.len()`, then `memcpy`s the members into `self` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`; @@ -56,13 +57,7 @@ And the program will print 16 '8's. [drawbacks]: #drawbacks Two new methods on `slice`. `[T]::fill` *will not* be lowered to a `memset` in -any case where the bytes of `value` are not all the same, as in - -```rust -// let points: [f32; 16]; -points.fill(1.0); // This is not lowered to a memset (However, it is lowered to - // a simd loop, which is what a memset is, in reality) -``` +all cases. # Alternatives [alternatives]: #alternatives From 73caab53a7e58f2042f471992c785efade94f214 Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Tue, 29 Dec 2015 10:59:10 +0000 Subject: [PATCH 0659/1195] replace_slice -> insert? --- text/0000-replace-slice.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-replace-slice.md b/text/0000-replace-slice.md index 3ee9d2c966e..ab9046f77f5 100644 --- a/text/0000-replace-slice.md +++ b/text/0000-replace-slice.md @@ -216,3 +216,7 @@ not every program needs it, and standard library growth has a maintainance cost. only when the returned iterator is advanced or dropped. There’s is precedent of this in `Vec::drain`, though the input iterator being lazily consumed could be surprising. + +* If coherence rules and backward-compatibility allow it, + this functionality could be added to `Vec::insert` and `String::insert` + by overloading them / making them more generic. From 5a44898eb98344b8d81b3fd6381589199ba9a27e Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Tue, 29 Dec 2015 11:08:01 +0000 Subject: [PATCH 0660/1195] impl RangeArgument for usize? --- text/0000-replace-slice.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-replace-slice.md b/text/0000-replace-slice.md index ab9046f77f5..c6437981df1 100644 --- a/text/0000-replace-slice.md +++ b/text/0000-replace-slice.md @@ -220,3 +220,7 @@ not every program needs it, and standard library growth has a maintainance cost. * If coherence rules and backward-compatibility allow it, this functionality could be added to `Vec::insert` and `String::insert` by overloading them / making them more generic. + This would probably require implementing `RangeArgument` for `usize` + representing an empty range, + though a range of length 1 would maybe make more sense for `Vec::drain` + (another user of `RangeArgument`). From ab63c4fc3a5661eb396d0fdf5852df2d9ed33b71 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Tue, 29 Dec 2015 22:10:42 +0100 Subject: [PATCH 0661/1195] RFC update, first pass: * Monomorphize to `Result`, and move all `Carrier` stuff to a "Future possibilities" section. * Move `throw` and `throws` to "Future possibilities". * Move the irrefutable-pattern form of `catch` to "Future possibilities". * Early-exit using `break` instead of `return`. * Rename `Carrier` to `ResultCarrier`. * Miscellaneous other improvements. --- active/0000-trait-based-exception-handling.md | 669 ++++++++---------- 1 file changed, 280 insertions(+), 389 deletions(-) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index 2ef64c818ac..202d6058593 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -5,29 +5,22 @@ # Summary -Add sugar for working with existing algebraic datatypes such as `Result` and -`Option`. Put another way, use types such as `Result` and `Option` to model -common exception handling constructs. - -Add a trait which precisely spells out the abstract interface and requirements -for such types. +Add syntactic sugar for working with the `Result` type which models common exception handling constructs. The new constructs are: - * An `?` operator for explicitly propagating exceptions. - - * A `try`..`catch` construct for conveniently catching and handling exceptions. + * An `?` operator for explicitly propagating "exceptions". - * (Potentially) a `throw` operator, and `throws` sugar for function signatures. + * A `try`..`catch` construct for conveniently catching and handling "exceptions". -The idea for the `?` operator originates from [RFC PR 204][204] by @aturon. +The idea for the `?` operator originates from [RFC PR 204][204] by [@aturon](https://github.com/aturon). [204]: https://github.com/rust-lang/rfcs/pull/204 # Motivation and overview -Rust currently uses algebraic `enum` types `Option` and `Result` for error +Rust currently uses the `enum Result` type for error handling. This solution is simple, well-behaved, and easy to understand, but often gnarly and inconvenient to work with. We would like to solve the latter problem while retaining the other nice properties and avoiding duplication of @@ -35,10 +28,9 @@ functionality. We can accomplish this by adding constructs which mimic the exception-handling constructs of other languages in both appearance and behavior, while improving -upon them in typically Rustic fashion. These constructs are well-behaved in a -very precise sense and their meaning can be specified by a straightforward -source-to-source translation into existing language constructs (plus a very -simple and obvious new one). (They may also, but need not necessarily, be +upon them in typically Rustic fashion. Their meaning can be specified by a straightforward +source-to-source translation into existing language constructs, plus a very +simple and obvious new one. (They may also, but need not necessarily, be implemented in this way.) These constructs are strict additions to the existing language, and apart from @@ -47,17 +39,15 @@ programs is entirely unaffected. The most important additions are a postfix `?` operator for propagating "exceptions" and a `try`..`catch` block for catching and handling them. By an -"exception", we more or less just mean the `None` variant of an `Option` or the -`Err` variant of a `Result`. (See the "Detailed design" section for more +"exception", we essentially just mean the `Err` variant of a `Result`. (See the "Detailed design" section for more precision.) + ## `?` operator -The postfix `?` operator can be applied to expressions of types like `Option` -and `Result` which contain either a "success" or an "exception" value, and can -be thought of as a generalization of the current `try! { }` macro. It either -returns the "success" value directly, or performs an early exit and propagates -the "exception" value further out. (So given `my_result: Result`, we +The postfix `?` operator can be applied to `Result` values and is equivalent to the current `try!()` macro. It either +returns the `Ok` value directly, or performs an early exit and propagates +the `Err` value further out. (So given `my_result: Result`, we have `my_result?: Foo`.) This allows it to be used for e.g. conveniently chaining method calls which may each "throw an exception": @@ -68,15 +58,13 @@ chaining method calls which may each "throw an exception": When used outside of a `try` block, the `?` operator propagates the exception to the caller of the current function, just like the current `try!` macro does. (If -the return type of the function isn't one, like `Result`, that's capable of -carrying the exception, then this is a type error.) When used inside a `try` +the return type of the function isn't a `Result`, then this is a type error.) When used inside a `try` block, it propagates the exception up to the innermost `try` block, as one would expect. Requiring an explicit `?` operator to propagate exceptions strikes a very pleasing balance between completely automatic exception propagation, which most -languages have, and completely manual propagation, which we currently have -(apart from the `try!` macro to lessen the pain). It means that function calls +languages have, and completely manual propagation, which we'd have apart from the `try!` macro. It means that function calls remain simply function calls which return a result to their caller, with no magic going on behind the scenes; and this also *increases* flexibility, because one gets to choose between propagation with `?` or consuming the returned @@ -86,11 +74,12 @@ The `?` operator itself is suggestive, syntactically lightweight enough to not be bothersome, and lets the reader determine at a glance where an exception may or may not be thrown. It also means that if the signature of a function changes with respect to exceptions, it will lead to type errors rather than silent -behavior changes, which is always a good thing. Finally, because exceptions are -tracked in the type system, there is no silent propagation of exceptions, and +behavior changes, which is a good thing. Finally, because exceptions are +tracked in the type system, and there is no silent propagation of exceptions, and all points where an exception may be thrown are readily apparent visually, this also means that we do not have to worry very much about "exception safety". + ## `try`..`catch` Like most other things in Rust, and unlike other languages that I know of, @@ -100,28 +89,18 @@ thrown, it is passed to the `catch` block, and the `try`..`catch` evaluates to the value of the `catch` block. As with `if`..`else` expressions, the types of the `try` and `catch` blocks must therefore unify. Unlike other languages, only a single type of exception may be thrown in the `try` block (a `Result` only has -a single `Err` type); and there may only be a single `catch` block, which -catches all exceptions. This dramatically simplifies matters and allows for nice -properties. - -There are two variations on the `try`..`catch` theme, each of which is more -convenient in different circumstances. - - 1. `try { EXPR } catch IRR-PAT { EXPR }` +a single `Err` type); all exceptions are always caught; and there may only be one `catch` block. This dramatically simplifies thinking about the behavior of exception-handling code. - For example: +There are two variations on this theme: - try { - foo()?.bar()? - } catch e { - let x = baz(e); - quux(x, e); - } + 1. `try { EXPR }` - Here the caught exception is bound to an irrefutable pattern immediately - following the `catch`. - This form is convenient when one does not wish to do case analysis on the - caught exception. + In this case the `try` block evaluates directly to a `Result` + containing either the value of `EXPR`, or the exception which was thrown. + For instance, `try { foo()? }` is essentially equivalent to `foo()`. + This can be useful if you want to coalesce *multiple* potential exceptions - + `try { foo()?.bar()?.baz()? }` - into a single `Result`, which you wish to + then e.g. pass on as-is to another function, rather than analyze yourself. 2. `try { EXPR } catch { PAT => EXPR, PAT => EXPR, ... }` @@ -134,88 +113,10 @@ convenient in different circumstances. Blue(bex) => quux(bex) } - Here the `catch` is not immediately followed by a pattern; instead, its body + Here the `catch` performs a `match` on the caught exception directly, using any number of - refutable patterns. - This form is convenient when one *does* wish to do case analysis on the - caught exception. - -While it may appear to be extravagant to provide both forms, there is reason to -do so: either form on its own leads to unavoidable rightwards drift under some -circumstances. - -The first form leads to rightwards drift if one wishes to `match` on the caught -exception: - - try { - foo()?.bar()? - } catch e { - match e { - Red(rex) => baz(rex), - Blue(bex) => quux(bex) - } - } - -This `match e` is quite redundant and unfortunate. - -The second form leads to rightwards drift if one wishes to do more complex -multi-statement work with the caught exception: - - try { - foo()?.bar()? - } catch { - e => { - let x = baz(e); - quux(x, e); - } - } - -This single case arm is quite redundant and unfortunate. - -Therefore, neither form can be considered strictly superior to the other, and it -is preferable to simply provide both. - -Finally, it is also possible to write a `try` block *without* a `catch` block: - - 3. `try { EXPR }` - - In this case the `try` block evaluates directly to a `Result`-like type - containing either the value of `EXPR`, or the exception which was thrown. - For instance, `try { foo()? }` is essentially equivalent to `foo()`. - This can be useful if you want to coalesce *multiple* potential exceptions - - `try { foo()?.bar()?.baz()? }` - into a single `Result`, which you wish to - then e.g. pass on as-is to another function, rather than analyze yourself. - -## (Optional) `throw` and `throws` - -It is possible to carry the exception handling analogy further and also add -`throw` and `throws` constructs. - -`throw` is very simple: `throw EXPR` is essentially the same thing as -`Err(EXPR)?`; in other words it throws the exception `EXPR` to the innermost -`try` block, or to the function's caller if there is none. - -A `throws` clause on a function: - - fn foo(arg; Foo) -> Bar throws Baz { ... } - -would do two things: - - * Less importantly, it would make the function polymorphic over the - `Result`-like type used to "carry" exceptions. - - * More importantly, it means that instead of writing `return Ok(foo)` and - `return Err(bar)` in the body of the function, one would write `return foo` - and `throw bar`, and these are implicitly embedded as the "success" or - "exception" value in the carrier type. This removes syntactic overhead from - both "normal" and "throwing" code paths and (apart from `?` to propagate - exceptions) matches what code might look like in a language with native - exceptions. - -(This could potentially be extended to allow writing `throws` clauses on `fn` -and closure *types*, desugaring to a type parameter with a `Carrier` bound on -the parent item (e.g. a HOF), but this would be considerably more involved, and -it's not clear whether there is value in doing so.) + refutable patterns. This form is convenient for checking and handling the + caught exception directly. # Detailed design @@ -225,6 +126,7 @@ translation. We make use of an "early exit from any block" feature which doesn't currently exist in the language, generalizes the current `break` and `return` constructs, and is independently useful. + ## Early exit from any block The capability can be exposed either by generalizing `break` to take an optional @@ -233,14 +135,14 @@ value argument and break out of any block (not just loops), or by generalizing just the outermost block of the function. This feature is independently useful and I believe it should be added, but as it is only used here in this RFC as an explanatory device, and implementing the RFC does not require exposing it, I am -going to arbitrarily choose the `return` syntax for the following and won't +going to arbitrarily choose the `break` syntax for the following and won't discuss the question further. -So we are extending `return` with an optional lifetime argument: `return 'a -EXPR`. This is an expression of type `!` which causes an early return from the +So we are extending `break` with an optional value argument: `break 'a EXPR`. +This is an expression of type `!` which causes an early return from the enclosing block specified by `'a`, which then evaluates to the value `EXPR` (of course, the type of `EXPR` must unify with the type of the last expression in -that block). +that block). This works for any block, not only loops. A completely artificial example: @@ -248,7 +150,7 @@ A completely artificial example: let my_thing = if have_thing { get_thing() } else { - return 'a None + break 'a None }; println!("found thing: {}", my_thing); Some(my_thing) @@ -256,109 +158,9 @@ A completely artificial example: Here if we don't have a thing, we escape from the block early with `None`. -If no lifetime is specified, it defaults to returning from the whole function: -in other words, the current behavior. We can pretend there is a magical lifetime -`'fn` which refers to the outermost block of the current function, which is the -default. - -## The trait - -Here we specify the trait for types which can be used to "carry" either a normal -result or an exception. There are several different, completely equivalent ways -to formulate it, which differ only in the set of methods: for other -possibilities, see the appendix. - - #[lang(carrier)] - trait Carrier { - type Normal; - type Exception; - fn embed_normal(from: Normal) -> Self; - fn embed_exception(from: Exception) -> Self; - fn translate>(from: Self) -> Other; - } - -This trait basically just states that `Self` is isomorphic to -`Result` for some types `Normal` and `Exception`. For greater -clarity on how these methods work, see the section on `impl`s below. (For a -simpler formulation of the trait using `Result` directly, see the appendix.) - -The `translate` method says that it should be possible to translate to any -*other* `Carrier` type which has the same `Normal` and `Exception` types. This -can be used to inspect the value by translating to a concrete type such as -`Result` and then, for example, pattern matching on it. +If no value is specified, it defaults to `()`: in other words, the current behavior. +We can also imagine there is a magical lifetime `'fn` which refers to the lifetime of the whole function: in this case, `break 'fn` is equivalent to `return`. -Laws: - - 1. For all `x`, `translate(embed_normal(x): A): B ` = `embed_normal(x): B`. - 2. For all `x`, `translate(embed_exception(x): A): B ` = `embed_exception(x): B`. - 3. For all `carrier`, `translate(translate(carrier: A): B): A` = `carrier: A`. - -Here I've used explicit type ascription syntax to make it clear that e.g. the -types of `embed_` on the left and right hand sides are different. - -The first two laws say that embedding a result `x` into one carrier type and -then translating it to a second carrier type should be the same as embedding it -into the second type directly. - -The third law says that translating to a different carrier type and then -translating back should be the identity function. - - -## `impl`s of the trait - - impl Carrier for Result { - type Normal = T; - type Exception = E; - fn embed_normal(a: T) -> Result { Ok(a) } - fn embed_exception(e: E) -> Result { Err(e) } - fn translate>(result: Result) -> Other { - match result { - Ok(a) => Other::embed_normal(a), - Err(e) => Other::embed_exception(e) - } - } - } - -As we can see, `translate` can be implemented by deconstructing ourself and then -re-embedding the contained value into the other carrier type. - - impl Carrier for Option { - type Normal = T; - type Exception = (); - fn embed_normal(a: T) -> Option { Some(a) } - fn embed_exception(e: ()) -> Option { None } - fn translate>(option: Option) -> Other { - match option { - Some(a) => Other::embed_normal(a), - None => Other::embed_exception(()) - } - } - } - -Potentially also: - - impl Carrier for bool { - type Normal = (); - type Exception = (); - fn embed_normal(a: ()) -> bool { true } - fn embed_exception(e: ()) -> bool { false } - fn translate>(b: bool) -> Other { - match b { - true => Other::embed_normal(()), - false => Other::embed_exception(()) - } - } - } - -The laws should be sufficient to rule out any "icky" impls. For example, an impl -for `Vec` where an exception is represented as the empty vector, and a normal -result as a single-element vector: here the third law fails, because if the -`Vec` has more than one element *to begin with*, then it's not possible to -translate to a different carrier type and then back without losing information. - -The `bool` impl may be surprising, or not useful, but it *is* well-behaved: -`bool` is, after all, isomorphic to `Result<(), ()>`. This `impl` may be -included or not; I don't have a strong opinion about it. ## Definition of constructs @@ -372,34 +174,21 @@ constructs, and a "deep" one which is "fully expanded". Of course, these could be defined in many equivalent ways: the below definitions are merely one way. - * Construct: - - throw EXPR - - Shallow: - - return 'here Carrier::embed_exception(EXPR) - - Where `'here` refers to the innermost enclosing `try` block, or to `'fn` if - there is none. As with `return`, `EXPR` may be omitted and defaults to `()`. - * Construct: EXPR? Shallow: - match translate(EXPR) { + match EXPR { Ok(a) => a, - Err(e) => throw e + Err(e) => break 'here Err(e) } - Deep: + Where `'here` refers to the innermost enclosing `try` block, or to `'fn` if + there is none. - match translate(EXPR) { - Ok(a) => a, - Err(e) => return 'here Carrier::embed_exception(e) - } + The `?` operator has the same precedence as `.`. * Construct: @@ -410,45 +199,16 @@ are merely one way. Shallow: 'here: { - Carrier::embed_normal(foo()?.bar()) + Ok(foo()?.bar()) } Deep: 'here: { - Carrier::embed_normal(match translate(foo()) { - Ok(a) => a, - Err(e) => return 'here Carrier::embed_exception(e) - }.bar()) - } - - * Construct: - - try { - foo()?.bar() - } catch e { - baz(e) - } - - Shallow: - - match try { - foo()?.bar() - } { - Ok(a) => a, - Err(e) => baz(e) - } - - Deep: - - match 'here: { - Carrier::embed_normal(match translate(foo()) { + Ok(match foo() { Ok(a) => a, - Err(e) => return 'here Carrier::embed_exception(e) + Err(e) => break 'here Err(e) }.bar()) - } { - Ok(a) => a, - Err(e) => baz(e) } * Construct: @@ -460,12 +220,13 @@ are merely one way. B(b) => quux(b) } - Shallow: + Shallow: - try { + match (try { foo()?.bar() - } catch e { - match e { + }) { + Ok(a) => a, + Err(e) => match e { A(a) => baz(a), B(b) => quux(b) } @@ -474,9 +235,9 @@ are merely one way. Deep: match 'here: { - Carrier::embed_normal(match translate(foo()) { + Ok(match foo() { Ok(a) => a, - Err(e) => return 'here Carrier::embed_exception(e) + Err(e) => break 'here Err(e) }.bar()) } { Ok(a) => a, @@ -486,38 +247,6 @@ are merely one way. } } - * Construct: - - fn foo(A) -> B throws C { - CODE - } - - Shallow: - - fn foo>(A) -> Car { - try { - 'fn: { - CODE - } - } - } - - Deep: - - fn foo>(A) -> Car { - 'here: { - Carrier::embed_normal('fn: { - CODE - }) - } - } - - (Here our desugaring runs into a stumbling block, and we resort to a pun: the - *whole function* should be conceptually wrapped in a `try` block, and a - `return` inside `CODE` should be embedded as a successful result into the - carrier, rather than escaping from the `try` block itself. We suggest this by - putting the "magical lifetime" `'fn` *inside* the `try` block.) - The fully expanded translations get quite gnarly, but that is why it's good that you don't have to write them! @@ -528,78 +257,47 @@ of their definitions. a source-to-source translation in this manner, they need not necessarily be *implemented* this way.) + ## Laws -Without any attempt at completeness, and modulo `translate()` between different -carrier types, here are some things which should be true: +Without any attempt at completeness, here are some things which should be true: * `try { foo() } ` = `Ok(foo())` - * `try { throw e } ` = `Err(e)` + * `try { Err(e)? } ` = `Err(e)` * `try { foo()? } ` = `foo()` * `try { foo() } catch e { e }` = `foo()` - * `try { throw e } catch e { e }` = `e` + * `try { Err(e)? } catch e { e }` = `e` * `try { Ok(foo()?) } catch e { Err(e) }` = `foo()` -## Misc - - * Our current lint for unused results could be replaced by one which warns for - any unused result of a type which implements `Carrier`. - - * If there is ever ambiguity due to the carrier type being underdetermined - (experience should reveal whether this is a problem in practice), we could - resolve it by defaulting to `Result`. (This would presumably involve making - `Result` a lang item.) - - * Translating between different carrier types with the same `Normal` and - `Exception` types *should*, but may not necessarily *currently* be, a no-op - most of the time. - - We should make it so that: - - * repr(`Option`) = repr(`Result`) - * repr(`bool`) = repr(`Option<()>`) = repr(`Result<(), ()>`) - - If these hold, then `translate` between these types could in theory be - compiled down to just a `transmute`. (Whether LLVM is smart enough to do - this, I don't know.) - - * The `translate()` function smells to me like a natural transformation between - functors, but I'm not category theorist enough for it to be obvious. - # Drawbacks - * Adds new constructs to the language. + * Increases the syntactic surface area of the language. - * Some people have a philosophical objection to "there's more than one way to - do it". + * No expressivity is added, only convenience. Some object to "there's more than one way to do it" on principle. - * Relative to first-class checked exceptions, our implementation options are - constrained: while actual checked exceptions could be implemented in a - similar way to this proposal, they could also be implemented using unwinding, - should we choose to do so, and we do not realistically have that option here. + * If at some future point we were to add higher-kinded types and syntactic sugar + for monads, a la Haskell's `do` or Scala's `for`, their functionality may overlap and result in redundancy. + However, a number of challenges would have to be overcome for a generic monadic sugar to be able to + fully supplant these features: the integration of higher-kinded types into Rust's type system in the + first place, the shape of a `Monad` `trait` in a language with lifetimes and move semantics, + interaction between the monadic control flow and Rust's native control flow (the "ambient monad"), + automatic upcasting of exception types via `Into` (the exception (`Either`, `Result`) monad normally does not + do this, and it's not clear whether it can), and potentially others. # Alternatives - * Do nothing. + * Don't. - * Only add the `?` operator, but not any of the other constructs. + * Only add the `?` operator, but not `try`..`catch`. * Instead of a built-in `try`..`catch` construct, attempt to define one using macros. However, this is likely to be awkward because, at least, macros may only have their contents as a single block, rather than two. Furthermore, macros are excellent as a "safety net" for features which we forget to add - to the language itself, or which only have specialized use cases; but after - seeing this proposal, we need not forget `try`..`catch`, and its prevalence - in nearly every existing language suggests that it is, in fact, generally - useful. - - * Instead of a general `Carrier` trait, define everything directly in terms of - `Result`. This has precedent in that, for example, the `if`..`else` construct - is also defined directly in terms of `bool`. (However, this would likely also - lead to removing `Option` from the standard library in favor of - `Result<_, ()>`.) + to the language itself, or which only have specialized use cases; but generally + useful control flow constructs still work better as language features. * Add [first-class checked exceptions][notes], which are propagated automatically (without an `?` operator). @@ -615,27 +313,220 @@ carrier types, here are some things which should be true: [notes]: https://github.com/glaebhoerl/rust-notes/blob/268266e8fbbbfd91098d3bea784098e918b42322/my_rfcs/Exceptions.txt + * Wait (and hope) for HKTs and generic monad sugar. + + +# Future possibilities + +## An additional `catch` form to bind the caught exception irrefutably + +The `catch` described above immediately passes the caught exception into a `match` block. +It may sometimes be desirable to instead bind it directly to a single variable. That might +look like this: + + try { EXPR } catch IRR-PAT { EXPR } + +Where `catch` is followed by any irrefutable pattern (as with `let`). + +For example: + + try { + foo()?.bar()? + } catch e { + let x = baz(e); + quux(x, e); + } + +While it may appear to be extravagant to provide both forms, there is reason to +do so: either form on its own leads to unavoidable rightwards drift under some +circumstances. + +The first form leads to rightwards drift if one wishes to do more complex +multi-statement work with the caught exception: + + try { + foo()?.bar()? + } catch { + e => { + let x = baz(e); + quux(x, e); + } + } + +This single case arm is quite redundant and unfortunate. + +The second form leads to rightwards drift if one wishes to `match` on the caught +exception: + + try { + foo()?.bar()? + } catch e { + match e { + Red(rex) => baz(rex), + Blue(bex) => quux(bex) + } + } + +This `match e` is quite redundant and unfortunate. + +Therefore, neither form can be considered strictly superior to the other, and it +may be preferable to simply provide both. + + +## `throw` and `throws` + +It is possible to carry the exception handling analogy further and also add +`throw` and `throws` constructs. + +`throw` is very simple: `throw EXPR` is essentially the same thing as +`Err(EXPR)?`; in other words it throws the exception `EXPR` to the innermost +`try` block, or to the function's caller if there is none. + +A `throws` clause on a function: + + fn foo(arg: Foo) -> Bar throws Baz { ... } + +would mean that instead of writing `return Ok(foo)` and +`return Err(bar)` in the body of the function, one would write `return foo` +and `throw bar`, and these are implicitly turned into `Ok` or `Err` for the caller. This removes syntactic overhead from +both "normal" and "throwing" code paths and (apart from `?` to propagate +exceptions) matches what code might look like in a language with native +exceptions. + + +## Generalize over `Result`, `Option`, and other result-carrying types + +`Option` is completely equivalent to `Result` modulo names, and many common APIs +use the `Option` type, so it would make sense to extend all of the above syntax to `Option`, +and other (potentially user-defined) equivalent-to-`Result` types, as well. + +This can be done by specifying a trait for types which can be used to "carry" either a normal +result or an exception. There are several different, equivalent ways +to formulate it, which differ in the set of methods provided, but the meaning in any case is essentially just +that you can choose some types `Normal` and `Exception` such that `Self` is isomorphic to `Result`. + +Here is one way: + + #[lang(result_carrier)] + trait ResultCarrier { + type Normal; + type Exception; + fn embed_normal(from: Normal) -> Self; + fn embed_exception(from: Exception) -> Self; + fn translate>(from: Self) -> Other; + } + +For greater clarity on how these methods work, see the section on `impl`s below. (For a +simpler formulation of the trait using `Result` directly, see further below.) + +The `translate` method says that it should be possible to translate to any +*other* `ResultCarrier` type which has the same `Normal` and `Exception` types. +This may not appear to be very useful, but in fact, this is what can be used to inspect the result, +by translating it to a concrete type such as `Result` and then, for example, pattern matching on it. + +Laws: + + 1. For all `x`, `translate(embed_normal(x): A): B ` = `embed_normal(x): B`. + 2. For all `x`, `translate(embed_exception(x): A): B ` = `embed_exception(x): B`. + 3. For all `carrier`, `translate(translate(carrier: A): B): A` = `carrier: A`. + +Here I've used explicit type ascription syntax to make it clear that e.g. the +types of `embed_` on the left and right hand sides are different. + +The first two laws say that embedding a result `x` into one result-carrying type and +then translating it to a second result-carrying type should be the same as embedding it +into the second type directly. + +The third law says that translating to a different result-carrying type and then +translating back should be a no-op. + + +## `impl`s of the trait + + impl ResultCarrier for Result { + type Normal = T; + type Exception = E; + fn embed_normal(a: T) -> Result { Ok(a) } + fn embed_exception(e: E) -> Result { Err(e) } + fn translate>(result: Result) -> Other { + match result { + Ok(a) => Other::embed_normal(a), + Err(e) => Other::embed_exception(e) + } + } + } + +As we can see, `translate` can be implemented by deconstructing ourself and then +re-embedding the contained value into the other result-carrying type. + + impl ResultCarrier for Option { + type Normal = T; + type Exception = (); + fn embed_normal(a: T) -> Option { Some(a) } + fn embed_exception(e: ()) -> Option { None } + fn translate>(option: Option) -> Other { + match option { + Some(a) => Other::embed_normal(a), + None => Other::embed_exception(()) + } + } + } + +Potentially also: + + impl ResultCarrier for bool { + type Normal = (); + type Exception = (); + fn embed_normal(a: ()) -> bool { true } + fn embed_exception(e: ()) -> bool { false } + fn translate>(b: bool) -> Other { + match b { + true => Other::embed_normal(()), + false => Other::embed_exception(()) + } + } + } + +The laws should be sufficient to rule out any "icky" impls. For example, an impl +for `Vec` where an exception is represented as the empty vector, and a normal +result as a single-element vector: here the third law fails, because if the +`Vec` has more than one element *to begin with*, then it's not possible to +translate to a different result-carrying type and then back without losing information. + +The `bool` impl may be surprising, or not useful, but it *is* well-behaved: +`bool` is, after all, isomorphic to `Result<(), ()>`. + +### Other miscellaneous notes about `ResultCarrier` -# Unresolved questions + * Our current lint for unused results could be replaced by one which warns for + any unused result of a type which implements `ResultCarrier`. - * What should the precedence of the `?` operator be? + * If there is ever ambiguity due to the result-carrying type being underdetermined + (experience should reveal whether this is a problem in practice), we could + resolve it by defaulting to `Result`. - * Should we add `throw` and/or `throws`? + * Translating between different result-carrying types with the same `Normal` and + `Exception` types *should*, but may not necessarily *currently* be, a + machine-level no-op most of the time. - * Should we have `impl Carrier for bool`? + We could/should make it so that: - * Should we also add the "early return from any block" feature along with this - proposal, or should that be considered separately? (If we add it: should we - do it by generalizing `break` or `return`?) + * repr(`Option`) = repr(`Result`) + * repr(`bool`) = repr(`Option<()>`) = repr(`Result<(), ()>`) + If these hold, then `translate` between these types could in theory be + compiled down to just a `transmute`. (Whether LLVM is smart enough to do + this, I don't know.) + + * The `translate()` function smells to me like a natural transformation between + functors, but I'm not category theorist enough for it to be obvious. -# Appendices -## Alternative formulations of the `Carrier` trait +### Alternative formulations of the `ResultCarrier` trait All of these have the form: - trait Carrier { + trait ResultCarrier { type Normal; type Exception; ...methods... @@ -643,7 +534,7 @@ All of these have the form: and differ only in the methods, which will be given. -### Explicit isomorphism with `Result` +#### Explicit isomorphism with `Result` fn from_result(Result) -> Self; fn to_result(Self) -> Result; @@ -653,7 +544,7 @@ This is, of course, the simplest possible formulation. The drawbacks are that it, in some sense, privileges `Result` over other potentially equivalent types, and that it may be less efficient for those types: for any non-`Result` type, every operation requires two method calls (one into -`Result`, and one out), whereas with the `Carrier` trait in the main text, they +`Result`, and one out), whereas with the `ResultCarrier` trait in the main text, they only require one. Laws: @@ -664,7 +555,7 @@ Laws: Laws for the remaining formulations below are left as an exercise for the reader. -### Avoid privileging `Result`, most naive version +#### Avoid privileging `Result`, most naive version fn embed_normal(Normal) -> Self; fn embed_exception(Exception) -> Self; @@ -675,7 +566,7 @@ reader. Of course this is horrible. -### Destructuring with HOFs (a.k.a. Church/Scott-encoding) +#### Destructuring with HOFs (a.k.a. Church/Scott-encoding) fn embed_normal(Normal) -> Self; fn embed_exception(Exception) -> Self; @@ -686,7 +577,7 @@ This is probably the right approach for Haskell, but not for Rust. With this formulation, because they each take ownership of them, the two closures may not even close over the same variables! -### Destructuring with HOFs, round 2 +#### Destructuring with HOFs, round 2 trait BiOnceFn { type ArgA; @@ -696,7 +587,7 @@ closures may not even close over the same variables! fn callB(Self, ArgB) -> Ret; } - trait Carrier { + trait ResultCarrier { type Normal; type Exception; fn normal(Normal) -> Self; From ef6bb5c4c4657746392f913157eaaab37edf3638 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Tue, 29 Dec 2015 22:24:04 +0100 Subject: [PATCH 0662/1195] make clearer that early-exit-from-any-block is not proposed, mention it in "Future possibilities" --- active/0000-trait-based-exception-handling.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index 202d6058593..74d1b46504b 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -132,9 +132,8 @@ constructs, and is independently useful. The capability can be exposed either by generalizing `break` to take an optional value argument and break out of any block (not just loops), or by generalizing `return` to take an optional lifetime argument and return from any block, not -just the outermost block of the function. This feature is independently useful -and I believe it should be added, but as it is only used here in this RFC as an -explanatory device, and implementing the RFC does not require exposing it, I am +just the outermost block of the function. This feature is only used in this RFC as an +explanatory device, and implementing the RFC does not require exposing it, so I am going to arbitrarily choose the `break` syntax for the following and won't discuss the question further. @@ -161,6 +160,8 @@ Here if we don't have a thing, we escape from the block early with `None`. If no value is specified, it defaults to `()`: in other words, the current behavior. We can also imagine there is a magical lifetime `'fn` which refers to the lifetime of the whole function: in this case, `break 'fn` is equivalent to `return`. +Again, this RFC does not propose generalizing `break` in this way at this time: it is only used as a way to explain the meaning of the constructs it does propose. + ## Definition of constructs @@ -318,6 +319,10 @@ Without any attempt at completeness, here are some things which should be true: # Future possibilities +## Expose a generalized form of `break` or `return` as described + +This RFC doesn't propose doing so at this time, but as it would be an independently useful feature, it could be added as well. + ## An additional `catch` form to bind the caught exception irrefutably The `catch` described above immediately passes the caught exception into a `match` block. From 4288a756fd2ce132ba6330a03db277dfd74123be Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Wed, 30 Dec 2015 03:30:08 +0100 Subject: [PATCH 0663/1195] add syntax as an unresolved question --- active/0000-trait-based-exception-handling.md | 45 ++++++++++++++++--- 1 file changed, 38 insertions(+), 7 deletions(-) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index 74d1b46504b..29e787ddb4f 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -146,7 +146,7 @@ that block). This works for any block, not only loops. A completely artificial example: 'a: { - let my_thing = if have_thing { + let my_thing = if have_thing() { get_thing() } else { break 'a None @@ -271,6 +271,37 @@ Without any attempt at completeness, here are some things which should be true: * `try { Ok(foo()?) } catch e { Err(e) }` = `foo()` +# Unresolved questions + +## Choice of keywords + +The RFC to this point uses the keywords `try`..`catch`, but there are a number of other possibilities, each with different advantages and drawbacks: + + * `try { ... } catch { ... }` + + * `try { ... } match { ... }` + + * `try { ... } handle { ... }` + + * `catch { ... } match { ... }` + + * `catch { ... } handle { ... }` + + * `catch ...` (without braces or a second clause) + +Among the considerations: + + * Simplicity. Brevity. + + * Following precedent from existing, popular languages, and familiarity with respect to analogous constructs in them. + + * Fidelity to the constructs' actual behavior. For instance, the first clause always catches the "exception"; the second only branches on it. + + * Consistency with the existing `try!()` macro. If the first clause is called `try`, then `try { }` and `try!()` would have essentially inverse meanings. + + * Language-level backwards compatibility when adding new keywords. I'm not sure how this could or should be handled. + + # Drawbacks * Increases the syntactic surface area of the language. @@ -291,7 +322,9 @@ Without any attempt at completeness, here are some things which should be true: * Don't. - * Only add the `?` operator, but not `try`..`catch`. + * Only add the `?` operator, but not `try` and `try`..`catch`. + + * Only add `?` and `try`, but not `try`..`catch`. * Instead of a built-in `try`..`catch` construct, attempt to define one using macros. However, this is likely to be awkward because, at least, macros may @@ -312,10 +345,10 @@ Without any attempt at completeness, here are some things which should be true: serious an issue this would actually be in practice, I don't know - there's reason to believe that it would be much less of one than in C++. -[notes]: https://github.com/glaebhoerl/rust-notes/blob/268266e8fbbbfd91098d3bea784098e918b42322/my_rfcs/Exceptions.txt - * Wait (and hope) for HKTs and generic monad sugar. +[notes]: https://github.com/glaebhoerl/rust-notes/blob/268266e8fbbbfd91098d3bea784098e918b42322/my_rfcs/Exceptions.txt + # Future possibilities @@ -377,7 +410,6 @@ This `match e` is quite redundant and unfortunate. Therefore, neither form can be considered strictly superior to the other, and it may be preferable to simply provide both. - ## `throw` and `throws` It is possible to carry the exception handling analogy further and also add @@ -398,11 +430,10 @@ both "normal" and "throwing" code paths and (apart from `?` to propagate exceptions) matches what code might look like in a language with native exceptions. - ## Generalize over `Result`, `Option`, and other result-carrying types `Option` is completely equivalent to `Result` modulo names, and many common APIs -use the `Option` type, so it would make sense to extend all of the above syntax to `Option`, +use the `Option` type, so it would be useful to extend all of the above syntax to `Option`, and other (potentially user-defined) equivalent-to-`Result` types, as well. This can be done by specifying a trait for types which can be used to "carry" either a normal From f5bf33127b1a1e3dfbb5016aa1284d0e81ea0f20 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Wed, 30 Dec 2015 04:25:41 +0100 Subject: [PATCH 0664/1195] Add exception-upcasting with `Into` and minor other stuff --- active/0000-trait-based-exception-handling.md | 72 ++++++++++++++++--- 1 file changed, 61 insertions(+), 11 deletions(-) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index 29e787ddb4f..abc2146cda7 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -53,8 +53,8 @@ chaining method calls which may each "throw an exception": foo()?.bar()?.baz() -(Naturally, in this case the types of the "exceptions thrown by" `foo()` and -`bar()` must unify.) +Naturally, in this case the types of the "exceptions thrown by" `foo()` and +`bar()` must unify. Like the current `try!()` macro, the `?` operator will also perform an implicit "upcast" on the exception type. When used outside of a `try` block, the `?` operator propagates the exception to the caller of the current function, just like the current `try!` macro does. (If @@ -79,6 +79,23 @@ tracked in the type system, and there is no silent propagation of exceptions, an all points where an exception may be thrown are readily apparent visually, this also means that we do not have to worry very much about "exception safety". +### Exception type upcasting + +In a language with checked exceptions and subtyping, it is clear that if a function is declared as throwing a particular type, its body should also be able to throw any of its subtypes. Similarly, in a language with structural sum types (a.k.a. anonymous `enum`s, polymorphic variants), one should be able to throw a type with fewer cases in a function declaring that it may throw a superset of those cases. This is essentially what is achieved by the common Rust practice of declaring a custom error `enum` with `From` `impl`s for each of the upstream error types which may be propagated: + + enum MyError { + IoError(io::Error), + JsonError(json::Error), + OtherError(...) + } + + impl From for MyError { ... } + impl From for MyError { ... } + +Here `io::Error` and `json::Error` can be thought of as subtypes of `MyError`, with a clear and direct embedding into the supertype. + +The `?` operator should therefore perform such an implicit conversion in the nature of a subtype-to-supertype coercion. The present RFC uses the `std::convert::Into` trait for this purpose (which has a blanket `impl` forwarding from `From`). The precise requirements for a conversion to be "like" a subtyping coercion are an open question; see the "Unresolved questions" section. + ## `try`..`catch` @@ -183,7 +200,7 @@ are merely one way. match EXPR { Ok(a) => a, - Err(e) => break 'here Err(e) + Err(e) => break 'here Err(e.into()) } Where `'here` refers to the innermost enclosing `try` block, or to `'fn` if @@ -208,7 +225,7 @@ are merely one way. 'here: { Ok(match foo() { Ok(a) => a, - Err(e) => break 'here Err(e) + Err(e) => break 'here Err(e.into()) }.bar()) } @@ -238,7 +255,7 @@ are merely one way. match 'here: { Ok(match foo() { Ok(a) => a, - Err(e) => break 'here Err(e) + Err(e) => break 'here Err(e.into()) }.bar()) } { Ok(a) => a, @@ -263,16 +280,19 @@ a source-to-source translation in this manner, they need not necessarily be Without any attempt at completeness, here are some things which should be true: - * `try { foo() } ` = `Ok(foo())` - * `try { Err(e)? } ` = `Err(e)` - * `try { foo()? } ` = `foo()` - * `try { foo() } catch e { e }` = `foo()` - * `try { Err(e)? } catch e { e }` = `e` - * `try { Ok(foo()?) } catch e { Err(e) }` = `foo()` + * `try { foo() } ` = `Ok(foo())` + * `try { Err(e)? } ` = `Err(e.into())` + * `try { try_foo()? } ` = `try_foo().map_err(Into::into)` + * `try { Err(e)? } catch { e => e }` = `e.into()` + * `try { Ok(try_foo()?) } catch { e => Err(e) }` = `try_foo().map_err(Into::into)` + +(In the above, `foo()` is a function returning any type, and `try_foo()` is a function returning a `Result`.) # Unresolved questions +These questions should be satisfactorally resolved before stabilizing the relevant features, at the latest. + ## Choice of keywords The RFC to this point uses the keywords `try`..`catch`, but there are a number of other possibilities, each with different advantages and drawbacks: @@ -302,6 +322,36 @@ Among the considerations: * Language-level backwards compatibility when adding new keywords. I'm not sure how this could or should be handled. +## Semantics for "upcasting" + +What should the contract for a `From`/`Into` `impl` be? Are these even the right `trait`s to use for this feature? + +Two obvious, minimal requirements are: + + * It should be pure: no side effects, and no observation of side effects. (The result should depend *only* on the argument.) + + * It should be total: no panics or other divergence, except perhaps in the case of resource exhaustion (OOM, stack overflow). + +The other requirements for an implicit conversion to be well-behaved in the context of this feature should be thought through with care. + +Some further thoughts and possibilities on this matter: + + * It should be "like a coercion from subtype to supertype", as described earlier. The precise meaning of this is not obvious. + + * A common condition on subtyping coercions is coherence: if you can compound-coerce to go from `A` to `Z` indirectly along multiple different paths, they should all have the same end result. + + * It should be unambiguous, or preserve the meaning of the input: `impl From for u32` as `x as u32` feels right; as `(x as u32) * 12345` feels wrong, even though this is perfectly pure, total, and injective. What this means precisely in the general case is unclear. + + * It should be lossless, or in other words, injective: it should map each observably-different element of the input type to observably-different elements of the output type. (Observably-different means that it is possible to write a program which behaves differently depending on which one it gets, modulo things that "shouldn't count" like observing execution time or resource usage.) + + * The types converted between should the "same kind of thing": for instance, the *existing* `impl From for Ipv4Addr` is pretty suspect on this count. (This perhaps ties into the subtyping angle: `Ipv4Addr` is clearly not a supertype of `u32`.) + + +## Forwards-compatibility + +If we later want to generalize this feature to other types such as `Option`, as described below, will we be able to do so while maintaining backwards-compatibility? + + # Drawbacks * Increases the syntactic surface area of the language. From 1a50c01eb72d4ca49880f1d9a8617f72379166e4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Wed, 30 Dec 2015 04:39:41 +0100 Subject: [PATCH 0665/1195] wrap ALL the words; also mention lang items --- active/0000-trait-based-exception-handling.md | 280 +++++++++++------- 1 file changed, 170 insertions(+), 110 deletions(-) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index abc2146cda7..6c5d4c04f3d 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -5,33 +5,35 @@ # Summary -Add syntactic sugar for working with the `Result` type which models common exception handling constructs. +Add syntactic sugar for working with the `Result` type which models common +exception handling constructs. The new constructs are: * An `?` operator for explicitly propagating "exceptions". - * A `try`..`catch` construct for conveniently catching and handling "exceptions". + * A `try`..`catch` construct for conveniently catching and handling + "exceptions". -The idea for the `?` operator originates from [RFC PR 204][204] by [@aturon](https://github.com/aturon). +The idea for the `?` operator originates from [RFC PR 204][204] by +[@aturon](https://github.com/aturon). [204]: https://github.com/rust-lang/rfcs/pull/204 # Motivation and overview -Rust currently uses the `enum Result` type for error -handling. This solution is simple, well-behaved, and easy to understand, but -often gnarly and inconvenient to work with. We would like to solve the latter -problem while retaining the other nice properties and avoiding duplication of -functionality. +Rust currently uses the `enum Result` type for error handling. This solution is +simple, well-behaved, and easy to understand, but often gnarly and inconvenient +to work with. We would like to solve the latter problem while retaining the +other nice properties and avoiding duplication of functionality. We can accomplish this by adding constructs which mimic the exception-handling constructs of other languages in both appearance and behavior, while improving -upon them in typically Rustic fashion. Their meaning can be specified by a straightforward -source-to-source translation into existing language constructs, plus a very -simple and obvious new one. (They may also, but need not necessarily, be -implemented in this way.) +upon them in typically Rustic fashion. Their meaning can be specified by a +straightforward source-to-source translation into existing language constructs, +plus a very simple and obvious new one. (They may also, but need not +necessarily, be implemented in this way.) These constructs are strict additions to the existing language, and apart from the issue of keywords, the legality and behavior of all currently existing Rust @@ -39,49 +41,58 @@ programs is entirely unaffected. The most important additions are a postfix `?` operator for propagating "exceptions" and a `try`..`catch` block for catching and handling them. By an -"exception", we essentially just mean the `Err` variant of a `Result`. (See the "Detailed design" section for more -precision.) +"exception", we essentially just mean the `Err` variant of a `Result`. (See the +"Detailed design" section for more precision.) ## `?` operator -The postfix `?` operator can be applied to `Result` values and is equivalent to the current `try!()` macro. It either -returns the `Ok` value directly, or performs an early exit and propagates -the `Err` value further out. (So given `my_result: Result`, we -have `my_result?: Foo`.) This allows it to be used for e.g. conveniently -chaining method calls which may each "throw an exception": +The postfix `?` operator can be applied to `Result` values and is equivalent to +the current `try!()` macro. It either returns the `Ok` value directly, or +performs an early exit and propagates the `Err` value further out. (So given +`my_result: Result`, we have `my_result?: Foo`.) This allows it to be +used for e.g. conveniently chaining method calls which may each "throw an +exception": foo()?.bar()?.baz() Naturally, in this case the types of the "exceptions thrown by" `foo()` and -`bar()` must unify. Like the current `try!()` macro, the `?` operator will also perform an implicit "upcast" on the exception type. +`bar()` must unify. Like the current `try!()` macro, the `?` operator will also +perform an implicit "upcast" on the exception type. When used outside of a `try` block, the `?` operator propagates the exception to the caller of the current function, just like the current `try!` macro does. (If -the return type of the function isn't a `Result`, then this is a type error.) When used inside a `try` -block, it propagates the exception up to the innermost `try` block, as one would -expect. +the return type of the function isn't a `Result`, then this is a type error.) +When used inside a `try` block, it propagates the exception up to the innermost +`try` block, as one would expect. Requiring an explicit `?` operator to propagate exceptions strikes a very pleasing balance between completely automatic exception propagation, which most -languages have, and completely manual propagation, which we'd have apart from the `try!` macro. It means that function calls -remain simply function calls which return a result to their caller, with no -magic going on behind the scenes; and this also *increases* flexibility, because -one gets to choose between propagation with `?` or consuming the returned -`Result` directly. +languages have, and completely manual propagation, which we'd have apart from +the `try!` macro. It means that function calls remain simply function calls +which return a result to their caller, with no magic going on behind the scenes; +and this also *increases* flexibility, because one gets to choose between +propagation with `?` or consuming the returned `Result` directly. The `?` operator itself is suggestive, syntactically lightweight enough to not be bothersome, and lets the reader determine at a glance where an exception may or may not be thrown. It also means that if the signature of a function changes with respect to exceptions, it will lead to type errors rather than silent -behavior changes, which is a good thing. Finally, because exceptions are -tracked in the type system, and there is no silent propagation of exceptions, and -all points where an exception may be thrown are readily apparent visually, this -also means that we do not have to worry very much about "exception safety". +behavior changes, which is a good thing. Finally, because exceptions are tracked +in the type system, and there is no silent propagation of exceptions, and all +points where an exception may be thrown are readily apparent visually, this also +means that we do not have to worry very much about "exception safety". ### Exception type upcasting -In a language with checked exceptions and subtyping, it is clear that if a function is declared as throwing a particular type, its body should also be able to throw any of its subtypes. Similarly, in a language with structural sum types (a.k.a. anonymous `enum`s, polymorphic variants), one should be able to throw a type with fewer cases in a function declaring that it may throw a superset of those cases. This is essentially what is achieved by the common Rust practice of declaring a custom error `enum` with `From` `impl`s for each of the upstream error types which may be propagated: +In a language with checked exceptions and subtyping, it is clear that if a +function is declared as throwing a particular type, its body should also be able +to throw any of its subtypes. Similarly, in a language with structural sum types +(a.k.a. anonymous `enum`s, polymorphic variants), one should be able to throw a +type with fewer cases in a function declaring that it may throw a superset of +those cases. This is essentially what is achieved by the common Rust practice of +declaring a custom error `enum` with `From` `impl`s for each of the upstream +error types which may be propagated: enum MyError { IoError(io::Error), @@ -92,9 +103,15 @@ In a language with checked exceptions and subtyping, it is clear that if a funct impl From for MyError { ... } impl From for MyError { ... } -Here `io::Error` and `json::Error` can be thought of as subtypes of `MyError`, with a clear and direct embedding into the supertype. +Here `io::Error` and `json::Error` can be thought of as subtypes of `MyError`, +with a clear and direct embedding into the supertype. -The `?` operator should therefore perform such an implicit conversion in the nature of a subtype-to-supertype coercion. The present RFC uses the `std::convert::Into` trait for this purpose (which has a blanket `impl` forwarding from `From`). The precise requirements for a conversion to be "like" a subtyping coercion are an open question; see the "Unresolved questions" section. +The `?` operator should therefore perform such an implicit conversion in the +nature of a subtype-to-supertype coercion. The present RFC uses the +`std::convert::Into` trait for this purpose (which has a blanket `impl` +forwarding from `From`). The precise requirements for a conversion to be "like" +a subtyping coercion are an open question; see the "Unresolved questions" +section. ## `try`..`catch` @@ -106,16 +123,18 @@ thrown, it is passed to the `catch` block, and the `try`..`catch` evaluates to the value of the `catch` block. As with `if`..`else` expressions, the types of the `try` and `catch` blocks must therefore unify. Unlike other languages, only a single type of exception may be thrown in the `try` block (a `Result` only has -a single `Err` type); all exceptions are always caught; and there may only be one `catch` block. This dramatically simplifies thinking about the behavior of exception-handling code. +a single `Err` type); all exceptions are always caught; and there may only be +one `catch` block. This dramatically simplifies thinking about the behavior of +exception-handling code. There are two variations on this theme: 1. `try { EXPR }` - In this case the `try` block evaluates directly to a `Result` - containing either the value of `EXPR`, or the exception which was thrown. - For instance, `try { foo()? }` is essentially equivalent to `foo()`. - This can be useful if you want to coalesce *multiple* potential exceptions - + In this case the `try` block evaluates directly to a `Result` containing + either the value of `EXPR`, or the exception which was thrown. For instance, + `try { foo()? }` is essentially equivalent to `foo()`. This can be useful if + you want to coalesce *multiple* potential exceptions - `try { foo()?.bar()?.baz()? }` - into a single `Result`, which you wish to then e.g. pass on as-is to another function, rather than analyze yourself. @@ -130,10 +149,9 @@ There are two variations on this theme: Blue(bex) => quux(bex) } - Here the `catch` - performs a `match` on the caught exception directly, using any number of - refutable patterns. This form is convenient for checking and handling the - caught exception directly. + Here the `catch` performs a `match` on the caught exception directly, using + any number of refutable patterns. This form is convenient for checking and + handling the caught exception directly. # Detailed design @@ -149,10 +167,10 @@ constructs, and is independently useful. The capability can be exposed either by generalizing `break` to take an optional value argument and break out of any block (not just loops), or by generalizing `return` to take an optional lifetime argument and return from any block, not -just the outermost block of the function. This feature is only used in this RFC as an -explanatory device, and implementing the RFC does not require exposing it, so I am -going to arbitrarily choose the `break` syntax for the following and won't -discuss the question further. +just the outermost block of the function. This feature is only used in this RFC +as an explanatory device, and implementing the RFC does not require exposing it, +so I am going to arbitrarily choose the `break` syntax for the following and +won't discuss the question further. So we are extending `break` with an optional value argument: `break 'a EXPR`. This is an expression of type `!` which causes an early return from the @@ -174,10 +192,14 @@ A completely artificial example: Here if we don't have a thing, we escape from the block early with `None`. -If no value is specified, it defaults to `()`: in other words, the current behavior. -We can also imagine there is a magical lifetime `'fn` which refers to the lifetime of the whole function: in this case, `break 'fn` is equivalent to `return`. +If no value is specified, it defaults to `()`: in other words, the current +behavior. We can also imagine there is a magical lifetime `'fn` which refers to +the lifetime of the whole function: in this case, `break 'fn` is equivalent to +`return`. -Again, this RFC does not propose generalizing `break` in this way at this time: it is only used as a way to explain the meaning of the constructs it does propose. +Again, this RFC does not propose generalizing `break` in this way at this time: +it is only used as a way to explain the meaning of the constructs it does +propose. ## Definition of constructs @@ -275,6 +297,9 @@ of their definitions. a source-to-source translation in this manner, they need not necessarily be *implemented* this way.) +As a result of this RFC, both `Into` and `Result` would have to become lang +items. + ## Laws @@ -286,16 +311,19 @@ Without any attempt at completeness, here are some things which should be true: * `try { Err(e)? } catch { e => e }` = `e.into()` * `try { Ok(try_foo()?) } catch { e => Err(e) }` = `try_foo().map_err(Into::into)` -(In the above, `foo()` is a function returning any type, and `try_foo()` is a function returning a `Result`.) +(In the above, `foo()` is a function returning any type, and `try_foo()` is a +function returning a `Result`.) # Unresolved questions -These questions should be satisfactorally resolved before stabilizing the relevant features, at the latest. +These questions should be satisfactorally resolved before stabilizing the +relevant features, at the latest. ## Choice of keywords -The RFC to this point uses the keywords `try`..`catch`, but there are a number of other possibilities, each with different advantages and drawbacks: +The RFC to this point uses the keywords `try`..`catch`, but there are a number +of other possibilities, each with different advantages and drawbacks: * `try { ... } catch { ... }` @@ -313,59 +341,85 @@ Among the considerations: * Simplicity. Brevity. - * Following precedent from existing, popular languages, and familiarity with respect to analogous constructs in them. + * Following precedent from existing, popular languages, and familiarity with + respect to analogous constructs in them. - * Fidelity to the constructs' actual behavior. For instance, the first clause always catches the "exception"; the second only branches on it. + * Fidelity to the constructs' actual behavior. For instance, the first clause + always catches the "exception"; the second only branches on it. - * Consistency with the existing `try!()` macro. If the first clause is called `try`, then `try { }` and `try!()` would have essentially inverse meanings. + * Consistency with the existing `try!()` macro. If the first clause is called + `try`, then `try { }` and `try!()` would have essentially inverse meanings. - * Language-level backwards compatibility when adding new keywords. I'm not sure how this could or should be handled. + * Language-level backwards compatibility when adding new keywords. I'm not sure + how this could or should be handled. ## Semantics for "upcasting" -What should the contract for a `From`/`Into` `impl` be? Are these even the right `trait`s to use for this feature? +What should the contract for a `From`/`Into` `impl` be? Are these even the right +`trait`s to use for this feature? Two obvious, minimal requirements are: - * It should be pure: no side effects, and no observation of side effects. (The result should depend *only* on the argument.) + * It should be pure: no side effects, and no observation of side effects. (The + result should depend *only* on the argument.) - * It should be total: no panics or other divergence, except perhaps in the case of resource exhaustion (OOM, stack overflow). + * It should be total: no panics or other divergence, except perhaps in the case + of resource exhaustion (OOM, stack overflow). -The other requirements for an implicit conversion to be well-behaved in the context of this feature should be thought through with care. +The other requirements for an implicit conversion to be well-behaved in the +context of this feature should be thought through with care. Some further thoughts and possibilities on this matter: - * It should be "like a coercion from subtype to supertype", as described earlier. The precise meaning of this is not obvious. + * It should be "like a coercion from subtype to supertype", as described + earlier. The precise meaning of this is not obvious. - * A common condition on subtyping coercions is coherence: if you can compound-coerce to go from `A` to `Z` indirectly along multiple different paths, they should all have the same end result. + * A common condition on subtyping coercions is coherence: if you can + compound-coerce to go from `A` to `Z` indirectly along multiple different + paths, they should all have the same end result. - * It should be unambiguous, or preserve the meaning of the input: `impl From for u32` as `x as u32` feels right; as `(x as u32) * 12345` feels wrong, even though this is perfectly pure, total, and injective. What this means precisely in the general case is unclear. + * It should be unambiguous, or preserve the meaning of the input: + `impl From for u32` as `x as u32` feels right; as `(x as u32) * 12345` + feels wrong, even though this is perfectly pure, total, and injective. What + this means precisely in the general case is unclear. - * It should be lossless, or in other words, injective: it should map each observably-different element of the input type to observably-different elements of the output type. (Observably-different means that it is possible to write a program which behaves differently depending on which one it gets, modulo things that "shouldn't count" like observing execution time or resource usage.) + * It should be lossless, or in other words, injective: it should map each + observably-different element of the input type to observably-different + elements of the output type. (Observably-different means that it is possible + to write a program which behaves differently depending on which one it gets, + modulo things that "shouldn't count" like observing execution time or + resource usage.) - * The types converted between should the "same kind of thing": for instance, the *existing* `impl From for Ipv4Addr` is pretty suspect on this count. (This perhaps ties into the subtyping angle: `Ipv4Addr` is clearly not a supertype of `u32`.) + * The types converted between should the "same kind of thing": for instance, + the *existing* `impl From for Ipv4Addr` is pretty suspect on this count. + (This perhaps ties into the subtyping angle: `Ipv4Addr` is clearly not a + supertype of `u32`.) ## Forwards-compatibility -If we later want to generalize this feature to other types such as `Option`, as described below, will we be able to do so while maintaining backwards-compatibility? +If we later want to generalize this feature to other types such as `Option`, as +described below, will we be able to do so while maintaining backwards-compatibility? # Drawbacks * Increases the syntactic surface area of the language. - * No expressivity is added, only convenience. Some object to "there's more than one way to do it" on principle. + * No expressivity is added, only convenience. Some object to "there's more than + one way to do it" on principle. - * If at some future point we were to add higher-kinded types and syntactic sugar - for monads, a la Haskell's `do` or Scala's `for`, their functionality may overlap and result in redundancy. - However, a number of challenges would have to be overcome for a generic monadic sugar to be able to - fully supplant these features: the integration of higher-kinded types into Rust's type system in the - first place, the shape of a `Monad` `trait` in a language with lifetimes and move semantics, - interaction between the monadic control flow and Rust's native control flow (the "ambient monad"), - automatic upcasting of exception types via `Into` (the exception (`Either`, `Result`) monad normally does not - do this, and it's not clear whether it can), and potentially others. + * If at some future point we were to add higher-kinded types and syntactic + sugar for monads, a la Haskell's `do` or Scala's `for`, their functionality + may overlap and result in redundancy. However, a number of challenges would + have to be overcome for a generic monadic sugar to be able to fully supplant + these features: the integration of higher-kinded types into Rust's type + system in the first place, the shape of a `Monad` `trait` in a language with + lifetimes and move semantics, interaction between the monadic control flow + and Rust's native control flow (the "ambient monad"), automatic upcasting of + exception types via `Into` (the exception (`Either`, `Result`) monad normally + does not do this, and it's not clear whether it can), and potentially others. # Alternatives @@ -380,8 +434,9 @@ If we later want to generalize this feature to other types such as `Option`, as macros. However, this is likely to be awkward because, at least, macros may only have their contents as a single block, rather than two. Furthermore, macros are excellent as a "safety net" for features which we forget to add - to the language itself, or which only have specialized use cases; but generally - useful control flow constructs still work better as language features. + to the language itself, or which only have specialized use cases; but + generally useful control flow constructs still work better as language + features. * Add [first-class checked exceptions][notes], which are propagated automatically (without an `?` operator). @@ -408,9 +463,9 @@ This RFC doesn't propose doing so at this time, but as it would be an independen ## An additional `catch` form to bind the caught exception irrefutably -The `catch` described above immediately passes the caught exception into a `match` block. -It may sometimes be desirable to instead bind it directly to a single variable. That might -look like this: +The `catch` described above immediately passes the caught exception into a +`match` block. It may sometimes be desirable to instead bind it directly to a +single variable. That might look like this: try { EXPR } catch IRR-PAT { EXPR } @@ -473,23 +528,25 @@ A `throws` clause on a function: fn foo(arg: Foo) -> Bar throws Baz { ... } -would mean that instead of writing `return Ok(foo)` and -`return Err(bar)` in the body of the function, one would write `return foo` -and `throw bar`, and these are implicitly turned into `Ok` or `Err` for the caller. This removes syntactic overhead from -both "normal" and "throwing" code paths and (apart from `?` to propagate -exceptions) matches what code might look like in a language with native -exceptions. +would mean that instead of writing `return Ok(foo)` and `return Err(bar)` in the +body of the function, one would write `return foo` and `throw bar`, and these +are implicitly turned into `Ok` or `Err` for the caller. This removes syntactic +overhead from both "normal" and "throwing" code paths and (apart from `?` to +propagate exceptions) matches what code might look like in a language with +native exceptions. ## Generalize over `Result`, `Option`, and other result-carrying types -`Option` is completely equivalent to `Result` modulo names, and many common APIs -use the `Option` type, so it would be useful to extend all of the above syntax to `Option`, -and other (potentially user-defined) equivalent-to-`Result` types, as well. +`Option` is completely equivalent to `Result` modulo names, and many +common APIs use the `Option` type, so it would be useful to extend all of the +above syntax to `Option`, and other (potentially user-defined) +equivalent-to-`Result` types, as well. -This can be done by specifying a trait for types which can be used to "carry" either a normal -result or an exception. There are several different, equivalent ways -to formulate it, which differ in the set of methods provided, but the meaning in any case is essentially just -that you can choose some types `Normal` and `Exception` such that `Self` is isomorphic to `Result`. +This can be done by specifying a trait for types which can be used to "carry" +either a normal result or an exception. There are several different, equivalent +ways to formulate it, which differ in the set of methods provided, but the +meaning in any case is essentially just that you can choose some types `Normal` +and `Exception` such that `Self` is isomorphic to `Result`. Here is one way: @@ -502,13 +559,15 @@ Here is one way: fn translate>(from: Self) -> Other; } -For greater clarity on how these methods work, see the section on `impl`s below. (For a -simpler formulation of the trait using `Result` directly, see further below.) +For greater clarity on how these methods work, see the section on `impl`s below. +(For a simpler formulation of the trait using `Result` directly, see further +below.) The `translate` method says that it should be possible to translate to any *other* `ResultCarrier` type which has the same `Normal` and `Exception` types. -This may not appear to be very useful, but in fact, this is what can be used to inspect the result, -by translating it to a concrete type such as `Result` and then, for example, pattern matching on it. +This may not appear to be very useful, but in fact, this is what can be used to +inspect the result, by translating it to a concrete type such as `Result` and then, for example, pattern matching on it. Laws: @@ -519,9 +578,9 @@ Laws: Here I've used explicit type ascription syntax to make it clear that e.g. the types of `embed_` on the left and right hand sides are different. -The first two laws say that embedding a result `x` into one result-carrying type and -then translating it to a second result-carrying type should be the same as embedding it -into the second type directly. +The first two laws say that embedding a result `x` into one result-carrying type +and then translating it to a second result-carrying type should be the same as +embedding it into the second type directly. The third law says that translating to a different result-carrying type and then translating back should be a no-op. @@ -577,7 +636,8 @@ The laws should be sufficient to rule out any "icky" impls. For example, an impl for `Vec` where an exception is represented as the empty vector, and a normal result as a single-element vector: here the third law fails, because if the `Vec` has more than one element *to begin with*, then it's not possible to -translate to a different result-carrying type and then back without losing information. +translate to a different result-carrying type and then back without losing +information. The `bool` impl may be surprising, or not useful, but it *is* well-behaved: `bool` is, after all, isomorphic to `Result<(), ()>`. @@ -587,12 +647,12 @@ The `bool` impl may be surprising, or not useful, but it *is* well-behaved: * Our current lint for unused results could be replaced by one which warns for any unused result of a type which implements `ResultCarrier`. - * If there is ever ambiguity due to the result-carrying type being underdetermined - (experience should reveal whether this is a problem in practice), we could - resolve it by defaulting to `Result`. + * If there is ever ambiguity due to the result-carrying type being + underdetermined (experience should reveal whether this is a problem in + practice), we could resolve it by defaulting to `Result`. - * Translating between different result-carrying types with the same `Normal` and - `Exception` types *should*, but may not necessarily *currently* be, a + * Translating between different result-carrying types with the same `Normal` + and `Exception` types *should*, but may not necessarily *currently* be, a machine-level no-op most of the time. We could/should make it so that: @@ -630,8 +690,8 @@ This is, of course, the simplest possible formulation. The drawbacks are that it, in some sense, privileges `Result` over other potentially equivalent types, and that it may be less efficient for those types: for any non-`Result` type, every operation requires two method calls (one into -`Result`, and one out), whereas with the `ResultCarrier` trait in the main text, they -only require one. +`Result`, and one out), whereas with the `ResultCarrier` trait in the main text, +they only require one. Laws: From a12bad60bd9e66420b9debaf82d7513dcbd9db4f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Wed, 30 Dec 2015 20:54:37 +0100 Subject: [PATCH 0666/1195] minor verbiage --- active/0000-trait-based-exception-handling.md | 33 +++++++++---------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index 6c5d4c04f3d..0d868aaaee4 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -41,8 +41,7 @@ programs is entirely unaffected. The most important additions are a postfix `?` operator for propagating "exceptions" and a `try`..`catch` block for catching and handling them. By an -"exception", we essentially just mean the `Err` variant of a `Result`. (See the -"Detailed design" section for more precision.) +"exception", for now, we essentially just mean the `Err` variant of a `Result`. ## `?` operator @@ -106,7 +105,7 @@ error types which may be propagated: Here `io::Error` and `json::Error` can be thought of as subtypes of `MyError`, with a clear and direct embedding into the supertype. -The `?` operator should therefore perform such an implicit conversion in the +The `?` operator should therefore perform such an implicit conversion, in the nature of a subtype-to-supertype coercion. The present RFC uses the `std::convert::Into` trait for this purpose (which has a blanket `impl` forwarding from `From`). The precise requirements for a conversion to be "like" @@ -150,8 +149,8 @@ There are two variations on this theme: } Here the `catch` performs a `match` on the caught exception directly, using - any number of refutable patterns. This form is convenient for checking and - handling the caught exception directly. + any number of refutable patterns. This form is convenient for handling the + exception in-place. # Detailed design @@ -274,12 +273,12 @@ are merely one way. Deep: - match 'here: { + match ('here: { Ok(match foo() { Ok(a) => a, Err(e) => break 'here Err(e.into()) }.bar()) - } { + }) { Ok(a) => a, Err(e) => match e { A(a) => baz(a), @@ -342,7 +341,7 @@ Among the considerations: * Simplicity. Brevity. * Following precedent from existing, popular languages, and familiarity with - respect to analogous constructs in them. + respect to their analogous constructs. * Fidelity to the constructs' actual behavior. For instance, the first clause always catches the "exception"; the second only branches on it. @@ -370,7 +369,7 @@ Two obvious, minimal requirements are: The other requirements for an implicit conversion to be well-behaved in the context of this feature should be thought through with care. -Some further thoughts and possibilities on this matter: +Some further thoughts and possibilities on this matter, only as brainstorming: * It should be "like a coercion from subtype to supertype", as described earlier. The precise meaning of this is not obvious. @@ -379,11 +378,6 @@ Some further thoughts and possibilities on this matter: compound-coerce to go from `A` to `Z` indirectly along multiple different paths, they should all have the same end result. - * It should be unambiguous, or preserve the meaning of the input: - `impl From for u32` as `x as u32` feels right; as `(x as u32) * 12345` - feels wrong, even though this is perfectly pure, total, and injective. What - this means precisely in the general case is unclear. - * It should be lossless, or in other words, injective: it should map each observably-different element of the input type to observably-different elements of the output type. (Observably-different means that it is possible @@ -391,8 +385,13 @@ Some further thoughts and possibilities on this matter: modulo things that "shouldn't count" like observing execution time or resource usage.) + * It should be unambiguous, or preserve the meaning of the input: + `impl From for u32` as `x as u32` feels right; as `(x as u32) * 12345` + feels wrong, even though this is perfectly pure, total, and injective. What + this means precisely in the general case is unclear. + * The types converted between should the "same kind of thing": for instance, - the *existing* `impl From for Ipv4Addr` is pretty suspect on this count. + the *existing* `impl From for Ipv4Addr` feels suspect on this count. (This perhaps ties into the subtyping angle: `Ipv4Addr` is clearly not a supertype of `u32`.) @@ -566,8 +565,8 @@ below.) The `translate` method says that it should be possible to translate to any *other* `ResultCarrier` type which has the same `Normal` and `Exception` types. This may not appear to be very useful, but in fact, this is what can be used to -inspect the result, by translating it to a concrete type such as `Result` and then, for example, pattern matching on it. +inspect the result, by translating it to a concrete, known type such as +`Result` and then, for example, pattern matching on it. Laws: From 06f87fdd916396884b61aa143b3b572f178b7791 Mon Sep 17 00:00:00 2001 From: John Hodge Date: Fri, 1 Jan 2016 11:28:59 +0800 Subject: [PATCH 0667/1195] Initial draft --- text/0000-drop-types-in-const.md | 40 ++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 text/0000-drop-types-in-const.md diff --git a/text/0000-drop-types-in-const.md b/text/0000-drop-types-in-const.md new file mode 100644 index 00000000000..44f05876cf2 --- /dev/null +++ b/text/0000-drop-types-in-const.md @@ -0,0 +1,40 @@ +- Feature Name: `drop_types_in_const` +- Start Date: 2016-01-01 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Allow types with destructors to be used in `const`/`static` items, as long as the destructor is never run during `const` evaluation. + +# Motivation +[motivation]: #motivation + +Most collection types do not allocate any memory when constructed empty. With the change to make leaking safe, the restriction on `static` items with destructors +is no longer trequired to be a hard error. + +Allowing types with destructors to be directly used in `const` functions and stored in `static`s will remove the need to have +runtime-initialisation for global variables. + +# Detailed design +[design]: #detailed-design + +- Remove the check for `Drop` types in constant expressions. +- Add an error lint ensuring that `Drop` types are not dropped in a constant expression + - This includes when another field is moved out of a struct/tuple, and unused arguments in constant functions. + +# Drawbacks +[drawbacks]: #drawbacks + +Destructors do not run on `static` items (by design), so this can lead to unexpected behavior when a side-effecting type is stored in a `static` (e.g. a RAII temporary folder handle). However, this can already happen using the `lazy_static` crate, or with `Option` (which bypasses the existing checks). + +# Alternatives +[alternatives]: #alternatives + +Existing workarounds are based on storing `Option`, and initialising it to `Some` upon first access. These solutions work, but require runtime intialisation and incur a checking overhead on subsequent accesses. + +# Unresolved questions +[unresolved]: #unresolved-questions + +- TBD From c886a0aed5e912d731d359876b937903e666456c Mon Sep 17 00:00:00 2001 From: John Hodge Date: Fri, 1 Jan 2016 11:41:03 +0800 Subject: [PATCH 0668/1195] Update alternatives (UnsafeCell trick is a "bug", lazy_static uses raw pointers on stable) --- text/0000-drop-types-in-const.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/0000-drop-types-in-const.md b/text/0000-drop-types-in-const.md index 44f05876cf2..be6bdd6de6a 100644 --- a/text/0000-drop-types-in-const.md +++ b/text/0000-drop-types-in-const.md @@ -32,7 +32,10 @@ Destructors do not run on `static` items (by design), so this can lead to unexpe # Alternatives [alternatives]: #alternatives -Existing workarounds are based on storing `Option`, and initialising it to `Some` upon first access. These solutions work, but require runtime intialisation and incur a checking overhead on subsequent accesses. +- Runtime initialisation of a raw pointer can be used instead (as the `lazy_static` crate currently does on stable) +- On nightly, a bug related to `static` and `UnsafeCell>` can be used to remove the dynamic allocation. + +Both of these alternatives require runtime initialisation, and incur a checking overhead on subsequent accesses. # Unresolved questions [unresolved]: #unresolved-questions From d400320c7dd63ce2aec827c60b0fba0a3960c9f8 Mon Sep 17 00:00:00 2001 From: John Hodge Date: Fri, 1 Jan 2016 12:02:16 +0800 Subject: [PATCH 0669/1195] Replace detailed design with @eddyb's comments from rust-lang/rust#30667 --- text/0000-drop-types-in-const.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/text/0000-drop-types-in-const.md b/text/0000-drop-types-in-const.md index be6bdd6de6a..08c8a6ce85f 100644 --- a/text/0000-drop-types-in-const.md +++ b/text/0000-drop-types-in-const.md @@ -20,14 +20,17 @@ runtime-initialisation for global variables. # Detailed design [design]: #detailed-design -- Remove the check for `Drop` types in constant expressions. -- Add an error lint ensuring that `Drop` types are not dropped in a constant expression - - This includes when another field is moved out of a struct/tuple, and unused arguments in constant functions. + +- allow destructors in statics + - optionally warn about the "potential leak" +- allow instantiating structures that impl Drop in constant expressions +- prevent const items from holding values with destructors, but allow const fn to return them +- disallow constant expressions which would result in the Drop impl getting called, where they not in a constant context # Drawbacks [drawbacks]: #drawbacks -Destructors do not run on `static` items (by design), so this can lead to unexpected behavior when a side-effecting type is stored in a `static` (e.g. a RAII temporary folder handle). However, this can already happen using the `lazy_static` crate, or with `Option` (which bypasses the existing checks). +Destructors do not run on `static` items (by design), so this can lead to unexpected behavior when a side-effecting type is stored in a `static` (e.g. a RAII temporary folder handle). However, this can already happen using the `lazy_static` crate. # Alternatives [alternatives]: #alternatives From afea13ff8ec516c84ec34de968b5546db7f04fab Mon Sep 17 00:00:00 2001 From: John Hodge Date: Fri, 1 Jan 2016 12:24:43 +0800 Subject: [PATCH 0670/1195] Expand detailed design --- text/0000-drop-types-in-const.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/text/0000-drop-types-in-const.md b/text/0000-drop-types-in-const.md index 08c8a6ce85f..a96cbbb708b 100644 --- a/text/0000-drop-types-in-const.md +++ b/text/0000-drop-types-in-const.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Allow types with destructors to be used in `const`/`static` items, as long as the destructor is never run during `const` evaluation. +Allow types with destructors to be used in `static` items and in `cosnt` functions, as long as the destructor never needs to run in const context. # Motivation [motivation]: #motivation @@ -20,17 +20,17 @@ runtime-initialisation for global variables. # Detailed design [design]: #detailed-design - -- allow destructors in statics - - optionally warn about the "potential leak" -- allow instantiating structures that impl Drop in constant expressions -- prevent const items from holding values with destructors, but allow const fn to return them -- disallow constant expressions which would result in the Drop impl getting called, where they not in a constant context +- Lift the restriction on types with destructors being used in statics. + - (Optionally adding a lint that warn about the possibility of resource leak) +- Alloc instantiating structures with destructors in constant expressions, +- Continue to prevent `const` items from holding types with destructors. +- Allow `const fn` to return types wth destructors. +- Disallow constant expressions which would result in the destructor being called (if the code were run at runtime). # Drawbacks [drawbacks]: #drawbacks -Destructors do not run on `static` items (by design), so this can lead to unexpected behavior when a side-effecting type is stored in a `static` (e.g. a RAII temporary folder handle). However, this can already happen using the `lazy_static` crate. +Destructors do not run on `static` items (by design), so this can lead to unexpected behavior when a type's destructor has effects outside the program (e.g. a RAII temporary folder handle, which deletes the folder on drop). However, this can already happen using the `lazy_static` crate. # Alternatives [alternatives]: #alternatives From a778e876e6514ba7ae37da330594a8b6c27a7fd7 Mon Sep 17 00:00:00 2001 From: John Hodge Date: Fri, 1 Jan 2016 18:37:50 +0800 Subject: [PATCH 0671/1195] Add some examples --- text/0000-drop-types-in-const.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/text/0000-drop-types-in-const.md b/text/0000-drop-types-in-const.md index a96cbbb708b..60f9434d212 100644 --- a/text/0000-drop-types-in-const.md +++ b/text/0000-drop-types-in-const.md @@ -27,6 +27,28 @@ runtime-initialisation for global variables. - Allow `const fn` to return types wth destructors. - Disallow constant expressions which would result in the destructor being called (if the code were run at runtime). +## Examples +Assuming that `RwLock` and `Vec` have `const fn new` methods, the following example is possible and avoids runtime validity checks. + +```rust +/// Logging output handler +trait LogHandler: Send + Sync { + // ... +} +/// List of registered logging handlers +static S_LOGGERS: RwLock >> = RwLock::new( Vec::new() ); +``` + +Disallowed code +```rust +static VAL: usize = (Vec::::new(), 0).1; // The `Vec` would be dropped +const EMPTY_BYTE_VEC: Vec = Vec::new(); // `const` items can't have destructors + +const fn sample(_v: Vec) -> usize { + 0 // Discards the input vector, dropping it +} +``` + # Drawbacks [drawbacks]: #drawbacks From 1c9c699e958ef89509f021c8f3b826aaccd5b9f8 Mon Sep 17 00:00:00 2001 From: John Hodge Date: Fri, 1 Jan 2016 21:44:05 +0800 Subject: [PATCH 0672/1195] Fix spelling errors --- text/0000-drop-types-in-const.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-drop-types-in-const.md b/text/0000-drop-types-in-const.md index 60f9434d212..1db09dfa45d 100644 --- a/text/0000-drop-types-in-const.md +++ b/text/0000-drop-types-in-const.md @@ -6,13 +6,13 @@ # Summary [summary]: #summary -Allow types with destructors to be used in `static` items and in `cosnt` functions, as long as the destructor never needs to run in const context. +Allow types with destructors to be used in `static` items and in `const` functions, as long as the destructor never needs to run in const context. # Motivation [motivation]: #motivation Most collection types do not allocate any memory when constructed empty. With the change to make leaking safe, the restriction on `static` items with destructors -is no longer trequired to be a hard error. +is no longer required to be a hard error. Allowing types with destructors to be directly used in `const` functions and stored in `static`s will remove the need to have runtime-initialisation for global variables. @@ -24,7 +24,7 @@ runtime-initialisation for global variables. - (Optionally adding a lint that warn about the possibility of resource leak) - Alloc instantiating structures with destructors in constant expressions, - Continue to prevent `const` items from holding types with destructors. -- Allow `const fn` to return types wth destructors. +- Allow `const fn` to return types with destructors. - Disallow constant expressions which would result in the destructor being called (if the code were run at runtime). ## Examples From ccd4a78198d558810befcd44eb1282b80dbfeb9a Mon Sep 17 00:00:00 2001 From: John Hodge Date: Fri, 1 Jan 2016 22:19:26 +0800 Subject: [PATCH 0673/1195] Clarify collection types with non-allocating constructors, note that destructors will never run, acknowledge `.dtors` as alternative --- text/0000-drop-types-in-const.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/0000-drop-types-in-const.md b/text/0000-drop-types-in-const.md index 1db09dfa45d..92f79a19e7b 100644 --- a/text/0000-drop-types-in-const.md +++ b/text/0000-drop-types-in-const.md @@ -11,8 +11,8 @@ Allow types with destructors to be used in `static` items and in `const` functio # Motivation [motivation]: #motivation -Most collection types do not allocate any memory when constructed empty. With the change to make leaking safe, the restriction on `static` items with destructors -is no longer required to be a hard error. +Some of the collection types do not allocate any memory when constructed empty (most notably `Vec`). With the change to make leaking safe, the restriction on `static` items with destructors +is no longer required to be a hard error (as it is safe and accepted that these destructors may never run). Allowing types with destructors to be directly used in `const` functions and stored in `static`s will remove the need to have runtime-initialisation for global variables. @@ -21,6 +21,7 @@ runtime-initialisation for global variables. [design]: #detailed-design - Lift the restriction on types with destructors being used in statics. + - `static`s containing Drop-types will not run the destructor upon program/thread exit. - (Optionally adding a lint that warn about the possibility of resource leak) - Alloc instantiating structures with destructors in constant expressions, - Continue to prevent `const` items from holding types with destructors. @@ -59,8 +60,9 @@ Destructors do not run on `static` items (by design), so this can lead to unexpe - Runtime initialisation of a raw pointer can be used instead (as the `lazy_static` crate currently does on stable) - On nightly, a bug related to `static` and `UnsafeCell>` can be used to remove the dynamic allocation. - -Both of these alternatives require runtime initialisation, and incur a checking overhead on subsequent accesses. + - Both of these alternatives require runtime initialisation, and incur a checking overhead on subsequent accesses. +- Leaking of objects could be addressed by using C++-style `.dtors` support + - This is undesirable, as it introduces confusion around destructor execution order. # Unresolved questions [unresolved]: #unresolved-questions From 1f14fb2a6c4bd0cdb69a27cfaf963e056d9364c0 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Tue, 5 Jan 2016 04:50:57 +0000 Subject: [PATCH 0674/1195] Extend atomic compare_and_swap --- text/0000-extended-compare-and-swap.md | 112 +++++++++++++++++++++++++ 1 file changed, 112 insertions(+) create mode 100644 text/0000-extended-compare-and-swap.md diff --git a/text/0000-extended-compare-and-swap.md b/text/0000-extended-compare-and-swap.md new file mode 100644 index 00000000000..b0eedcf9900 --- /dev/null +++ b/text/0000-extended-compare-and-swap.md @@ -0,0 +1,112 @@ +- Feature Name: extended_compare_and_swap +- Start Date: 2016-1-5 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Rust currently provides a `compare_and_swap` method on atomic types, but this method only exposes a subset of the functionality of the C++11 equivalents [`compare_exchange_strong` and `compare_exchange_weak`](http://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange): + +- `compare_and_swap` maps to the C++11 `compare_exchange_strong`, but there is no Rust equivalent for `compare_exchange_weak`. The latter is allowed to fail spuriously even when the comparison succeeds, which allows the compiler to generate better assembly code when the compare and swap is used in a loop. + +- `compare_and_swap` only has a single memory ordering parameter, whereas the C++11 versions have two: the first describes the memory ordering when the operation succeeds while the second one describes the memory ordering on failure. + +# Motivation +[motivation]: #motivation + +While all of these variants are identical on x86, they can allow more efficient code to be generated on architectures such as ARM: + +- On ARM, the strong variant of compare and swap is compiled into an `LDREX` / `STREX` loop which restarts the compare and swap when a spurious failure is detected. This is unnecessary for many lock-free algorithms since the compare and swap is usually already inside a loop and a spurious failure is often caused by another thread modifying the atomic concurrently, which will probably cause the compare and swap to fail anyways. + +- When Rust lowers `compare_and_swap` to LLVM, it uses the same memory ordering type for success and failure, which on ARM adds extra memory barrier instructions to the failure path. Most lock-free algorithms which make use of compare and swap in a loop only need relaxed ordering on failure since the operation is going to be restarted anyways. + +# Detailed design +[design]: #detailed-design + +## Memory ordering on failure + +Since `compare_and_swap` is stable, we can't simply add a second memory ordering parameter to it. A new method is instead added to atomic types: + +```rust +fn compare_and_swap_explicit(&self, current: T, new: T, success: Ordering, failure: Ordering) -> T; +``` + +The restrictions on the failure ordering are the same as C++11: only `SeqCst`, `Acquire` and `Relaxed` are allowed and it must be equal or weaker than the success ordering. + +The documentation for the original `compare_and_swap` is updated to say that it is equivalent to `compare_and_swap_explicit` with the following mapping for memory orders: + +Original | Success | Failure +-------- | ------- | ------- +Relaxed | Relaxed | Relaxed +Acquire | Acquire | Acquire +Release | Release | Relaxed +AcqRel | AcqRel | Acquire +SeqCst | SeqCst | SeqCst + +## `compare_and_swap_weak` + +Two new methods are added to atomic types: + +```rust +fn compare_and_swap_weak(&self, current: T, new: T, order: Ordering) -> (T, bool); +fn compare_and_swap_weak_explicit(&self, current: T, new: T, success: Ordering, failure: Ordering) -> (T, bool); +``` + +`compare_and_swap` does not need to return a success flag because it can be inferred by checking if the returned value is equal to the expected one. This is not possible for `compare_and_swap_weak` because it is allowed to fail spuriously, which means that it could fail to perform the swap even though the returned value is equal to the expected one. + +A lock free algorithm using a loop would use the returned bool to determine whether to break out of the loop, and if not, use the returned value for the next iteration of the loop. + +## Intrinsics + +These are the existing intrinsics used to implement `compare_and_swap`: + +```rust + pub fn atomic_cxchg(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_acq(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_rel(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_acqrel(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_relaxed(dst: *mut T, old: T, src: T) -> T; +``` + +The following intrinsics need to be added to support relaxed memory orderings on failure: + +```rust + pub fn atomic_cxchg_acqrel_failrelaxed(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_failacq(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_failrelaxed(dst: *mut T, old: T, src: T) -> T; + pub fn atomic_cxchg_acq_failrelaxed(dst: *mut T, old: T, src: T) -> T; +``` + +The following intrinsics need to be added to support `compare_and_swap_weak`: + +```rust + pub fn atomic_cxchg_weak(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acq(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_rel(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acqrel(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_relaxed(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acqrel_failrelaxed(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_failacq(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_failrelaxed(dst: *mut T, old: T, src: T) -> (T, bool); + pub fn atomic_cxchg_weak_acq_failrelaxed(dst: *mut T, old: T, src: T) -> (T, bool); +``` + +# Drawbacks +[drawbacks]: #drawbacks + +Ideally support for failure memory ordering would be added by simply adding an extra parameter to the existing `compare_and_swap` function. However this is not possible because `compare_and_swap` is stable. + +For consistency with `compare_and_swap`, `compare_and_swap_weak` also has a separate explicit variant with two memory ordering parameters, even though ideally only a single method would be required. + +# Alternatives +[alternatives]: #alternatives + +One alternative for supporting failure orderings is to add new enum variants to `Ordering` instead of adding new methods with two ordering parameters. The following variants would need to be added: `AcquireFailRelaxed`, `AcqRelFailRelaxed`, `SeqCstFailRelaxed`, `SeqCstFailAcquire`. The downside is that the names are quite ugly and are only valid for `compare_and_swap`, not other atomic operations. + +Not doing anything is also a possible option, but this will cause Rust to generate worse code for some lock-free algorithms. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From 4126461a38a45908bd239db4d67c7b2eba04083c Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Tue, 29 Dec 2015 06:10:14 -0800 Subject: [PATCH 0675/1195] RFC: native C-compatible unions via `untagged_union` --- text/0000-untagged_union.md | 325 ++++++++++++++++++++++++++++++++++++ 1 file changed, 325 insertions(+) create mode 100644 text/0000-untagged_union.md diff --git a/text/0000-untagged_union.md b/text/0000-untagged_union.md new file mode 100644 index 00000000000..0c67104b3cf --- /dev/null +++ b/text/0000-untagged_union.md @@ -0,0 +1,325 @@ +- Feature Name: `untagged_union` +- Start Date: 2015-12-29 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Provide native support for C-compatible unions, defined via a new keyword +`untagged_union`. + +# Motivation +[motivation]: #motivation + +Many FFI interfaces include unions. Rust does not currently have any native +representation for unions, so users of these FFI interfaces must define +multiple structs and transmute between them via `std::mem::transmute`. The +resulting FFI code must carefully understand platform-specific size and +alignment requirements for structure fields. Such code has little in common +with how a C client would invoke the same interfaces. + +Introducing native syntax for unions makes many FFI interfaces much simpler and +less error-prone to write, simplifying the creation of bindings to native +libraries, and enriching the Rust/Cargo ecosystem. + +A native union mechanism would also simplify Rust implementations of +space-efficient or cache-efficient structures relying on value representation, +such as machine-word-sized unions using the least-significant bits of aligned +pointers to distinguish cases. + +The syntax proposed here avoids reserving `union` as the new keyword, as +existing Rust code already uses `union` for other purposes, including [multiple +functions in the standard +library](https://doc.rust-lang.org/std/?search=union). + +To preserve memory safety, accesses to union fields may only occur in `unsafe` +code. Commonly, code using unions will provide safe wrappers around unsafe +union field accesses. + +# Detailed design +[design]: #detailed-design + +## Declaring a union type + +A union declaration uses the same field declaration syntax as a `struct` +declaration, except with the keyword `untagged_union` in place of `struct`: + +```rust +untagged_union MyUnion { + f1: u32, + f2: f32, +} +``` + +`untagged_union` implies `#[repr(C)]` as the default representation, making +`#[repr(C)] untagged_union` permissible but redundant. + +## Instantiating a union + +A union instantiation uses the same syntax as a struct instantiation, except +that it must specify exactly one field: + +```rust +let u = MyUnion { f1: 1 }; +``` + +Specifying multiple fields in a union instantiation results in a compiler +error. + +Safe code may instantiate a union, as no unsafe behavior can occur until +accessing a field of the union. Code that wishes to maintain invariants about +the union fields should make the union fields private and provide public +functions that maintain the invariants. + +## Reading fields + +Unsafe code may read from union fields, using the same dotted syntax as a +struct: + +```rust +fn f(u: MyUnion) -> f32 { + unsafe { u.f2 } +} +``` + +## Writing fields + +Unsafe code may write to fields in a mutable union, using the same syntax as a +struct: + +```rust +fn f(u: &mut MyUnion) { + unsafe { + u.f1 = 2; + } +} +``` + +If a union contains multiple fields of different sizes, assigning to a field +smaller than the entire union must not change the memory of the union outside +that field. + +## Pattern matching + +Unsafe code may pattern match on union fields, using the same syntax as a +struct, without the requirement to mention every field of the union in a match +or use `..`: + +```rust +fn f(u: MyUnion) { + unsafe { + match u { + MyUnion { f1: 10 } => { println!("ten"); } + MyUnion { f2 } => { println!("{}", f2); } + } + } +} +``` + +Matching a specific value from a union field makes a refutable pattern; naming +a union field without matching a specific value makes an irrefutable pattern. +Both require unsafe code. + +Pattern matching may match a union as a field of a larger structure. In +particular, when using an `untagged_union` to implement a C tagged union via +FFI, this allows matching on the tag and the corresponding field +simultaneously: + +```rust +#[repr(u32)] +enum Tag { I, F } + +untagged_union U { + i: i32, + f: f32, +} + +#[repr(C)] +struct Value { + tag: Tag, + u: U, +} + +fn is_zero(v: Value) -> bool { + unsafe { + match v { + Value { tag: I, u: U { i: 0 } } => true, + Value { tag: F, u: U { f: 0.0 } } => true, + _ => false, + } + } +} +``` + +Note that a pattern match on a union field that has a smaller size than the +entire union must not make any assumptions about the value of the union's +memory outside that field. + +## Borrowing union fields + +Unsafe code may borrow a reference to a field of a union; doing so borrows the +entire union, such that any borrow conflicting with a borrow of the union +(including a borrow of another union field or a borrow of a structure +containing the union) will produce an error. + +```rust +untagged_union U { + f1: u32, + f2: f32, +} + +#[test] +fn test() { + let mut u = U { f1: 1 }; + unsafe { + let b1 = &mut u.f1; + // let b2 = &mut u.f2; // This would produce an error + *b1 = 5; + } + unsafe { + assert_eq!(u.f1, 5); + } +} +``` + +Simultaneous borrows of multiple fields of a struct contained within a union do +not conflict: + +```rust +struct S { + x: u32, + y: u32, +} + +untagged_union U { + s: S, + both: u64, +} + +#[test] +fn test() { + let mut u = U { s: S { x: 1, y: 2 } }; + unsafe { + let bx = &mut u.s.x; + // let bboth = &mut u.both; // This would fail + let by = &mut u.s.y; + *bx = 5; + *by = 10; + } + unsafe { + assert_eq!(u.s.x, 5); + assert_eq!(u.s.y, 10); + } +} +``` + +## Union and field visibility + +The `pub` keyword works on the union and on its fields, as with a struct. The +union and its fields default to private. Using a private field in a union +instantiation, field access, or pattern match produces an error. + +## Uninitialized unions + +The compiler should consider a union uninitialized if declared without an +initializer. However, providing a field during instantiation, or assigning to +a field, should cause the compiler to treat the entire union as initialized. + +## Unions and traits + +A union may have trait implementations, using the same syntax as a struct. + +The compiler should warn if a union field has a type that implements the `Drop` +trait. + +## Unions and undefined behavior + +Rust code must not use unions to invoke [undefined +behavior](https://doc.rust-lang.org/nightly/reference.html#behavior-considered-undefined). +In particular, Rust code must not use unions to break the pointer aliasing +rules with raw pointers, or access a field containing a primitive type with an +invalid value. + +## Union size and alignment + +A union must have the same size and alignment as an equivalent C union +declaration for the target platform. Typically, a union would have the maximum +size of any of its fields, and the maximum alignment of any of its fields. +Note that those maximums may come from different fields; for instance: + +```rust +untagged_union U { + f1: u16, + f2: [u8; 4], +} + +#[test] +fn test() { + assert_eq!(std::mem::size_of(), 4); + assert_eq!(std::mem::align_of(), 2); +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +Adding a new type of data structure would increase the complexity of the +language and the compiler implementation, albeit marginally. However, this +change seems likely to provide a net reduction in the quantity and complexity +of unsafe code. + +# Alternatives +[alternatives]: #alternatives + +- Don't do anything, and leave users of FFI interfaces with unions to continue + writing complex platform-specific transmute code. +- Create macros to define unions and access their fields. However, such macros + make field accesses and pattern matching look more cumbersome and less + structure-like. The implementation and use of such macros provides strong + motivation to seek a better solution, and indeed existing writers and users + of such macros have specifically requested native syntax in Rust. +- Define unions without a new keyword `untagged_union`, such as via + `#[repr(union)] struct`. This would avoid any possibility of breaking + existing code that uses the keyword, but would make declarations more + verbose, and introduce potential confusion with `struct` (or whatever + existing construct the `#[repr(union)]` attribute modifies). +- Use a compound keyword like `unsafe union`, while not reserving `union` on + its own as a keyword, to avoid breaking use of `union` as an identifier. + Potentially more appealing syntax, if the Rust parser can support it. +- Use a new operator to access union fields, rather than the same `.` operator + used for struct fields. This would make union fields more obvious at the + time of access, rather than making them look syntactically identical to + struct fields despite the semantic difference in storage representation. +- The [unsafe enum](https://github.com/rust-lang/rfcs/pull/724) proposal: + introduce untagged enums, identified with `unsafe enum`. Pattern-matching + syntax would make field accesses significantly more verbose than structure + field syntax. +- The [unsafe enum](https://github.com/rust-lang/rfcs/pull/724) proposal with + the addition of struct-like field access syntax. The resulting field access + syntax would look much like this proposal; however, pairing an enum-style + definition with struct-style usage seems confusing for developers. An + enum-based declaration leads users to expect enum-like syntax; a new + construct distinct from both enum and struct does not lead to such + expectations, and developers used to C unions will expect struct-like field + access for unions. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Can the borrow checker support the rule that "simultaneous borrows of multiple +fields of a struct contained within a union do not conflict"? If not, omitting +that rule would only marginally increase the verbosity of such code, by +requiring an explicit borrow of the entire struct first. + +Can a pattern match match multiple fields of a union at once? For rationale, +consider a union using the low bits of an aligned pointer as a tag; a pattern +match may match the tag using one field and a value identified by that tag +using another field. However, if this complicates the implementation, omitting +it would not significantly complicate code using unions. + +C APIs using unions often also make use of anonymous unions and anonymous +structs. For instance, a union may contain anonymous structs to define +non-overlapping fields, and a struct may contain an anonymous union to define +overlapping fields. This RFC does not define anonymous unions or structs, but +a subsequent RFC may wish to do so. From b2d8ca06aeac74643f298714f084c7d064c9a36b Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Mon, 4 Jan 2016 23:42:12 -0800 Subject: [PATCH 0676/1195] Make union fields that implement Drop an error, not a warning --- text/0000-untagged_union.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-untagged_union.md b/text/0000-untagged_union.md index 0c67104b3cf..eff2838e12f 100644 --- a/text/0000-untagged_union.md +++ b/text/0000-untagged_union.md @@ -230,8 +230,8 @@ a field, should cause the compiler to treat the entire union as initialized. A union may have trait implementations, using the same syntax as a struct. -The compiler should warn if a union field has a type that implements the `Drop` -trait. +The compiler should produce an error if a union field has a type that +implements the `Drop` trait. ## Unions and undefined behavior From 4b471de048d27939a45002a42ef27c1e852f216e Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 6 Jan 2016 12:45:24 -0500 Subject: [PATCH 0677/1195] add new RFC --- text/0000-restrict-constants-in-patterns.md | 568 ++++++++++++++++++++ 1 file changed, 568 insertions(+) create mode 100644 text/0000-restrict-constants-in-patterns.md diff --git a/text/0000-restrict-constants-in-patterns.md b/text/0000-restrict-constants-in-patterns.md new file mode 100644 index 00000000000..c37e7d209ee --- /dev/null +++ b/text/0000-restrict-constants-in-patterns.md @@ -0,0 +1,568 @@ +- Feature Name: (fill me in with a unique ident, my_awesome_feature) +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Feature-gate the use of constants in patterns unless those constants +have simple types, like integers, booleans, and characters. The +semantics of constants in general were never widely discussed and the +compiler's current implementation is not broadly agreed upon (though +it has many proponents). The intention of adding a feature-gate is to +give us time to discuss and settle on the desired semantics in an +"affirmative" way. + +Because the compiler currently accepts a larger set of constants, this +is a backwards incompatible change. This is justified as part of the +["underspecified language semantics" clause of RFC 1122][ls]. A +[crater run] found 14 regressions on crates.io, which suggests that +the impact of this change on real code would be minimal. + +Note: this was also discussed on an [internals thread]. Major points +from that thread are summarized either inline or in alternatives. + +[ls]: https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md#underspecified-language-semantics +[crater run]: https://gist.github.com/nikomatsakis/26096ec2a2df3c1fb224 +[internals thread]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846) + +# Motivation +[motivation]: #motivation + +The compiler currently permits any kind of constant to be used within +a pattern. However, the *meaning* of such a pattern is somewhat +controversial: the current semantics implemented by the compiler were +[adopted in July of 2014](https://github.com/rust-lang/rust/pull/15650) +and were never widely discussed nor did they go through the RFC +process. Moreover, the discussion at the time was focused primarily on +implementation concerns, and overlooked the potential semantic +hazards. + +### Semantic vs structural equality + +Consider a program like this one, which references a constant value +from within a pattern: + +```rust +struct SomeType { + a: u32, + b: u32, +} + +const SOME_CONSTANT: SomeType = SomeType { a: 22+22, b: 44+44 }; + +fn test(v: SomeType) { + match v { + SOME_CONSTANT => println!("Yes"), + _ => println!("No"), + } +} +``` + +The question at hand is what do we expect this match to do, precisely? +There are two main possibilities: semantic and structural equality. + +**Semantic equality.** Semantic equality states that a pattern +`SOME_CONSTANT` matches a value `v` if `v == SOME_CONSTANT`. In other +words, the `match` statement above would be exactly equivalent to an +`if`: + +```rust +if v == SOME_CONSTANT { + println!("Yes") +} else { + println!("No"); +} +``` + +Under semantic equality, the program above would not compile, because +`SomeType` does not implement the `PartialEq` trait. + +**Structural equality.** Under structural equality, `v` matches the +pattern `SOME_CONSTANT` if all of its fields are (structurally) equal. +Primitive types like `u32` are structurally equal if they represent +the same value (but see below for discussion about floating point +types like `f32` and `f64`). This means that the `match` statement +above would be roughly equivalent to the following `if` (modulo +privacy): + +```rust +if v.a == SOME_CONSTANT.a && v.b == SOME_CONSTANT.b { + println!("Yes") +} else { + println!("No"); +} +``` + +Structural equality basically says "two things are structurally equal +if their fields are structurally equal". It is sort of equality you +would get if everyone used `#[derive(PartialEq)]` on all types. Note +that the equality defined by structural equality is completely +distinct from the `==` operator, which is tied to the `PartialEq` +traits. That is, two values that are *semantically unequal* could be +*structurally equal* (an example where this might occur is the +floating point value `NaN`). + +**Current semantics.** The compiler's current semantics are basically +structural equality, though in the case of floating point numbers they +are arguably closer to semantic equality (details below). In +particular, when a constant appears in a pattern, the compiler first +evaluates that constant to a specific value. So we would reduce the +expression: + +```rust +const SOME_CONSTANT: SomeType = SomeType { a: 22+22, b: 44+44 }; +``` + +to the value `SomeType { a: 44, b: 88 }`. We then expand the pattern +`SOME_CONSTANT` as though you had typed this value in place (well, +almost as though, read on for some complications around privacy). +Thus the match statement above is equivalent to: + +```rust +match v { + SomeType { a: 44, b: 88 } => println!(Yes), + _ => println!("No"), +} +``` + +### Disadvantages of the current approach + +Given that the compiler already has a defined semantics, it is +reasonable to ask why we might want to change it. There +are two main disadvantages: + +1. **No abstraction boundary.** The current approach does not permit + types to define what equality means for themselves (at least not if + they can be constructed in a constant). +2. **Scaling to associated constants.** The current approach does not + permit associated constants or generic integers to be used in a + match statement. + +#### Disadvantage: Weakened abstraction bounary + +The single biggest concern with structural equality is that it +introduces two distinct notions of equality: the `==` operator, based +on the `PartialEq` trait, and pattern matching, based on a builtin +structural recursion. This will cause problems for user-defined types +that rely on `PartialEq` to define equality. Put another way, **it is +no longer possible for user-defined types to completely define what +equality means for themselves** (at least not if they can be +constructed in a constant). Furthermore, because the builtin +structural recursion does not consider privacy, `match` statements can +now be used to **observe private fields**. + +**Example: Normalized durations.** Consider a simple duration type: + +```rust +#[derive(Copy, Clone)] +pub struct Duration { + pub seconds: u32, + pub minutes: u32, +} +``` + +Let's say that this `Duration` type wishes to represent a span of +time, but it also wishes to preserve whether that time was expressed +in seconds or minutes. In other words, 60 seconds and 1 minute are +equal values, but we don't want to normalize 60 seconds into 1 minute; +perhaps because it comes from user input and we wish to keep things +just as the user chose to express it. + +We might implement `PartialEq` like so (actually the `PartialEq` trait +is slightly different, but you get the idea): + +```rust +impl PartialEq for Duration { + fn eq(&self, other: &Duration) -> bool { + let s1 = (self.seconds as u64) + (self.minutes as u64 * 60); + let s2 = (other.seconds as u64) + (other.minutes as u64 * 60); + s1 == s2 + } +} +``` + +Now imagine I have some constants: + +```rust +const TWENTY_TWO_SECONDS: Duration = Duration { seconds: 22, minutes: 0 }; +const ONE_MINUTE: Duration = Duration { seconds: 0, minutes: 1 }; +``` + +And I write a match statement using those constants: + +```rust +fn detect_some_case_or_other(d: Duration) { + match d { + TWENTY_TWO_SECONDS => /* do something */, + ONE_MINUTE => /* do something else */, + _ => /* do something else again */, + } +} +``` + +Now this code is, in all probability, buggy. Probably I meant to use +the notion of equality that `Duration` defined, where seconds and +minutes are normalized. But that is not the behavior I will see -- +instead I will use a pure structural match. What's worse, this means +the code will probably work in my local tests, since I like to say +"one minute", but it will break when I demo it for my customer, since +she prefers to write "60 seconds". + +**Example: Floating point numbers.** Another example is floating point +numbers. Consider the case of `0.0` and `-0.0`: these two values are +distinct, but they typically behave the same; so much so that they +compare equal (that is, `0.0 == -0.0` is `true`). So it is likely +that code such as: + +```rust +match some_computation() { + 0.0 => ..., + x => ..., +} +``` + +did not intend to discriminate between zero and negative zero. In +fact, in the compiler today, match *will* compare 0.0 and -0.0 as +equal. We simply do not extend that courtesy to user-defined types. + +**Example: observing private fields.** The current constant expansion +code does not consider privacy. In other words, constants are expanded +into equivalent patterns, but those patterns may not have been +something the user could have typed because of privacy rules. Consider +a module like: + +```rust +mod foo { + pub struct Foo { b: bool } + pub const V1: Foo = Foo { b: true }; + pub const V2: Foo = Foo { b: false }; +} +``` + +Note that there is an abstraction boundary here: b is a private +field. But now if I wrote code from another module that matches on a +value of type Foo, that abstraction boundary is pierced: + +```rust +fn bar(f: x::Foo) { + // rustc knows this is exhaustive because if expanded `V1` into + // equivalent patterns; patterns you could not write by hand! + match f { + x::V1 => { /* moreover, now we know that f.b is true */ } + x::V2 => { /* and here we know it is false */ } + } +} +``` + +Note that, because `Foo` does not implement `PartialEq`, just having +access to `V1` would not otherwise allow us to observe the value of +`f.b`. (And even if `Foo` *did* implement `PartialEq`, that +implementation might not read `f.b`, so we still would not be able to +observe its value.) + +**More examples.** There are numerous possible examples here. For +example, strings that compare using case-insensitive comparisons, but +retain the original case for reference, such as those used in +file-systems. Views that extract a subportion of a larger value (and +hence which should only compare that subportion). And so forth. + +#### Disadvantage: Scaling to associated constants and generic integers + +Rewriting constants into patterns requires that we can **fully +evaluate** the constant at the time of exhaustiveness checking. For +associated constants and type-level integers, that is not possible -- +we have to wait until monomorphization time. Consider: + +```rust +trait SomeTrait { + const A: bool; + const B: bool; +} + +fn foo(x: bool) { + match x { + T::A => println!("A"), + T::B => println!("B"), + } +} + +impl SomeTrait for i32 { + const A: bool = true; + const B: bool = true; +} + +impl SomeTrait for u32 { + const A: bool = true; + const B: bool = false; +} +``` + +Is this match exhaustive? Does it contain dead code? The answer will +depend on whether `T=i32` or `T=u32`, of course. + +### Advantages of the current approach + +However, structural equality also has a number of advantages: + +**Better optimization.** One of the biggest "pros" is that it can +potentially enable nice optimization. For example, given constants like the following: + +```rust +struct Value { x: u32 } +const V1: Value = Value { x: 0 }; +const V2: Value = Value { x: 1 }; +const V3: Value = Value { x: 2 }; +const V4: Value = Value { x: 3 }; +const V5: Value = Value { x: 4 }; +``` + +and a match pattern like the following: + +```rust +match v { + V1 => ..., + ..., + V5 => ..., +} +``` + +then, because pattern matching is always a process of structurally +extracting values, we can compile this to code that reads the field +`x` (which is a `u32`) and does an appropriate switch on that +value. Semantic equality would potentially force a more conservative +compilation strategy. + +**Better exhautiveness and dead-code checking.** Similarly, we can do +more thorough exhaustiveness and dead-code checking. So for example if +I have a struct like: + +```rust +struct Value { field: bool } +const TRUE: Value { field: true }; +const FALSE: Value { field: false }; +``` + +and a match pattern like: + +```rust +match v { TRUE => .., FALSE => .. } +``` + +then we can prove that this match is exhaustive. Similarly, we can prove +that the following match contains dead-code: + +```rust +const A: Value { field: true }; +match v { + TRUE => ..., + A => ..., +} +``` + +Again, some of the alternatives might not allow this. (But note the +cons, which also raise the question of exhaustiveness checking.) + +**Nullary variants and constants are (more) equivalent.** Currently, +there is a sort of equivalence between enum variants and constants, at +least with respect to pattern matching. Consider a C-like enum: + +```rust +enum Modes { + Happy = 22, + Shiny = 44, + People = 66, + Holding = 88, + Hands = 110, +} + +const C: Modes = Modes::Happy; +``` + +Now if I match against `Modes::Happy`, that is matching against an +enum variant, and under *all* the proposals I will discuss below, it +will check the actual variant of the value being matched (regardless +of whether `Modes` implements `PartialEq`, which it does not here). On +the other hand, if matching against `C` were to require a `PartialEq` +impl, then it would be illegal. Therefore matching against an *enum +variant* is distinct from matching against a *constant*. + +# Detailed design +[design]: #detailed-design + +Define the set of builtin types `B` as follows: + +``` +B = i8 | i16 | i32 | i64 | isize // signed integers + | u8 | u16 | u32 | u64 | usize // unsigned integers + | char // characters + | bool // booleans + | (B, ..., B) // tuples of builtin types +``` + +Any constants appearing in a pattern whose type is not a member of `B` +will be feature-gated. This feature-gate will be phased in using a +deprecation cycle, as usual. + +# Drawbacks +[drawbacks]: #drawbacks + +This is a breaking change, which means some people will have to change +their code. Moreover, code that is currently using constants of disallowed +types becomes slightly more verbose. For example: + +```rust +match foo { + Some(CONSTANT) => ..., + None => ..., +} +``` + +would now be written: + +```rust +match foo { + Some(v) if v == CONSTANT => ..., + None => ..., +} +``` + +# Alternatives +[alternatives]: #alternatives + +**No changes.** Naturally we could opt to keep the semantics as they +are. The advantages and disadvantages are discussed above. + +**Embrace semantic equality.** We could opt to just go straight +towards "semantic equality". Howver, it seems better to reset the +semantics to a base point that everyone can agree on, and then extend +from that base point. Moreover, adopting semantic equality straight +out would be a riskier breaking change, as it could silently change +the semantics of existing programs (whereas the current proposal only +causes compilation to fail, never changes what an existing program +will do). + +# Discussion thread summary + +This section summarizes various points that were raised in the +[internals thread] which are related to patterns but didn't seem to +fit elsewhere. + +**Overloaded patterns.** Some languages, notably Scala, permit +overloading of patterns. This is related to "semantic equality" in +that it involves executing custom, user-provided code at compilation +time. + +**Pattern synonyms.** Haskell offers a feature called "pattern +synonyms" and +[it was argued](https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/39?u=nikomatsakis) +that the current treatment of patterns can be viewed as a similar +feature. This may be true, but constants-in-patterns are lacking a +number of important features from pattern synonyms, such as bindings, +as +[discussed in this response](https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/48?u=nikomatsakis). +The author feels that pattern synonyms might be a useful feature, but +it would be better to design them as a first-class feature, not adapt +constants for that purpose. + +# Unresolved questions +[unresolved]: #unresolved-questions + +**Should we also adjust the exhaustiveness and match analysis +algorithm to be more conservative around constants?** This RFC just +proposes limiting the types of constants that can be used in a match +pattern. However, since the code currently inlines the actual values +of constants before doing exhaustiveness checking, this also implies +that it can compute exhaustiveness and dead-code in cases where it +arguably should not be able to. + +For example, the following code +[fails to compile](http://is.gd/PJjNKl) because it contains dead-code: + +```rust +const X: u64 = 0; +const Y: u64 = 0; +fn bar(foo: u64) { + match foo { + X => { } + Y => { } + _ => { } + } +} +``` + +However, we would be unable to perform such an analysis in a more +generic context, such as with an associated constant: + +```rust +trait Trait { + const X: u64; + const Y: u64; +} + +fn bar(foo: u64) { + match foo { + T::X => { } + T::Y => { } + _ => { } + } +} +``` + +Here, although it may well be that `T::X == T::Y`, we can't know for +sure. So, for consistency, we may wish to treat all constants opaquely +regardless of whether we are in a generic context or not. + +Another argument in favor of treating all constants opaquely is that +the current behavior can leak details that perhaps were intended to be +hidden. For example, imagine that I define a fn `hash` that, given a +previous hash and a value, produces a new hash. Because I am lazy and +prototyping my system, I decide for now to just ignore the new value +and pass the old hash through: + +```rust +const fn add_to_hash(prev_hash: u64, _value: u64) -> u64 { + prev_hash +} +``` + +Now I have some consumers of my library and they define a few constants: + +```rust +const HASH_OF_ZERO: add_to_hash(0, 0); +const HASH_OF_ONE: add_to_hash(0, 1); +``` + +And at some point they write a match statement: + +```rust +fn process_hash(h: u64) { + match h { + HASH_OF_ZERO => /* do something */, + HASH_OF_ONE => /* do something else */, + _ => /* do something else again */, +} +``` + +As before, what you get when you [compile this](http://is.gd/u5WtCo) +is a dead-code error, because the compiler can see that `HASH_OF_ZERO` +and `HASH_OF_ONE` are the same value. + +Part of the solution here might be making "unreachable patterns" a +warning and not an error. The author feels this would be a good idea +regardless (though not necessarily as part of this RFC). However, +that's not a complete solution, since -- at least for `bool` constants +-- the same issues arise if you consider exhaustiveness checking. + +On the other hand, it feels very silly for the compiler not to +understand that `match some_bool { true => ..., false => ... }` is +exhaustive. Furthermore, there are other ways for the values of +constants to "leak out", such as when part of a type like +`[u8; SOME_CONSTANT]` (a point made by both [arielb1][arielb1ac] and +[glaebhoerl][gac] on the [internals thread]). Therefore, the proper +way to address this question is perhaps to consider an explicit form +of "abstract constant". + +[arielb1ac]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/9?u=nikomatsakis +[gac]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/32?u=nikomatsakis From 1cc857fb4239990e0620a11ae504ece7f410fcaa Mon Sep 17 00:00:00 2001 From: Peter Marheine Date: Fri, 8 Jan 2016 15:05:17 -0700 Subject: [PATCH 0678/1195] Update naked fns for later discussion. --- text/0000-naked-fns.md | 90 ++++++++++++++++++++++++++++-------------- 1 file changed, 61 insertions(+), 29 deletions(-) diff --git a/text/0000-naked-fns.md b/text/0000-naked-fns.md index d77a1416203..c86679ddfa1 100644 --- a/text/0000-naked-fns.md +++ b/text/0000-naked-fns.md @@ -87,17 +87,14 @@ any calling convention the compiler is compatible with, calls to naked functions from within Rust code are forbidden unless the function is also declared with a well-defined ABI. -The function `call_foo` in the following code block is an error because the -default (Rust) ABI is unspecified and as such a programmer can never write code -in `foo` which is compatible: +Defining a naked function with the default (Rust) ABI is an error, because the +Rust ABI is unspecified and the programmer can never write a function which is +guaranteed to be compatible. For example, The function declaration of `foo` in +the following code block is an error. ```rust #[naked] -fn foo() { } - -fn call_foo() { - foo(); -} +unsafe fn foo() { } ``` The following variant is not an error because the C calling convention is @@ -107,28 +104,36 @@ function: ```rust #[naked] extern "C" fn foo() { } - -fn call_foo() { - foo(); -} ``` --- -The current support for `extern` functions in `rustc` generates a minimum of two -basic blocks for any function declared in Rust code with a non-default calling -convention: a trampoline which translates the declared calling convention to the -Rust convention, and a Rust ABI version of the function containing the actual -implementation. Calls to the function from Rust code call the Rust ABI version -directly. +Because the compiler cannot verify the correctness of code written in a naked +function (since it may have an unknown calling convention), naked functions must +be declared `unsafe` or contain no non-`unsafe` statements in the body. The +function `error` in the following code block is a compile-time error, whereas +the functions `correct1` and `correct2` are permitted. -For naked functions, it is impossible for the compiler to generate a Rust ABI -version of the function because the implementation may depend on the calling -convention. In cases where calling a naked function from Rust is permitted, the -compiler must be able to use the target calling convention directly rather than -call the same function with the Rust convention. +``` +#[naked] +extern "C" fn error(x: &mut u8) { + *x += 1; +} ---- +#[naked] +unsafe extern "C" fn correct1(x: &mut u8) { + *x += 1; +} + +#[naked] +extern "C" fn correct2() { + unsafe { + *x += 1; + } +} +``` + +## Example The following example illustrates the possible use of a naked function for implementation of an interrupt service routine on 32-bit x86. @@ -162,6 +167,21 @@ fn main() { } ``` +## Implementation Considerations + +The current support for `extern` functions in `rustc` generates a minimum of two +basic blocks for any function declared in Rust code with a non-default calling +convention: a trampoline which translates the declared calling convention to the +Rust convention, and a Rust ABI version of the function containing the actual +implementation. Calls to the function from Rust code call the Rust ABI version +directly. + +For naked functions, it is impossible for the compiler to generate a Rust ABI +version of the function because the implementation may depend on the calling +convention. In cases where calling a naked function from Rust is permitted, the +compiler must be able to use the target calling convention directly rather than +call the same function with the Rust convention. + # Drawbacks The utility of this feature is extremely limited to most users, and it might be @@ -179,8 +199,20 @@ external libraries such as `libffi`. It is easy to quietly generate wrong code in naked functions, such as by causing the compiler to allocate stack space for temporaries where none were -anticipated. It may be desirable to require that all statements inside naked -functions be inside `unsafe` blocks (either by declaring the function `unsafe` -or including `unsafe { }` in the function body) to reinforce the need for -extreme care in the use of this feature. Requiring that the function always be -marked `unsafe` is not desirable because its external API may be safe. +anticipated. There is currently no restriction on writing Rust statements inside +a naked function, while most compilers supporting similar features either +require or strongly recommend that authors write only inline assembly inside +naked functions to ensure no code is generated that assumes a particular stack +layout. It may be desirable to place further restrictions on what statements are +permitted in the body of a naked function, such as permitting only `asm!` +statements. + +The `unsafe` requirement on naked functions may not be desirable in all cases. +However, relaxing that requirement in the future would not be a breaking change. + +Because a naked function may use a calling convention unknown to the compiler, +it may be useful to add a "unknown" calling convention to the compiler which is +illegal to call directly. Absent this feature, functions implementing an unknown +ABI would need to be declared with a calling convention which is known to be +incorrect and depend on the programmer to avoid calling such a function +incorrectly since it cannot be prevented statically. From 3dd97de465b2ff6e7adf034f32aa04df48aba9f3 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 8 Jan 2016 09:37:41 +1300 Subject: [PATCH 0679/1195] Rewrite most of the RFC More focussed on the RLS (formerly oracle). A smaller and mosty simpler design. --- text/0000-ide.md | 522 ++++++++++++++++------------------------------- 1 file changed, 180 insertions(+), 342 deletions(-) diff --git a/text/0000-ide.md b/text/0000-ide.md index 6f3eb3c015d..64a1e8c31f7 100644 --- a/text/0000-ide.md +++ b/text/0000-ide.md @@ -5,75 +5,51 @@ # Summary -This RFC describes how we intend to modify the compiler to support IDEs. The -intention is that support will be as generic as possible. A follow-up internals -post will describe how we intend to focus our energies and deploy Rust support -in actual IDEs. +This RFC describes the Rust Language Server (RLS). This is a program designed to +service IDEs and other tools. It offers a new access point to compilation and +APIs for getting information about a program. The RLS can be thought of as an +alternate compiler, but internally will use the existing compiler. -There are two sets of technical changes proposed in this RFC: changes to how we -compile, and the creation of an 'oracle' tool (name of tool TBC). +Using the RLS offers very low latency compilation. This allows for an IDE to +present information based on compilation to the user as quickly as possible. -This RFC is fairly detailed, it is intended as a straw-man plan to guide early -implementation, rather than as a strict blueprint. +## Requirements -## Compilation model +To be concrete about the requirements for the RLS, it should enable the +following actions: -An IDE will perform two kinds of compilation - an incremental check as the user -types (used to provide error and code completion information) and a full build. -The full build is explicitly signaled by the user (it could also happen -implicitly, for example when the user saves a file). A full build is basically -just a `cargo build` command, as would be done from the command line. It will -take advantage of any future improvements to regular compilation (such as -incremental compilation), but there is essentially no change from a compile -today. It is not very interesting and won't be discussed further. +* show compilation errors and warnings, updated as the user types, +* code completion as the user types, +* highlight all references to an item, +* find all references to an item, +* jump to definition. -The incremental check follows a new model of compilation. This check must be as -fast as possible but does not need to generate machine code. We'll describe it -in more detail below. We call this kind of compilation a 'quick-check'. +These requirements will be covered in more detail in later sections. -This RFC also covers making compilation more robust. +## History note -## The oracle +This RFC started as a more wide-ranging RFC. Some of the details have been +scaled back to allow for more focused and incremental development. -The oracle is a long running daemon process. It will keep a database -representation of an entire project's source code and semantic information (as -opposed to the compiler which operates on a crate at a time). It is -incrementally updated by the compiler and provides an IPC API for providing -information about a program - the low-level information an IDE (or similar tool) -needs, e.g., code completion options, location of definitions/declarations, -documentation for items. +Parts of the RFC dealing with robust compilation have been removed - work here +is ongoing and mostly doesn't require an RFC. -The oracle is a general purpose, low-level tool and should be usable by any IDE -as well as other tools. End users and editors with less project knowledge should -use the oracle via a more friendly interface (such as Racer). - - -## Other shared functionality - -Other functionality, such as refactoring and reformatting will be provided by -separate tools rather than the oracle. These should be sharable between IDE -implementations. They are not covered in this RFC. +The RLS was earlier referred to as the oracle. # Motivation -An IDE collects together many tools into a single piece of software. Some of -these are entirely separate from the rest of the Rust eco-system (such as editor -functionality), some will reuse existing tools in pretty much the same way they -are already used (e.g., formatting code, which should straightforwardly use -Rustfmt), and some will have totally new ways of using the compiler or other -tools (e.g., code completion). - Modern IDEs are large and complex pieces of software; creating a new one from scratch for Rust would be impractical. Therefore we need to work with existing IDEs (such as Eclipse, IntelliJ, and Visual Studio) to provide functionality. These IDEs provide excellent editor and project management support out of the -box, but know nothing about the Rust language. +box, but know nothing about the Rust language. This information must come from +the compiler. An important aspect of IDE support is that response times must be extremely -short. Users expect information as they type. Running normal compilation of an +quick. Users expect some feedback as they type. Running normal compilation of an entire project is far too slow. Furthermore, as the user is typing, the program will not be a valid, complete Rust program. @@ -81,361 +57,223 @@ We expect that an IDE may have its own lexer and parser. This is necessary for the IDE to quickly give parse errors as the user types. Editors are free to rely on the compiler's parsing if they prefer (the compiler will do its own parsing in any case). Further information (name resolution, type information, etc.) will -be provided by the compiler via the oracle. - +be provided by the RLS. -# Detailed design - -## Quick-check compilation - -(See also open questions, below). +## Requirements -We run the quick-check compiler on a single crate. At some point after quick -checking, dependent crates must be rebuilt. This is the responsibility of an -external tool to manage (see below). Quick-check is driven by an IDE (or -possibly by the oracle), not by Cargo. +We stated some requirements in the summary, here we'll cover more detail and the +workflow between IDE and RLS. +The RLS should be safe to use in the face of concurrent actions. For example, +multiple requests for compilation could occur, with later requests occurring +before earlier requests have finished. There could be multiple clients making +requests to the RLS, some of which may mutate its data. The RLS should provide +reliable and consistent responses. However, it is not expected that clients are +totally isolated, e.g., if client 1 updates the program, then client 2 requests +information about the program, client 2's response will reflect the changes made +by client 1, even if these are not otherwise known to client 2. -### Incremental and lazy compilation -Incremental compilation is where, rather than re-compiling an entire crate, only -code which is changed and its dependencies are re-compiled. See -[RFC #1298](https://github.com/rust-lang/rfcs/pull/1298). +### Show compilation errors and warnings, updated as the user types -Lazy compilation is where, rather than compiling an entire crate, we start by -compiling a single function (or possibly some other unit of code), and re- -compiling code which is depended on until we are done. Not all of a crate will -be compiled in this fashion. +The IDE will request compilation of the in-memory program. The RLS will compile +the program and asynchronously supply the IDE with errors and warnings. -These two compilation strategies are faster than the current compilation model -(compile everything, every time). They are somewhat orthogonal - compilation can -be either lazy or incremental without implying the other. The [current -proposal](https://github.com/rust-lang/rfcs/pull/1298) for supporting -incremental compilation involves some lazy compilation as an implementation -detail. +### Code completion as the user types -For quick-checking, compilation should be both incremental and lazy. The input -to the compiler is not just the crate being re-compiled, but also the span of -code changed (normal incremental compilation computes this span for itself, but -the IDE already has this information, so it would be wasteful to recompute it). -As a further optimisation, if the IDE can refer to items by an id (such as a -path), then this could be fed to the compiler rather than a code span to save -the compiler the effort of finding an AST node from a code span. +The IDE will request compilation of the in-memory program and request code- +completion options for the cursor position. The RLS will compile the program. As +soon as it has enough information for code-completion it will return options to +the IDE. -We begin by computing which code is invalidated by the change (that is, any code -which depends on the changed code). We then re-compile the changed code. -Information which is depended upon is looked up in the saved metadata used for -incremental compilation. When we have re-compiled the changed code, then we -output the result (see below). If there are no fatal errors, then we continue to -compile the rest of the invalidated code. - - -### Compilation output - -The output of compilation is either success or a set of errors (as with today's -compiler, but see below for more detail on error message format). However, since -compilation can continue after returning an initial result, we might produce -further errors (I presume that IDEs provide a mechanism for the compiler to -communicate these asynchronously to the IDE plugin). - -In addition we must produce data to update the oracle, this should be done -directly, without involving the IDE plugin. - -TODO metadata - really? - -Quick-check does not generate executable code or crate metadata. However, it -should (probably) update the metadata used for incremental compilation. - - -### Multiple crates - -Quick check only applies to a single crate, however, after some changes we might -need to re-compile dependent crates. This is the IDE's responsibility. In the -short term we can just trigger a full re-build (via Cargo) when the user starts -editing a file belonging to a different crate (there will obviously be some lag -there). The compiler must also generate crate metadata for the modified crate. - -Long term, the IDE might keep track of the dependency graph between crates -(provided by Cargo). The quick-check should signal when a crate's public -interface changes due to re-compilation. In that case the IDE can trigger -background re-compilation of dependent crates (possibly with some -delay/batching). - - -## The Oracle - -The oracle is a long-running tool which takes input from both full builds and -quick-checks, and responds to queries about a Rust program. Of particular note -is that it knows about a whole project, not just a single crate. In fact, other -than as a kind of module, it doesn't much care about the notion of a crate at -all. - -We require a data format for getting metadata from the compiler to the oracle. -Unfortunately none of the existing ones are quite right. Crate metadata is not -complete enough (it mostly only contains data about interfaces, not function -bodies), save-analysis data has been processed too far (basically into strings) -which loses some of the structure that would be useful, debuginfo is not Rust- -centric enough (i.e., does not contain Rust type information) and is based on -expanded source code. Furthermore, serialising any of the compiler's IRs is not -good enough: the AST and HIR do not contain any type or name information, the -HIR and MIR are post-expansion. - -The best option seems to be the save-analysis information. This is in a poor -format, but is the 'right' data (it can be based on an early AST and includes -type and name information). It can be evolved to be more efficient form over the -long run (it has been a low priority task for a while to support different -formats for the information). - -Full builds will generate a dump of save-analysis data for a whole crate. Quick -checks will generate data for the changed code. In both cases the oracle must -incrementally update its knowledge of the source code. How exactly to do this -when neither names nor ids are stable is an interesting question, but too much -detail for this RFC (especially as the implementation of ids in the compiler is -evolving). - -For crates which are not built from source (for example the standard library), -authors can choose to distribute the oracle's metadata to allow users to get a -good IDE experience with these crates. In this case, we only need metadata for -interfaces, not the bodies of functions or private items. The oracle should -handle such reduced metadata. It should be possible to generate the oracle's -metadata from the crate metadata, but this is not a short-term goal. (Note this -will require some knowledge in the IDE too - if there is no corresponding source -code, the IDE cannot 'jump to definition', for example). - -The oracle's data is platform-dependent. We must be careful when working with a -cross-compiled project to generate metadata for the target machine. This -shouldn't be a problem for normal compilation, but it means that quick-check -compilation must be configured for the same target, and care should be taken -with downloaded metadata. - -As well as metadata based on types and names, the oracle should keep track of -warnings. Since code with warnings but no errors is not re-compiled, a tool -outside the compiler must track them for display in the IDE. This will be done -by the oracle. - - -### Details - -#### API - -The oracle's API is a set of IPC calls. How exactly these should be implemented -is not clear. The most promising options are sending JSON over TCP, using -[thrift](https://thrift.apache.org/), or using Cap'n Proto (I'm unclear about -exactly what the transport layer looks like using Cap'n Proto, there is no Cap'n -Proto RCP implementation for Java, but I believe there is an alternative using -shared, memory mapped files as a buffer; I'm not familiar enough with the -library to work out what is needed). - -I've detailed the API I believe we'll need to start with. This is slightly more -than a minimal set. I expect it will expand as time goes by. At some point we -will want to stabilise parts of the API to allow for third party implementations -of the oracle and compiler. - -All API calls can return success or error results. Many calls involve a *span*; -for the oracle's API, this is defined as two byte offsets from the start of the -file (oracle spans must always be contained in a single file). - -There are some alternative span definitions: we could use file and column indices -rather than byte offsets (this has some edge case difficulties with the -definition of a newline - do unicode newlines count? It also requires some extra -computation), we could use character offsets (again involves some more -computation, but might be more robust). - -A problem is that Visual Studio uses UTF16 while Rust uses UTF8, there is (I -understand) no efficient way to convert between byte counts in these systems. -I'm not sure how to address this. It might require the oracle to be able to -operate in UTF16 mode. +* The RLS should return code-completion options asynchronously to the IDE. + Alternatively, the RLS could block the IDE's request for options. +* The RLS should not filter the code-completion options. For example, if the + user types `foo.ba` where `foo` has available fields `bar` and `qux`, it + should return both these fields, not just `bar`. The IDE can perform it's own + filtering since it might want to perform spell checking, etc. Put another way, + the RLS is not a code completion tool, but supplies the low-level data that a + code completion tool uses to provide suggestions. -Where no return value is specified, the call returns success or failure (with a -reason). +### Highlight all references to an item -The philosophy of the API is that most functions should only take a single call, -as opposed to making each function as minimal and orthogonal as possible. This -is because IPC can be slow and response time is important for IDEs. +The IDE requests all references in the same file based on a position in the +file. The RLS returns a list of spans. +### Find all references to an item -**Projects** +The IDE requests all references based on a position in the file. The RLS returns +a list of spans. -Note that the oracle stores no metadata about a project. +### Jump to definition -*init project* +The IDE requests the definition of an item based on a position in a file. The RLS +returns a list of spans (a list is necessary since, for example, a dynamically +dispatched trait method could be defined in multiple places). -Takes a project name, returns an id string (something close to the project's name). -*delete project* - -Takes a project id. - -*list projects* - -Takes nothing, returns a list of project ids. - -Each of the remaining calls takes a project identifier. - - -**Update** - -See section on input data format below. - -*update* - -Takes input data (actual source code rather than spans since we cannot assume -the user has saved the file) and a list of spans to invalidate. Where there are -no invalidated spans, the update call adds data (which will cause an error if -there are conflicts). Where there is no input data, update just invalidates. +# Detailed design -We might want to allow some shortcuts to invalidate an entire file or -recursively invalidate a directory. +## Architecture +The basic requirements for the architecture of the RLS are that it should be: -**Description** +* reusable by different clients (IDEs, tools, ...), +* fast (we must provide semantic information about a program as the user types), +* handle multi-crate programs, +* consistent (it should handle multiple, potentially mutating, concurrent requests). -*get definition* +The RLS will be a long running daemon process. Communication between the RLS and +an IDE will be via IPC calls (tools (for example, Racer) will also be able to +use the RLS as an in-process library.). The RLS will include the compiler as a +library. -Takes a span, returns all 'definitions and declarations' for the identifier -covered by the span. Can return an error if the span does not cover exactly one -identifier or the oracle has no data for an identifier. +The RLS has three main components - the compiler, a database, and a work queue. -The returned data is a list of 'definition' data. That data includes the span -for the item, any documentation for the item, a code snippet for the item, -optionally a type for the item, and one or more kinds of definition (e.g., -'variable definition', 'field definition', 'function declaration'). +The RLS accepts two kinds of requests - compilation requests and queries. It +will also push data to registered programs (generally triggered by compilation +completing). Essentially, all communication with the RLS is asynchronous (when +used as an in-process library, the client will be able to use synchronous +function calls too). -*get references* +The work queue is used to sequentialise requests and ensure consistency of +responses. Both compilation requests and queries are stored in the queue. Some +compilation requests can cause earlier compilation requests to be canceled. +Queries blocked on the earlier compilation then become blocked on the new +request. -Takes a span, returns a list of reference data (or an error). Each datum -consists of the span of the reference and a code snippet. +In the future, we should move queries ahead of compilation requests where +possible. -*get docs* +When compilation completes, the database is updated (see below for more +details). All queries are answered from the database. The database has data for +the whole project, not just one crate. This also means we don't need to keep the +compiler's data in memory. -Takes a span, returns the same data as *get definition* but limited to doc strings. -*get type* +## Compilation -Takes a span, returns the same data as *get definition* but limited to type information. +The RLS is somewhat parametric in its compilation model. Theoretically, it could +run a full compile on the requested crate, however this would be too slow in +practice. -Question: are these useful/necessary? Or should users just call *get definition*? +The general procedure is that the IDE (or other client) requests that the RLS +compile a crate. It is up to the IDE to interact with Cargo (or some other +build system) in order to produce the correct build command and to ensure that +any dependencies are built. -*search for identifier* +Initially, the RLS will do a standard incremental compile on the specified +crate. See [RFC PR 1298](https://github.com/rust-lang/rfcs/pull/1298) for more +details on incremental compilation. -Takes a search string or an id, and a struct of search parameters including case -sensitivity, the scope of the search, and the kind of items to search (e.g., -functions, traits, all items). Returns a list of spans and code snippets. +I see two ways to improve compilation times: lazy compilation and keeping the +compiler in memory. We might also experiment with having the IDE specify which +parts of the program have changed, rather than having the compiler compute this. +### Lazy compilation -**Code completion** +With lazy compilation the IDE requests that a specific item is compiled, rather +than the whole program. The compiler compiles this function compiling other +items only as necessary to compile the requested item. -*get suggestions* +Lazy compilation should also be incremental - an item is only compiled if +required *and* if it has changed. -Takes a span (note that this span could be empty, e.g, for `foo.` we would use -the empty span which starts after the `.`; for `foo.b` we would use the span for -`b`), and returns a list of suggestions (is this useful? Is there any difference -from just using the caret position?). Each suggestion consists of the text for -completion plus the same information as returned for the *get definition* call. +Obviously, we could miss some errors with pure lazy compilation. To address this +the RLS schedules both a lazy and a full (but still incremental) compilation. +The advantage of this approach is that many queries scheduled after compilation +can be performed after the lazy compilation, but before the full compilation. +### Keeping the compiler in memory -#### Input data format +There are still overheads with the incremental compilation approach. We must +startup the compiler initialising its data structures, we must parse the whole +crate, and we must read the incremental compilation data and metadata from disk. -The precise serialisation format of the oracle's input data will likely change -over time. At first, I propose we use csv, since that is what save-analysis -currently supports, and there is good decoding support for Rust. Longer term we -should use a binary format for more efficient serialisation and deserialisation. +If we can keep the compiler in memory, we avoid these costs. -Each datum consists of an identifier, a kind, a span, and a set of fields (the -exact fields are dependent on the kind of data). +However, this would require some significant refactoring of the compiler. There +is currently no way to invalidate data the compiler has already computed. It +also becomes difficult to cancel compilation: if we receive two compile requests +in rapid succession, we may wish to cancel the first compilation before it +finishes, since it will be wasted work. This is currently easy - the compilation +process is killed and all data released. However, if we want to keep the +compiler in memory we must invalidate some data and ensure the compiler is in a +consistent state. -If the datum is for a definition (of a trait, struct, etc.), then the identifier -is an absolute path (including the crate) to that definition. Question: how to -identify impls - do we need to distinguish multiple impls for the same trait and -data type? -For statements and expressions, the identifier is a path to the expression's -function (or static/const) and a function relative id. Note that this means we -have to invalidate an entire function at a time (or at least all of the function -after the edited portion). It would be nice if we could avoid this and be more -fine-grained about invalidation, any ideas? +### Compilation output -I propose that we follow the save-analysis data format to start with (in terms -of the kinds of data available and the fields for each). However, we should use -identifiers rather than DefIds and distinguish fields from variables. +Once compilation is finished, the RLS's database must be updated. Errors and +warnings produced by the compiler are stored in the database. Information from +name resolution and type checking is stored in the database (exactly which +information will grow with time). The analysis information will be provided by +the save-analysis API. +The compiler will also provide data on which (old) code has been invalidated. +Any information (including errors) in the database concerning this code is +removed before the new data is inserted. -### Racer -The oracle fulfills a similar role to -[Racer](https://github.com/phildawes/racer). Indeed, forking Racer may be a good -way to start development of the oracle. The oracle should provide more -information and should be more accurate by being more closely integrated with -the compiler. +### Multiple crates -Racer could be refactored to be a client of the oracle, thus taking advantage of -more accurate data and a simpler implementation, whilst maintaining its -interface. This would be a nice way to make the oracle's data available to less -sophisticated editors. Alternatively, Racer could make use of the oracle's -metadata but do its own processing of that data to provide an alternate -implementation of an oracle. +The RLS does not track dependencies, nor much crate information. However, it +will be asked to compile many crates and it will keep track of which crate data +belongs to. It will also keep track of which crates belong to a single program +and will not share data between programs, even if the same crate is shared. This +helps avoid versioning issues. -### DXR and Rustdoc +## Versioning -Both DXR and Rustdoc could be rewritten to talk to the oracle and run in a live -mode, rather than maintaining their own pre-processed data. This would have some -benefit in keeping these resources up to date as programs are edited (and -reducing the number of ways for doing essentially the same thing). However, this -does not seem like enough motivation to actually do the work. Could be an -interesting student project or something. +The RLS will be released using the same train model as Rust. A version of the +RLS is pinned to a specific version of Rust. If users want to operate with +multiple versions, they will need multiple versions of the RLS (I hope we can +extend multirust/rustup.rs to handle the RLS as well as Rust). # Drawbacks -It's a lot of work. On the other hand the largest changes are desirable for -general improvements in compilation speed or for other tools. +It's a lot of work. But better we do it once than each IDE doing it themselves, +or having sub-standard IDE support. # Alternatives -The oracle and quick-check compiler could be combined in a single tool. This -might be more efficient, but would increase complexity and decrease opportunity -for third party alternatives. - -The oracle could do more - actually perform some of the processing tasks usually +The big design choice here is using a database rather than the compiler's data +structures. The primary motivation for this is the 'find all references' +requirement. References could be in multiple crates, so we would need to reload +incremental compilation data (which must include the serialised MIR, or +something equivalent) for all crates, then search this data for matching +identifiers. Assuming the serialisation format is not too complex, this should +be possible in a reasonable amount of time. Since identifiers might be in +function bodies, we can't rely on metadata. + +This is a reasonable alternative, and may be simpler than the database approach. +However, it is not planned to output this data in the near future (the initial +plan for incremental compilation is to not store information required to re- +check function bodies). This approach might be too slow for very large projects, +we might wish to do searches in the future that cannot be answered without doing +the equivalent of a database join, and the database simplifies questions about +concurrent accesses. + +We could only provide the RLS as a library, rather than providing an API via +IPC. An IPC interface allows a single instance of the RLS to service multiple +programs, is language-agnostic, and allows for easy asynchronous-ness between +the RLS and its clients. It also provides isolation - a panic in the RLS will +not cause the IDE to crash, not can a long-running operation delay the IDE. Most +of these advantages could be captured using threads. However, the cost of +implementing an IPC interface is fairly low and means less effort for clients, +so it seems worthwhile to provide. + +The RLS could do more - actually perform some of the processing tasks usually done by IDEs (such as editing source code) or other tools (refactoring, reformating, etc.). -Should the oracle hide the quick-check compiler? I.e., the IDE talks only to the -oracle and the oracle requests compilation as needed. This might make things a -bit simpler for the IDE and means less IPC overhead and complexity. Either the -oracle could be responsible for all coordination, or the IDE could remain -responsible for coordinating when crates are handled, and the oracle is -responsible for coordinating calls to the quick check compiler to build a single -crate. - # Unresolved questions -Should the quick-check compilation be provided by a separate tool or a mode of -the compiler? It is fairly different in its operation from the compiler. It -might be better to provide a different 'frontend' rather than adding many more -options to the compiler. (I think the answer is 'yes'). - -Should quick-check be a long running process? It could save some time by not -having to reload metadata, but having to keep metadata for an entire project in -memory would be expensive. We could perhaps compromise by unloading when the -user needs to recompile a different crate. I believe it is probably better in -the long run, but a batch process is OK to start with. - -How and when should we generate crate metadata. It seems sensible to generate -this when we switch to editing/re-compiling a different crate. However, it's not -clear if this must be done from scratch or if it can be produced from the -incremental compilation metadata (see that RFC, I guess). - -What should we call the oracle tool? I don't particularly like "oracle", -although it is descriptive (it comes from the Go tool of the same name). -Alternatives are 'Rider', 'Racer Server', or anything you can think of. - -How do we handle different versions of Rust and interact with multi-rust? -Upgrades to the next stable version of Rust? - -Do we need to standardise error messages for the various parsers to prevent user -confusion (i.e., try to ensure that rustc and the various IDEs give the same -error messages). +A problem is that Visual Studio uses UTF16 while Rust uses UTF8, there is (I +understand) no efficient way to convert between byte counts in these systems. +I'm not sure how to address this. It might require the RLS to be able to operate +in UTF16 mode. From 2f9a2a1887019fb70b0522f191345a8670d47563 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 15 Jan 2016 09:26:32 +1300 Subject: [PATCH 0680/1195] Add some more text to alternatives and unanswered questions --- text/0000-ide.md | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/text/0000-ide.md b/text/0000-ide.md index 64a1e8c31f7..a8b2bca3caf 100644 --- a/text/0000-ide.md +++ b/text/0000-ide.md @@ -266,6 +266,14 @@ of these advantages could be captured using threads. However, the cost of implementing an IPC interface is fairly low and means less effort for clients, so it seems worthwhile to provide. +Extending this idea, we could do less than the RLS - provide a high-level +library API for the Rust compiler and let other projects do the rest. In +particular, Racer does an excellent job at providing the information the RLS +would provide without much information from the compiler. This is certainly less +work for the compiler team and more flexible for clients. On the other hand, it +means more work for clients and possible fragmentation. Duplicated effort means +that different clients will not benefit from each other's innovations. + The RLS could do more - actually perform some of the processing tasks usually done by IDEs (such as editing source code) or other tools (refactoring, reformating, etc.). @@ -276,4 +284,15 @@ reformating, etc.). A problem is that Visual Studio uses UTF16 while Rust uses UTF8, there is (I understand) no efficient way to convert between byte counts in these systems. I'm not sure how to address this. It might require the RLS to be able to operate -in UTF16 mode. +in UTF16 mode. This is only a problem with byte offsets in spans, not with +row/column data (the RLS will supply both). It may be possible for Visual Studio +to just use the row/column data, or convert inefficiently to UTF16. I guess the +question comes down to should this conversion be done in the RLS or the client. +I think we should start assuming the client, and perhaps adjust course later. + +What kind of IPC protocol to use? HTTP is popular and simple to deal with. It's +platform-independent and used in many similar pieces of software. On the other +hand it is heavyweight and requires pulling in large libraries, and requires +some attention to security issues. Alternatives are some kind of custom +prototcol, or using a solution like Thrift. My prefernce is for HTTP, since it +has been proven in similar situations. From 37d72bec6b93958a87e88fa10b861ac4cb955ce2 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 15 Jan 2016 09:29:26 +1300 Subject: [PATCH 0681/1195] Add text about dirty buffers --- text/0000-ide.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-ide.md b/text/0000-ide.md index a8b2bca3caf..5e41fe2163e 100644 --- a/text/0000-ide.md +++ b/text/0000-ide.md @@ -166,6 +166,10 @@ Initially, the RLS will do a standard incremental compile on the specified crate. See [RFC PR 1298](https://github.com/rust-lang/rfcs/pull/1298) for more details on incremental compilation. +The crate being compiled should include any modifications made in the client and +not yet committed to a file (e.g., changes the IDE has in memory). The client +should pass such changes to the RLS along with the compilation request. + I see two ways to improve compilation times: lazy compilation and keeping the compiler in memory. We might also experiment with having the IDE specify which parts of the program have changed, rather than having the compiler compute this. From c39a7cb55ecad0f15344307362d52b8d80b85a3e Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 15 Jan 2016 00:28:01 +0100 Subject: [PATCH 0682/1195] Add `[` to the FOLLOW(ty) in macro future-proofing rules. --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 230a146b209..17d9a990fd1 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -414,7 +414,7 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, - `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in)}` - `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` -- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where)}` +- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where), OpenDelim(Bracket)}` - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(path)` = `FOLLOW(ty)` - `FOLLOW(block)` = any token From 6fef0b3439025e95a44e2e220a604860338c8ba7 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 15 Jan 2016 00:56:23 +0100 Subject: [PATCH 0683/1195] added self-referential note to the Edit History for RFC 550. --- text/0550-macro-future-proofing.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 17d9a990fd1..b7cc1de19cb 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -481,6 +481,9 @@ reasonable freedom and can be extended in the future. [RFC issue 1336]: https://github.com/rust-lang/rfcs/issues/1336 +- Updated by https://github.com/rust-lang/rfcs/pull/1462, which added + open square bracket into the follow set for types. + # Appendices ## Appendix A: Algorithm for recognizing valid matchers. From a2968d41857aaf61b3931409c030f3938adc3077 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 15 Jan 2016 14:45:49 -0500 Subject: [PATCH 0684/1195] merge and rename RFC #1331 --- ...0-grammar-is-canonical.md => 1331-grammar-is-canonical.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-grammar-is-canonical.md => 1331-grammar-is-canonical.md} (96%) diff --git a/text/0000-grammar-is-canonical.md b/text/1331-grammar-is-canonical.md similarity index 96% rename from text/0000-grammar-is-canonical.md rename to text/1331-grammar-is-canonical.md index 17eaccd55e2..e88f690d3ae 100644 --- a/text/0000-grammar-is-canonical.md +++ b/text/1331-grammar-is-canonical.md @@ -1,7 +1,7 @@ - Feature Name: grammar - Start Date: 2015-10-21 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1331](https://github.com/rust-lang/rfcs/pull/1331) +- Rust Issue: [rust-lang/rust#30942](https://github.com/rust-lang/rust/issues/30942) # Summary [summary]: #summary From 2025389e1084a7c74772b33d27aa93e53ecf2cc8 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Fri, 15 Jan 2016 21:50:12 -0500 Subject: [PATCH 0685/1195] Add amendments section to note change. At @nikomatsakis request. --- text/1192-inclusive-ranges.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index de512694888..7bef8c643c1 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -108,3 +108,8 @@ The `Empty` variant could be omitted, leaving two options: # Unresolved questions None so far. + +# Amendments + +* In rust-lang/rfcs#1320, this RFC was amended to change the `RangeInclusive` + type from a struct with a `finished` field to an enum. From 4f203e9fb3b2dee1001f49ad50af1dfbfd4b37c8 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Mon, 18 Jan 2016 04:26:55 +0000 Subject: [PATCH 0686/1195] Stabilize volatile read and write --- text/0000-volatile.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 text/0000-volatile.md diff --git a/text/0000-volatile.md b/text/0000-volatile.md new file mode 100644 index 00000000000..55c5d73ab91 --- /dev/null +++ b/text/0000-volatile.md @@ -0,0 +1,34 @@ +- Feature Name: volatile +- Start Date: 2016-01-18 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Stabilize the `volatile_load` and `volatile_store` intrinsics as `ptr::volatile_read` and `ptr::volatile_write`. + +# Motivation +[motivation]: #motivation + +This is necessary to allow volatile access to memory-mapping I/O in stable code. Currently this is only possible using unstable intrinsics, or by abusing a bug in the `load` and `store` functions on atomic types which gives them volatile semantics ([rust-lang/rust#30962](https://github.com/rust-lang/rust/pull/30962)). + +# Detailed design +[design]: #detailed-design + +`ptr::volatile_read` and `ptr::volatile_write` will work the same way as `ptr::read` and `ptr::write` respectively, except that the memory access will be done with volatile semantics. The semantics of a volatile access are already pretty well defined by the C standard and by LLVM. + +# Drawbacks +[drawbacks]: #drawbacks + +None. + +# Alternatives +[alternatives]: #alternatives + +We could also stabilize the `volatile_set_memory`, `volatile_copy_memory` and `volatile_copy_nonoverlapping_memory` intrinsics as `ptr::volatile_write_bytes`, `ptr::volatile_copy` and `ptr::volatile_copy_nonoverlapping`, but these are not as widely used and are not available in C. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None. From c826e6713cc7e25a8f86cb3736b6905671ae0ff3 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Mon, 18 Jan 2016 13:14:38 -0800 Subject: [PATCH 0687/1195] Elaborate on the interaction between unions and Drop --- text/0000-untagged_union.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/text/0000-untagged_union.md b/text/0000-untagged_union.md index eff2838e12f..d68a95ea3ec 100644 --- a/text/0000-untagged_union.md +++ b/text/0000-untagged_union.md @@ -230,8 +230,11 @@ a field, should cause the compiler to treat the entire union as initialized. A union may have trait implementations, using the same syntax as a struct. -The compiler should produce an error if a union field has a type that -implements the `Drop` trait. +The compiler should provide a lint if a union field has a type that implements +the `Drop` trait. The compiler may optionally provide a pragma to disable that +lint, for code that intentionally stores a type with Drop in a union. The +compiler must never implicitly generate a Drop implementation for the union +itself, though Rust code may explicitly implement Drop for a union type. ## Unions and undefined behavior From 2e5ef18b235dcbddbb19f2b513cd11afed0b0f02 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Mon, 18 Jan 2016 16:44:52 -0800 Subject: [PATCH 0688/1195] Change syntax to "union! U { ... }" This provides the clean syntax of a keyword, without breaking any existing code, and without attaching expectations based on the semantics or syntax of some existing keyword such as "struct" or "enum". --- .../{0000-untagged_union.md => 0000-union.md} | 48 ++++++++++--------- 1 file changed, 25 insertions(+), 23 deletions(-) rename text/{0000-untagged_union.md => 0000-union.md} (87%) diff --git a/text/0000-untagged_union.md b/text/0000-union.md similarity index 87% rename from text/0000-untagged_union.md rename to text/0000-union.md index d68a95ea3ec..5769022f3ff 100644 --- a/text/0000-untagged_union.md +++ b/text/0000-union.md @@ -1,4 +1,4 @@ -- Feature Name: `untagged_union` +- Feature Name: `union` - Start Date: 2015-12-29 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -6,8 +6,8 @@ # Summary [summary]: #summary -Provide native support for C-compatible unions, defined via a new keyword -`untagged_union`. +Provide native support for C-compatible unions, defined via a built-in syntax +macro `union!`. # Motivation [motivation]: #motivation @@ -28,10 +28,14 @@ space-efficient or cache-efficient structures relying on value representation, such as machine-word-sized unions using the least-significant bits of aligned pointers to distinguish cases. -The syntax proposed here avoids reserving `union` as the new keyword, as -existing Rust code already uses `union` for other purposes, including [multiple -functions in the standard -library](https://doc.rust-lang.org/std/?search=union). +The syntax proposed here avoids reserving a new keyword (such as `union`), and +thus will not break any existing code. This syntax also avoids adding a pragma +to some existing keyword that doesn't quite fit, such as `struct` or `enum`, +which avoids attaching any of the semantic significance of those keywords to +this new construct. Rust does not produce an error or warning about the +redefinition of a macro already defined in the standard library, so the +proposed syntax will not even break code that currently defines a macro named +`union!`. To preserve memory safety, accesses to union fields may only occur in `unsafe` code. Commonly, code using unions will provide safe wrappers around unsafe @@ -43,17 +47,16 @@ union field accesses. ## Declaring a union type A union declaration uses the same field declaration syntax as a `struct` -declaration, except with the keyword `untagged_union` in place of `struct`: +declaration, except with `union!` in place of `struct`. ```rust -untagged_union MyUnion { +union! MyUnion { f1: u32, f2: f32, } ``` -`untagged_union` implies `#[repr(C)]` as the default representation, making -`#[repr(C)] untagged_union` permissible but redundant. +`union!` implies `#[repr(C)]` as the default representation. ## Instantiating a union @@ -122,15 +125,14 @@ a union field without matching a specific value makes an irrefutable pattern. Both require unsafe code. Pattern matching may match a union as a field of a larger structure. In -particular, when using an `untagged_union` to implement a C tagged union via -FFI, this allows matching on the tag and the corresponding field -simultaneously: +particular, when using a Rust union to implement a C tagged union via FFI, this +allows matching on the tag and the corresponding field simultaneously: ```rust #[repr(u32)] enum Tag { I, F } -untagged_union U { +union! U { i: i32, f: f32, } @@ -164,7 +166,7 @@ entire union, such that any borrow conflicting with a borrow of the union containing the union) will produce an error. ```rust -untagged_union U { +union! U { f1: u32, f2: f32, } @@ -192,7 +194,7 @@ struct S { y: u32, } -untagged_union U { +union! U { s: S, both: u64, } @@ -252,7 +254,7 @@ size of any of its fields, and the maximum alignment of any of its fields. Note that those maximums may come from different fields; for instance: ```rust -untagged_union U { +union! U { f1: u16, f2: [u8; 4], } @@ -282,11 +284,11 @@ of unsafe code. structure-like. The implementation and use of such macros provides strong motivation to seek a better solution, and indeed existing writers and users of such macros have specifically requested native syntax in Rust. -- Define unions without a new keyword `untagged_union`, such as via - `#[repr(union)] struct`. This would avoid any possibility of breaking - existing code that uses the keyword, but would make declarations more - verbose, and introduce potential confusion with `struct` (or whatever - existing construct the `#[repr(union)]` attribute modifies). +- Define unions via a pragma modifying an existing keyword, such as via + `#[repr(union)] struct`. Like the macro approach, this avoids breaking + existing code via a new keyword. However, this would make declarations more + verbose and noisy, and would introduce potential confusion with `struct` (or + whatever existing construct the pragma modified). - Use a compound keyword like `unsafe union`, while not reserving `union` on its own as a keyword, to avoid breaking use of `union` as an identifier. Potentially more appealing syntax, if the Rust parser can support it. From 9b4b8af040156a5773e15240f36fbadc7400e396 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Mon, 18 Jan 2016 17:18:44 -0800 Subject: [PATCH 0689/1195] Rewrite alternatives as prose, and expand --- text/0000-union.md | 98 +++++++++++++++++++++++++++++++--------------- 1 file changed, 67 insertions(+), 31 deletions(-) diff --git a/text/0000-union.md b/text/0000-union.md index 5769022f3ff..c70beb1d38e 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -277,37 +277,73 @@ of unsafe code. # Alternatives [alternatives]: #alternatives -- Don't do anything, and leave users of FFI interfaces with unions to continue - writing complex platform-specific transmute code. -- Create macros to define unions and access their fields. However, such macros - make field accesses and pattern matching look more cumbersome and less - structure-like. The implementation and use of such macros provides strong - motivation to seek a better solution, and indeed existing writers and users - of such macros have specifically requested native syntax in Rust. -- Define unions via a pragma modifying an existing keyword, such as via - `#[repr(union)] struct`. Like the macro approach, this avoids breaking - existing code via a new keyword. However, this would make declarations more - verbose and noisy, and would introduce potential confusion with `struct` (or - whatever existing construct the pragma modified). -- Use a compound keyword like `unsafe union`, while not reserving `union` on - its own as a keyword, to avoid breaking use of `union` as an identifier. - Potentially more appealing syntax, if the Rust parser can support it. -- Use a new operator to access union fields, rather than the same `.` operator - used for struct fields. This would make union fields more obvious at the - time of access, rather than making them look syntactically identical to - struct fields despite the semantic difference in storage representation. -- The [unsafe enum](https://github.com/rust-lang/rfcs/pull/724) proposal: - introduce untagged enums, identified with `unsafe enum`. Pattern-matching - syntax would make field accesses significantly more verbose than structure - field syntax. -- The [unsafe enum](https://github.com/rust-lang/rfcs/pull/724) proposal with - the addition of struct-like field access syntax. The resulting field access - syntax would look much like this proposal; however, pairing an enum-style - definition with struct-style usage seems confusing for developers. An - enum-based declaration leads users to expect enum-like syntax; a new - construct distinct from both enum and struct does not lead to such - expectations, and developers used to C unions will expect struct-like field - access for unions. +This proposal has a substantial history, with many variants and alternatives +prior to the current macro-based syntax. Thanks to many people in the Rust +community for helping to refine this RFC. + +As an alternative to the macro syntax, Rust could support unions via a new +keyword instead. However, any introduction of a new keyword will necessarily +break some code that previously compiled, such as code using the keyword as an +identifier. Using `union` as the keyword would break the substantial volume of +existing Rust code using `union` for other purposes, including [multiple +functions in the standard +library](https://doc.rust-lang.org/std/?search=union). Another keyword such as +`untagged_union` would reduce the likelihood of breaking code in practice; +however, in the absence of an explicit policy for introducing new keywords, +this RFC opts to not propose a new keyword. + +To avoid breakage caused by a new reserved keyword, Rust could use a compound +keyword like `unsafe union` (currently not legal syntax in any context), while +not reserving `union` on its own as a keyword, to avoid breaking use of `union` +as an identifier. This provides equally reasonable syntax, but potentially +introduces more complexity in the Rust parser. + +In the absence of a new keyword, since unions represent unsafe, untagged sum +types, and enum represents safe, tagged sum types, Rust could base unions on +enum instead. The [unsafe enum](https://github.com/rust-lang/rfcs/pull/724) +proposal took this approach, introducing unsafe, untagged enums, identified +with `unsafe enum`; further discussion around that proposal led to the +suggestion of extending it with struct-like field access syntax. Such a +proposal would similarly eliminate explicit use of `std::mem::transmute`, and +avoid the need to handle platform-specific size and alignment requirements for +fields. + +The standard pattern-matching syntax of enums would make field accesses +significantly more verbose than struct-like syntax, and in particular would +typically require more code inside unsafe blocks. Adding struct-like field +access syntax would avoid that; however, pairing an enum-like definition with +struct-like usage seems confusing for developers. A declaration using `enum` +leads users to expect enum-like syntax; a new construct distinct from both +`enum` and `struct` avoids leading users to expect any particular syntax or +semantics. Furthermore, developers used to C unions will expect struct-like +field access for unions. + +Since this proposal uses struct-like syntax for declaration, initialization, +pattern matching, and field access, the original version of this RFC used a +pragma modifying the `struct` keyword: `#[repr(union)] struct`. However, while +the proposed unions match struct syntax, they do not share the semantics of +struct; most notably, unions represent a sum type, while structs represent a +product type. The new construct `union!` avoids the semantics attached to +existing keywords. + +In the absence of any native support for unions, developers of existing Rust +code have resorted to either complex platform-specific transmute code, or +complex union-definition macros. In the latter case, such macros make field +accesses and pattern matching look more cumbersome and less structure-like, and +still require detailed platform-specific knowledge of structure layout and +field sizes. The implementation and use of such macros provides strong +motivation to seek a better solution, and indeed existing writers and users of +such macros have specifically requested native syntax in Rust. + +Finally, to call more attention to reads and writes of union fields, field +access could use a new access operator, rather than the same `.` operator used +for struct fields. This would make union fields more obvious at the time of +access, rather than making them look syntactically identical to struct fields +despite the semantic difference in storage representation. However, this does +not seem worth the additional syntactic complexity and divergence from other +languages. Union field accesses already require unsafe blocks, which calls +attention to them. Calls to unsafe functions use the same syntax as calls to +safe functions. # Unresolved questions [unresolved]: #unresolved-questions From 17feb14875594aa8add679b29e0045829686d937 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Mon, 18 Jan 2016 17:19:18 -0800 Subject: [PATCH 0690/1195] Remove unnecessary backquotes. --- text/0000-union.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-union.md b/text/0000-union.md index c70beb1d38e..b3244f78b82 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -37,7 +37,7 @@ redefinition of a macro already defined in the standard library, so the proposed syntax will not even break code that currently defines a macro named `union!`. -To preserve memory safety, accesses to union fields may only occur in `unsafe` +To preserve memory safety, accesses to union fields may only occur in unsafe code. Commonly, code using unions will provide safe wrappers around unsafe union field accesses. @@ -46,7 +46,7 @@ union field accesses. ## Declaring a union type -A union declaration uses the same field declaration syntax as a `struct` +A union declaration uses the same field declaration syntax as a struct declaration, except with `union!` in place of `struct`. ```rust From c123950b0251a24f0a543932fa190ec9e8e46a21 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Mon, 18 Jan 2016 17:21:11 -0800 Subject: [PATCH 0691/1195] Add example about pattern match on union field with smaller size than union. --- text/0000-union.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-union.md b/text/0000-union.md index b3244f78b82..4c0858e4634 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -156,7 +156,9 @@ fn is_zero(v: Value) -> bool { Note that a pattern match on a union field that has a smaller size than the entire union must not make any assumptions about the value of the union's -memory outside that field. +memory outside that field. For example, if a union contains a `u8` and a +`u32`, matching on the `u8` may not perform a `u32`-sized comparison over the +entire union. ## Borrowing union fields From 98a5809eec752ad48a042cddfa57d6dbd46e673c Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Mon, 18 Jan 2016 17:24:15 -0800 Subject: [PATCH 0692/1195] Reduce the size of unsafe blocks --- text/0000-union.md | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/text/0000-union.md b/text/0000-union.md index 4c0858e4634..00c7301db49 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -181,9 +181,7 @@ fn test() { // let b2 = &mut u.f2; // This would produce an error *b1 = 5; } - unsafe { - assert_eq!(u.f1, 5); - } + assert_eq!(unsafe { u.f1 }, 5); } ``` @@ -211,10 +209,8 @@ fn test() { *bx = 5; *by = 10; } - unsafe { - assert_eq!(u.s.x, 5); - assert_eq!(u.s.y, 10); - } + assert_eq!(unsafe { u.s.x }, 5); + assert_eq!(unsafe { u.s.y }, 10); } ``` From 55786e7e066574d88cbef8f624d836a77e1de857 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 15 Jan 2016 14:40:15 -0500 Subject: [PATCH 0693/1195] start adding structural_match --- text/0000-restrict-constants-in-patterns.md | 168 ++++++++++++-------- 1 file changed, 104 insertions(+), 64 deletions(-) diff --git a/text/0000-restrict-constants-in-patterns.md b/text/0000-restrict-constants-in-patterns.md index c37e7d209ee..6c1ec5f94b9 100644 --- a/text/0000-restrict-constants-in-patterns.md +++ b/text/0000-restrict-constants-in-patterns.md @@ -6,22 +6,43 @@ # Summary [summary]: #summary -Feature-gate the use of constants in patterns unless those constants -have simple types, like integers, booleans, and characters. The -semantics of constants in general were never widely discussed and the -compiler's current implementation is not broadly agreed upon (though -it has many proponents). The intention of adding a feature-gate is to -give us time to discuss and settle on the desired semantics in an -"affirmative" way. - -Because the compiler currently accepts a larger set of constants, this -is a backwards incompatible change. This is justified as part of the -["underspecified language semantics" clause of RFC 1122][ls]. A -[crater run] found 14 regressions on crates.io, which suggests that -the impact of this change on real code would be minimal. - -Note: this was also discussed on an [internals thread]. Major points -from that thread are summarized either inline or in alternatives. +The current compiler implements a more expansive semantics for pattern +matching than was originally intended. This RFC introduces several +mechanisms to reign in these semantics without actually breaking +(much, if any) extant code: + +- Introduce a feature-gated attribute `#[structural_match]` which can + be applied to a struct or enum `T` to indicate that constants of + type `T` can be used within patterns. +- Have `#[derive(Eq)]` automatically apply this attribute to + the struct or enum that it decorates. **Automatically inserted attributes + do not require use of feature-gate.** +- When expanding constants of struct or enum type into equivalent + patterns, require that the struct or enum type is decorated with + `#[structural_match]`. Constants of builtin types are always + expanded. + +The practical effect of these changes will be to prevent the use of +constants in patterns unless the type of those constants is either a +built-in type (like `i32` or `&str`) or a user-defined constant for +which `Eq` is **derived** (not merely *implemented*). + +To be clear, this `#[structural_match]` attribute is **never intended +to be stabilized**. Rather, the intention of this change is to +restrict constant patterns to those cases that everyone can agree on +for now. We can then have further discussion to settle the best +semantics in the long term. + +Because the compiler currently accepts arbitrary constant patterns, +this is technically a backwards incompatible change. However, the +design of the RFC means that existing code that uses constant patterns +will generally "just work". The justification for this change is that +it is clarifying +["underspecified language semantics" clause, as described in RFC 1122][ls]. + +**Note:** this was also discussed on an [internals thread]. Major +points from that thread are summarized either inline or in +alternatives. [ls]: https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md#underspecified-language-semantics [crater run]: https://gist.github.com/nikomatsakis/26096ec2a2df3c1fb224 @@ -391,51 +412,54 @@ variant* is distinct from matching against a *constant*. # Detailed design [design]: #detailed-design -Define the set of builtin types `B` as follows: +The goal of this RFC is not to decide between semantic and structural +equality. Rather, the goal is to restrict pattern matching to that subset +of types where the two variants behave roughly the same. -``` -B = i8 | i16 | i32 | i64 | isize // signed integers - | u8 | u16 | u32 | u64 | usize // unsigned integers - | char // characters - | bool // booleans - | (B, ..., B) // tuples of builtin types -``` +### The structural match attribute -Any constants appearing in a pattern whose type is not a member of `B` -will be feature-gated. This feature-gate will be phased in using a -deprecation cycle, as usual. +We will introduce an attribute `#[structural_match]` which can be +applied to struct and enum types. Explicit use of this attribute will +(naturally) be feature-gated. When converting a constant value into a +pattern, if the constant is of struct or enum type, we will check +whether this attribute is present on the struct -- if so, we will +convert the value as we do today. If not, we will report an error that +the struct/enum value cannot be used in a pattern. -# Drawbacks -[drawbacks]: #drawbacks +### Behavior of `#[derive(Eq)]` -This is a breaking change, which means some people will have to change -their code. Moreover, code that is currently using constants of disallowed -types becomes slightly more verbose. For example: +When deriving the `Eq` trait, we will add the `#[structural_match]` to +the type in question. Attributes added in this way will be **exempt from +the feature gate**. -```rust -match foo { - Some(CONSTANT) => ..., - None => ..., -} -``` +### Phasing -would now be written: +We will not make this change instantaneously. Rather, for at least one +release cycle, users who are pattern matching on struct types that +lack `#[structural_match]` will be warned about imminent breakage. -```rust -match foo { - Some(v) if v == CONSTANT => ..., - None => ..., -} -``` +# Drawbacks +[drawbacks]: #drawbacks + +This is a breaking change, which means some people might have to +change their code. However, that is considered extremely unlikely, +because such users would have to be pattern matching on constants that +are not comparable for equality (this is likely a bug in any case). # Alternatives [alternatives]: #alternatives -**No changes.** Naturally we could opt to keep the semantics as they -are. The advantages and disadvantages are discussed above. + **Limit matching to builtin types.** An earlier version of this RFC +limited matching to builtin types like integers (and tuples of +integers). This RFC is a generalization of that which also +accommodates struct types that derive `Eq`. + +**Embrace current semantics (structural equality).** Naturally we +could opt to keep the semantics as they are. The advantages and +disadvantages are discussed above. **Embrace semantic equality.** We could opt to just go straight -towards "semantic equality". Howver, it seems better to reset the +towards "semantic equality". However, it seems better to reset the semantics to a base point that everyone can agree on, and then extend from that base point. Moreover, adopting semantic equality straight out would be a riskier breaking change, as it could silently change @@ -470,14 +494,27 @@ constants for that purpose. [unresolved]: #unresolved-questions **Should we also adjust the exhaustiveness and match analysis -algorithm to be more conservative around constants?** This RFC just -proposes limiting the types of constants that can be used in a match -pattern. However, since the code currently inlines the actual values -of constants before doing exhaustiveness checking, this also implies -that it can compute exhaustiveness and dead-code in cases where it -arguably should not be able to. - -For example, the following code +algorithm to be more conservative around user-defined structs and +enums?** This RFC leaves exhaustiveness and dead-code checking +unchanged. If we adopted semantic equality semantics, then we would +have to assume that the `Eq` impls are not buggy in order for the +exhaustiveness checking to continue working like this (that is, we +would have to assume that `x == x` always returned true). That said, +this might be OK, so long as the compiler handles the failure in some +graceful way, rather than generating undefined behavior. Furthermore, +in practice it is rather challenging to successfully make an +exhaustive match using user-defined constants unless they are +something trivial like newtype'd bools. + +Still, for maximum flexibility, the ideal behavior would be to be +conservative around exhaustiveness checking, but still detect and warn +about "dead-code" arms (e.g., `match foo { C => _, C => _ }`). We +would want to determine how possible this is. + +**What about exhaustiveness etc on builtin types?** Even if we ignore +user-defined types, there are complications around exhaustiveness +checking for constants of any kind related to associated constants and +other possible future extensions. For example, the following code [fails to compile](http://is.gd/PJjNKl) because it contains dead-code: ```rust @@ -512,14 +549,17 @@ fn bar(foo: u64) { Here, although it may well be that `T::X == T::Y`, we can't know for sure. So, for consistency, we may wish to treat all constants opaquely -regardless of whether we are in a generic context or not. - -Another argument in favor of treating all constants opaquely is that -the current behavior can leak details that perhaps were intended to be -hidden. For example, imagine that I define a fn `hash` that, given a -previous hash and a value, produces a new hash. Because I am lazy and -prototyping my system, I decide for now to just ignore the new value -and pass the old hash through: +regardless of whether we are in a generic context or not. (However, it +also seems reasonable to make a "best effort" attempt at +exhaustiveness and dead pattern checking, erring on the conservative +side in those cases where constants cannot be fully evaluated.) + +A different argument in favor of treating all constants opaquely is +that the current behavior can leak details that perhaps were intended +to be hidden. For example, imagine that I define a fn `hash` that, +given a previous hash and a value, produces a new hash. Because I am +lazy and prototyping my system, I decide for now to just ignore the +new value and pass the old hash through: ```rust const fn add_to_hash(prev_hash: u64, _value: u64) -> u64 { From 7f839570e88bb471ada2dc518a86a0d07b4b4599 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Fri, 22 Jan 2016 07:32:51 +0000 Subject: [PATCH 0694/1195] Rename to read_volatile/write_volatile and add a link to definition of volatile --- text/0000-volatile.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-volatile.md b/text/0000-volatile.md index 55c5d73ab91..a2b9ac833a3 100644 --- a/text/0000-volatile.md +++ b/text/0000-volatile.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Stabilize the `volatile_load` and `volatile_store` intrinsics as `ptr::volatile_read` and `ptr::volatile_write`. +Stabilize the `volatile_load` and `volatile_store` intrinsics as `ptr::read_volatile` and `ptr::write_volatile`. # Motivation [motivation]: #motivation @@ -16,7 +16,7 @@ This is necessary to allow volatile access to memory-mapping I/O in stable code. # Detailed design [design]: #detailed-design -`ptr::volatile_read` and `ptr::volatile_write` will work the same way as `ptr::read` and `ptr::write` respectively, except that the memory access will be done with volatile semantics. The semantics of a volatile access are already pretty well defined by the C standard and by LLVM. +`ptr::read_volatile` and `ptr::write_volatile` will work the same way as `ptr::read` and `ptr::write` respectively, except that the memory access will be done with volatile semantics. The semantics of a volatile access are already pretty well defined by the C standard and by LLVM. In documentation we can refer to http://llvm.org/docs/LangRef.html#volatile-memory-accesses. # Drawbacks [drawbacks]: #drawbacks @@ -26,7 +26,7 @@ None. # Alternatives [alternatives]: #alternatives -We could also stabilize the `volatile_set_memory`, `volatile_copy_memory` and `volatile_copy_nonoverlapping_memory` intrinsics as `ptr::volatile_write_bytes`, `ptr::volatile_copy` and `ptr::volatile_copy_nonoverlapping`, but these are not as widely used and are not available in C. +We could also stabilize the `volatile_set_memory`, `volatile_copy_memory` and `volatile_copy_nonoverlapping_memory` intrinsics as `ptr::write_bytes_volatile`, `ptr::copy_volatile` and `ptr::copy_nonoverlapping_volatile`, but these are not as widely used and are not available in C. # Unresolved questions [unresolved]: #unresolved-questions From 08eec931bc4724614c33b821f4f6e68542c4a2f7 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Fri, 22 Jan 2016 08:35:46 +0000 Subject: [PATCH 0695/1195] Specify behavior on invalid memory ordering and add more alternatives --- text/0000-extended-compare-and-swap.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/0000-extended-compare-and-swap.md b/text/0000-extended-compare-and-swap.md index b0eedcf9900..ef26a144ad0 100644 --- a/text/0000-extended-compare-and-swap.md +++ b/text/0000-extended-compare-and-swap.md @@ -32,7 +32,7 @@ Since `compare_and_swap` is stable, we can't simply add a second memory ordering fn compare_and_swap_explicit(&self, current: T, new: T, success: Ordering, failure: Ordering) -> T; ``` -The restrictions on the failure ordering are the same as C++11: only `SeqCst`, `Acquire` and `Relaxed` are allowed and it must be equal or weaker than the success ordering. +The restrictions on the failure ordering are the same as C++11: only `SeqCst`, `Acquire` and `Relaxed` are allowed and it must be equal or weaker than the success ordering. Passing an invalid memory ordering will result in a panic, although this can often be optimized away since the ordering is usually statically known. The documentation for the original `compare_and_swap` is updated to say that it is equivalent to `compare_and_swap_explicit` with the following mapping for memory orders: @@ -102,7 +102,9 @@ For consistency with `compare_and_swap`, `compare_and_swap_weak` also has a sepa # Alternatives [alternatives]: #alternatives -One alternative for supporting failure orderings is to add new enum variants to `Ordering` instead of adding new methods with two ordering parameters. The following variants would need to be added: `AcquireFailRelaxed`, `AcqRelFailRelaxed`, `SeqCstFailRelaxed`, `SeqCstFailAcquire`. The downside is that the names are quite ugly and are only valid for `compare_and_swap`, not other atomic operations. +One alternative for supporting failure orderings is to add new enum variants to `Ordering` instead of adding new methods with two ordering parameters. The following variants would need to be added: `AcquireFailRelaxed`, `AcqRelFailRelaxed`, `SeqCstFailRelaxed`, `SeqCstFailAcquire`. The downside is that the names are quite ugly and are only valid for `compare_and_swap`, not other atomic operations. It is also a breaking change to a stable enum. + +Another alternative is to deprecate the existing `compare_and_swap` functions and replace them with `compare_exchange` which takes two ordering parameters. The new name matches the one used by C++11 and C11, which is a good thing since Rust's memory model is based on the C++11 one. Not doing anything is also a possible option, but this will cause Rust to generate worse code for some lock-free algorithms. From fd358fba362af5c4e893ba62348e0639b174647b Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 22 Jan 2016 10:41:23 -0800 Subject: [PATCH 0696/1195] Add specifics around feature gates --- active/0000-trait-based-exception-handling.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/active/0000-trait-based-exception-handling.md b/active/0000-trait-based-exception-handling.md index 0d868aaaee4..507af12c767 100644 --- a/active/0000-trait-based-exception-handling.md +++ b/active/0000-trait-based-exception-handling.md @@ -313,6 +313,12 @@ Without any attempt at completeness, here are some things which should be true: (In the above, `foo()` is a function returning any type, and `try_foo()` is a function returning a `Result`.) +## Feature gates + +The two major features here, the `?` syntax and the `try`/`catch` +syntax, will be tracked by independent feature gates. Each of the +features has a distinct motivation, and we should evaluate them +independently. # Unresolved questions From 063d0ad139cb35969737e950f1a7c29e4a62d005 Mon Sep 17 00:00:00 2001 From: Doug Goldstein Date: Fri, 22 Jan 2016 12:47:03 -0600 Subject: [PATCH 0697/1195] target spec: remove /etc/rustc as default path The RFC specifies that if RUST_TARGET_PATH is unset then the default is /etc/rustc but this won't work on all systems (e.g. Windows) and the Rust compiler never actually implemented this behavior so remove it from the RFC. closes rust-lang/rust#31117 Signed-off-by: Doug Goldstein --- text/0131-target-specification.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/0131-target-specification.md b/text/0131-target-specification.md index 42fbaceabde..2bbb008af6b 100644 --- a/text/0131-target-specification.md +++ b/text/0131-target-specification.md @@ -65,8 +65,7 @@ deciding how to build for a given target. The process would look like: 1. Look up the target triple in an internal map, and load that configuration if it exists. If that fails, check if the target name exists as a file, and try loading that. If the file does not exist, look up `.json` in - the `RUST_TARGET_PATH`, which is a colon-separated list of directories - defaulting to `/etc/rustc`. + the `RUST_TARGET_PATH`, which is a colon-separated list of directories. 2. If `-C linker` is specified, use that instead of the target-specified linker. 3. If `-C link-args` is given, add those to the ones specified by the target. From 8272bf64493c3d23f6cc98d98a185df21adf9eb8 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 22 Jan 2016 16:29:50 -0500 Subject: [PATCH 0698/1195] add second issue for later amendment --- text/0550-macro-future-proofing.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index b7cc1de19cb..705ea1880d0 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -1,6 +1,8 @@ - Start Date: 2014-12-21 - RFC PR: [550](https://github.com/rust-lang/rfcs/pull/550) -- Rust Issue: [20563](https://github.com/rust-lang/rust/pull/20563) +- Rust Issues: + - [20563](https://github.com/rust-lang/rust/pull/20563) + - [31135](https://github.com/rust-lang/rust/issues/31135) # Summary From 33da8e84d58b0d9b91b6977061fa0efc89e60c8b Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Sat, 23 Jan 2016 05:52:18 +0000 Subject: [PATCH 0699/1195] Use compare_exchange_strong/compare_exchange_weak instead --- text/0000-extended-compare-and-swap.md | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/text/0000-extended-compare-and-swap.md b/text/0000-extended-compare-and-swap.md index ef26a144ad0..d8fdcade35d 100644 --- a/text/0000-extended-compare-and-swap.md +++ b/text/0000-extended-compare-and-swap.md @@ -24,17 +24,19 @@ While all of these variants are identical on x86, they can allow more efficient # Detailed design [design]: #detailed-design -## Memory ordering on failure +Since `compare_and_swap` is stable, we can't simply add a second memory ordering parameter to it. This RFC proposes deprecating the `compare_and_swap` function and replacing it with `compare_exchange_strong` and `compare_exchange_weak`, which match the names of the equivalent C++11 functions. -Since `compare_and_swap` is stable, we can't simply add a second memory ordering parameter to it. A new method is instead added to atomic types: +## `compare_exchange_strong` + +A new method is instead added to atomic types: ```rust -fn compare_and_swap_explicit(&self, current: T, new: T, success: Ordering, failure: Ordering) -> T; +fn compare_exchange_strong(&self, current: T, new: T, success: Ordering, failure: Ordering) -> T; ``` The restrictions on the failure ordering are the same as C++11: only `SeqCst`, `Acquire` and `Relaxed` are allowed and it must be equal or weaker than the success ordering. Passing an invalid memory ordering will result in a panic, although this can often be optimized away since the ordering is usually statically known. -The documentation for the original `compare_and_swap` is updated to say that it is equivalent to `compare_and_swap_explicit` with the following mapping for memory orders: +The documentation for the original `compare_and_swap` is updated to say that it is equivalent to `compare_exchange_strong` with the following mapping for memory orders: Original | Success | Failure -------- | ------- | ------- @@ -44,16 +46,15 @@ Release | Release | Relaxed AcqRel | AcqRel | Acquire SeqCst | SeqCst | SeqCst -## `compare_and_swap_weak` +## `compare_exchange_weak` -Two new methods are added to atomic types: +A new method is instead added to atomic types: ```rust -fn compare_and_swap_weak(&self, current: T, new: T, order: Ordering) -> (T, bool); -fn compare_and_swap_weak_explicit(&self, current: T, new: T, success: Ordering, failure: Ordering) -> (T, bool); +fn compare_exchange_weak(&self, current: T, new: T, success: Ordering, failure: Ordering) -> (T, bool); ``` -`compare_and_swap` does not need to return a success flag because it can be inferred by checking if the returned value is equal to the expected one. This is not possible for `compare_and_swap_weak` because it is allowed to fail spuriously, which means that it could fail to perform the swap even though the returned value is equal to the expected one. +`compare_exchange_strong` does not need to return a success flag because it can be inferred by checking if the returned value is equal to the expected one. This is not possible for `compare_exchange_weak` because it is allowed to fail spuriously, which means that it could fail to perform the swap even though the returned value is equal to the expected one. A lock free algorithm using a loop would use the returned bool to determine whether to break out of the loop, and if not, use the returned value for the next iteration of the loop. @@ -78,7 +79,7 @@ The following intrinsics need to be added to support relaxed memory orderings on pub fn atomic_cxchg_acq_failrelaxed(dst: *mut T, old: T, src: T) -> T; ``` -The following intrinsics need to be added to support `compare_and_swap_weak`: +The following intrinsics need to be added to support `compare_exchange_weak`: ```rust pub fn atomic_cxchg_weak(dst: *mut T, old: T, src: T) -> (T, bool); @@ -97,14 +98,14 @@ The following intrinsics need to be added to support `compare_and_swap_weak`: Ideally support for failure memory ordering would be added by simply adding an extra parameter to the existing `compare_and_swap` function. However this is not possible because `compare_and_swap` is stable. -For consistency with `compare_and_swap`, `compare_and_swap_weak` also has a separate explicit variant with two memory ordering parameters, even though ideally only a single method would be required. +This RFC proposes deprecating a stable function, which may not be desirable. # Alternatives [alternatives]: #alternatives One alternative for supporting failure orderings is to add new enum variants to `Ordering` instead of adding new methods with two ordering parameters. The following variants would need to be added: `AcquireFailRelaxed`, `AcqRelFailRelaxed`, `SeqCstFailRelaxed`, `SeqCstFailAcquire`. The downside is that the names are quite ugly and are only valid for `compare_and_swap`, not other atomic operations. It is also a breaking change to a stable enum. -Another alternative is to deprecate the existing `compare_and_swap` functions and replace them with `compare_exchange` which takes two ordering parameters. The new name matches the one used by C++11 and C11, which is a good thing since Rust's memory model is based on the C++11 one. +Another alternative is to not deprecate `compare_and_swap` and instead add `compare_and_swap_explicit`, `compare_and_swap_weak` and `compare_and_swap_weak_explicit`. However the distiniction between the explicit and non-explicit isn't very clear and can lead to some confusion. Not doing anything is also a possible option, but this will cause Rust to generate worse code for some lock-free algorithms. From 60e6d04e5b08df5a8af155e7e2f003f58a30abd7 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 22 Jan 2016 23:53:19 -0800 Subject: [PATCH 0700/1195] The TOML spec has been updated to allow literal strings Note is now removed --- text/0000-cargo-cfg-dependencies.md | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/text/0000-cargo-cfg-dependencies.md b/text/0000-cargo-cfg-dependencies.md index 055a389cfbd..f9419f5fe7f 100644 --- a/text/0000-cargo-cfg-dependencies.md +++ b/text/0000-cargo-cfg-dependencies.md @@ -43,7 +43,7 @@ winapi = "0.2" [target."cfg(unix)".dependencies] unix-socket = "0.4" -[target."cfg(target_os = \"macos\")".dependencies] +[target.'cfg(target_os = "macos")'.dependencies] core-foundation = "0.2" ``` @@ -52,16 +52,6 @@ the string "cfg(" and ends with ")". If this is not true then Cargo will continue to treat it as an opaque string and pass it to the compiler via `--target` (Cargo's current behavior). -> **Note**: There's an [issue open against TOML][toml-issue] to support -> single-quoted keys allowing more ergonomic syntax in some cases like: -> -> ```toml -> [target.'cfg(target_os = "macos")'.dependencies] -> core-foundation = "0.2" -> ``` - -[toml-issue]: https://github.com/toml-lang/toml/issues/354 - Cargo will implement its own parser of this syntax inside the `cfg` expression, it will not rely on the compiler itself. The grammar, however, will be the same as the compiler for now: From 7aa647fe770400b9b6eac8e11563125e7486eb66 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Tue, 12 Jan 2016 22:23:57 -0800 Subject: [PATCH 0701/1195] Move some net2 functionality into libstd --- text/0000-net2-mutators.md | 129 +++++++++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 text/0000-net2-mutators.md diff --git a/text/0000-net2-mutators.md b/text/0000-net2-mutators.md new file mode 100644 index 00000000000..fd095df99df --- /dev/null +++ b/text/0000-net2-mutators.md @@ -0,0 +1,129 @@ +- Feature Name: net2_mutators +- Start Date: 2016-01-12 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +[RFC 1158](https://github.com/rust-lang/rfcs/pull/1158) proposed the addition +of more functionality for the `TcpStream`, `TcpListener` and `UdpSocket` types, +but was declined so that those APIs could be built up out of tree in the [net2 +crate](https://crates.io/crates/net2/). This RFC proposes pulling portions of +net2's APIs into the standard library. + +# Motivation +[motivation]: #motivation + +The functionality provided by the standard library's wrappers around standard +networking types is fairly limited, and there is a large set of well supported, +standard functionality that is not currently implemented in `std::net` but has +existed in net2 for some time. + +All of the methods to be added map directly to equivalent system calls. + +This does not cover the entirety of net2's APIs. In particular, this RFC does +not propose to touch the builder types. + +# Detailed design +[design]: #detailed-design + +The following methods will be added: + +```rust +impl TcpStream { + fn set_nodelay(&self, nodelay: bool) -> io::Result<()>; + fn nodelay(&self) -> io::Result; + + fn set_keepalive(&self, keepalive: Option) -> io::Result<()>; + fn keepalive(&self) -> io::Result>; + + fn set_ttl(&self, ttl: u32) -> io::Result<()>; + fn ttl(&self) -> io::Result; + + fn set_only_v6(&self, only_v6: bool) -> io::Result<()>; + fn only_v6(&self) -> io::Result; + + fn take_error(&self) -> io::Result>; + + fn set_nonblocking(&self, nonblocking: bool) -> io::Result<()>; +} + +impl TcpListener { + fn set_ttl(&self, ttl: u32) -> io::Result<()>; + fn ttl(&self) -> io::Result; + + fn set_only_v6(&self, only_v6: bool) -> io::Result<()>; + fn only_v6(&self) -> io::Result; + + fn take_error(&self) -> io::Result>; + + fn set_nonblocking(&self, nonblocking: bool) -> io::Result<()>; +} + +impl UdpSocket { + fn set_broadcast(&self, broadcast: bool) -> io::Result<()>; + fn broadcast(&self) -> io::Result; + + fn set_multicast_loop_v4(&self, multicast_loop_v4: bool) -> io::Result<()>; + fn multicast_loop_v4(&self) -> io::Result; + + fn set_multicast_ttl_v4(&self, multicast_ttl_v4: u32) -> io::Result<()>; + fn multicast_ttl_v4(&self) -> io::Result; + + fn set_multicast_loop_v6(&self, multicast_loop_v6: bool) -> io::Result<()>; + fn multicast_loop_v6(&self) -> io::Result; + + fn set_ttl(&self, ttl: u32) -> io::Result<()>; + fn ttl(&self) -> io::Result; + + fn set_only_v6(&self, only_v6: bool) -> io::Result<()>; + fn only_v6(&self) -> io::Result; + + fn join_multicast_v4(&self, multiaddr: &Ipv4Addr, interface: &Ipv4Addr) -> io::Result<()>; + fn join_multicast_v6(&self, multiaddr: &Ipv6Addr, interface: u32) -> io::Result<()>; + + fn leave_multicast_v4(&self, multiaddr: &Ipv4Addr, interface: &Ipv4Addr) -> io::Result<()>; + fn leave_multicast_v6(&self, multiaddr: &Ipv6Addr, interface: u32) -> io::Result<()>; + + fn connect(&self, addr: A) -> Result<()>; + fn send(&self, buf: &[u8]) -> Result; + fn recv(&self, buf: &mut [u8]) -> Result; + + fn take_error(&self) -> io::Result>; + + fn set_nonblocking(&self, nonblocking: bool) -> io::Result<()>; +} +``` + +The traditional approach would be to add these as unstable, inherent methods. +However, since inherent methods take precedence over trait methods, this would +cause all code using the extension traits in net2 to start reporting stability +errors. Instead, we have two options: + +1. Add this functionality as *stable* inherent methods. The rationale here would + be that time in a nursery crate acts as a de facto stabilization period. +2. Add this functionality via *unstable* extension traits. When/if we decide to + stabilize, we would deprecate the trait and add stable inherent methods. + Extension traits are a bit more annoying to work with, but this would give + us a formal stabilization period. + +Option 2 seems like the safer approach unless people feel comfortable with these +APIs. + +# Drawbacks +[drawbacks]: #drawbacks + +This is a fairly significant increase in the surface areas of these APIs, and +most users will never touch some of the more obscure functionality that these +provide. + +# Alternatives +[alternatives]: #alternatives + +We can leave some or all of this functionality in net2. + +# Unresolved questions +[unresolved]: #unresolved-questions + +The stabilization path (see above). From 415fabde4131cacd3fcd388beb3cc4bfc8c302e4 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 19 Jun 2015 13:31:29 -0700 Subject: [PATCH 0702/1195] RFC: impl specialization --- text/0000-impl-specialization.md | 1572 ++++++++++++++++++++++++++++++ 1 file changed, 1572 insertions(+) create mode 100644 text/0000-impl-specialization.md diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md new file mode 100644 index 00000000000..2a12837c74b --- /dev/null +++ b/text/0000-impl-specialization.md @@ -0,0 +1,1572 @@ +- Feature Name: specialization +- Start Date: 2015-06-17 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +This RFC proposes a design for *specialization*, which permits multiple `impl` +blocks to apply to the same type/trait, so long as one of the blocks is clearly +"more specific" than the other. The more specific `impl` block is used in a case +of overlap. The design proposed here also supports refining default trait +implementations based on specifics about the types involved. + +Altogether, this relatively small extension to the trait system yields benefits +for performance and code reuse, and it lays the groundwork for an "efficient +inheritance" scheme that is largely based on the trait system (described in a +[companion RFC][data]). + +[data]: + +# Motivation + +Specialization brings benefits along several different axes: + +* **Performance**: specialization expands the scope of "zero cost abstraction", + because specialized impls can provide custom high-performance code for + particular, concrete cases of an abstraction. + +* **Reuse**: the design proposed here also supports refining default (but + incomplete) implementations of a trait, given details about the types + involved. + +* **Groundwork**: the design lays the groundwork for supporting + ["efficient inheritance"](https://internals.rust-lang.org/t/summary-of-efficient-inheritance-rfcs/494) + through the trait system. + +The following subsections dive into each of these motivations in more detail. + +## Performance + +The simplest and most longstanding motivation for specialization is +performance. + +To take a very simple example, suppose we add a trait for overloading the `+=` +operator: + +```rust +trait AddAssign { + fn add_assign(&mut self, Rhs); +} +``` + +It's tempting to provide an impl for any type that you can both `Clone` and +`Add`: + +```rust +impl + Clone> AddAssign for T { + fn add_assign(&mut self, rhs: R) { + let tmp = self.clone() + rhs; + *self = tmp; + } +} +``` + +This impl is especially nice because it means that you frequently don't have to +bound separately by `Add` and `AddAssign`; often `Add` is enough to give you +both operators. + +However, in today's Rust, such an impl would rule out any more specialized +implementation that, for example, avoids the call to `clone`. That means there's +a tension between simple abstractions and code reuse on the one hand, and +performance on the other. Specialization resolves this tension by allowing both +the blanket impl, and more specific ones, to coexist, using the specialized ones +whenever possible (and thereby guaranteeing maximal performance). + +More broadly, traits today can provide static dispatch in Rust, but they can +still impose an abstraction tax. For example, consider the `Extend` trait: + +```rust +pub trait Extend { + fn extend(&mut self, iterable: T) where T: IntoIterator; +} +``` + +Collections that implement the trait are able to insert data from arbitrary +iterators. Today, that means that the implementation can assume nothing about +the argument `iterable` that it's given except that it can be transformed into +an iterator. That means the code must work by repeatedly calling `next` and +inserting elements one at a time. + +But in specific cases, like extending a vector with a slice, a much more +efficient implementation is possible -- and the optimizer isn't always capable +of producing it automatically. In such cases, specialization can be used to get +the best of both worlds: retaining the abstraction of `extend` while providing +custom code for specific cases. + +The design in this RFC relies on multiple, overlapping trait impls, so to take +advantage for `Extend` we need to refactor a bit: + +```rust +pub trait Extend> { + fn extend(&mut self, iterable: T); +} + +// The generic implementation +impl Extend for Vec where T: IntoIterator { + fn extend(&mut self, iterable: T) { + ... // implementation using push (like today's extend) + } +} + +// A specialized implementation for slices +impl<'a, A> Extend for Vec { + fn extend(&mut self, iterable: &'a [A]) { + ... // implementation using ptr::write (like push_all) + } +} +``` + +Other kinds of specialization are possible, including using marker traits like: + +```rust +unsafe trait TrustedSizeHint {} +``` + +that can allow the optimization to apply to a broader set of types than slices, +but are still more specific than `T: IntoIterator`. + +## Reuse + +Today's default methods in traits are pretty limited: they can assume only the +`where` clauses provided by the trait itself, and there is no way to provide +conditional or refined defaults that rely on more specific type information. + +For example, consider a different design for overloading `+` and `+=`, such that +they are always overloaded together: + +```rust +trait Add { + type Output; + fn add(self, rhs: Rhs) -> Self::Output; + fn add_assign(&mut self, Rhs); +} +``` + +In this case, there's no natural way to provide a default implementation of +`add_assign`, since we do not want to restrict the `Add` trait to `Clone` data. + +The specialization design in this RFC also allows for *partial* implementations, +which can provide specialized defaults without actually providing a full trait +implementation: + +```rust +partial impl Add for T { + fn add_assign(&mut self, rhs: R) { + let tmp = self.clone() + rhs; + *self = tmp; + } +} +``` + +This partial impl does *not* mean that `Add` is implemented for all `Clone` +data, but jut that when you do impl `Add` and `Self: Clone`, you can leave off +`add_assign`: + +```rust +#[derive(Copy, Clone)] +struct Complex { + // ... +} + +impl Add for Complex { + type Output = Complex; + fn add(self, rhs: Complex) { + // ... + } + // no fn add_assign necessary +} +``` + +A particularly nice case of refined defaults comes from trait hierarchies: you +can sometimes use methods from subtraits to improve default supertrait +methods. For example, consider the relationship between `size_hint` and +`ExactSizeIterator`: + +```rust +partial impl Iterator for T where T: ExactSizeIterator { + fn size_hint(&self) -> (usize, Option) { + (self.len(), Some(self.len())) + } +} +``` + +As we'll see later, the design of this RFC makes it possible to "lock down" such +method impls, preventing any further refinement (akin to Java's `final` +keyword); that in turn makes it possible to statically enforce the contract that +is supposed to connect the `len` and `size_hint` methods. (Of course, we can't +make *that* particular change, since the relevant APIs are already stable.) + +## Supporting efficient inheritance + +Finally, specialization can be seen as a form of inheritance, since methods +defined within a blanket impl can be overridden in a fine-grained way by a more +specialized impl. As we will see, this analogy is a useful guide to the design +of specialization. But it is more than that: the specialization design proposed +here is specifically tailored to support "efficient inheritance" schemes (like +those discussed +[here](https://internals.rust-lang.org/t/summary-of-efficient-inheritance-rfcs/494)) +without adding an entirely separate inheritance mechanism. + +The key insight supporting this design is that virtual method definitions in +languages like C++ and Java actually encompass two distinct mechanisms: virtual +dispatch (also known as "late binding") and implementation inheritance. These +two mechanisms can be separated and addressed independently; this RFC +encompasses an "implementation inheritance" mechanism distinct from virtual +dispatch, and useful in a number of other circumstances. But it can be combined +nicely with an orthogonal mechanism for virtual dispatch to give a complete +story for the "efficient inheritance" goal that many previous RFCs targeted. + +The author is preparing a companion RFC showing how this can be done with a +relatively small further extension to the language. But it should be said that +the design in *this* RFC is fully motivated independently of its companion RFC. + +# Detailed design + +There's a fair amount of material to cover, so we'll start with a basic overview +of the design in intuitive terms, and then look more formally at a specification. + +At the simplest level, specialization is about allowing overlap between impl +blocks, so long as there is always an unambiguous "winner" for any type falling +into the overlap. For example: + +```rust +impl Debug for T where T: Display { + fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { + ::fmt(self, f) + } +} + +impl Debug for String { + fn fmt(&self, f: &mut Formatter) -> Result { + try!(write!(f, "\"")); + for c in self.chars().flat_map(|c| c.escape_default()) { + try!(write!(f, "{}", c)); + } + write!(f, "\"") + } +} +``` + +The idea for this pair of impls is that you can rest assured that *any* type +implementing `Display` will also implement `Debug` via a reasonable default, but +go on to provide more specific `Debug` implementations when warranted. In +particular, the intuition is that a `Self` type of `String` is somehow "more +specific" or "more concrete" than `T where T: Display`. + +The bulk of the detailed design is aimed at making this intuition more +precise. But first, we need to explore some problems that arise when you +introduce specialization in any form. + +## Hazard: interactions with type checking + +Consider the following, somewhat odd example of overlapping impls: + +```rust +trait Example { + type Output; + fn generate(self) -> Self::Output; +} + +impl Example for T { + type Output = Box; + fn generate(self) -> Box { Box::new(self) } +} + +impl Example for bool { + type Output = bool; + fn generate(self) -> bool { self } +} +``` + +The key point to pay attention to here is the difference in associated types: +the blanket impl uses `Box`, while the impl for `bool` just uses `bool`. +If we write some code that uses the above impls, we can get into trouble: + +```rust +fn trouble(t: T) -> Box { + Example::generate(t) +} + +fn weaponize() -> bool { + let b: Box = trouble(true); + *b +} +``` + +What's going on? When type checking `trouble`, the compiler has a type `T` about +which it knows nothing, and sees an attempt to employ the `Example` trait via +`Example::generate(t)`. Because of the blanket impl, this use of `Example` is +allowed -- but furthermore, the associated type found in the blanket impl is now +directly usable, so that `::Output` is known within `trouble` to +be `Box`, allowing `trouble` to type check. But during *monomorphization*, +`weaponize` will actually produce a version of the code that returns a boolean +instead, and then attempt to dereference that boolean. In other words, things +look different to the typechecker than they do to codegen. Oops. + +So what went wrong? It should be fine for the compiler to assume that `T: +Example` for all `T`, given the blanket impl. But it's clearly problematic to +*also* assume that the associated types will be the ones given by that blanket +impl. Thus, the "obvious" solution is just to generate a type error in `trouble` +by preventing it from assuming `::Output` is `Box`. + +Unfortunately, this solution doesn't work. For one thing, it would be a breaking +change, since the following code *does* compile today: + +```rust +trait Example { + type Output; + fn generate(self) -> Self::Output; +} + +impl Example for T { + type Output = Box; + fn generate(self) -> Box { Box::new(self) } +} + +fn trouble(t: T) -> Box { + Example::generate(t) +} +``` + +And there are definitely cases where this pattern is important. To pick just one +example, consider the following impl for the slice iterator: + +```rust +impl<'a, T> Iterator for Iter<'a, T> { + type Item = &'a T; + // ... +} +``` + +It's essential that downstream code be able to assume that ` as +Iterator>::Item` is just `&'a T`, no matter what `'a` and `T` happen to be. + +Furthermore, it doesn't work to say that the compiler can make this kind of +assumption *unless* specialization is being used, since we want to allow +downstream crates to add specialized impls. We need to know up front. + +The solution proposed in this RFC is instead to treat specialization of items in +a trait as a per-item *opt in*, described in the next section. + +(As a sidenote, the trouble described above isn't limited to associated +types. Every function/method in a trait has an implicit associated type that +implements the closure types, and similar bad assumptions about blanket impls +can crop up there. It's not entirely clear whether they can be weaponized, +however.) + +## The `default` keyword + +Many statically-typed languages that allow refinement of behavior in some +hierarchy also come with ways to signal whether or not this is allowed: + +- C++ requires the `virtual` keyword to permit a method to be overridden in + subclasses. Modern C++ also supports `final` and `override` qualifiers. + +- C# requires the `virtual` keyword at definition and `override` at point of + overriding an existing method. + +- Java makes things silently virtual, but supports `final` as an opt out. + +Why have these qualifiers? Overriding implementations is, in a way, "action at a +distance". It means that the code that's actually being run isn't obvious when +e.g. a class is defined; it can change in subclasses defined +elsewhere. Requiring qualifiers is a way of signaling that this non-local change +is happening, so that you know you need to look more globally to understand the +actual behavior of the class. + +While impl specialization does not directly involve virtual dispatch, it's +closely-related to inheritance, and it allows some amount of "action at a +distance" (modulo, as we'll see, coherence rules). We can thus borrow directly +from these previous designs. + +This RFC proposes a "final-by-default" semantics akin to C++ that is +backwards-compatible with today's Rust, which means that the following +overlapping impls are prohibited: + +```rust +impl Example for T { + type Output = Box; + fn generate(self) -> Box { Box::new(self) } +} + +impl Example for bool { + type Output = bool; + fn generate(self) -> bool { self } +} +``` + +The error in these impls is that the first impl is implicitly defining "final" +versions of its items, which are thus not allowed to be refined in further +specializations. + +If you want to allow specialization of an item, you do so via the `default` +qualifier *within the impl block*: + +```rust +impl Example for T { + default type Output = Box; + default fn generate(self) -> Box { Box::new(self) } +} + +impl Example for bool { + type Output = bool; + fn generate(self) -> bool { self } +} +``` + +Thus, when you're trying to understand what code is going to be executed, if you +see an impl that applies to a type and the relevant item is *not* marked +`default`, you know that the definition you're looking at is the one that will +apply. If, on the other hand, the item is marked `default`, you need to scan for +other impls that could apply to your type. The coherence rules, described below, +help limit the scope of this search in practice. + +This design optimizes for fine-grained control over when specialization is +permitted. It's worth pausing for a moment and considering some alternatives and +questions about the design: + +- **Why mark `default` on impls rather than the trait?** There are a few reasons + to have `default` apply at the impl level. First of all, traits are + fundamentally *interfaces*, while `default` is really about + *implementations*. Second, as we'll see, it's useful to be able to "seal off" + a certain avenue of specialization while leaving others open; doing it at the + trait level is an all-or-nothing choice. + +- **Why mark `default` on items rather than the entire impl?** Again, this is + largely about granularity; it's useful to be able to pin down part of an impl + while leaving others open for specialization. Furthermore, while this RFC + doesn't propose to do it, we could easily add a shorthand later on in which + `default impl Trait for Type` is sugar for adding `default` to all items in + the impl. + +- **Won't `default` be confused with default methods?** Yes! But usefully so: as + we'll see, in this RFC's design today's default methods become sugar for + tomorrow's specialization. + +Finally, how does `default` help with the hazards described above? Easy: an +associated type from a blanket impl must be treated "opaquely" if it's marked +`default`. That is, if you write these impls: + +```rust +impl Example for T { + default type Output = Box; + default fn generate(self) -> Box { Box::new(self) } +} + +impl Example for bool { + type Output = bool; + fn generate(self) -> bool { self } +} +``` + +then the function `trouble` will fail to typecheck: + +```rust +fn trouble(t: T) -> Box { + Example::generate(t) +} +``` + +The error is that `::Output` no longer normalizes to `Box`, +because the applicable blanket impl marks the type as `default`. The fact that +`default` is an opt in makes this behavior backwards-compatible. + +The main drawbacks of this solution are: + +- **API evolution**. Adding `default` to an associated type *takes away* some + abilities, which makes it a breaking change. (In principle, this is probably + true for functions/methods as well, but the breakage there is theoretical at + most.) However, given the design constraints discussed so far, this seems like + an inevitable aspect of any simple, backwards-compatible design. + +- **Verbosity**. It's possible that certain uses of the trait system will result + in typing `default` quite a bit. This RFC takes a conservative approach of + introducing the keyword at a fine-grained level, but leaving the door open to + adding shorthands (like writing `default impl ...`) in the future, if need be. + +## Overlapping impls and specialization + +### What is overlap? + +Rust today does not allow any "overlap" between impls. Intuitively, this means +that you cannot write two trait impls that could apply to the same "input" +types. (An input type is either `Self` or a type parameter of the trait). For +overlap to occur, the input types must be able to "unify", which means that +there's some way of instantiating any type parameters involved so that the input +types are the same. Here are some examples: + +```rust +trait Foo {} + +// No overlap: String and Vec cannot unify. +impl Foo for String {} +impl Foo for Vec {} + +// No overlap: Vec and Vec cannot unify because u16 and u8 cannot unify. +impl Foo for Vec {} +impl Foo for Vec {} + +// Overlap: T can be instantiated to String. +impl Foo for T {} +impl Foo for String {} + +// Overlap: Vec and Vec can unify because T can be instantiated to u8. +impl Foo for Vec {} +impl Foo for Vec + +// No overlap: String and Vec cannot unify, no matter what T is. +impl Foo for String {} +impl Foo for Vec {} + +// Overlap: for any T that is Clone, both impls apply. +impl Foo for Vec where T: Clone {} +impl Foo for Vec {} + +// No overlap: implicitly, T: Sized, and since !Foo: Sized, you cannot instantiate T with it. +impl Foo for Box {} +impl Foo for Box {} + +trait Trait1 {} +trait Trait2 {} + +// Overlap: nothing prevents a T such that T: Trait1 + Trait2. +impl Foo for T {} +impl Foo for T {} + +trait Trait3 {} +trait Trait4: Trait3 {} + +// Overlap: any T: Trait4 is covered by both impls. +impl Foo for T {} +impl Foo for T {} + +trait Bar {} + +// No overlap: *all* input types must unify for overlap to happen. +impl Bar for u8 {} +impl Bar for u8 {} + +// No overlap: *all* input types must unify for overlap to happen. +impl Bar for T {} +impl Bar for T {} + +// No overlap: no way to instantiate T such that T == u8 and T == u16. +impl Bar for T {} +impl Bar for u8 {} + +// Overlap: instantiate U as T. +impl Bar for T {} +impl Bar for U {} + +// No overlap: no way to instantiate T such that T == &'a T. +impl Bar for T {} +impl<'a, T> Bar<&'a T> for T {} + +// Overlap: instantiate T = &'a U. +impl Bar for T {} +impl<'a, T, U> Bar for &'a U where U: Bar {} +``` + +### Permitting overlap + +The goal of specialization is to allow overlapping impls, but it's not as simple +as permitting *all* overlap. There has to be a way to decide which of two +overlapping impls to actually use for a given set of input types. The simpler +and more intuitive the rule for deciding, the easier it is to write and reason +about code -- and since dispatch is already quite complicated, simplicity here +is a high priority. On the other hand, the design should support as many of the +motivating use cases as possible. + +The basic intuition we've been using for specialization is the idea that one +impl is "more specific" than another it overlaps with. Before turning this +intuition into a rule, let's go through the previous examples of overlap and +decide which, if any, of the impls is intuitively more specific: + +```rust +trait Foo {} + +// Overlap: T can be instantiated to String. +impl Foo for T {} +impl Foo for String {} // String is more specific than T + +// Overlap: Vec and Vec can unify because T can be instantiated to u8. +impl Foo for Vec {} +impl Foo for Vec // Vec is more specific than Vec + +// Overlap: for any T that is Clone, both impls apply. +impl Foo for Vec // "Vec where T: Clone" is more specific than "Vec for any T" + where T: Clone {} +impl Foo for Vec {} + +trait Trait1 {} +trait Trait2 {} + +// Overlap: nothing prevents a T such that T: Trait1 + Trait2 +impl Foo for T {} // Neither is more specific; +impl Foo for T {} // there's no relationship between the traits here + +trait Trait3 {} +trait Trait4: Trait3 {} + +// Overlap: any T: Trait4 is covered by both impls. +impl Foo for T {} +impl Foo for T {} // T: Trait4 is more specific than T: Trait3 + +trait Bar {} + +// Overlap: instantiate U as T. +impl Bar for T {} // More specific since both input types are identical +impl Bar for U {} + +// Overlap: instantiate T = &'a U. +impl Bar for T {} // Neither is more specific +impl<'a, T, U> Bar for &'a U + where U: Bar {} +``` + +What are the patterns here? + +- Concrete types are more specific than type variables, e.g.: + - `String` is more specific than `T` + - `Vec` is more specific than `Vec` +- More constraints lead to more specific impls, e.g.: + - `T: Clone` is more specific than `T` + - `Bar for T` is more specific than `Bar for U` +- Unrelated constraints don't contribute, e.g.: + - Neither `T: Trait1` nor `T: Trait2` is more specific than the other. + +For many purposes, the above simple patterns are sufficient for working with +specialization. But to provide a spec, we need a more general, formal way of +deciding precedence; we'll give one next. + +### Defining the precedence rules + +An impl block `I` contains basically two pieces of information relevant to +specialization: + +- A set of type variables, like `T, U` in `impl Bar for U`. + - We'll call this `I.vars`. +- A set of where clauses, like `T: Clone` in `impl Foo for Vec`. + - We'll call this `I.wc`. + +We're going to define a *specialization relation* `<=` between impl blocks, so +that `I <= J` means that impl block `I` is "at least as specific as" impl block +`J`. (If you want to think of this in terms of "size", you can imagine that the +set of types `I` applies to is no bigger than those `J` applies to.) + +We'll say that `I < J` if `I <= J` and `!(J <= I)`. In this case, `I` is *more +specialized* than `J`. + +To ensure specialization is coherent, we will ensure that for any two impls `I` +and `J` that overlap, we have either `I < J` or `J < I`. That is, one must be +truly more specific than the other. Specialization chooses the "smallest" impl +in this order -- and the new overlap rule ensures there is a unique smallest +impl among those that apply to a given set of input types. + +More broadly, while `<=` is not a total order on *all* impls of a given trait, +it will be a total order on any set of impls that all mutually overlap, which is +all we need to determine which impl to use. + +We'll start with an abstract/high-level formulation, and then build up toward an +algorithm for deciding specialization by introducing a number of building +blocks. + +#### Abstract formulation + +Recall that the +[input types](https://github.com/aturon/rfcs/blob/associated-items/active/0000-associated-items.md) +of a trait are the `Self` type and all trait type parameters. So the following +impl has input types `bool`, `u8` and `String`: + +```rust +trait Baz { .. } +// impl I +impl Baz for String { .. } +``` + +If you think of these input types as a tuple, `(bool, u8, String`) you can think +of each trait impl `I` as determining a set `apply(I)` of input type tuples that +obeys `I`'s where clauses. The impl above is just the singleton set `apply(I) = { (bool, +u8, String) }`. Here's a more interesting case: + +```rust +// impl J +impl Baz for U where T: Clone { .. } +``` + +which gives the set `apply(J) = { (T, u8, U) | T: Clone }`. + +Two impls `I` and `J` overlap if `apply(I)` and `apply(J)` intersect. + +**We can now define the specialization order abstractly**: `I <= J` if +`apply(I)` is a subset of `apply(J)`. + +This is true of the two sets above: + +``` +apply(I) = { (bool, u8, String) } + is a strict subset of +apply(J) = { (T, u8, U) | T: Clone } +``` + +Here are a few more examples. + +**Via where clauses**: + +```rust +// impl I +// apply(I) = { T | T a type } +impl Foo for T {} + +// impl J +// apply(J) = { T | T: Clone } +impl Foo for T where T: Clone {} + +// J < I +``` + +**Via type structure**: + +```rust +// impl I +// apply(I) = { (T, U) | T, U types } +impl Bar for U {} + +// impl J +// apply(J) = { (T, T) | T a type } +impl Bar for T {} + +// J < I +``` + +The same reasoning can be applied to all of the examples we saw earlier, and the +reader is encouraged to do so. We'll look at one of the more subtle cases here: + +```rust +// impl I +// apply(I) = { (T, T) | T any type } +impl Bar for T {} + +// impl J +// apply(J) = { (T, &'a U) | U: Bar, 'a any lifetime } +impl<'a, T, U> Bar for &'a U where U: Bar {} +``` + +The claim is that `apply(I)` and `apply(J)` intersect, but neither contains the +other. Thus, these two impls are not permitted to coexist according to this +RFC's design. (We'll revisit this limitation toward the end of the RFC.) + +#### Algorithmic formulation + +The goal in the remainder of this section is to turn the above abstract +definition of `<=` into something closer to an algorithm, connected to existing +mechanisms in the Rust compiler. We'll start by reformulating `<=` in a way that +effectively "inlines" `apply`: + +`I <= J` if: + +- For any way of instantiating `I.vars`, there is some way of instantiating + `J.vars` such that the `Self` type and trait type parameters match up. + +- For this instantiation of `I.vars`, if you assume `I.wc` holds, you can prove + `J.wc`. + +It turns out that the compiler is already quite capable of answering these +questions, via "unification" and "skolemization", which we'll see next. + +##### Unification: solving equations on types + +Unification is the workhorse of type inference and many other mechanisms in the +Rust compiler. You can think of it as a way of solving equations on types that +contain variables. For example, consider the following situation: + +```rust +fn use_vec(v: Vec) { .. } + +fn caller() { + let v = vec![0u8, 1u8]; + use_vec(v); +} +``` + +The compiler ultimately needs to infer what type to use for the `T` in `use_vec` +within the call in `caller`, given that the actual argument has type +`Vec`. You can frame this as a unification problem: solve the equation +`Vec = Vec`. Easy enough: `T = u8`! + +Some equations can't be solved. For example, if we wrote instead: + +```rust +fn caller() { + let s = "hello"; + use_vec(s); +} +``` + +we would end up equating `Vec = &str`. There's no choice of `T` that makes +that equation work out. Type error! + +Unification often involves solving a series of equations between types +simultaneously, but it's not like high school algebra; the equations involved +all have the limited form of `type1 = type2`. + +One immediate way in which unification is relevant to this RFC is in determining +when two impls "overlap": roughly speaking, they overlap if you can each pair of +input types can be unified simultaneously. For example: + +```rust +// No overlap: String and bool do not unify +impl Foo for String { .. } +impl Foo for bool { .. } + +// Overlap: String and T unify +impl Foo for String { .. } +impl Foo for T { .. } + +// Overlap: T = U, T = V is trivially solvable +impl Bar for T { .. } +impl Bar for V { .. } + +// No overlap: T = u8, T = bool not solvable +impl Bar for T { .. } +impl Bar for bool { .. } +``` + +Note the difference in how *concrete types* and *type variables* work for +unification. When `T`, `U` and `V` are variables, it's fine to say that `T = U`, +`T = V` is solvable: we can make the impls overlap by instantiating all three +variables with the same type. But asking for e.g. `String = bool` fails, because +these are concrete types, not variables. (The same happens in algebra; consider +that `2 = 3` cannot be solved, but `x = y` and `y = z` can be.) This +distinction may seem obvious, but we'll next see how to leverage it in a +somewhat subtle way. + +##### Skolemization: asking forall/there exists questions + +We've already rephrased `<=` to start with a "for all, there exists" problem: + +- For any way of instantiating `I.vars`, there is some way of instantiating + `J.vars` such that the `Self` type and trait type parameters match up. + +For example: + +```rust +// impl I +impl Bar for T {} + +// impl J +impl Bar for V {} +``` + +For any choice of `T`, it's possible to choose a `U` and `V` such that the two +impls match -- just choose `U = T` and `V = T`. But the opposite isn't possible: +if `U` and `V` are different (say, `String` and `bool`), then no choice of `T` +will make the two impls match up. + +This feels similar to a unification problem, and it turns out we can solve it +with unification using a scary-sounding trick known as "skolemization". + +Basically, to "skolemize" a type variable is to treat it *as if it were a +concrete type*. So if `U` and `V` are skolemized, then `U = V` is unsolvable, in +the same way that `String = bool` is unsolvable. That's perfect for capturing +the "for any instantiation of I.vars" part of what we want to formalize. + +With this tool in hand, we can further rephrase the "for all, there exists" part +of `<=` in the following way: + +- After skolemizing `I.vars`, it's possible to unify `I` and `J`. + +Note that a successful unification through skolemization gives you the same +answer as you'd get if you unified without skolemizing. + +##### The algorithmic version + +One outcome of running unification on two impls as above is that we can +understand both impl headers in terms of a single set of type variables. For +example: + +```rust +// Before unification: +impl Bar for T where T: Clone { .. } +impl Bar for Vec where V: Debug { .. } + +// After unification: +// T = Vec +// U = Vec +// V = W +impl Bar> for Vec where Vec: Clone { .. } +impl Bar> for Vec where W: Debug { .. } +``` + +By putting everything in terms of a single set of type params, it becomes +possible to do things like compare the `where` clauses, which is the last piece +we need for a final rephrasing of `<=` that we can implement directly. + +Putting it all together, we'll say `I <= J` if: + +- After skolemizing `I.vars`, it's possible to unify `I` and `J`. +- Under the resulting unification, `I.wc` implies `J.wc` + +Let's look at a couple more examples to see how this works: + +```rust +trait Trait1 {} +trait Trait2 {} + +// Overlap: nothing prevents a T such that T: Trait1 + Trait2 +impl Foo for T {} // Neither is more specific; +impl Foo for T {} // there's no relationship between the traits here +``` + +In comparing these two impls in either direction, we make it past unification +and must try to prove that one where clause implies another. But `T: Trait1` +does not imply `T: Trait2`, nor vice versa, so neither impl is more specific +than the other. Since the impls do overlap, an ambiguity error is reported. + +On the other hand: + +```rust +trait Trait3 {} +trait Trait4: Trait3 {} + +// Overlap: any T: Trait4 is covered by both impls. +impl Foo for T {} +impl Foo for T {} // T: Trait4 is more specific than T: Trait3 +``` + +Here, since `T: Trait4` implies `T: Trait3` but not vice versa, we get + +```rust +impl Foo for T < impl Foo for T +``` + +##### Key properties + +Remember that for each pair of impls `I`, `J`, the compiler will check that +exactly one of the following holds: + +- `I` and `J` do not overlap (a unification check), or else +- `I < J`, or else +- `J < I` + +Since `I <= J` ultimately boils down to a subset relationship, we get a lot of +nice properties for free (e.g., transitivity: if `I <= J <= K` then `I <= K`). +Together with the compiler check above, we know that at monomorphization time, +after filtering to the impls that apply to some concrete input types, there will +always be a unique, smallest impl in specialization order. (In particular, if +multiple impls apply to concrete input types, those impls must overlap.) + +There are various implementation strategies that avoid having to recalculate the +ordering during monomorphization, but we won't delve into those details in this +RFC. + +### Implications for coherence + +The coherence rules ensure that there is never an ambiguity about which impl to +use when monomorphizing code. Today, the rules consist of the simple overlap +check described earlier, and the "orphan" check which limits the crates in which +impls are allowed to appear ("orphan" refers to an impl in a crate that defines +neither the trait nor the types it applies to). The orphan check is needed, in +particular, so that overlap cannot be created accidentally when linking crates +together. + +The design in this RFC heavily revises the overlap check, as described above, +but does not propose any changes to the orphan check (which is described in +[a blog post](http://smallcultfollowing.com/babysteps/blog/2015/01/14/little-orphan-impls/)). Basically, +the change to the overlap check does not appear to change the cases in which +orphan impls can cause trouble. And a moment's thought reveals why: if two +sibling crates are unaware of each other, there's no way that they could each +provide an impl overlapping with the other, yet be sure that one of those impls +is more specific than the other in the overlapping region. + +## Partial impls + +An interesting consequence of specialization is that impls need not (and in fact +sometimes *cannot*) provide all of the items that a trait specifies. Of course, +this is already the case with defaulted items in a trait -- but as we'll see, +that mechanism can be seen as just a way of using specialization. + +Let's start with a simple example: + +```rust +trait MyTrait { + fn foo(&self); + fn bar(&self); +} + +impl MyTrait for T { + default fn foo(&self) { ... } + default fn bar(&self) { ... } +} + +impl MyTrait for String { + fn bar(&self) { ... } +} +``` + +Here, we're acknowledging that the blanket impl has already provided definitions +for both methods, so the impl for `String` can opt to just re-use the earlier +definition of `foo`. This is one reason for the choice of the keyword `default`. +Viewed this way, items defined in a specialized impl are optional overrides of +those in overlapping blanket impls. + +And, in fact, if we'd written the blanket impl differently, we could *force* the +`String` impl to leave off `foo`: + +```rust +impl MyTrait for T { + // now `foo` is "final" + fn foo(&self) { ... } + + default fn bar(&self) { ... } +} +``` + +Being able to leave off items that are covered by blanket impls means that +specialization is close to providing a finer-grained version of defaulted items +in traits -- one in which the defaults can become ever more refined as more is +known about the input types to the traits (as described in the Motivation +section). But to fully realize this goal, we need one other ingredient: the +ability for the *blanket* impl itself to leave off some items. We do this by +using the `partial` keyword: + +```rust +trait Add { + type Output; + fn add(self, rhs: Rhs) -> Self::Output; + fn add_assign(&mut self, Rhs); +} + +partial impl Add for T { + fn add_assign(&mut self, rhs: R) { + let tmp = self.clone() + rhs; + *self = tmp; + } +} +``` + +A subsequent overlapping impl of `Add` where `Self: Clone` can choose to leave +off `add_assign`, "inheriting" it from the partial impl above. + +A key point here is that, as the keyword suggests, a `partial` impl may be +incomplete: from the above code, you *cannot* assume that `T: Add` for any +`T: Clone`, because no such complete impl has been provided. + +With partial impls, defaulted items in traits are just sugar for a partial +blanket impl: + +```rust +trait Iterator { + type Item; + fn next(&mut self) -> Option; + + fn size_hint(&self) -> (usize, Option) { + (0, None) + } + // ... +} + +// desugars to: + +trait Iterator { + type Item; + fn next(&mut self) -> Option; + fn size_hint(&self) -> (usize, Option); + // ... +} + +partial impl Iterator for T { + default fn size_hint(&self) -> (usize, Option) { + (0, None) + } + // ... +} +``` + +Partial impls are somewhat akin to abstract base classes in object-oriented +languages; they provide some, but not all, of the materials needed for a fully +concrete implementation, and thus enable code reuse but cannot be used concretely. + +Note that partial impls still need to use `default` to allow for overriding -- +leaving off the qualifier will lock down the implementation in any +more-specialized complete impls, which is actually a useful pattern (as +explained in the Motivation.) + +There are a few important details to nail down with the design. This RFC +proposes starting with the conservative approach of applying the general overlap +rule to partial impls, same as with complete ones. That ensures that there is +always a clear definition to use when providing subsequent complete impls. It +would be possible, though, to relax this constraint and allow *arbitrary* +overlap between partial impls, requiring then whenever a complete impl overlaps +with them, *for each item*, there is either a unique "most specific" partial +impl that applies, or else the complete impl provides its own definition for +that item. Such a relaxed approach is much more flexible, probably easier to +work with, and can enable more code reuse -- but it's also more complicated, and +backwards-compatible to add on top of the proposed conservative approach. + +## Inherent impls + +It has long been folklore that inherent impls can be thought of as special, +anonymous traits that are: + +- Automatically in scope; +- Given higher dispatch priority than normal traits. + +It is easiest to make this idea work out if you think of each inherent item as +implicitly defining and implementing its own trait, so that you can account for +examples like the following: + +```rust +struct Foo { .. } + +impl Foo { + fn foo(&self) { .. } +} + +impl Foo { + fn bar(&self) { .. } +} +``` + +In this example, the availability of each inherent item is dependent on a +distinct `where` clause. A reasonable "desugaring" would be: + +```rust +#[inherent] // an imaginary attribute turning on the "special" treatment of inherent impls +trait Foo_foo { + fn foo(&self); +} + +#[inherent] +trait Foo_bar { + fn bar(&self); +} + +impl Foo_foo for Foo { + fn foo(&self) { .. } +} + +impl Foo_bar for Foo { + fn bar(&self) { .. } +} +``` + +With this idea in mind, it is natural to expect specialization to work for +inherent impls, e.g.: + +```rust +impl Vec where I: IntoIterator { + fn extend(iter: I) { .. } +} + +impl Vec { + fn extend(slice: &[T]) { .. } +} +``` + +This RFC proposes to permit such specialization at the inherent impl level. The +semantics is defined in terms of the folklore desugaring above. + +(Note: this example was chosen purposefully: it's possible to use specialization +at the inherent impl level to avoid refactoring the `Extend` trait as described +in the Motivation section.) + +One tricky aspect here is that, since there is no explicit trait definition, +there is no general signature that each definition of an inherent item must +match. Thinking about `Vec` above, for example, notice that the two signatures +for `extend` look superficially different, although it's clear that the first +impl is the more general of the two. + +For the intended desugaring into "inherent traits" to be coherent, we need to +determine the items signatures. To do this, we apply the following test: + +- Suppose an item of the same kind, named `foo`, occurs in two inherent impl + blocks for the same type constructor. + +- If it's possible to unify the two impl headers, then the two signatures for + `foo` must be "equivalent" under that unification. + +- Two signatures `S` and `T` are equivalent if: + - After skolemizing all type variables in `S`, it is possible to unify the two signatures, and, + - After that unification, the where clauses in `S` imply those in `T`, and + - Vice versa. + +Basically, this check ensures that any overlapping inherent items provide +"identical" signatures for the area of overlap. That in turn means that callers +can be typechecked against *any* of the definitions that applies, and +typechecking would equally succeed with any other. + +Note that this is a breaking change, since examples like the following are +(surprisingly!) allowed today: + +```rust +struct Foo(A, B); + +impl Foo { + fn foo(&self, _: u32) {} +} + +impl Foo { + fn foo(&self, _: bool) {} +} + +fn use_foo(f: Foo) { + f.foo(true) +} +``` + +As has been proposed +[elsewhere](https://internals.rust-lang.org/t/pre-rfc-adjust-default-object-bounds/2199/), +this "breaking change" would be made available through a feature flag that must +be used even after stabilization (to opt in to specialization of inherent +impls); the full details will depend on pending revisions to +[RFC 1122](https://github.com/rust-lang/rfcs/pull/1122). + +## Limitations + +One frequent motivation for specialization is broader "expressiveness", in +particular providing a larger set of trait implementations than is possible +today. + +For example, the standard library currently includes an `AsRef` trait +for "as-style" conversions: + +```rust +pub trait AsRef where T: ?Sized { + fn as_ref(&self) -> &T; +} +``` + +Currently, there is also a blanket implementation as follows: + +```rust +impl<'a, T: ?Sized, U: ?Sized> AsRef for &'a T where T: AsRef { + fn as_ref(&self) -> &U { + >::as_ref(*self) + } +} +``` + +which allows these conversions to "lift" over references, which is in turn +important for making a number of standard library APIs ergonomic. + +On the other hand, we'd also like to provide the following very simple +blanket implementation: + +```rust +impl<'a, T: ?Sized> AsRef for T { + fn as_ref(&self) -> &T { + self + } +} +``` + +The current coherence rules prevent having both impls, however, +because they can in principle overlap: + +```rust +AsRef<&'a T> for &'a T where T: AsRef<&'a T> +``` + +Another examples comes from the `Option` type, which currently provides two +methods for unwrapping while providing a default value for the `None` case: + +```rust +impl Option { + fn unwrap_or(self, def: T) -> T { ... } + fn unwrap_or_else(self, f: F) -> T where F: FnOnce() -> T { .. } +} +``` + +The `unwrap_or` method is more ergonomic but `unwrap_or_else` is more efficient +in the case that the default is expensive to compute. The original +[collections reform RFC](https://github.com/rust-lang/rfcs/pull/235) proposed a +`ByNeed` trait that was rendered unworkable after unboxed closures landed: + +```rust +trait ByNeed { + fn compute(self) -> T; +} + +impl ByNeed for T { + fn compute(self) -> T { + self + } +} + +impl ByNeed for F where F: FnOnce() -> T { + fn compute(self) -> T { + self() + } +} + +impl Option { + fn unwrap_or(self, def: U) where U: ByNeed { ... } + ... +} +``` + +The trait represents any value that can produce a `T` on demand. But the above +impls fail to compile in today's Rust, because they overlap: consider `ByNeed +for F` where `F: FnOnce() -> F`. + +There are also some trait hierarchies where a subtrait completely subsumes the +functionality of a supertrait. For example, consider `PartialOrd` and `Ord`: + +```rust +trait PartialOrd: PartialEq { + fn partial_cmp(&self, other: &Rhs) -> Option; +} + +trait Ord: Eq + PartialOrd { + fn cmp(&self, other: &Self) -> Ordering; +} +``` + +In cases like this, it's somewhat annoying to have to provide an impl for *both* +`Ord` and `PartialOrd`, since the latter can be trivially derived from the +former. So you might want an impl like this: + +```rust +impl PartialOrd for T where T: Ord { + fn partial_cmp(&self, other: &T) -> Option { + Some(self.cmp(other)) + } +} +``` + +But this blanket impl would conflict with a number of others that work to "lift" +`PartialOrd` and `Ord` impls over various type constructors like references and +tuples, e.g.: + +```rust +impl<'a, A: ?Sized> Ord for &'a A where A: Ord { + fn cmp(&self, other: & &'a A) -> Ordering { Ord::cmp(*self, *other) } +} + +impl<'a, 'b, A: ?Sized, B: ?Sized> PartialOrd<&'b B> for &'a A where A: PartialOrd { + fn partial_cmp(&self, other: &&'b B) -> Option { + PartialOrd::partial_cmp(*self, *other) + } +``` + +The case where they overlap boils down to: + +```rust +PartialOrd<&'a T> for &'a T where &'a T: Ord +PartialOrd<&'a T> for &'a T where T: PartialOrd +``` + +and there is no implication between either of the where clauses. + +There are many other examples along these lines. + +Unfortunately, *none* of these examples are permitted by the revised overlap +rule in this RFC, because in none of these cases is one of the impls fully a +"subset" of the other; the overlap is always partial. + +It's a shame to not be able to address these cases, but the benefit is a +specialization rule that is very intuitive and accepts only very clear-cut +cases. The Alternatives section sketches some different rules that are less +intuitive but do manage to handle cases like those above. + +If we allowed "relaxed" partial impls as described above, one could at least use +that mechanism to avoid having to give a definition directly in most cases. (So +if you had `T: Ord` you could write `impl PartialOrd for T {}`.) + +## Possible extensions + +It's worth briefly mentioning a couple of mechanisms that one could consider +adding on top of specialization. + +### Super + +Continuing the analogy between specialization and inheritance, one could imagine +a mechanism like `super` to access and reuse less specialized implementations +when defining more specialized ones. While there's not a strong need for this +mechanism as part of this RFC, it's worth checking that the specialization +approach is at least compatible with `super`. + +Fortunately, it is. If we take `super` to mean "the most specific impl +overlapping with this one", there is always a unique answer to that question, +because all overlapping impls are totally ordered with respect to each other via +specialization. + +### Extending HRTBs + +In the Motivation we mentioned the need to refactor the `Extend` trait to take +advantage of specialization. It's possible to work around that need by using +specialization on inherent impls (and having the trait impl defer to the +inherent one), but of course that's a bit awkward. + +For reference, here's the refactoring: + +```rust +// Current definition +pub trait Extend { + fn extend(&mut self, iterable: T) where T: IntoIterator; +} + +// Refactored definition +pub trait Extend> { + fn extend(&mut self, iterable: T); +} +``` + +One problem with this kind of refactoring is that you *lose* the ability to say +that a type `T` is extendable *by an arbitrary iterator*, because every use of +the `Extend` trait has to say precisely what iterator is supported. But the +whole point of this exercise is to have a blanket impl of `Extend` for any +iterator that is then specialized later. + +This points to a longstanding limitation: the trait system makes it possible to +ask for any number of specific impls to exist, but not to ask for a blanket impl +to exist -- *except* in the limited case of lifetimes, where higher-ranked trait +bounds allow you to do this: + +```rust +trait Trait { .. } +impl<'a> Trait for &'a MyType { .. } + +fn use_all(t: T) where for<'a> &'a T: Trait { .. } +``` + +We could extend this mechanism to cover type parameters as well, so that you could write: + +```rust +fn needs_extend_all(t: T) where for> T: Extend { .. } +``` + +Such a mechanism is out of scope for this RFC. + +# Drawbacks + +Many of the more minor tradeoffs have been discussed in detail throughout. We'll +focus here on the big picture. + +As with many new language features, the most obvious drawback of this proposal +is the increased complexity of the language -- especially given the existing +complexity of the trait system. Partly for that reason, the RFC errs on the side +of simplicity in the design wherever possible. + +One aspect of the design that mitigates its complexity somewhat is the fact that +it is entirely opt in: you have to write `default` in an impl in order for +specialization of that item to be possible. That means that all the ways we have +of reasoning about existing code still hold good. When you do opt in to +specialization, the "obviousness" of the specialization rule should mean that +it's easy to tell at a glance which of two impls will be preferred. + +On the other hand, the simplicity of this design has its own drawbacks: + +- You have to lift out trait parameters to enable specialization, as in the + `Extend` example above. The RFC mentions a few ways of dealing with this + limitation -- either by employing inherent item specialization, or by + eventually generalizing HRTBs. + +- You can't use specialization to handle some of the more "exotic" cases of + overlap, as described in the Limitations section above. This is a deliberate + trade, favoring simple rules over maximal expressiveness. + +Finally, if we take it as a given that we want to support some form of +"efficient inheritance" as at least a programming pattern in Rust, the ability +to use specialization to do so, while also getting all of its benefits, is a net +simplifier. The full story there, of course, depends on the forthcoming companion RFC. + +# Alternatives + +The main alternative to specialization in general is an approach based on +negative bounds, such as the one outlined in an +[earlier RFC](https://github.com/rust-lang/rfcs/pull/586). Negative bounds make +it possible to handle many of the examples this proposal can't (the ones in the +Limitations section). But negative bounds are also fundamentally *closed*: they +make it possible to perform a certain amount of specialization up front when +defining a trait, but don't easily support downstream crates further +specializing the trait impls. + +Assuming we want specialization, there are alternative designs for the +specialization rule that allow it to handle more cases, but lose out on +intuition or other important properties. Suppose we want to handle both of the +following examples: + +```rust +trait AsRef { .. } +impl<'a, T, U> AsRef for &'a T where T: AsRef { .. } +impl<'a, T> AsRef for T { .. } + +trait Debug { .. } +trait Display { .. } +impl Debug for T where T: Display { .. } +impl<'a, T> Debug for &'a T where T: Debug { .. } +``` + +In the first example, you might be tempted to say that the first impl is the +"more specific", because *in the case of overlap*, it has a more constrained +where clause. + +In the second example, you might be tempted to say that the second impl is the +"more specific", because in the case of overlap, neither where clause implies +the other, but the second impl imposes more constrained type structure (`&'a T` +rather than `T`). + +These intuitions correspond to an alternative definition of `I <=alt J` where +*either or both* of the following clauses needs to hold: + +- `I` and `J` unify and `I.wc` implies `J.wc` under that unification +- `I` and `J` unify after skolemizing `I.vars` + +The first clause ignores differences in type structure, while the second clause +ignores differences in where clauses. Note that `I <=alt J` and `I <= J` are +very similar -- essentially, `I <= J` requires that *both* clauses hold, while +`I <=alt J` permits either of them to hold. + +Using `I <=alt J`, the examples above work out, as do the rest described in +Limitations. But the rule is far less intuitive. And it suffers from some poor +mathematical properties. + +Consider this example: + +```rust +trait Trait1 {} +trait Trait2: Trait1 {} +trait Trait3: Trait2 {} + +trait Foo {} + +impl Foo for () where T: Trait1 {} // impl I +impl Foo for () where U: Trait2 {} // impl J +impl Foo for () where V: Trait3 {} // impl K +``` + +This example is carefully designed so that `K Date: Thu, 9 Jul 2015 12:04:46 -0700 Subject: [PATCH 0703/1195] Add text on explicit ordering alternative --- text/0000-impl-specialization.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 2a12837c74b..036515648df 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -1475,6 +1475,8 @@ simplifier. The full story there, of course, depends on the forthcoming companio # Alternatives +## Alternatives to specialization + The main alternative to specialization in general is an approach based on negative bounds, such as the one outlined in an [earlier RFC](https://github.com/rust-lang/rfcs/pull/586). Negative bounds make @@ -1484,6 +1486,10 @@ make it possible to perform a certain amount of specialization up front when defining a trait, but don't easily support downstream crates further specializing the trait impls. +## Alternative specialization designs + +### Relaxing the rule + Assuming we want specialization, there are alternative designs for the specialization rule that allow it to handle more cases, but lose out on intuition or other important properties. Suppose we want to handle both of the @@ -1552,6 +1558,15 @@ something like `<=alt` backwards-compatibly, since the latter allows strictly more specializations (i.e. only changes things in cases that would have previously generated an error). +### Explicit ordering + +Another, perhaps more palatable alternative would be to take the specialization +rule proposed in this RFC, but have some other way of specifying precedence when +that rule can't resolve it -- perhaps by explicit priority numbering. That kind +of mechanism is usually noncompositional, but due to the orphan rule, it's a +least a crate-local concern. Like the alternative rule above, it could be added +backwards compatibly if needed, since it only enables new cases. + # Unresolved questions Finally, there are a few important questions not yet addressed by this RFC: From dcc1875d8a49079218f6ad017fef24ae0814509f Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 9 Jul 2015 12:05:52 -0700 Subject: [PATCH 0704/1195] Remove dead link --- text/0000-impl-specialization.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 036515648df..29e07ca8bd0 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -14,9 +14,7 @@ implementations based on specifics about the types involved. Altogether, this relatively small extension to the trait system yields benefits for performance and code reuse, and it lays the groundwork for an "efficient inheritance" scheme that is largely based on the trait system (described in a -[companion RFC][data]). - -[data]: +forthcoming companion RFC). # Motivation From 6f2b9e46b12b0c43c75c9d4504400ae9c667d0e2 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 13 Jul 2015 09:53:23 -0700 Subject: [PATCH 0705/1195] A few clarifications throughout --- text/0000-impl-specialization.md | 47 +++++++++++++++++++------------- 1 file changed, 28 insertions(+), 19 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 29e07ca8bd0..ce87a72892e 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -102,7 +102,8 @@ pub trait Extend> { // The generic implementation impl Extend for Vec where T: IntoIterator { - fn extend(&mut self, iterable: T) { + // the `default` qualifier allows this method to be specialized below + default fn extend(&mut self, iterable: T) { ... // implementation using push (like today's extend) } } @@ -150,7 +151,8 @@ implementation: ```rust partial impl Add for T { - fn add_assign(&mut self, rhs: R) { + // the `default` qualifier allows further specialization + default fn add_assign(&mut self, rhs: R) { let tmp = self.clone() + rhs; *self = tmp; } @@ -190,10 +192,11 @@ partial impl Iterator for T where T: ExactSizeIterator { ``` As we'll see later, the design of this RFC makes it possible to "lock down" such -method impls, preventing any further refinement (akin to Java's `final` -keyword); that in turn makes it possible to statically enforce the contract that -is supposed to connect the `len` and `size_hint` methods. (Of course, we can't -make *that* particular change, since the relevant APIs are already stable.) +method impls (by not using the `default` qualifier), preventing any further +refinement (akin to Java's `final` keyword); that in turn makes it possible to +statically enforce the contract that is supposed to connect the `len` and +`size_hint` methods. (Of course, we can't make *that* particular change, since +the relevant APIs are already stable.) ## Supporting efficient inheritance @@ -344,15 +347,18 @@ Furthermore, it doesn't work to say that the compiler can make this kind of assumption *unless* specialization is being used, since we want to allow downstream crates to add specialized impls. We need to know up front. +Another possibility would be to simply disallow specialization of associated +types. But the trouble described above isn't limited to associated types. Every +function/method in a trait has an implicit associated type that implements the +closure types, and similar bad assumptions about blanket impls can crop up +there. It's not entirely clear whether they can be weaponized, however. (That +said, it may be reasonable to stabilize only specialization of functions/methods +to begin with, and wait for strong use cases of associated type specialization +to emerge before stabilizing that.) + The solution proposed in this RFC is instead to treat specialization of items in a trait as a per-item *opt in*, described in the next section. -(As a sidenote, the trouble described above isn't limited to associated -types. Every function/method in a trait has an implicit associated type that -implements the closure types, and similar bad assumptions about blanket impls -can crop up there. It's not entirely clear whether they can be weaponized, -however.) - ## The `default` keyword Many statically-typed languages that allow refinement of behavior in some @@ -473,10 +479,11 @@ because the applicable blanket impl marks the type as `default`. The fact that The main drawbacks of this solution are: - **API evolution**. Adding `default` to an associated type *takes away* some - abilities, which makes it a breaking change. (In principle, this is probably - true for functions/methods as well, but the breakage there is theoretical at - most.) However, given the design constraints discussed so far, this seems like - an inevitable aspect of any simple, backwards-compatible design. + abilities, which makes it a breaking change to a public API. (In principle, + this is probably true for functions/methods as well, but the breakage there is + theoretical at most.) However, given the design constraints discussed so far, + this seems like an inevitable aspect of any simple, backwards-compatible + design. - **Verbosity**. It's possible that certain uses of the trait system will result in typing `default` quite a bit. This RFC takes a conservative approach of @@ -579,7 +586,9 @@ motivating use cases as possible. The basic intuition we've been using for specialization is the idea that one impl is "more specific" than another it overlaps with. Before turning this intuition into a rule, let's go through the previous examples of overlap and -decide which, if any, of the impls is intuitively more specific: +decide which, if any, of the impls is intuitively more specific. **Note that since +we're leaving out the body of the impls, you won't see the `default` keyword +that would be required in practice for the less specialized impls.** ```rust trait Foo {} @@ -1155,7 +1164,7 @@ inherent impls, e.g.: ```rust impl Vec where I: IntoIterator { - fn extend(iter: I) { .. } + default fn extend(iter: I) { .. } } impl Vec { @@ -1177,7 +1186,7 @@ for `extend` look superficially different, although it's clear that the first impl is the more general of the two. For the intended desugaring into "inherent traits" to be coherent, we need to -determine the items signatures. To do this, we apply the following test: +determine the item signatures. To do this, we apply the following test: - Suppose an item of the same kind, named `foo`, occurs in two inherent impl blocks for the same type constructor. From 9d0a4de269ac38e715bdfaacf57ce73bfbe394a8 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 13 Jul 2015 15:00:12 -0700 Subject: [PATCH 0706/1195] Clarify inherent impl desugaring; clarify Felix's alternative --- text/0000-impl-specialization.md | 92 ++++++++++++++++++++++++++------ 1 file changed, 75 insertions(+), 17 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index ce87a72892e..d909139494b 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -1185,27 +1185,48 @@ match. Thinking about `Vec` above, for example, notice that the two signatures for `extend` look superficially different, although it's clear that the first impl is the more general of the two. -For the intended desugaring into "inherent traits" to be coherent, we need to -determine the item signatures. To do this, we apply the following test: +We propose a very simple-minded conceptual desugaring: each item desugars into a +distinct trait, with type parameters for e.g. each argument and the return +type. All concrete type information then emerges from desugaring into impl +blocks. Thus, for example: -- Suppose an item of the same kind, named `foo`, occurs in two inherent impl - blocks for the same type constructor. +``` +impl Vec where I: IntoIterator { + default fn extend(iter: I) { .. } +} -- If it's possible to unify the two impl headers, then the two signatures for - `foo` must be "equivalent" under that unification. +impl Vec { + fn extend(slice: &[T]) { .. } +} -- Two signatures `S` and `T` are equivalent if: - - After skolemizing all type variables in `S`, it is possible to unify the two signatures, and, - - After that unification, the where clauses in `S` imply those in `T`, and - - Vice versa. +// Desugars to: -Basically, this check ensures that any overlapping inherent items provide -"identical" signatures for the area of overlap. That in turn means that callers -can be typechecked against *any* of the definitions that applies, and -typechecking would equally succeed with any other. +trait Vec_extend { + fn extend(Arg) -> Result; +} -Note that this is a breaking change, since examples like the following are -(surprisingly!) allowed today: +impl Vec_extend for Vec where I: IntoIterator { + default fn extend(iter: I) { .. } +} + +impl Vec_extend<&[T], ()> for Vec { + fn extend(slice: &[T]) { .. } +} +``` + +All items of a given name must desugar to the same trait, which means that the +number of arguments must be consistent across all impl blocks for a given `Self` +type. In addition, we require that *all of the impl blocks overlap* (meaning +that there is a single, most general impl). Without these constraints, we would +implicitly be permitting full-blown overloading on both arity and type +signatures. For the time being at least, we want to restrict overloading to +explicit uses of the trait system, as it is today. + +This "desugaring" semantics has the benefits of allowing inherent item +specialization, and also making it *actually* be the case that inherent impls +are really just implicit traits -- unifying the two forms of dispatch. Note that +this is a breaking change, since examples like the following are (surprisingly!) +allowed today: ```rust struct Foo(A, B); @@ -1574,6 +1595,42 @@ of mechanism is usually noncompositional, but due to the orphan rule, it's a least a crate-local concern. Like the alternative rule above, it could be added backwards compatibly if needed, since it only enables new cases. +### Singleton non-default wins + +@pnkfelix suggested the following rule, which allows overlap so long as there is +a unique non-default item. + +> For any given type-based lookup, either: +> +> 0. There are no results (error) +> +> 1. There is only one lookup result, in which case we're done (regardless of +> whether it is tagged as default or not), +> +> 2. There is a non-empty set of results with defaults, where exactly one +> result is non-default -- and then that non-default result is the answer, +> *or* +> +> 3. There is a non-empty set of results with defaults, where 0 or >1 results +> are non-default (and that is an error). + +This rule is arguably simpler than the one proposed in this RFC, and can +accommodate the examples we've presented throughout. It would also support some +of the cases this RFC cannot, because the default/non-default distinction can be +used to specify an ordering between impls when the subset ordering fails to do +so. For that reason, it is not forward-compatible with the main proposal in this +RFC. + +The downsides are: + +- Because actual dispatch occurs at monomorphization, errors are generated quite + late, and only at use sites, not impl sites. That moves traits much more in + the direction of C++ templates. + +- It's less scalable/compositional: this alternative design forces the + "specialization hierarchy" to be flat, in particular ruling out multiple + levels of increasingly-specialized blanket impls. + # Unresolved questions Finally, there are a few important questions not yet addressed by this RFC: @@ -1591,4 +1648,5 @@ Finally, there are a few important questions not yet addressed by this RFC: all-or-nothing affair, but it would occasionally be useful to say that all further specializations will at least guarantee some additional trait bound on the associated type. This is particularly relevant for the "efficient - inheritance" use case. Such a mechanism can likely be added, if needed, later on. + inheritance" use case. Such a mechanism can likely be added, if needed, later + on. From 63076c9b2d921e0d9154859f96eb43917d4547bf Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 8 Jan 2016 14:22:00 -0800 Subject: [PATCH 0707/1195] Change RFC to use lattice rule. --- text/0000-impl-specialization.md | 286 +++++++++++++++++++++++++++---- 1 file changed, 257 insertions(+), 29 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index d909139494b..9b5e49f7403 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -6,13 +6,13 @@ # Summary This RFC proposes a design for *specialization*, which permits multiple `impl` -blocks to apply to the same type/trait, so long as one of the blocks is clearly -"more specific" than the other. The more specific `impl` block is used in a case -of overlap. The design proposed here also supports refining default trait -implementations based on specifics about the types involved. +blocks to apply to the same type/trait, so long as there is always a clearly +"most specific" impl block that applies. The most specific `impl` block is used +in a case of overlap. The design proposed here also supports refining default +trait implementations based on specifics about the types involved. Altogether, this relatively small extension to the trait system yields benefits -for performance and code reuse, and it lays the groundwork for an "efficient +for performance, expressiveness, and code reuse, and it lays the groundwork for an "efficient inheritance" scheme that is largely based on the trait system (described in a forthcoming companion RFC). @@ -24,6 +24,11 @@ Specialization brings benefits along several different axes: because specialized impls can provide custom high-performance code for particular, concrete cases of an abstraction. +* **Expressiveness**: specialization significantly relaxes the overlapping impl + rules, making it possible to write multiple blanket impls that intersect -- a + desire that has come up over and over again in the standard library and + elsewhere. + * **Reuse**: the design proposed here also supports refining default (but incomplete) implementations of a trait, given details about the types involved. @@ -125,6 +130,147 @@ unsafe trait TrustedSizeHint {} that can allow the optimization to apply to a broader set of types than slices, but are still more specific than `T: IntoIterator`. +## Expressiveness + +One frequent motivation for specialization is broader "expressiveness", in +particular providing a larger set of trait implementations than is possible +today. + +For example, the standard library currently includes an `AsRef` trait +for "as-style" conversions: + +```rust +pub trait AsRef where T: ?Sized { + fn as_ref(&self) -> &T; +} +``` + +Currently, there is also a blanket implementation as follows: + +```rust +impl<'a, T: ?Sized, U: ?Sized> AsRef for &'a T where T: AsRef { + fn as_ref(&self) -> &U { + >::as_ref(*self) + } +} +``` + +which allows these conversions to "lift" over references, which is in turn +important for making a number of standard library APIs ergonomic. + +On the other hand, we'd also like to provide the following very simple +blanket implementation: + +```rust +impl<'a, T: ?Sized> AsRef for T { + fn as_ref(&self) -> &T { + self + } +} +``` + +The current coherence rules prevent having both impls, however, +because they can in principle overlap: + +```rust +AsRef<&'a T> for &'a T where T: AsRef<&'a T> +``` + +Another example comes from the `Option` type, which currently provides two +methods for unwrapping while providing a default value for the `None` case: + +```rust +impl Option { + fn unwrap_or(self, def: T) -> T { ... } + fn unwrap_or_else(self, f: F) -> T where F: FnOnce() -> T { .. } +} +``` + +The `unwrap_or` method is more ergonomic but `unwrap_or_else` is more efficient +in the case that the default is expensive to compute. The original +[collections reform RFC](https://github.com/rust-lang/rfcs/pull/235) proposed a +`ByNeed` trait that was rendered unworkable after unboxed closures landed: + +```rust +trait ByNeed { + fn compute(self) -> T; +} + +impl ByNeed for T { + fn compute(self) -> T { + self + } +} + +impl ByNeed for F where F: FnOnce() -> T { + fn compute(self) -> T { + self() + } +} + +impl Option { + fn unwrap_or(self, def: U) where U: ByNeed { ... } + ... +} +``` + +The trait represents any value that can produce a `T` on demand. But the above +impls fail to compile in today's Rust, because they overlap: consider `ByNeed +for F` where `F: FnOnce() -> F`. + +There are also some trait hierarchies where a subtrait completely subsumes the +functionality of a supertrait. For example, consider `PartialOrd` and `Ord`: + +```rust +trait PartialOrd: PartialEq { + fn partial_cmp(&self, other: &Rhs) -> Option; +} + +trait Ord: Eq + PartialOrd { + fn cmp(&self, other: &Self) -> Ordering; +} +``` + +In cases like this, it's somewhat annoying to have to provide an impl for *both* +`Ord` and `PartialOrd`, since the latter can be trivially derived from the +former. So you might want an impl like this: + +```rust +impl PartialOrd for T where T: Ord { + fn partial_cmp(&self, other: &T) -> Option { + Some(self.cmp(other)) + } +} +``` + +But this blanket impl would conflict with a number of others that work to "lift" +`PartialOrd` and `Ord` impls over various type constructors like references and +tuples, e.g.: + +```rust +impl<'a, A: ?Sized> Ord for &'a A where A: Ord { + fn cmp(&self, other: & &'a A) -> Ordering { Ord::cmp(*self, *other) } +} + +impl<'a, 'b, A: ?Sized, B: ?Sized> PartialOrd<&'b B> for &'a A where A: PartialOrd { + fn partial_cmp(&self, other: &&'b B) -> Option { + PartialOrd::partial_cmp(*self, *other) + } +``` + +The case where they overlap boils down to: + +```rust +PartialOrd<&'a T> for &'a T where &'a T: Ord +PartialOrd<&'a T> for &'a T where T: PartialOrd +``` + +There are many other examples along these lines. + +Specialization as proposed in this RFC greatly relaxes the overlap rules, even +allowing impls to only partially overlap, as long as there is another +yet-more-specialized impl that disambiguates the portion of partial overlap. + ## Reuse Today's default methods in traits are pretty limited: they can assume only the @@ -219,8 +365,12 @@ nicely with an orthogonal mechanism for virtual dispatch to give a complete story for the "efficient inheritance" goal that many previous RFCs targeted. The author is preparing a companion RFC showing how this can be done with a -relatively small further extension to the language. But it should be said that -the design in *this* RFC is fully motivated independently of its companion RFC. +relatively small further extension to the language. In the meantime, you can +find a blog post laying out the basic ideas +[here](http://aturon.github.io/blog/2015/09/18/reuse/). + +But it should be said that the design in *this* RFC is fully motivated +independently of its companion RFC. # Detailed design @@ -357,7 +507,8 @@ to begin with, and wait for strong use cases of associated type specialization to emerge before stabilizing that.) The solution proposed in this RFC is instead to treat specialization of items in -a trait as a per-item *opt in*, described in the next section. +a trait as a per-item *opt in*, described in the next section. (This opt in, it +should be noted, is desirable for other reasons as well.) ## The `default` keyword @@ -576,7 +727,7 @@ impl<'a, T, U> Bar for &'a U where U: Bar {} ### Permitting overlap The goal of specialization is to allow overlapping impls, but it's not as simple -as permitting *all* overlap. There has to be a way to decide which of two +as permitting *all* overlap. There has to be a way to decide which of several overlapping impls to actually use for a given set of input types. The simpler and more intuitive the rule for deciding, the easier it is to write and reason about code -- and since dispatch is already quite complicated, simplicity here @@ -584,11 +735,14 @@ is a high priority. On the other hand, the design should support as many of the motivating use cases as possible. The basic intuition we've been using for specialization is the idea that one -impl is "more specific" than another it overlaps with. Before turning this -intuition into a rule, let's go through the previous examples of overlap and -decide which, if any, of the impls is intuitively more specific. **Note that since -we're leaving out the body of the impls, you won't see the `default` keyword -that would be required in practice for the less specialized impls.** +impl can be "more specific" than another, and that for any given type there +should be at most one "most specific" impl that applies. + +Before turning this intuition into a rule, let's go through the previous +examples of overlap and decide which, if any, of the impls is intuitively more +specific. **Note that since we're leaving out the body of the impls, you won't +see the `default` keyword that would be required in practice for the less +specialized impls.** ```rust trait Foo {} @@ -666,14 +820,45 @@ We'll say that `I < J` if `I <= J` and `!(J <= I)`. In this case, `I` is *more specialized* than `J`. To ensure specialization is coherent, we will ensure that for any two impls `I` -and `J` that overlap, we have either `I < J` or `J < I`. That is, one must be -truly more specific than the other. Specialization chooses the "smallest" impl -in this order -- and the new overlap rule ensures there is a unique smallest -impl among those that apply to a given set of input types. +and `J` that overlap, there must be an impl that is *precisely their +intersection*. That intersecting impl might just *be* one of `I` or `J` -- in +other words, the rule is automatically satisfied if `I < J` or `J < I`. + +For example: + +```rust +trait Foo {} +trait Trait1 {} +trait Trait2 {} +trait Trait3 {} + +// these two impls overlap without one being more specific than the other: +impl Foo for T where T: Trait1 {} +impl Foo for T where T: Trait2 {} -More broadly, while `<=` is not a total order on *all* impls of a given trait, -it will be a total order on any set of impls that all mutually overlap, which is -all we need to determine which impl to use. +// ... but this one gives their intersection: +impl Foo for T where T: Trait1 + Trait2 {} + +// ... and this one is just more specific than all the others +impl Foo for T where T: Trait1 + Trait2 + Trait3 {} +``` + +Note in particular that the intersection of the last two impls *is* the last +impl. + +This rule guarantees that, given any concrete type that has at least one +applicable impl, we'll be able to find a *single most-specific impl* that +applies. In other words if you take the set of all applicable impls +`ALL_APPLICABLE_IMPLS`: + +- There will be some `I` in `ALL_APPLICABLE_IMPLS` such that: + - For all `J` in `ALL_APPLICABLE_IMPLS` such that: + - `I <= J` + +And this most-specific impl is what we'll dispatch to. + +(It's worth pausing to think for a moment how this rule can lead to dispatching +to each of the impls in the example above.) We'll start with an abstract/high-level formulation, and then build up toward an algorithm for deciding specialization by introducing a number of building @@ -748,7 +933,32 @@ impl Bar for T {} ``` The same reasoning can be applied to all of the examples we saw earlier, and the -reader is encouraged to do so. We'll look at one of the more subtle cases here: +reader is encouraged to do so. We'll look at some of the more subtle cases here: + +```rust +trait Foo {} +trait Trait1 {} +trait Trait2 {} +trait Trait3 {} + +// impl I +// apply(I) = { T | T: Trait1 } +impl Foo for T where T: Trait1 {} + +// impl J +// apply(J) = { T | T: Trait2 } +impl Foo for T where T: Trait2 {} + +// Neither I < J or J < I, but: + +// impl K +// apply(K) = { T | T: Trait1, T: Trait2 } +impl Foo for T where T: Trait1 + Trait2 {} + +// apply(I) intersect apply(J) = apply(K) +// apply(I) intersect apply(K) = apply(K) +// apply(J) intersect apply(K) = apply(K) +``` ```rust // impl I @@ -761,8 +971,26 @@ impl<'a, T, U> Bar for &'a U where U: Bar {} ``` The claim is that `apply(I)` and `apply(J)` intersect, but neither contains the -other. Thus, these two impls are not permitted to coexist according to this -RFC's design. (We'll revisit this limitation toward the end of the RFC.) +other. Thus, for these impls to coexist, there must be a *third* impl for which +`apply` is precisely the intersection: + +```rust +// impl I +// apply(I) = { (T, T) | T any type } +impl Bar for T {} + +// impl J +// apply(J) = { (T, &'a U) | U: Bar, 'a any lifetime } +impl<'a, T, U> Bar for &'a U where U: Bar {} + +// impl K +// apply(K) = { (&'a T, &'a T) | T: Bar<&'a T>, 'a any lifetime } +impl<'a, T> Bar<&'a T> for &'a T where T: Bar<&'a T> {} + +// apply(I) intersect apply(J) = apply(K) +// apply(I) intersect apply(K) = apply(K) +// apply(J) intersect apply(K) = apply(K) +``` #### Algorithmic formulation @@ -929,7 +1157,8 @@ impl Foo for T {} // there's no relationship between the traits he In comparing these two impls in either direction, we make it past unification and must try to prove that one where clause implies another. But `T: Trait1` does not imply `T: Trait2`, nor vice versa, so neither impl is more specific -than the other. Since the impls do overlap, an ambiguity error is reported. +than the other. Since the impls do overlap, there must be a third impl for their +intersection (`T: Trait1 + Trait2`). On the other hand: @@ -950,12 +1179,11 @@ impl Foo for T < impl Foo for T ##### Key properties -Remember that for each pair of impls `I`, `J`, the compiler will check that -exactly one of the following holds: +For each pair of impls `I`, `J`, the compiler will check that exactly one of the +following holds: - `I` and `J` do not overlap (a unification check), or else -- `I < J`, or else -- `J < I` +- There is some impl `K` such that `apply(K) = apply(I) intersect apply(J)` Since `I <= J` ultimately boils down to a subset relationship, we get a lot of nice properties for free (e.g., transitivity: if `I <= J <= K` then `I <= K`). From 84776273978061b2ac5b62196267677e3f2c43c4 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 8 Jan 2016 14:22:36 -0800 Subject: [PATCH 0708/1195] Move inherent impl specialization to future enhancements --- text/0000-impl-specialization.md | 316 +++++++++---------------------- 1 file changed, 85 insertions(+), 231 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 9b5e49f7403..2cfb0cbff65 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -1340,7 +1340,12 @@ that item. Such a relaxed approach is much more flexible, probably easier to work with, and can enable more code reuse -- but it's also more complicated, and backwards-compatible to add on top of the proposed conservative approach. -## Inherent impls +## Possible extensions + +It's worth briefly mentioning a couple of mechanisms that one could consider +adding on top of specialization. + +### Inherent impls It has long been folklore that inherent impls can be thought of as special, anonymous traits that are: @@ -1400,241 +1405,14 @@ impl Vec { } ``` -This RFC proposes to permit such specialization at the inherent impl level. The -semantics is defined in terms of the folklore desugaring above. +We could permit such specialization at the inherent impl level. The +semantics would be defined in terms of the folklore desugaring above. (Note: this example was chosen purposefully: it's possible to use specialization at the inherent impl level to avoid refactoring the `Extend` trait as described in the Motivation section.) -One tricky aspect here is that, since there is no explicit trait definition, -there is no general signature that each definition of an inherent item must -match. Thinking about `Vec` above, for example, notice that the two signatures -for `extend` look superficially different, although it's clear that the first -impl is the more general of the two. - -We propose a very simple-minded conceptual desugaring: each item desugars into a -distinct trait, with type parameters for e.g. each argument and the return -type. All concrete type information then emerges from desugaring into impl -blocks. Thus, for example: - -``` -impl Vec where I: IntoIterator { - default fn extend(iter: I) { .. } -} - -impl Vec { - fn extend(slice: &[T]) { .. } -} - -// Desugars to: - -trait Vec_extend { - fn extend(Arg) -> Result; -} - -impl Vec_extend for Vec where I: IntoIterator { - default fn extend(iter: I) { .. } -} - -impl Vec_extend<&[T], ()> for Vec { - fn extend(slice: &[T]) { .. } -} -``` - -All items of a given name must desugar to the same trait, which means that the -number of arguments must be consistent across all impl blocks for a given `Self` -type. In addition, we require that *all of the impl blocks overlap* (meaning -that there is a single, most general impl). Without these constraints, we would -implicitly be permitting full-blown overloading on both arity and type -signatures. For the time being at least, we want to restrict overloading to -explicit uses of the trait system, as it is today. - -This "desugaring" semantics has the benefits of allowing inherent item -specialization, and also making it *actually* be the case that inherent impls -are really just implicit traits -- unifying the two forms of dispatch. Note that -this is a breaking change, since examples like the following are (surprisingly!) -allowed today: - -```rust -struct Foo(A, B); - -impl Foo { - fn foo(&self, _: u32) {} -} - -impl Foo { - fn foo(&self, _: bool) {} -} - -fn use_foo(f: Foo) { - f.foo(true) -} -``` - -As has been proposed -[elsewhere](https://internals.rust-lang.org/t/pre-rfc-adjust-default-object-bounds/2199/), -this "breaking change" would be made available through a feature flag that must -be used even after stabilization (to opt in to specialization of inherent -impls); the full details will depend on pending revisions to -[RFC 1122](https://github.com/rust-lang/rfcs/pull/1122). - -## Limitations - -One frequent motivation for specialization is broader "expressiveness", in -particular providing a larger set of trait implementations than is possible -today. - -For example, the standard library currently includes an `AsRef` trait -for "as-style" conversions: - -```rust -pub trait AsRef where T: ?Sized { - fn as_ref(&self) -> &T; -} -``` - -Currently, there is also a blanket implementation as follows: - -```rust -impl<'a, T: ?Sized, U: ?Sized> AsRef for &'a T where T: AsRef { - fn as_ref(&self) -> &U { - >::as_ref(*self) - } -} -``` - -which allows these conversions to "lift" over references, which is in turn -important for making a number of standard library APIs ergonomic. - -On the other hand, we'd also like to provide the following very simple -blanket implementation: - -```rust -impl<'a, T: ?Sized> AsRef for T { - fn as_ref(&self) -> &T { - self - } -} -``` - -The current coherence rules prevent having both impls, however, -because they can in principle overlap: - -```rust -AsRef<&'a T> for &'a T where T: AsRef<&'a T> -``` - -Another examples comes from the `Option` type, which currently provides two -methods for unwrapping while providing a default value for the `None` case: - -```rust -impl Option { - fn unwrap_or(self, def: T) -> T { ... } - fn unwrap_or_else(self, f: F) -> T where F: FnOnce() -> T { .. } -} -``` - -The `unwrap_or` method is more ergonomic but `unwrap_or_else` is more efficient -in the case that the default is expensive to compute. The original -[collections reform RFC](https://github.com/rust-lang/rfcs/pull/235) proposed a -`ByNeed` trait that was rendered unworkable after unboxed closures landed: - -```rust -trait ByNeed { - fn compute(self) -> T; -} - -impl ByNeed for T { - fn compute(self) -> T { - self - } -} - -impl ByNeed for F where F: FnOnce() -> T { - fn compute(self) -> T { - self() - } -} - -impl Option { - fn unwrap_or(self, def: U) where U: ByNeed { ... } - ... -} -``` - -The trait represents any value that can produce a `T` on demand. But the above -impls fail to compile in today's Rust, because they overlap: consider `ByNeed -for F` where `F: FnOnce() -> F`. - -There are also some trait hierarchies where a subtrait completely subsumes the -functionality of a supertrait. For example, consider `PartialOrd` and `Ord`: - -```rust -trait PartialOrd: PartialEq { - fn partial_cmp(&self, other: &Rhs) -> Option; -} - -trait Ord: Eq + PartialOrd { - fn cmp(&self, other: &Self) -> Ordering; -} -``` - -In cases like this, it's somewhat annoying to have to provide an impl for *both* -`Ord` and `PartialOrd`, since the latter can be trivially derived from the -former. So you might want an impl like this: - -```rust -impl PartialOrd for T where T: Ord { - fn partial_cmp(&self, other: &T) -> Option { - Some(self.cmp(other)) - } -} -``` - -But this blanket impl would conflict with a number of others that work to "lift" -`PartialOrd` and `Ord` impls over various type constructors like references and -tuples, e.g.: - -```rust -impl<'a, A: ?Sized> Ord for &'a A where A: Ord { - fn cmp(&self, other: & &'a A) -> Ordering { Ord::cmp(*self, *other) } -} - -impl<'a, 'b, A: ?Sized, B: ?Sized> PartialOrd<&'b B> for &'a A where A: PartialOrd { - fn partial_cmp(&self, other: &&'b B) -> Option { - PartialOrd::partial_cmp(*self, *other) - } -``` - -The case where they overlap boils down to: - -```rust -PartialOrd<&'a T> for &'a T where &'a T: Ord -PartialOrd<&'a T> for &'a T where T: PartialOrd -``` - -and there is no implication between either of the where clauses. - -There are many other examples along these lines. - -Unfortunately, *none* of these examples are permitted by the revised overlap -rule in this RFC, because in none of these cases is one of the impls fully a -"subset" of the other; the overlap is always partial. - -It's a shame to not be able to address these cases, but the benefit is a -specialization rule that is very intuitive and accepts only very clear-cut -cases. The Alternatives section sketches some different rules that are less -intuitive but do manage to handle cases like those above. - -If we allowed "relaxed" partial impls as described above, one could at least use -that mechanism to avoid having to give a definition directly in most cases. (So -if you had `T: Ord` you could write `impl PartialOrd for T {}`.) - -## Possible extensions - -It's worth briefly mentioning a couple of mechanisms that one could consider -adding on top of specialization. +There are more details about this idea in the appendix. ### Super @@ -1878,3 +1656,79 @@ Finally, there are a few important questions not yet addressed by this RFC: the associated type. This is particularly relevant for the "efficient inheritance" use case. Such a mechanism can likely be added, if needed, later on. + +# Appendix + +## More details on inherent impls + +One tricky aspect for specializing inherent impls is that, since there is no +explicit trait definition, there is no general signature that each definition of +an inherent item must match. Thinking about `Vec` above, for example, notice +that the two signatures for `extend` look superficially different, although it's +clear that the first impl is the more general of the two. + +It's workable to use a very simple-minded conceptual desugaring: each item +desugars into a distinct trait, with type parameters for e.g. each argument and +the return type. All concrete type information then emerges from desugaring into +impl blocks. Thus, for example: + +``` +impl Vec where I: IntoIterator { + default fn extend(iter: I) { .. } +} + +impl Vec { + fn extend(slice: &[T]) { .. } +} + +// Desugars to: + +trait Vec_extend { + fn extend(Arg) -> Result; +} + +impl Vec_extend for Vec where I: IntoIterator { + default fn extend(iter: I) { .. } +} + +impl Vec_extend<&[T], ()> for Vec { + fn extend(slice: &[T]) { .. } +} +``` + +All items of a given name must desugar to the same trait, which means that the +number of arguments must be consistent across all impl blocks for a given `Self` +type. In addition, we'd require that *all of the impl blocks overlap* (meaning +that there is a single, most general impl). Without these constraints, we would +implicitly be permitting full-blown overloading on both arity and type +signatures. For the time being at least, we want to restrict overloading to +explicit uses of the trait system, as it is today. + +This "desugaring" semantics has the benefits of allowing inherent item +specialization, and also making it *actually* be the case that inherent impls +are really just implicit traits -- unifying the two forms of dispatch. Note that +this is a breaking change, since examples like the following are (surprisingly!) +allowed today: + +```rust +struct Foo(A, B); + +impl Foo { + fn foo(&self, _: u32) {} +} + +impl Foo { + fn foo(&self, _: bool) {} +} + +fn use_foo(f: Foo) { + f.foo(true) +} +``` + +As has been proposed +[elsewhere](https://internals.rust-lang.org/t/pre-rfc-adjust-default-object-bounds/2199/), +this "breaking change" could be made available through a feature flag that must +be used even after stabilization (to opt in to specialization of inherent +impls); the full details will depend on pending revisions to +[RFC 1122](https://github.com/rust-lang/rfcs/pull/1122). From 9388c846dc95107b10a8d26b80be0cb75144e60b Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 8 Jan 2016 14:23:10 -0800 Subject: [PATCH 0709/1195] Remove old alternative specialization rule --- text/0000-impl-specialization.md | 70 -------------------------------- 1 file changed, 70 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 2cfb0cbff65..68f3d859bf3 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -1522,76 +1522,6 @@ specializing the trait impls. ## Alternative specialization designs -### Relaxing the rule - -Assuming we want specialization, there are alternative designs for the -specialization rule that allow it to handle more cases, but lose out on -intuition or other important properties. Suppose we want to handle both of the -following examples: - -```rust -trait AsRef { .. } -impl<'a, T, U> AsRef for &'a T where T: AsRef { .. } -impl<'a, T> AsRef for T { .. } - -trait Debug { .. } -trait Display { .. } -impl Debug for T where T: Display { .. } -impl<'a, T> Debug for &'a T where T: Debug { .. } -``` - -In the first example, you might be tempted to say that the first impl is the -"more specific", because *in the case of overlap*, it has a more constrained -where clause. - -In the second example, you might be tempted to say that the second impl is the -"more specific", because in the case of overlap, neither where clause implies -the other, but the second impl imposes more constrained type structure (`&'a T` -rather than `T`). - -These intuitions correspond to an alternative definition of `I <=alt J` where -*either or both* of the following clauses needs to hold: - -- `I` and `J` unify and `I.wc` implies `J.wc` under that unification -- `I` and `J` unify after skolemizing `I.vars` - -The first clause ignores differences in type structure, while the second clause -ignores differences in where clauses. Note that `I <=alt J` and `I <= J` are -very similar -- essentially, `I <= J` requires that *both* clauses hold, while -`I <=alt J` permits either of them to hold. - -Using `I <=alt J`, the examples above work out, as do the rest described in -Limitations. But the rule is far less intuitive. And it suffers from some poor -mathematical properties. - -Consider this example: - -```rust -trait Trait1 {} -trait Trait2: Trait1 {} -trait Trait3: Trait2 {} - -trait Foo {} - -impl Foo for () where T: Trait1 {} // impl I -impl Foo for () where U: Trait2 {} // impl J -impl Foo for () where V: Trait3 {} // impl K -``` - -This example is carefully designed so that `K Date: Fri, 8 Jan 2016 14:29:24 -0800 Subject: [PATCH 0710/1195] Add a bit of text about error messages. --- text/0000-impl-specialization.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 68f3d859bf3..e4e863fe252 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -860,6 +860,10 @@ And this most-specific impl is what we'll dispatch to. (It's worth pausing to think for a moment how this rule can lead to dispatching to each of the impls in the example above.) +One nice thing about this approach is that, if there is an overlap without there +being an intersecting impl, the compiler can tell the programmer *precisely +which impl needs to be written* to disambiguate the overlapping portion. + We'll start with an abstract/high-level formulation, and then build up toward an algorithm for deciding specialization by introducing a number of building blocks. @@ -1185,6 +1189,10 @@ following holds: - `I` and `J` do not overlap (a unification check), or else - There is some impl `K` such that `apply(K) = apply(I) intersect apply(J)` +Recall also that if there is an overlap without there being an intersecting +impl, the compiler can tell the programmer *precisely which impl needs to be +written* to disambiguate the overlapping portion. + Since `I <= J` ultimately boils down to a subset relationship, we get a lot of nice properties for free (e.g., transitivity: if `I <= J <= K` then `I <= K`). Together with the compiler check above, we know that at monomorphization time, From b251507fa06379ffc1057b1772e9e6950d9d5584 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 26 Jan 2016 11:15:46 -0800 Subject: [PATCH 0711/1195] Revert "Change RFC to use lattice rule." This reverts commit 772153c18a9a7a83bc20aef3aa1a24d757e2809b. --- text/0000-impl-specialization.md | 286 ++++--------------------------- 1 file changed, 29 insertions(+), 257 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index e4e863fe252..db6fc21ad70 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -6,13 +6,13 @@ # Summary This RFC proposes a design for *specialization*, which permits multiple `impl` -blocks to apply to the same type/trait, so long as there is always a clearly -"most specific" impl block that applies. The most specific `impl` block is used -in a case of overlap. The design proposed here also supports refining default -trait implementations based on specifics about the types involved. +blocks to apply to the same type/trait, so long as one of the blocks is clearly +"more specific" than the other. The more specific `impl` block is used in a case +of overlap. The design proposed here also supports refining default trait +implementations based on specifics about the types involved. Altogether, this relatively small extension to the trait system yields benefits -for performance, expressiveness, and code reuse, and it lays the groundwork for an "efficient +for performance and code reuse, and it lays the groundwork for an "efficient inheritance" scheme that is largely based on the trait system (described in a forthcoming companion RFC). @@ -24,11 +24,6 @@ Specialization brings benefits along several different axes: because specialized impls can provide custom high-performance code for particular, concrete cases of an abstraction. -* **Expressiveness**: specialization significantly relaxes the overlapping impl - rules, making it possible to write multiple blanket impls that intersect -- a - desire that has come up over and over again in the standard library and - elsewhere. - * **Reuse**: the design proposed here also supports refining default (but incomplete) implementations of a trait, given details about the types involved. @@ -130,147 +125,6 @@ unsafe trait TrustedSizeHint {} that can allow the optimization to apply to a broader set of types than slices, but are still more specific than `T: IntoIterator`. -## Expressiveness - -One frequent motivation for specialization is broader "expressiveness", in -particular providing a larger set of trait implementations than is possible -today. - -For example, the standard library currently includes an `AsRef` trait -for "as-style" conversions: - -```rust -pub trait AsRef where T: ?Sized { - fn as_ref(&self) -> &T; -} -``` - -Currently, there is also a blanket implementation as follows: - -```rust -impl<'a, T: ?Sized, U: ?Sized> AsRef for &'a T where T: AsRef { - fn as_ref(&self) -> &U { - >::as_ref(*self) - } -} -``` - -which allows these conversions to "lift" over references, which is in turn -important for making a number of standard library APIs ergonomic. - -On the other hand, we'd also like to provide the following very simple -blanket implementation: - -```rust -impl<'a, T: ?Sized> AsRef for T { - fn as_ref(&self) -> &T { - self - } -} -``` - -The current coherence rules prevent having both impls, however, -because they can in principle overlap: - -```rust -AsRef<&'a T> for &'a T where T: AsRef<&'a T> -``` - -Another example comes from the `Option` type, which currently provides two -methods for unwrapping while providing a default value for the `None` case: - -```rust -impl Option { - fn unwrap_or(self, def: T) -> T { ... } - fn unwrap_or_else(self, f: F) -> T where F: FnOnce() -> T { .. } -} -``` - -The `unwrap_or` method is more ergonomic but `unwrap_or_else` is more efficient -in the case that the default is expensive to compute. The original -[collections reform RFC](https://github.com/rust-lang/rfcs/pull/235) proposed a -`ByNeed` trait that was rendered unworkable after unboxed closures landed: - -```rust -trait ByNeed { - fn compute(self) -> T; -} - -impl ByNeed for T { - fn compute(self) -> T { - self - } -} - -impl ByNeed for F where F: FnOnce() -> T { - fn compute(self) -> T { - self() - } -} - -impl Option { - fn unwrap_or(self, def: U) where U: ByNeed { ... } - ... -} -``` - -The trait represents any value that can produce a `T` on demand. But the above -impls fail to compile in today's Rust, because they overlap: consider `ByNeed -for F` where `F: FnOnce() -> F`. - -There are also some trait hierarchies where a subtrait completely subsumes the -functionality of a supertrait. For example, consider `PartialOrd` and `Ord`: - -```rust -trait PartialOrd: PartialEq { - fn partial_cmp(&self, other: &Rhs) -> Option; -} - -trait Ord: Eq + PartialOrd { - fn cmp(&self, other: &Self) -> Ordering; -} -``` - -In cases like this, it's somewhat annoying to have to provide an impl for *both* -`Ord` and `PartialOrd`, since the latter can be trivially derived from the -former. So you might want an impl like this: - -```rust -impl PartialOrd for T where T: Ord { - fn partial_cmp(&self, other: &T) -> Option { - Some(self.cmp(other)) - } -} -``` - -But this blanket impl would conflict with a number of others that work to "lift" -`PartialOrd` and `Ord` impls over various type constructors like references and -tuples, e.g.: - -```rust -impl<'a, A: ?Sized> Ord for &'a A where A: Ord { - fn cmp(&self, other: & &'a A) -> Ordering { Ord::cmp(*self, *other) } -} - -impl<'a, 'b, A: ?Sized, B: ?Sized> PartialOrd<&'b B> for &'a A where A: PartialOrd { - fn partial_cmp(&self, other: &&'b B) -> Option { - PartialOrd::partial_cmp(*self, *other) - } -``` - -The case where they overlap boils down to: - -```rust -PartialOrd<&'a T> for &'a T where &'a T: Ord -PartialOrd<&'a T> for &'a T where T: PartialOrd -``` - -There are many other examples along these lines. - -Specialization as proposed in this RFC greatly relaxes the overlap rules, even -allowing impls to only partially overlap, as long as there is another -yet-more-specialized impl that disambiguates the portion of partial overlap. - ## Reuse Today's default methods in traits are pretty limited: they can assume only the @@ -365,12 +219,8 @@ nicely with an orthogonal mechanism for virtual dispatch to give a complete story for the "efficient inheritance" goal that many previous RFCs targeted. The author is preparing a companion RFC showing how this can be done with a -relatively small further extension to the language. In the meantime, you can -find a blog post laying out the basic ideas -[here](http://aturon.github.io/blog/2015/09/18/reuse/). - -But it should be said that the design in *this* RFC is fully motivated -independently of its companion RFC. +relatively small further extension to the language. But it should be said that +the design in *this* RFC is fully motivated independently of its companion RFC. # Detailed design @@ -507,8 +357,7 @@ to begin with, and wait for strong use cases of associated type specialization to emerge before stabilizing that.) The solution proposed in this RFC is instead to treat specialization of items in -a trait as a per-item *opt in*, described in the next section. (This opt in, it -should be noted, is desirable for other reasons as well.) +a trait as a per-item *opt in*, described in the next section. ## The `default` keyword @@ -727,7 +576,7 @@ impl<'a, T, U> Bar for &'a U where U: Bar {} ### Permitting overlap The goal of specialization is to allow overlapping impls, but it's not as simple -as permitting *all* overlap. There has to be a way to decide which of several +as permitting *all* overlap. There has to be a way to decide which of two overlapping impls to actually use for a given set of input types. The simpler and more intuitive the rule for deciding, the easier it is to write and reason about code -- and since dispatch is already quite complicated, simplicity here @@ -735,14 +584,11 @@ is a high priority. On the other hand, the design should support as many of the motivating use cases as possible. The basic intuition we've been using for specialization is the idea that one -impl can be "more specific" than another, and that for any given type there -should be at most one "most specific" impl that applies. - -Before turning this intuition into a rule, let's go through the previous -examples of overlap and decide which, if any, of the impls is intuitively more -specific. **Note that since we're leaving out the body of the impls, you won't -see the `default` keyword that would be required in practice for the less -specialized impls.** +impl is "more specific" than another it overlaps with. Before turning this +intuition into a rule, let's go through the previous examples of overlap and +decide which, if any, of the impls is intuitively more specific. **Note that since +we're leaving out the body of the impls, you won't see the `default` keyword +that would be required in practice for the less specialized impls.** ```rust trait Foo {} @@ -820,45 +666,14 @@ We'll say that `I < J` if `I <= J` and `!(J <= I)`. In this case, `I` is *more specialized* than `J`. To ensure specialization is coherent, we will ensure that for any two impls `I` -and `J` that overlap, there must be an impl that is *precisely their -intersection*. That intersecting impl might just *be* one of `I` or `J` -- in -other words, the rule is automatically satisfied if `I < J` or `J < I`. - -For example: - -```rust -trait Foo {} -trait Trait1 {} -trait Trait2 {} -trait Trait3 {} - -// these two impls overlap without one being more specific than the other: -impl Foo for T where T: Trait1 {} -impl Foo for T where T: Trait2 {} - -// ... but this one gives their intersection: -impl Foo for T where T: Trait1 + Trait2 {} - -// ... and this one is just more specific than all the others -impl Foo for T where T: Trait1 + Trait2 + Trait3 {} -``` - -Note in particular that the intersection of the last two impls *is* the last -impl. - -This rule guarantees that, given any concrete type that has at least one -applicable impl, we'll be able to find a *single most-specific impl* that -applies. In other words if you take the set of all applicable impls -`ALL_APPLICABLE_IMPLS`: +and `J` that overlap, we have either `I < J` or `J < I`. That is, one must be +truly more specific than the other. Specialization chooses the "smallest" impl +in this order -- and the new overlap rule ensures there is a unique smallest +impl among those that apply to a given set of input types. -- There will be some `I` in `ALL_APPLICABLE_IMPLS` such that: - - For all `J` in `ALL_APPLICABLE_IMPLS` such that: - - `I <= J` - -And this most-specific impl is what we'll dispatch to. - -(It's worth pausing to think for a moment how this rule can lead to dispatching -to each of the impls in the example above.) +More broadly, while `<=` is not a total order on *all* impls of a given trait, +it will be a total order on any set of impls that all mutually overlap, which is +all we need to determine which impl to use. One nice thing about this approach is that, if there is an overlap without there being an intersecting impl, the compiler can tell the programmer *precisely @@ -937,32 +752,7 @@ impl Bar for T {} ``` The same reasoning can be applied to all of the examples we saw earlier, and the -reader is encouraged to do so. We'll look at some of the more subtle cases here: - -```rust -trait Foo {} -trait Trait1 {} -trait Trait2 {} -trait Trait3 {} - -// impl I -// apply(I) = { T | T: Trait1 } -impl Foo for T where T: Trait1 {} - -// impl J -// apply(J) = { T | T: Trait2 } -impl Foo for T where T: Trait2 {} - -// Neither I < J or J < I, but: - -// impl K -// apply(K) = { T | T: Trait1, T: Trait2 } -impl Foo for T where T: Trait1 + Trait2 {} - -// apply(I) intersect apply(J) = apply(K) -// apply(I) intersect apply(K) = apply(K) -// apply(J) intersect apply(K) = apply(K) -``` +reader is encouraged to do so. We'll look at one of the more subtle cases here: ```rust // impl I @@ -975,26 +765,8 @@ impl<'a, T, U> Bar for &'a U where U: Bar {} ``` The claim is that `apply(I)` and `apply(J)` intersect, but neither contains the -other. Thus, for these impls to coexist, there must be a *third* impl for which -`apply` is precisely the intersection: - -```rust -// impl I -// apply(I) = { (T, T) | T any type } -impl Bar for T {} - -// impl J -// apply(J) = { (T, &'a U) | U: Bar, 'a any lifetime } -impl<'a, T, U> Bar for &'a U where U: Bar {} - -// impl K -// apply(K) = { (&'a T, &'a T) | T: Bar<&'a T>, 'a any lifetime } -impl<'a, T> Bar<&'a T> for &'a T where T: Bar<&'a T> {} - -// apply(I) intersect apply(J) = apply(K) -// apply(I) intersect apply(K) = apply(K) -// apply(J) intersect apply(K) = apply(K) -``` +other. Thus, these two impls are not permitted to coexist according to this +RFC's design. (We'll revisit this limitation toward the end of the RFC.) #### Algorithmic formulation @@ -1161,8 +933,7 @@ impl Foo for T {} // there's no relationship between the traits he In comparing these two impls in either direction, we make it past unification and must try to prove that one where clause implies another. But `T: Trait1` does not imply `T: Trait2`, nor vice versa, so neither impl is more specific -than the other. Since the impls do overlap, there must be a third impl for their -intersection (`T: Trait1 + Trait2`). +than the other. Since the impls do overlap, an ambiguity error is reported. On the other hand: @@ -1183,11 +954,12 @@ impl Foo for T < impl Foo for T ##### Key properties -For each pair of impls `I`, `J`, the compiler will check that exactly one of the -following holds: +Remember that for each pair of impls `I`, `J`, the compiler will check that +exactly one of the following holds: - `I` and `J` do not overlap (a unification check), or else -- There is some impl `K` such that `apply(K) = apply(I) intersect apply(J)` +- `I < J`, or else +- `J < I` Recall also that if there is an overlap without there being an intersecting impl, the compiler can tell the programmer *precisely which impl needs to be From ceb8e84b3136574e69d2b8be4ec502612adc96cf Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 26 Jan 2016 11:15:26 -0800 Subject: [PATCH 0712/1195] Resolve question about lifetime interaction --- text/0000-impl-specialization.md | 506 ++++++++++++++++++++++++++++++- 1 file changed, 497 insertions(+), 9 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index db6fc21ad70..ffa912529a0 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -995,6 +995,312 @@ sibling crates are unaware of each other, there's no way that they could each provide an impl overlapping with the other, yet be sure that one of those impls is more specific than the other in the overlapping region. +### Interaction with lifetimes + +A hard constraint in the design of the trait system is that *dispatch cannot +depend on lifetime information*. In particular, we both cannot, and allow +specialization based on lifetimes: + +- We can't, because when the compiler goes to actually generate code ("trans"), + lifetime information has been erased -- so we'd have no idea what + specializations would soundly apply. + +- We shouldn't, because lifetime inference is subtle and would often lead to + counterintuitive results. For example, you could easily fail to get `'static` + even if it applies, because inference is choosing the smallest lifetime that + matches the other constraints. + +To be more concrete, here are some scenarios which should not be allowed: + +```rust +// Not allowed: trans doesn't know if T: 'static: +trait Bad1 {} +impl Bad1 for T {} +impl Bad1 for T {} + +// Not allowed: trans doesn't know if two refs have equal lifetimes: +trait Bad2 {} +impl Bad2 for T {} +impl<'a, T, U> Bad2<&'b U> for &'a T {} +``` + +But simply *naming* a lifetime that must exist, without *constraining* it, is fine: + +```rust +// Allowed: specializes based on being *any* reference, regardless of lifetime +trait Good {} +impl Good for T {} +impl<'a, T> Good for &'a T {} +``` + +In addition, it's okay for lifetime constraints to show up as long as +they aren't part of specialization: + +```rust +// Allowed: *all* impls impose the 'static requirement; the dispatch is happening +// purely based on `Clone` +trait MustBeStatic {} +impl MustBeStatic for T {} +impl MustBeStatic for T {} +``` + +#### Going down the rabbit hole + +Unfortunately, we cannot easily rule out the undesirable lifetime-dependent +specializations, because they can be "hidden" behind innocent-looking trait +bounds that can even cross crates: + +```rust +//////////////////////////////////////////////////////////////////////////////// +// Crate marker +//////////////////////////////////////////////////////////////////////////////// + +trait Marker {} +impl Marker for u32 {} + +//////////////////////////////////////////////////////////////////////////////// +// Crate foo +//////////////////////////////////////////////////////////////////////////////// + +extern crate marker; + +trait Foo { + fn foo(&self); +} + +impl Foo for T { + default fn foo(&self) { + println!("Default impl"); + } +} + +impl Foo for T { + fn foo(&self) { + println!("Marker impl"); + } +} + +//////////////////////////////////////////////////////////////////////////////// +// Crate bar +//////////////////////////////////////////////////////////////////////////////// + +extern crate marker; + +pub struct Bar(T); +impl marker::Marker for Bar {} + +//////////////////////////////////////////////////////////////////////////////// +// Crate client +//////////////////////////////////////////////////////////////////////////////// + +extern crate foo; +extern crate bar; + +fn main() { + // prints: Marker impl + 0u32.foo(); + + // prints: ??? + // the relevant specialization depends on the 'static lifetime + bar::Bar("Activate the marker!").foo(); +} +``` + +The problem here is that all of the crates in isolation look perfectly innocent. +The code in `marker`, `bar` and `client` is accepted today. It's only when these +crates are plugged together that a problem arises -- you end up with a +specialization based on a `'static` lifetime. And the `client` crate may not +even be aware of the existence of the `marker` crate. + +If we make this kind of situation a hard error, we could easily end up with a +scenario in which plugging together otherwise-unrelated crates is *impossible*. + +#### Proposal: ask forgiveness, rather than permission + +So what do we do? There seem to be essentially two avenues: + +1. Be maximally permissive in the impls you can write, and then just ignore + lifetime information in dispatch. We can generate a warning when this is + happening, though in cases like the above, it may be talking about traits + that the client is not even aware of. The assumption here is that these + "missed specializations" will be extremely rare, so better not to impose a + burden on everyone to rule them out. + +2. Try, somehow, to prevent you from writing impls that appear to dispatch based + on lifetimes. The most likely way of doing that is to somehow flag a trait as + "lifetime-dependent". If a trait is lifetime-dependent, it can have + lifetime-sensitive impls (like ones that apply only to `'static` data), but + it cannot be used when writing specialized impls of another trait. + +The downside of (2) is that it's an additional knob that all trait authors have to +think about. That approach is sketched in more detail in the Alternatives section. + +What this RFC proposes is to follow approach (1), at least during the initial +experimentation phase. That's the easiest way to gain experience with +specialization and see to what extent lifetime-dependent specializations +accidentally arise in practice. If they are indeed rare, it seems much better to +catch them via a lint then to force the entire world of traits to be explicitly +split in half. + +To begin with, this lint should be an error by default; we want to get +feedback as to how often this is happening before any +stabilization. + +##### What this means for the programmer + +Ultimately, the goal of the "just ignore lifetimes for specialization" approach +is to reduce the number of knobs in play. The programmer gets to use both +lifetime bounds and specialization freely. + +The problem, of course, is that when using the two together you can get +surprising dispatch results: + +```rust +trait Foo { + fn foo(&self); +} + +impl Foo for T { + default fn foo(&self) { + println!("Default impl"); + } +} + +impl Foo for &'static str { + fn foo(&self) { + println!("Static string slice: {}", self); + } +} + +fn main() { + // prints "Default impl", but generates a lint saying that + // a specialization was missed due to lifetime dependence. + "Hello, world!".foo(); +} +``` + +Specialization is refusing to consider the second impl because it imposes +lifetime constraints not present in the more general impl. We don't know whether +these constraints hold when we need to generate the code, and we don't want to +depend on them because of the subtleties of region inference. But we alert the +programmer that this is happening via a lint. + +Sidenote: for such simple intracrate cases, we could consider treating the impls +themselves more aggressively, catching that the `&'static str` impl will never +be used and refusing to compile it. + +In the more complicated multi-crate example we saw above, the line + +```rust +bar::Bar("Activate the marker!").foo(); +``` + +would likewise print `Default impl` and generate a warning. In this case, the +warning may be hard for the `client` crate author to understand, since the trait +relevant for specialization -- `marker::Marker` -- belongs to a crate that +hasn't even been imported in `client`. Nevertheless, this approach seems +friendlier than the alternative (discussed in Alternatives). + +#### An algorithm for ignoring lifetimes in dispatch + +Although approach (1) may seem simple, there are some subtleties in handling +cases like the following: + +```rust +trait Foo { ... } +impl Foo for T { ... } +impl Foo for T { ... } +``` + +In this "ignore lifetimes for specialization" approach, we still want the above +specialization to work, because *all* impls in the specialization family impose +the same lifetime constraints. The dispatch here purely comes down to `T: Clone` +or not. That's in contrast to something like this: + +```rust +trait Foo { ... } +impl Foo for T { ... } +impl Foo for T { ... } +``` + +where the difference between the impls includes a nontrivial lifetime constraint +(the `'static` bound on `T`). The second impl should effectively be dead code: +we should never dispatch to it in favor of the first impl, because that depends +on lifetime information that we don't have available in trans (and don't want to +rely on in general, due to the way region inference works). We would instead +lint against it (probably error by default). + +So, how do we tell these two scenarios apart? + +- First, we evaluate the impls normally, winnowing to a list of +applicable impls. + +- Then, we attempt to determine specialization. For any pair of applicable impls + `Parent` and `Child` (where `Child` specializes `Parent`), we do the + following: + + - Introduce as assumptions all of the where clauses of `Parent` + + - Attempt to prove that `Child` definitely applies, using these assumptions. + **Crucially**, we do this test in a special mode: lifetime bounds are only + considered to hold if they (1) follow from general well-formedness or (2) are + directly assumed from `Parent`. That is, a constraint in `Child` that `T: + 'static` has to follow either from some basic type assumption (like the type + `&'static T`) or from a similar clause in `Parent`. + + - If the `Child` impl cannot be shown to hold under these more stringent + conditions, then we have discovered a lifetime-sensitive specialization, and + can trigger the lint. + + - Otherwise, the specialization is valid. + +Let's do this for the two examples above. + +**Example 1** + +```rust +trait Foo { ... } +impl Foo for T { ... } +impl Foo for T { ... } +``` + +Here, if we think both impls apply, we'll start by assuming that `T: 'static` +holds, and then we'll evaluate whether `T: 'static` and `T: Clone` hold. The +first evaluation succeeds trivially from our assumption. The second depends on +`T`, as you'd expect. + +**Example 2** + +```rust +trait Foo { ... } +impl Foo for T { ... } +impl Foo for T { ... } +``` + +Here, if we think both impls apply, we start with no assumption, and then +evaluate `T: 'static` and `T: Clone`. We'll fail to show the former, because +it's a lifetime-dependent predicate, and we don't have any assumption that +immediately yields it. + +This should scale to less obvious cases, e.g. using `T: Any` rather than `T: +'static` -- because when trying to prove `T: Any`, we'll find we need to prove +`T: 'static`, and then we'll end up using the same logic as above. It also works +for cases like the following: + +```rust +trait SometimesDep {} + +impl SometimesDep for i32 {} +impl SometimesDep for T {} + +trait Spec {} +impl Spec for T {} +impl Spec for T {} +``` + +Using `Spec` on `i32` will not trigger the lint, because the specialization is +justified without any lifetime constraints. + ## Partial impls An interesting consequence of specialization is that impls need not (and in fact @@ -1120,6 +1426,158 @@ that item. Such a relaxed approach is much more flexible, probably easier to work with, and can enable more code reuse -- but it's also more complicated, and backwards-compatible to add on top of the proposed conservative approach. +## Limitations + +One frequent motivation for specialization is broader "expressiveness", in +particular providing a larger set of trait implementations than is possible +today. + +For example, the standard library currently includes an `AsRef` trait +for "as-style" conversions: + +```rust +pub trait AsRef where T: ?Sized { + fn as_ref(&self) -> &T; +} +``` + +Currently, there is also a blanket implementation as follows: + +```rust +impl<'a, T: ?Sized, U: ?Sized> AsRef for &'a T where T: AsRef { + fn as_ref(&self) -> &U { + >::as_ref(*self) + } +} +``` + +which allows these conversions to "lift" over references, which is in turn +important for making a number of standard library APIs ergonomic. + +On the other hand, we'd also like to provide the following very simple +blanket implementation: + +```rust +impl<'a, T: ?Sized> AsRef for T { + fn as_ref(&self) -> &T { + self + } +} +``` + +The current coherence rules prevent having both impls, however, +because they can in principle overlap: + +```rust +AsRef<&'a T> for &'a T where T: AsRef<&'a T> +``` + +Another examples comes from the `Option` type, which currently provides two +methods for unwrapping while providing a default value for the `None` case: + +```rust +impl Option { + fn unwrap_or(self, def: T) -> T { ... } + fn unwrap_or_else(self, f: F) -> T where F: FnOnce() -> T { .. } +} +``` + +The `unwrap_or` method is more ergonomic but `unwrap_or_else` is more efficient +in the case that the default is expensive to compute. The original +[collections reform RFC](https://github.com/rust-lang/rfcs/pull/235) proposed a +`ByNeed` trait that was rendered unworkable after unboxed closures landed: + +```rust +trait ByNeed { + fn compute(self) -> T; +} + +impl ByNeed for T { + fn compute(self) -> T { + self + } +} + +impl ByNeed for F where F: FnOnce() -> T { + fn compute(self) -> T { + self() + } +} + +impl Option { + fn unwrap_or(self, def: U) where U: ByNeed { ... } + ... +} +``` + +The trait represents any value that can produce a `T` on demand. But the above +impls fail to compile in today's Rust, because they overlap: consider `ByNeed +for F` where `F: FnOnce() -> F`. + +There are also some trait hierarchies where a subtrait completely subsumes the +functionality of a supertrait. For example, consider `PartialOrd` and `Ord`: + +```rust +trait PartialOrd: PartialEq { + fn partial_cmp(&self, other: &Rhs) -> Option; +} + +trait Ord: Eq + PartialOrd { + fn cmp(&self, other: &Self) -> Ordering; +} +``` + +In cases like this, it's somewhat annoying to have to provide an impl for *both* +`Ord` and `PartialOrd`, since the latter can be trivially derived from the +former. So you might want an impl like this: + +```rust +impl PartialOrd for T where T: Ord { + fn partial_cmp(&self, other: &T) -> Option { + Some(self.cmp(other)) + } +} +``` + +But this blanket impl would conflict with a number of others that work to "lift" +`PartialOrd` and `Ord` impls over various type constructors like references and +tuples, e.g.: + +```rust +impl<'a, A: ?Sized> Ord for &'a A where A: Ord { + fn cmp(&self, other: & &'a A) -> Ordering { Ord::cmp(*self, *other) } +} + +impl<'a, 'b, A: ?Sized, B: ?Sized> PartialOrd<&'b B> for &'a A where A: PartialOrd { + fn partial_cmp(&self, other: &&'b B) -> Option { + PartialOrd::partial_cmp(*self, *other) + } +``` + +The case where they overlap boils down to: + +```rust +PartialOrd<&'a T> for &'a T where &'a T: Ord +PartialOrd<&'a T> for &'a T where T: PartialOrd +``` + +and there is no implication between either of the where clauses. + +There are many other examples along these lines. + +Unfortunately, *none* of these examples are permitted by the revised overlap +rule in this RFC, because in none of these cases is one of the impls fully a +"subset" of the other; the overlap is always partial. + +It's a shame to not be able to address these cases, but the benefit is a +specialization rule that is very intuitive and accepts only very clear-cut +cases. The Alternatives section sketches some different rules that are less +intuitive but do manage to handle cases like those above. + +If we allowed "relaxed" partial impls as described above, one could at least use +that mechanism to avoid having to give a definition directly in most cases. (So +if you had `T: Ord` you could write `impl PartialOrd for T {}`.) + ## Possible extensions It's worth briefly mentioning a couple of mechanisms that one could consider @@ -1302,6 +1760,45 @@ specializing the trait impls. ## Alternative specialization designs +### The "lattice" rule + +The rule proposed in this RFC essentially says that overlapping impls +must form *chains*, in which each one is strictly more specific than +the last. + +This approach can be generalized to *lattices*, in which partial +overlap between impls is allowed, so long as there is an additional +impl that covers precisely the area of overlap (the intersection). +Such a generalization can support all of the examples mentioned in the +Limitations section. Moving to the lattice rule is backwards compatible. + +Unfortunately, the lattice rule (or really, any generalization beyond +the proposed chain rule) runs into a nasty problem with our lifetime +strategy. Consider the following: + +```rust +trait Foo {} +impl Foo for (T, U) where T: 'static {} +impl Foo for (T, U) where U: 'static {} +impl Foo for (T, U) where T: 'static, U: 'static {} +``` + +The problem is, if we allow this situation to go through typeck, by +the time we actually generate code in trans, *there is no possible +impl to choose*. That is, we do not have enough information to +specialize, but we also don't know which of the (overlapping) +unspecialized impls actually applies. We can address this problem by +making the "lifetime dependent specialization" lint issue a hard error +for such intersection impls, but that means that certain compositions +will simply not be allowed (and, as mentioned before, these +compositions might involve traits, types, and impls that the +programmer is not even aware of). + +The limitations that the lattice rule addresses are fairly secondary +to the main goals of specialization (as laid out in the Motivation), +and so, since the lattice rule can be added later, the RFC sticks with +the simple chain rule for now. + ### Explicit ordering Another, perhaps more palatable alternative would be to take the specialization @@ -1351,15 +1848,6 @@ The downsides are: Finally, there are a few important questions not yet addressed by this RFC: -- Presumably, we do not want to permit specialization based on *lifetime* - parameters, but the algorithm as written does not give them any special - treatment. That needs to be dealt with in the implementation, at least. - -- We've said nothing about the interaction with dropck, which relies on a - parametricity property for generics that don't have bounds. Specialization - could potentially be used to subvert that property. That needs to be addressed - before the RFC can be accepted. - - The design with `default` makes specialization of associated types an all-or-nothing affair, but it would occasionally be useful to say that all further specializations will at least guarantee some additional trait bound on From a1ed017694f4ac64d66ca154d67529449f9fb055 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 26 Jan 2016 12:18:09 -0800 Subject: [PATCH 0713/1195] Move from `partial impl` to `default impl`, clarify semantics --- text/0000-impl-specialization.md | 55 +++++++++++++++----------------- 1 file changed, 25 insertions(+), 30 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index ffa912529a0..379a20152ae 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -145,21 +145,22 @@ trait Add { In this case, there's no natural way to provide a default implementation of `add_assign`, since we do not want to restrict the `Add` trait to `Clone` data. -The specialization design in this RFC also allows for *partial* implementations, -which can provide specialized defaults without actually providing a full trait -implementation: +The specialization design in this RFC also allows for *default impls*, +which can provide specialized defaults without actually providing a +full trait implementation: ```rust -partial impl Add for T { - // the `default` qualifier allows further specialization - default fn add_assign(&mut self, rhs: R) { +// the `default` qualifier here means (1) not all items are impled +// and (2) those that are can be further specialized +default impl Add for T { + fn add_assign(&mut self, rhs: R) { let tmp = self.clone() + rhs; *self = tmp; } } ``` -This partial impl does *not* mean that `Add` is implemented for all `Clone` +This default impl does *not* mean that `Add` is implemented for all `Clone` data, but jut that when you do impl `Add` and `Self: Clone`, you can leave off `add_assign`: @@ -184,20 +185,13 @@ methods. For example, consider the relationship between `size_hint` and `ExactSizeIterator`: ```rust -partial impl Iterator for T where T: ExactSizeIterator { +default impl Iterator for T where T: ExactSizeIterator { fn size_hint(&self) -> (usize, Option) { (self.len(), Some(self.len())) } } ``` -As we'll see later, the design of this RFC makes it possible to "lock down" such -method impls (by not using the `default` qualifier), preventing any further -refinement (akin to Java's `final` keyword); that in turn makes it possible to -statically enforce the contract that is supposed to connect the `len` and -`size_hint` methods. (Of course, we can't make *that* particular change, since -the relevant APIs are already stable.) - ## Supporting efficient inheritance Finally, specialization can be seen as a form of inheritance, since methods @@ -1301,7 +1295,7 @@ impl Spec for T {} Using `Spec` on `i32` will not trigger the lint, because the specialization is justified without any lifetime constraints. -## Partial impls +## Default impls An interesting consequence of specialization is that impls need not (and in fact sometimes *cannot*) provide all of the items that a trait specifies. Of course, @@ -1350,7 +1344,7 @@ in traits -- one in which the defaults can become ever more refined as more is known about the input types to the traits (as described in the Motivation section). But to fully realize this goal, we need one other ingredient: the ability for the *blanket* impl itself to leave off some items. We do this by -using the `partial` keyword: +using the `default` keyword at the `impl` level: ```rust trait Add { @@ -1359,7 +1353,7 @@ trait Add { fn add_assign(&mut self, Rhs); } -partial impl Add for T { +default impl Add for T { fn add_assign(&mut self, rhs: R) { let tmp = self.clone() + rhs; *self = tmp; @@ -1374,8 +1368,7 @@ A key point here is that, as the keyword suggests, a `partial` impl may be incomplete: from the above code, you *cannot* assume that `T: Add` for any `T: Clone`, because no such complete impl has been provided. -With partial impls, defaulted items in traits are just sugar for a partial -blanket impl: +Defaulted items in traits are just sugar for a default blanket impl: ```rust trait Iterator { @@ -1397,30 +1390,32 @@ trait Iterator { // ... } -partial impl Iterator for T { - default fn size_hint(&self) -> (usize, Option) { +default impl Iterator for T { + fn size_hint(&self) -> (usize, Option) { (0, None) } // ... } ``` -Partial impls are somewhat akin to abstract base classes in object-oriented +Default impls are somewhat akin to abstract base classes in object-oriented languages; they provide some, but not all, of the materials needed for a fully concrete implementation, and thus enable code reuse but cannot be used concretely. -Note that partial impls still need to use `default` to allow for overriding -- -leaving off the qualifier will lock down the implementation in any -more-specialized complete impls, which is actually a useful pattern (as -explained in the Motivation.) +Note that the semantics of `default impls` and defaulted items in +traits is that both are implicitly marked `default` -- that is, both +are considered specializable. This choice gives a coherent mental +model: when you choose *not* to employ a default, and instead provide +your own definition, you are in effect overriding/specializing that +code. (Put differently, you can think of default impls as abstract base classes). There are a few important details to nail down with the design. This RFC proposes starting with the conservative approach of applying the general overlap -rule to partial impls, same as with complete ones. That ensures that there is +rule to default impls, same as with complete ones. That ensures that there is always a clear definition to use when providing subsequent complete impls. It would be possible, though, to relax this constraint and allow *arbitrary* -overlap between partial impls, requiring then whenever a complete impl overlaps -with them, *for each item*, there is either a unique "most specific" partial +overlap between default impls, requiring then whenever a complete impl overlaps +with them, *for each item*, there is either a unique "most specific" default impl that applies, or else the complete impl provides its own definition for that item. Such a relaxed approach is much more flexible, probably easier to work with, and can enable more code reuse -- but it's also more complicated, and From be31335ad544de7e60247fff53b0fcc71cb49c30 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 26 Jan 2016 12:22:00 -0800 Subject: [PATCH 0714/1195] Finalize unresolved questions --- text/0000-impl-specialization.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 379a20152ae..6ee5ecb76f8 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -1707,6 +1707,15 @@ fn needs_extend_all(t: T) where for> T: Extend Date: Tue, 26 Jan 2016 13:03:29 -0800 Subject: [PATCH 0715/1195] Add lifetime_dependent alternative --- text/0000-impl-specialization.md | 85 ++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 6ee5ecb76f8..8dab544ff0e 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -1848,6 +1848,91 @@ The downsides are: "specialization hierarchy" to be flat, in particular ruling out multiple levels of increasingly-specialized blanket impls. +## Alternative handling of lifetimes + +This RFC proposes a *laissez faire* approach to lifetimes: we let you +write whatever impls you like, then warn you if some of them are being +ignored because the specialization is based purely on lifetimes. + +The main alternative approach is to make a more "principled" +distinction between two kinds of traits: those that can be used as +constraints in specialization, and those whose impls can be lifetime +dependent. Concretely: + +```rust +#[lifetime_dependent] +trait Foo {} + +// Only allowed to use 'static here because of the lifetime_dependent attribute +impl Foo for &'static str {} + +trait Bar { fn bar(&self); } +impl Bar for T { + // Have to use `default` here to allow specialization + default fn bar(&self) {} +} + +// CANNOT write the following impl, because `Foo` is lifetime_dependent +// and Bar is not. +// +// NOTE: this is what I mean by *using* a trait in specialization; +// we are trying to say a specialization applies when T: Foo holds +impl Bar for T { + fn bar(&self) { ... } +} + +// CANNOT write the following impl, because `Bar` is not lifetime_dependent +impl Bar for &'static str { + fn bar(&self) { ... } +} +``` + +There are several downsides to this approach: + +* It forces trait authors to consider a rather subtle knob for every + trait they write, choosing between two forms of expressiveness and + dividing the world accordingly. The last thing the trait system + needs is another knob. + +* Worse still, changing the knob in either direction is a breaking change: + + * If a trait gains a `lifetime_dependent` attribute, any impl of a + different trait that used it to specialize would become illegal. + + * If a trait loses its `lifetime_dependent` attribute, any impl of + that trait that was lifetime dependent would become illegal. + +* It hobbles specialization for some existing traits in `std`. + +For the last point, consider `From` (which is tied to `Into`). In +`std`, we have the following important "boxing" impl: + +```rust +impl<'a, E: Error + 'a> From for Box +``` + +This impl would necessitate `From` (and therefore, `Into`) being +marked `lifetime_dependent`. But these traits are very likely to be +used to describe specializations (e.g., an impl that applies when `T: +Into`). + +There does not seem to be any way to consider such impls as +lifetime-independent, either, because of examples like the following: + +```rust +// If we consider this innocent... +trait Tie {} +impl<'a, T: 'a> Tie for (T, &'a u8) + +// ... we get into trouble here +trait Foo {} +impl<'a, T> Foo for (T, &'a u8) +impl<'a, T> Foo for (T, &'a u8) where (T, &'a u8): Tie +``` + +All told, the proposed *laissez faire* seems a much better bet in +practice, but only experience with the feature can tell us for sure. + # Unresolved questions All questions from the RFC discussion and prototype have been resolved. From da75c9b284504f4b404bb3ac714021a75258cc94 Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Tue, 26 Jan 2016 17:51:19 -0800 Subject: [PATCH 0716/1195] Take out `[T]::fill` --- text/0000-slice-copy-fill.md | 83 ------------------------------------ text/0000-slice-copy.md | 58 +++++++++++++++++++++++++ 2 files changed, 58 insertions(+), 83 deletions(-) delete mode 100644 text/0000-slice-copy-fill.md create mode 100644 text/0000-slice-copy.md diff --git a/text/0000-slice-copy-fill.md b/text/0000-slice-copy-fill.md deleted file mode 100644 index 25f554bdd88..00000000000 --- a/text/0000-slice-copy-fill.md +++ /dev/null @@ -1,83 +0,0 @@ -- Feature Name: slice\_copy\_fill -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary -[summary]: #summary - -Safe `memcpy` from one slice to another of the same type and length, and a safe -`memset` of a slice of type `T: Copy`. - -# Motivation -[motivation]: #motivation - -Currently, the only way to quickly copy from one non-`u8` slice to another is to -use a loop, or unsafe methods like `std::ptr::copy_nonoverlapping`. This allows -us to guarantee a `memcpy` for `Copy` types, and is safe. The only way to -`memset` a slice, currently, is a loop, and we should expose a method to allow -people to do this. This also completely gets rid of the point of -`std::slice::bytes`, which means we can remove this deprecated and useless -module. - -# Detailed design -[design]: #detailed-design - -Add two methods to Primitive Type `slice`. - -```rust -impl [T] where T: Copy { - pub fn fill(&mut self, value: T); - pub fn copy_from(&mut self, src: &[T]); -} -``` - -`fill` loops through slice, setting each member to value. This will usually -lower to a memset in optimized builds. It is likely that this is only the -initial implementation, and will be optimized later to be almost as fast as, or -as fast as, memset. It is defined behavior to call `fill` on a slice which has -uninitialized members, and `self` is guaranteed to be fully filled afterwards. - -`copy_from` panics if `src.len() != self.len()`, then `memcpy`s the members into -`self` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`; -`self` can have uninitialized members, and `self` is guaranteed to be fully filled -afterwards. This means, for example, that the following is fully defined: - -```rust -let s1: [u8; 16] = unsafe { std::mem::uninitialized() }; -let s2: [u8; 16] = unsafe { std::mem::uninitialized() }; -s1.fill(42); -s2.copy_from(&s1); -println!("{}", s2); -``` - -And the program will print 16 '8's. - -# Drawbacks -[drawbacks]: #drawbacks - -Two new methods on `slice`. `[T]::fill` *will not* be lowered to a `memset` in -all cases. - -# Alternatives -[alternatives]: #alternatives - -We could name these functions something else. `fill`, for example, could be -called `set`, `fill_from`, or `fill_with`. - -`copy_from` could be called `copy_to`, and have the order of the arguments -switched around. This would follow `ptr::copy_nonoverlapping` ordering, and not -`dst = src` or `.clone_from()` ordering. - -`copy_from` could panic only if `dst.len() < src.len()`. This would be the same -as what came before, but we would also lose the guarantee that an uninitialized -slice would be fully initialized. - -`fill` and `copy_from` could both be free functions, and were in the -original draft of this document. However, overwhelming support for these as -methods has meant that these have become methods. - -# Unresolved questions -[unresolved]: #unresolved-questions - -None, as far as I can tell. diff --git a/text/0000-slice-copy.md b/text/0000-slice-copy.md new file mode 100644 index 00000000000..f49da7f0b95 --- /dev/null +++ b/text/0000-slice-copy.md @@ -0,0 +1,58 @@ +- Feature Name: slice\_copy\_from +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Safe `memcpy` from one slice to another of the same type and length. + +# Motivation +[motivation]: #motivation + +Currently, the only way to quickly copy from one non-`u8` slice to another is to +use a loop, or unsafe methods like `std::ptr::copy_nonoverlapping`. This allows +us to guarantee a `memcpy` for `Copy` types, and is safe. + +# Detailed design +[design]: #detailed-design + +Add one method to Primitive Type `slice`. + +```rust +impl [T] where T: Copy { + pub fn copy_from(&mut self, src: &[T]); +} +``` + +`copy_from` asserts that `src.len() == self.len()`, then `memcpy`s the members into +`self` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`. +`self` shall have exactly the same members as `src` after a call to `copy_from`. + +# Drawbacks +[drawbacks]: #drawbacks + +One new method on `slice`. + +# Alternatives +[alternatives]: #alternatives + +`copy_from` could be known as `copy_from_slice`, which would follow +`clone_from_slice`. + +`copy_from` could be called `copy_to`, and have the order of the arguments +switched around. This would follow `ptr::copy_nonoverlapping` ordering, and not +`dst = src` or `.clone_from()` ordering. + +`copy_from` could panic only if `dst.len() < src.len()`. This would be the same +as what came before, but we would also lose the guarantee that an uninitialized +slice would be fully initialized. + +`copy_from` could be a free function, as it was in the original draft of this +document. However, there was overwhelming support for it as a method. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None, as far as I can tell. From 3f0c85b140540817db7bbadffce5b3f22c612bea Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Mon, 25 Jan 2016 21:59:08 -0800 Subject: [PATCH 0717/1195] Unix socket support in the standard library --- text/0000-unix-socket.md | 501 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 501 insertions(+) create mode 100644 text/0000-unix-socket.md diff --git a/text/0000-unix-socket.md b/text/0000-unix-socket.md new file mode 100644 index 00000000000..5e509c3f0ae --- /dev/null +++ b/text/0000-unix-socket.md @@ -0,0 +1,501 @@ +- Feature Name: unix_socket +- Start Date: 2016-01-25 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +[Unix domain sockets](https://en.wikipedia.org/wiki/Unix_domain_socket) provide +a commonly used form of IPC on Unix-derived systems. This RFC proposes move the +[unix_socket](https://crates.io/crates/unix_socket/) nursery crate into the +`std::os::unix` module. + +# Motivation +[motivation]: #motivation + +Unix sockets are a common form of IPC on unixy systems. Databases like +PostgreSQL and Redis allow connections via Unix sockets, and Servo uses them to +communicate with subprocesses. Even though Unix sockets are not present on +Windows, their use is sufficiently widespread to warrant inclusion in the +platform-specific sections of the standard library. + +# Detailed design +[design]: #detailed-design + +Unix sockets can be configured with the `SOCK_STREAM`, `SOCK_DGRAM`, and +`SOCK_SEQPACKET` types. `SOCK_STREAM` creates a connection-oriented socket that +behaves like a TCP socket, `SOCK_DGRAM` creates a packet-oriented socket that +behaves like a UDP socket, and `SOCK_SEQPACKET` provides something of a hybrid +between the other two - a connection-oriented, reliable, ordered stream of +delimited packets. `SOCK_SEQPACKET` support has not yet been implemented in the +unix_socket crate, so only the first two socket types will initially be +supported in the standard library. + +While a TCP or UDP socket would be identified by a IP address and port number, +Unix sockets are typically identified by a filesystem path. For example, a +Postgres server will listen on a Unix socket located at +`/run/postgresql/.s.PGSQL.5432` in some configurations. However, the +`socketpair` function can make a pair of *unnamed* connected Unix sockets not +associated with a filesystem path. In addition, Linux provides a separate +*abstract* namespace not associated with the filesystem. + +A `std::os::unix::net` module will be created with the following contents: + +The `UnixStream` type mirrors `TcpStream`: +```rust +pub struct UnixStream { + ... +} + +impl UnixStream { + /// Connects to the socket named by `path`. + /// + /// Linux provides, as a nonportable extension, a separate "abstract" + /// address namespace as opposed to filesystem-based addressing. If `path` + /// begins with a null byte, it will be interpreted as an "abstract" + /// address. Otherwise, it will be interpreted as a "pathname" address, + /// corresponding to a path on the filesystem. + pub fn connect>(path: P) -> io::Result { + ... + } + + /// Creates an unnamed pair of connected sockets. + /// + /// Returns two `UnixStream`s which are connected to each other. + pub fn pair() -> io::Result<(UnixStream, UnixStream)> { + ... + } + + /// Creates a new independently owned handle to the underlying socket. + /// + /// The returned `UnixStream` is a reference to the same stream that this + /// object references. Both handles will read and write the same stream of + /// data, and options set on one stream will be propogated to the other + /// stream. + pub fn try_clone(&self) -> io::Result { + ... + } + + /// Returns the socket address of the local half of this connection. + pub fn local_addr(&self) -> io::Result { + ... + } + + /// Returns the socket address of the remote half of this connection. + pub fn peer_addr(&self) -> io::Result { + ... + } + + /// Sets the read timeout for the socket. + /// + /// If the provided value is `None`, then `read` calls will block + /// indefinitely. It is an error to pass the zero `Duration` to this + /// method. + pub fn set_read_timeout(&self, timeout: Option) -> io::Result<()> { + ... + } + + /// Sets the write timeout for the socket. + /// + /// If the provided value is `None`, then `write` calls will block + /// indefinitely. It is an error to pass the zero `Duration` to this + /// method. + pub fn set_write_timeout(&self, timeout: Option) -> io::Result<()> { + ... + } + + /// Returns the read timeout of this socket. + pub fn read_timeout(&self) -> io::Result> { + ... + } + + /// Returns the write timeout of this socket. + pub fn write_timeout(&self) -> io::Result> { + ... + } + + /// Moves the socket into or out of nonblocking mode. + pub fn set_nonblocking(&self, nonblocking: bool) -> io::Result<()> { + ... + } + + /// Returns the value of the `SO_ERROR` option. + pub fn take_error(&self) -> io::Result> { + ... + } + + /// Shuts down the read, write, or both halves of this connection. + /// + /// This function will cause all pending and future I/O calls on the + /// specified portions to immediately return with an appropriate value + /// (see the documentation of `Shutdown`). + pub fn shutdown(&self, how: Shutdown) -> io::Result<()> { + ... + } +} + +impl Read for UnixStream { + ... +} + +impl<'a> Read for &'a UnixStream { + ... +} + +impl Write for UnixStream { + ... +} + +impl<'a> Write for UnixStream { + ... +} + +impl FromRawFd for UnixStream { + ... +} + +impl AsRawFd for UnixStream { + ... +} + +impl IntoRawFd for UnixStream { + ... +} +``` + +Differences from `TcpStream`: +* `connect` takes an `AsRef` rather than a `ToSocketAddrs`. +* The `pair` method creates a pair of connected, unnamed sockets, as this is + commonly used for IPC. +* The `SocketAddr` returned by the `local_addr` and `peer_addr` methods is + different. +* The `set_nonblocking` and `take_error` methods are not currently present on + `TcpStream` but are provided in the `net2` crate and are being proposed for + addition to the standard library in a separate RFC. + +As noted above, a Unix socket can either be unnamed, be associated with a path +on the filesystem, or (on Linux) be associated with an ID in the abstract +namespace. The `SocketAddr` struct is fairly simple: + +```rust +pub struct SocketAddr { + ... +} + +impl SocketAddr { + /// Returns true if the address is unnamed. + pub fn is_unnamed(&self) -> bool { + ... + } + + /// Returns the contents of this address if it corresponds to a filesystem path. + pub fn as_pathname(&self) -> Option<&Path> { + ... + } +} +``` + +A Linux-specific extension trait is provided for the abstract namespace: +```rust +pub trait SocketAddrExt { + /// Returns the contents of this address (without the leading null byte) if + /// it is an abstract address. + fn as_abstract(&self) -> Option<&[u8]> +} +``` + +The `UnixListener` type mirrors the `TcpListener` type: +```rust +pub struct UnixListener { + ... +} + +impl UnixListener { + /// Creates a new `UnixListener` bound to the specified socket. + /// + /// Linux provides, as a nonportable extension, a separate "abstract" + /// address namespace as opposed to filesystem-based addressing. If `path` + /// begins with a null byte, it will be interpreted as an "abstract" + /// address. Otherwise, it will be interpreted as a "pathname" address, + /// corresponding to a path on the filesystem. + pub fn bind>(path: P) -> io::Result { + ... + } + + /// Accepts a new incoming connection to this listener. + /// + /// This function will block the calling thread until a new Unix connection + /// is established. When established, the corersponding `UnixStream` and + /// the remote peer's address will be returned. + pub fn accept(&self) -> io::Result<(UnixStream, SocketAddr)> { + ... + } + + /// Creates a new independently owned handle to the underlying socket. + /// + /// The returned `UnixListener` is a reference to the same socket that this + /// object references. Both handles can be used to accept incoming + /// connections and options set on one listener will affect the other. + pub fn try_clone(&self) -> io::Result { + ... + } + + /// Returns the local socket address of this listener. + pub fn local_addr(&self) -> io::Result { + ... + } + + /// Moves the socket into or out of nonblocking mode. + pub fn set_nonblocking(&self, nonblocking: bool) -> io::Result<()> { + ... + } + + /// Returns the value of the `SO_ERROR` option. + pub fn take_error(&self) -> io::Result> { + ... + } + + /// Returns an iterator over incoming connections. + /// + /// The iterator will never return `None` and will also not yield the + /// peer's `SocketAddr` structure. + pub fn incoming<'a>(&'a self) -> Incoming<'a> { + ... + } +} + +impl FromRawFd for UnixListener { + ... +} + +impl AsRawFd for UnixListener { + ... +} + +impl IntoRawFd for UnixListener { + ... +} +``` + +Differences from `TcpListener`: +* `bind` takes an `AsRef` rather than a `ToSocketAddrs`. +* The `SocketAddr` type is different. +* The `set_nonblocking` and `take_error` methods are not currently present on + `TcpListener` but are provided in the `net2` crate and are being proposed for + addition to the standard library in a separate RFC. + +Finally, the `UnixDatagram` type mirrors the `UpdSocket` type: +```rust +pub struct UnixDatagram { + ... +} + +impl UnixDatagram { + /// Creates a Unix datagram socket bound to the given path. + /// + /// Linux provides, as a nonportable extension, a separate "abstract" + /// address namespace as opposed to filesystem-based addressing. If `path` + /// begins with a null byte, it will be interpreted as an "abstract" + /// address. Otherwise, it will be interpreted as a "pathname" address, + /// corresponding to a path on the filesystem. + pub fn bind>(path: P) -> io::Result { + ... + } + + /// Creates a Unix Datagram socket which is not bound to any address. + pub fn unbound() -> io::Result { + ... + } + + /// Create an unnamed pair of connected sockets. + /// + /// Returns two `UnixDatagrams`s which are connected to each other. + pub fn pair() -> io::Result<(UnixDatagram, UnixDatagram)> { + ... + } + + /// Creates a new independently owned handle to the underlying socket. + /// + /// The returned `UnixDatagram` is a reference to the same stream that this + /// object references. Both handles will read and write the same stream of + /// data, and options set on one stream will be propogated to the other + /// stream. + pub fn try_clone(&self) -> io::Result { + ... + } + + /// Connects the socket to the specified address. + /// + /// The `send` method may be used to send data to the specified address. + /// `recv` and `recv_from` will only receive data from that address. + pub fn connect>(&self, path: P) -> io::Result<()> { + ... + } + + /// Returns the address of this socket. + pub fn local_addr(&self) -> io::Result { + ... + } + + /// Returns the address of this socket's peer. + /// + /// The `connect` method will connect the socket to a peer. + pub fn peer_addr(&self) -> io::Result { + ... + } + + /// Receives data from the socket. + /// + /// On success, returns the number of bytes read and the address from + /// whence the data came. + pub fn recv_from(&self, buf: &mut [u8]) -> io::Result<(usize, SocketAddr)> { + ... + } + + /// Receives data from the socket. + /// + /// On success, returns the number of bytes read. + pub fn recv(&self, buf: &mut [u8]) -> io::Result { + ... + } + + /// Sends data on the socket to the specified address. + /// + /// On success, returns the number of bytes written. + pub fn send_to>(&self, buf: &[u8], path: P) -> io::Result { + ... + } + + /// Sends data on the socket to the socket's peer. + /// + /// The peer address may be set by the `connect` method, and this method + /// will return an error if the socket has not already been connected. + /// + /// On success, returns the number of bytes written. + pub fn send(&self, buf: &[u8]) -> io::Result { + ... + } + + /// Sets the read timeout for the socket. + /// + /// If the provided value is `None`, then `recv` and `recv_from` calls will + /// block indefinitely. It is an error to pass the zero `Duration` to this + /// method. + pub fn set_read_timeout(&self, timeout: Option) -> io::Result<()> { + ... + } + + /// Sets the write timeout for the socket. + /// + /// If the provided value is `None`, then `send` and `send_to` calls will + /// block indefinitely. It is an error to pass the zero `Duration` to this + /// method. + pub fn set_write_timeout(&self, timeout: Option) -> io::Result<()> { + ... + } + + /// Returns the read timeout of this socket. + pub fn read_timeout(&self) -> io::Result> { + ... + } + + /// Returns the write timeout of this socket. + pub fn write_timeout(&self) -> io::Result> { + ... + } + + /// Moves the socket into or out of nonblocking mode. + pub fn set_nonblocking(&self, nonblocking: bool) -> io::Result<()> { + ... + } + + /// Returns the value of the `SO_ERROR` option. + pub fn take_error(&self) -> io::Result> { + ... + } + + /// Shut down the read, write, or both halves of this connection. + /// + /// This function will cause all pending and future I/O calls on the + /// specified portions to immediately return with an appropriate value + /// (see the documentation of `Shutdown`). + pub fn shutdown(&self, how: Shutdown) -> io::Result<()> { + ... + } +} + +impl FromRawFd for UnixDatagram { + ... +} + +impl AsRawFd for UnixDatagram { + ... +} + +impl IntoRawFd for UnixDatagram { + ... +} +``` + +Differences from `UdpSocket`: +* `bind` takes an `AsRef` rather than a `ToSocketAddrs`. +* The `unbound` method creates an unbound socket, as a Unix socket does not need + to be bound to send messages. +* The `pair` method creates a pair of connected, unnamed sockets, as this is + commonly used for IPC. +* The `SocketAddr` returned by the `local_addr` and `peer_addr` methods is + different. +* The `connect`, `send`, `recv`, `set_nonblocking`, and `take_error` methods are + not currently present on `UdpSocket` but are provided in the `net2` crate and + are being proposed for addition to the standard library in a separate RFC. + +## Functionality not present + +Some functionality is notably absent from this proposal: + +* No support for `SOCK_SEQPACKET` sockets is proposed, as it has not yet been + implemented. Since it is connection oriented, there will be a socket type + `UnixSeqPacket` and a listener type `UnixSeqListener`. The naming of the + listener is a bit unfortunate, but use `SOCK_SEQPACKET` is rare compared to + `SOCK_STREAM` so naming priority can go to that version. +* Unix sockets support file descriptor and credential transfer, but these will + not initially be supported as the `sendmsg`/`recvmsg` interface is complex + and bindings will need some time to prototype. + +These features can bake in the `rust-lang-nursery/unix-socket` as they're +developed. + +# Drawbacks +[drawbacks]: #drawbacks + +While there is precedent for platform specific components in the standard +library, this will be the by far the largest platform specific addition. + +# Alternatives +[alternatives]: #alternatives + +Unix socket support could be left out of tree. + +The naming convention of `UnixStream` and `UnixDatagram` doesn't perfectly +mirror `TcpStream` and `UdpSocket`, but `UnixStream` and `UnixSocket` seems way +too confusing. + +Constructors for the various socket types take an `AsRef`, which makes +construction of sockets associated with Linux abstract namespaces somewhat +nonobvious, as the leading null byte has to be explicitly added. However, it is +still possible, either via `&str` for UTF8 names or via `&OsStr` and +`std::os::unix::ffi::OsStrExt` for arbitrary names. Use of the abstract +namespace appears to be very obscure, so it seems best to optimize for +ergonomics of normal pathname addresses. We can add extension traits providing +methods taking `&[u8]` in the future if deemed necessary. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Is `std::os::unix::net` the right name for this module? It's not strictly +"networking" as all communication is local to one machine. `std::os::unix::unix` +is more accurate but weirdly repetitive and the extension trait module +`std::os::linux::unix` is even weirder. `std::os::unix::socket` is an option, +but seems like too general of a name for specifically `AF_UNIX` sockets as +opposed to *all* sockets. From 4cdc21a0b2ecfa5a3c741a3bf490888f9adea326 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 28 Jan 2016 23:11:18 -0800 Subject: [PATCH 0718/1195] RFC 1361 is #[cfg] in Cargo dependencies --- ...-cargo-cfg-dependencies.md => 1361-cargo-cfg-dependencies} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-cargo-cfg-dependencies.md => 1361-cargo-cfg-dependencies} (98%) diff --git a/text/0000-cargo-cfg-dependencies.md b/text/1361-cargo-cfg-dependencies similarity index 98% rename from text/0000-cargo-cfg-dependencies.md rename to text/1361-cargo-cfg-dependencies index f9419f5fe7f..c4eed93edb6 100644 --- a/text/0000-cargo-cfg-dependencies.md +++ b/text/1361-cargo-cfg-dependencies @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-11-10 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1361](https://github.com/rust-lang/rfcs/pull/1361) +- Rust Issue: N/A # Summary [summary]: #summary From 531cddfe41bb68dd6aedb4e2cdcd318e3b58232f Mon Sep 17 00:00:00 2001 From: Vadim Petrochenkov Date: Fri, 29 Jan 2016 10:23:28 +0300 Subject: [PATCH 0719/1195] Add extension ".md" to text/1361-cargo-cfg-dependencies Otherwise GitHub doesn't render it properly --- ...1361-cargo-cfg-dependencies => 1361-cargo-cfg-dependencies.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{1361-cargo-cfg-dependencies => 1361-cargo-cfg-dependencies.md} (100%) diff --git a/text/1361-cargo-cfg-dependencies b/text/1361-cargo-cfg-dependencies.md similarity index 100% rename from text/1361-cargo-cfg-dependencies rename to text/1361-cargo-cfg-dependencies.md From bb201c1caee5af080a25117ec06195bfc339294c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Mon, 1 Feb 2016 14:57:30 +0100 Subject: [PATCH 0720/1195] Conservative `impl Trait` --- text/0000-conservative-impl-trait.md | 325 +++++++++++++++++++++++++++ 1 file changed, 325 insertions(+) create mode 100644 text/0000-conservative-impl-trait.md diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md new file mode 100644 index 00000000000..c23071b45b5 --- /dev/null +++ b/text/0000-conservative-impl-trait.md @@ -0,0 +1,325 @@ +- Feature Name: conservative_impl_trait +- Start Date: 2016-01-31 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add a conservative form of abstract return types, aka `impl Trait`, +that will be compatible with most possible future extensions by +initially being restricted to: + +- Only free-standing or inherent functions. +- Only return type position of a function. + +Abstract return types allow a function to hide a concrete return +type behind a trait interface similar to trait objects, while +still generating the same statically dispatched code as with concrete types: + +```rust +fn foo(n: u32) -> impl Iterator { + (0..n).map(|x| x * 100) +} +// ^ behaves as if it had return type Map, Clos> +// where Clos = type of the |x| x * 100 closure. + +for x in foo(10) { + // ... +} + +``` + +# Motivation +[motivation]: #motivation + +> Why are we doing this? What use cases does it support? What is the expected outcome? + +There has been much discussion around the `impl Trait` feature already, with +different proposals extending the core idea into different directions. + +See http://aturon.github.io/blog/2015/09/28/impl-trait/ for detailed motivation, and +https://github.com/rust-lang/rfcs/pull/105 and https://github.com/rust-lang/rfcs/pull/1305 for prior RFCs on this topic. + +It is not yet clear which, if any, of the proposals will end up as the "final form" +of the feature, so this RFC aims to only specify a usable subset that will +be compatible with most of them. + +# Detailed design +[design]: #detailed-design + +> This is the bulk of the RFC. Explain the design in enough detail for somebody familiar +> with the language to understand, and for somebody familiar with the compiler to implement. +> This should get into specifics and corner-cases, and include examples of how the feature is used. + +#### Syntax + +Let's start with the bikeshed: The proposed syntax is `@Trait` in return type +position, composing like trait objects to forms like `@(Foo+Send+'a)`. + +The reason for choosing a sigil is ergonomics: Whatever the exact final +implementation will be capable of, you'd want it to be as easy to read/write +as trait objects, or else the more performant and idiomatic option would +be the more verbose one, and thus probably less used. + +The argument can be made this decreases the google-ability of Rust syntax +(and this doesn't even talk about the _old_ `@T` pointer semantic the internet is still littered with), +but this would be somewhat mitigated by the feature being supposedly used commonly once it lands, +and can be explained in the docs as being short for `abstract` or `anonym`. + +If there are good reasons against `@`, there is also the choice of `~`. +All points from above still apply, except `~` is a bit rarer in language +syntaxes in general, and depending on keyboard layout somewhat harder to reach. + +Finally, if there is a huge incentive _against_ new (old?) sigils in the language, +there is also the option of using keyword-based syntax like `impl Trait` or +`abstract Trait`, but this would add a verbosity overhead for a feature +that will be used somewhat commonly. + +#### Semantic + +The core semantic of the feature is described below. Note that the sections after +this one go into more detail on some of the design decisions. + +- `@Trait` may only be written at return type position + of a freestanding or inherent-impl function, not in trait definitions, + closure traits, function pointers, or any non-return type position. +- The function body can return values of any type that implements Trait, + but all return values need to be of the same type. +- Outside of the function body, the return type is only known to implement Trait. +- As an exception to the above, OIBITS like `Send` and `Sync` leak through an abstract return type. +- The return type is unnameable. +- The return type has a identity based on all generic parameters the + function body is parametrized by, and by the location of the function + in the module system. This means type equality behaves like this: + ```rust + fn foo(t: T) -> @Trait { + t + } + + fn bar() -> @Trait { + 123 + } + + fn equal_type(a: T, b: T) {} + + equal_type(bar(), bar()) // OK + equal_type(foo::(0), foo::(0)) // OK + equal_type(bar(), foo::(0)) // ERROR, `@Trait {bar}` is not the same type as `@Trait {foo}` + equal_type(foo::(false), foo::(0)) // ERROR, `@Trait {foo}` is not the same type as `@Trait {foo}` + ``` +- The function body can not see through its own return type, so code like this + would be forbidden just like on the outside: + ```rust + fn sum_to(n: u32) -> @Display { + if n == 0 { + 0 + } else { + n + sum_to(n - 1) + } + } + ``` +- Abstract return types are considered `Sized`, just like all return types today. + +#### Limitation to only retun type position + +There have been various proposed additional places where abstract types +might be usable. For example, `fn x(y: @Trait)` as shorthand for +`fn x(y: T)`. +Since the exact semantic and user experience for these +locations are yet unclear +(`@Trait` would effectively behave completely different before and after the `->`), +this has also been excluded from this proposal. + +#### OIBIT semantic + +OIBITs leak through an abstract return type. This might be considered controversial, since +it effectively opens a channel where the result of function-local type inference affects +item-level API, but has been deemed worth it for the following reasons: + +- Ergonomics: Trait objects already have the issue of explicitly needing to + declare `Send`/`Sync`-ability, and not extending this problem to abstract return types + is desireable. +- Low real change, since the situation already exists with structs with private fields: + - In both cases, a change to the private implementation might change whether a OIBIT is + implemented or not. + - In both cases, the existence of OIBIT impls is not visible without doc tools + - In both cases, you can only assert the existence of OIBIT impls + by adding explicit trait bounds either to the API or to the crate's testsuite. + +This means, however, that it has to be considered a silent breaking change +to change a function with a abstract return type +in a way that removes OIBIT impls, which might be a problem. + +#### Anonymity + +A abstract return type can not be named - this is similar to how closures +and function items are already unnameable types, and might be considered +a problem because it makes it not possible to build explicitly typed API +around the return type of a function. + +The current semantic has been chosen for consistency and simplicity, +since the issue already exists with closures and function items, and +a solution to them will also apply here. + +For example, if named abstract types get added, then existing +abstract return types could get upgraded to having a name transparently. +Likewise, if `typeof` makes it into the language, then you could refer to the +return type of a function without naming it. + +#### Type transparency in recursive functions + +Functions with abstract return types can not see through their own return type, +making code like this not compile: + +```rust +fn sum_to(n: u32) -> @Display { + if n == 0 { + 0 + } else { + n + sum_to(n - 1) + } +} +``` + +This limitation exists because it is not clear how much a function body +can and should know about different instantiations of itself. + +It would be safe to allow recursive calls if the set of generic parameters +is identical, and it might even be safe if the generic parameters are different, +since you would still be inside the private body of the function, just +differently instantiated. + +But variance caused by lifetime parameters and the interaction with +specialization makes it uncertain whether this would be sound. + +In any case, it can be initially worked around by defining a local helper function like this: + +```rust +fn sum_to(n: u32) -> @Display { + fn sum_to_(n: u32) -> u32 { + if n == 0 { + 0 + } else { + n + sum_to_(n - 1) + } + } + sum_to_(n) +} +``` + +#### Not legal in function pointers/closure traits + +Because `@Trait` defines a type tied to the concrete function body, +it does not make much sense to talk about it separately in a function signature, +so the syntax is forbidden there. + +#### Compability with conditional trait bounds + +On valid critique for the existing `@Trait` proposal is that it does not +cover more complex scenarios, where the return type would implement +one or more traits depending on whether a type parameter does so with another. + +For example, a iterator adapter might want to implement `Iterator` and +`DoubleEndedIterator`, depending on whether the adapted one does: + +```rust +fn skip_one(i: I) -> SkipOne { ... } +struct SkipOne { ... } +impl Iterator for SkipOne { ... } +impl DoubleEndedIterator for SkipOne { ... } +``` + +Using just `-> @Iterator`, this would not be possible to reproduce. + +Since there has been no proposals so far that would address this in a way +that would conflict with the fixed-trait-set case, this RFC punts on that issue as well. + +#### Limitation to free/inherent functions + +One important usecase of abstract retutn types is to use them in trait methods. + +However, there is an issue with this, namely that in combinations with generic +trait methods, they are effectively equivalent to higher kinded types. +Which is an issue because Rust HKT story is not yet figured out, so +any "accidential implementation" might cause uninteded fallout. + +HKT allows you to be generic over a type constructor, aka a +"thing with type parameters", and then instantiate them at some later point to +get the actual type. +For example, given a HK type `T` that takes one type as parameter, you could +write code that uses `T` or `T` without caring about +whether `T = Vec`, `T = Box`, etc. + +Now if we look at abstract return types, we have a similar situation: + +```rust +trait Foo { + fn bar() -> impl Baz +} +``` + +Given a `T: Foo`, we could instantiate `T::bar::` or `T::bar::`, +and could get arbitrary different return types of `bar` instantiated +with a `u32` or `bool`, +just like `T` and `T` might give us `Vec` or `Box` +in the example above. + +The problem does not exists with trait method return types today because +they are concrete: + +```rust +trait Foo { + fn bar() -> X +} +``` + +Given the above code, there is no way for `bar` to choose a return type `X` +that could fundamentally differ between instantiations of `Self` +while still being instantiable with an arbitrary `U`. + +At most you could return a associated type, but then you'd loose the generics +from `bar` + +```rust +trait Foo { + type X; + fn bar() -> Self::X // No way to apply U +} +``` + +So, in conclusion, since Rusts HKT story is not yet fleshed out, +and the compatibility of the current compiler with it is unknown, +it is not yet possible to reach a concrete solution here. + +In addition to that, there are also different proposals as to whether +a abstract return type is its own thing or sugar for a associated type, +how it interacts with other associated items and so on, +so forbidding them in traits seems like the best initial course of action. + +# Drawbacks +[drawbacks]: #drawbacks + +> Why should we *not* do this? + +As has been elaborated on above, there are various way this feature could be +extended and combined with the language, so implementing it might +cause issues down the road if limitations or incompatibilities become apparent. + +# Alternatives +[alternatives]: #alternatives + +> What other designs have been considered? What is the impact of not doing this? + +See the links in the motivation section for a more detailed analysis. + +But basically, with this feature certain things remain hard or impossible to do +in Rust, like returning a efficiently usable type parametricised by +types private to a function body, like a iterator adapter containing a closure. + +# Unresolved questions +[unresolved]: #unresolved-questions + +> What parts of the design are still TBD? + +None for the core feature proposed here, but many for possible extensions as elaborated on in detailed design. From cc1a6983a15af552bfb4d84c3945692d8b6507fa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Mon, 1 Feb 2016 16:37:09 +0100 Subject: [PATCH 0721/1195] Fix formating --- text/0000-conservative-impl-trait.md | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index c23071b45b5..f46e1dec834 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -94,34 +94,36 @@ this one go into more detail on some of the design decisions. in the module system. This means type equality behaves like this: ```rust fn foo(t: T) -> @Trait { - t + t } fn bar() -> @Trait { - 123 + 123 } fn equal_type(a: T, b: T) {} - equal_type(bar(), bar()) // OK - equal_type(foo::(0), foo::(0)) // OK - equal_type(bar(), foo::(0)) // ERROR, `@Trait {bar}` is not the same type as `@Trait {foo}` - equal_type(foo::(false), foo::(0)) // ERROR, `@Trait {foo}` is not the same type as `@Trait {foo}` + equal_type(bar(), bar()); // OK + equal_type(foo::(0), foo::(0)); // OK + equal_type(bar(), foo::(0)); // ERROR, `@Trait {bar}` is not the same type as `@Trait {foo}` + equal_type(foo::(false), foo::(0)); // ERROR, `@Trait {foo}` is not the same type as `@Trait {foo}` ``` + - The function body can not see through its own return type, so code like this would be forbidden just like on the outside: ```rust fn sum_to(n: u32) -> @Display { if n == 0 { - 0 + 0 } else { - n + sum_to(n - 1) + n + sum_to(n - 1) } } ``` + - Abstract return types are considered `Sized`, just like all return types today. -#### Limitation to only retun type position +#### Limitation to only return type position There have been various proposed additional places where abstract types might be usable. For example, `fn x(y: @Trait)` as shorthand for From 7b80b5137e4f1b6142a24733c8d02260dc5a1e05 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Mon, 1 Feb 2016 16:43:26 +0100 Subject: [PATCH 0722/1195] Fix formating --- text/0000-conservative-impl-trait.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index f46e1dec834..4f1ce3dfe5d 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -239,12 +239,12 @@ that would conflict with the fixed-trait-set case, this RFC punts on that issue #### Limitation to free/inherent functions -One important usecase of abstract retutn types is to use them in trait methods. +One important usecase of abstract return types is to use them in trait methods. However, there is an issue with this, namely that in combinations with generic trait methods, they are effectively equivalent to higher kinded types. Which is an issue because Rust HKT story is not yet figured out, so -any "accidential implementation" might cause uninteded fallout. +any "accidential implementation" might cause unintended fallout. HKT allows you to be generic over a type constructor, aka a "thing with type parameters", and then instantiate them at some later point to From 1cedd1271f5c200571f1556a9dd74bf10fa6411c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Mon, 1 Feb 2016 16:45:09 +0100 Subject: [PATCH 0723/1195] Fix formating --- text/0000-conservative-impl-trait.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 4f1ce3dfe5d..d701b9b833b 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -315,7 +315,7 @@ cause issues down the road if limitations or incompatibilities become apparent. See the links in the motivation section for a more detailed analysis. -But basically, with this feature certain things remain hard or impossible to do +But basically, without this feature certain things remain hard or impossible to do in Rust, like returning a efficiently usable type parametricised by types private to a function body, like a iterator adapter containing a closure. From 9cef028c23207b2edaefcfee75b1f37345638f23 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Mon, 1 Feb 2016 16:45:36 +0100 Subject: [PATCH 0724/1195] Fix formating --- text/0000-conservative-impl-trait.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index d701b9b833b..8c4d5a883d7 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -317,7 +317,7 @@ See the links in the motivation section for a more detailed analysis. But basically, without this feature certain things remain hard or impossible to do in Rust, like returning a efficiently usable type parametricised by -types private to a function body, like a iterator adapter containing a closure. +types private to a function body, for example a iterator adapter containing a closure. # Unresolved questions [unresolved]: #unresolved-questions From 3f8fec6f71c75b0e8161d759295f2e54b4bd3299 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Mon, 1 Feb 2016 17:21:52 +0100 Subject: [PATCH 0725/1195] Addressed a few remarks --- text/0000-conservative-impl-trait.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 8c4d5a883d7..0d486506043 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -78,8 +78,11 @@ that will be used somewhat commonly. #### Semantic -The core semantic of the feature is described below. Note that the sections after -this one go into more detail on some of the design decisions. +The core semantic of the feature is described below. + +Note that the sections after this one go into more detail on some of the design +decisions, and that it is likely for most of the mentioned limitations to be +lifted at some point in the future. - `@Trait` may only be written at return type position of a freestanding or inherent-impl function, not in trait definitions, @@ -87,7 +90,9 @@ this one go into more detail on some of the design decisions. - The function body can return values of any type that implements Trait, but all return values need to be of the same type. - Outside of the function body, the return type is only known to implement Trait. -- As an exception to the above, OIBITS like `Send` and `Sync` leak through an abstract return type. +- As an exception to the above, OIBITS like `Send` and `Sync` leak through an + abstract return type. This will cause some additional complexity in the + compiler due to some non-local typechecking becoming neccessary. - The return type is unnameable. - The return type has a identity based on all generic parameters the function body is parametrized by, and by the location of the function From d7441bf81582591233040161b7c926bb1db746b4 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 1 Feb 2016 10:47:39 -0800 Subject: [PATCH 0726/1195] Clarify lifting drawback --- text/0000-impl-specialization.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/0000-impl-specialization.md b/text/0000-impl-specialization.md index 8dab544ff0e..644486e5866 100644 --- a/text/0000-impl-specialization.md +++ b/text/0000-impl-specialization.md @@ -1735,10 +1735,12 @@ it's easy to tell at a glance which of two impls will be preferred. On the other hand, the simplicity of this design has its own drawbacks: -- You have to lift out trait parameters to enable specialization, as in the - `Extend` example above. The RFC mentions a few ways of dealing with this - limitation -- either by employing inherent item specialization, or by - eventually generalizing HRTBs. +- You have to lift out trait parameters to enable specialization, as + in the `Extend` example above. Of course, this lifting can be hidden + behind an additional trait, so that the end-user interface remains + idiomatic. The RFC mentions a few other extensions for dealing with + this limitation -- either by employing inherent item specialization, + or by eventually generalizing HRTBs. - You can't use specialization to handle some of the more "exotic" cases of overlap, as described in the Limitations section above. This is a deliberate From 34cb52031d2ebc879940a01b2d1aec71fa35defb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 2 Feb 2016 14:50:42 +0100 Subject: [PATCH 0727/1195] Elaborate on details about trans, designd ecisions, and OIBIT behavior --- text/0000-conservative-impl-trait.md | 83 +++++++++++++++++++++------- 1 file changed, 64 insertions(+), 19 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 0d486506043..bfe7a7fc5c5 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -89,11 +89,21 @@ lifted at some point in the future. closure traits, function pointers, or any non-return type position. - The function body can return values of any type that implements Trait, but all return values need to be of the same type. -- Outside of the function body, the return type is only known to implement Trait. -- As an exception to the above, OIBITS like `Send` and `Sync` leak through an - abstract return type. This will cause some additional complexity in the - compiler due to some non-local typechecking becoming neccessary. -- The return type is unnameable. +- As far as the typesystem and the compiler is concerned, + the return type outside of the function would + not be a entirely "new" type, nor would it be + a simple type alias. Rather, its semantic would be very similar to that of + _generic type paramters_ inside a function, with small differences caused + by being an _output_ rather than an _input_ of the function. + - The type would be known to implement the specified traits. + - The type would not be known to implement any other trait, with + the exception of OIBITS and default traits like `Sized`. + - The type would not be considered equal to the actual underlying type. + - The type would not be allowed to be implemented on. + - The type would be unnameable, just like closures and function items. +- Because OIBITS like `Send` and `Sync` will leak through an + abstract return type, there will be some additional complexity in the + compiler due to some non-local type checking becoming necessary. - The return type has a identity based on all generic parameters the function body is parametrized by, and by the location of the function in the module system. This means type equality behaves like this: @@ -126,17 +136,35 @@ lifted at some point in the future. } ``` -- Abstract return types are considered `Sized`, just like all return types today. - -#### Limitation to only return type position - -There have been various proposed additional places where abstract types -might be usable. For example, `fn x(y: @Trait)` as shorthand for -`fn x(y: T)`. -Since the exact semantic and user experience for these -locations are yet unclear -(`@Trait` would effectively behave completely different before and after the `->`), -this has also been excluded from this proposal. +- The code generation passes of the compiler would + not draw a distinction between the abstract return type and the underlying type, + just like they don't for generic paramters. This means: + - The same trait code would be instantiated, for example, `-> @Any` + would return the type id of the underlying type. + - Specialization would specialize based on the underlying type. + +#### Why this semantic for the return type? + +There has been a lot of discussion about what the semantic of +the return type should be, with the theoretical extremes being "full return type inference" and "fully abstract type that behaves like a autogenerated newtype wrapper" + +The design as choosen in this RFC lies somewhat in between those two, +for the following reasons: + +- Usage of this feature should not imply worse performance + than not using it, so specialization and codegeneration has to + treat it the same. +- Likewise, there should not be any bad interactions + caused by part of the typesystem treating the return type different + than other parts, so it should not have its own "identity" + in the sense of allowing additional or different trait or inherent implementations. +- It should not enable return type inference in item signatures, + so the exact underlying type needs to be hidden. +- It should not cause type errors to change the function + body and/or the underlying type as long as the specifed trait + bounds are still satisfied. +- As a exception to the above, it should not act as a barrier to OIBITs like + `Send` and `Sync` due to ergonomic reasons. For more details, see next section. #### OIBIT semantic @@ -145,9 +173,11 @@ it effectively opens a channel where the result of function-local type inference item-level API, but has been deemed worth it for the following reasons: - Ergonomics: Trait objects already have the issue of explicitly needing to - declare `Send`/`Sync`-ability, and not extending this problem to abstract return types - is desireable. -- Low real change, since the situation already exists with structs with private fields: + declare `Send`/`Sync`-ability, and not extending this problem to abstract + return types is desireable. In practice, most uses + of this feature would have to add explicit bounds for OIBITS + if they want to be maximally usable. +- Low real change, since the situation already somewhat exists on structs with private fields: - In both cases, a change to the private implementation might change whether a OIBIT is implemented or not. - In both cases, the existence of OIBIT impls is not visible without doc tools @@ -158,6 +188,10 @@ This means, however, that it has to be considered a silent breaking change to change a function with a abstract return type in a way that removes OIBIT impls, which might be a problem. +But since the number of used OIBITs is relatvly small, +deducing the return type in a function body and reasoning +about whether such a breakage will occur has been deemed as a manageable amount of work. + #### Anonymity A abstract return type can not be named - this is similar to how closures @@ -174,6 +208,17 @@ abstract return types could get upgraded to having a name transparently. Likewise, if `typeof` makes it into the language, then you could refer to the return type of a function without naming it. +#### Limitation to only return type position + +There have been various proposed additional places where abstract types +might be usable. For example, `fn x(y: @Trait)` as shorthand for +`fn x(y: T)`. + +Since the exact semantic and user experience for these +locations are yet unclear +(`@Trait` would effectively behave completely different before and after the `->`), +this has also been excluded from this proposal. + #### Type transparency in recursive functions Functions with abstract return types can not see through their own return type, From d74cdf78e1cb73ae91ff9e47a5de0c2de8c9a44d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 2 Feb 2016 15:18:43 +0100 Subject: [PATCH 0728/1195] Raise an unresolved question about specialization --- text/0000-conservative-impl-trait.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index bfe7a7fc5c5..5cc848a9ddb 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -176,7 +176,7 @@ item-level API, but has been deemed worth it for the following reasons: declare `Send`/`Sync`-ability, and not extending this problem to abstract return types is desireable. In practice, most uses of this feature would have to add explicit bounds for OIBITS - if they want to be maximally usable. + if they wanted to be maximally usable. - Low real change, since the situation already somewhat exists on structs with private fields: - In both cases, a change to the private implementation might change whether a OIBIT is implemented or not. @@ -374,4 +374,12 @@ types private to a function body, for example a iterator adapter containing a cl > What parts of the design are still TBD? -None for the core feature proposed here, but many for possible extensions as elaborated on in detailed design. +- What happens if you specialize a function with an abstract return type, + and differ in whether the return type implements an OIBIT or not? + - It would mean that specialization choice + has to flow back into typechecking. + - It seems sound, but would mean that different input type combinations + of such a function could cause different OIBIT behavior independent + of the input type parameters themself. + - Which would not necessarily be an issue, since the actual type could not + be observed from the outside anyway. From d975dce592d222adb046b47377003d89c8ba9022 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 3 Feb 2016 16:23:01 -0800 Subject: [PATCH 0729/1195] RFC 1359 is CommandExt::{exec, before_exec} --- text/{0000-process-ext-unix.md => 1359-process-ext-unix.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-process-ext-unix.md => 1359-process-ext-unix.md} (97%) diff --git a/text/0000-process-ext-unix.md b/text/1359-process-ext-unix.md similarity index 97% rename from text/0000-process-ext-unix.md rename to text/1359-process-ext-unix.md index cc67f148dff..ece03ce6a75 100644 --- a/text/0000-process-ext-unix.md +++ b/text/1359-process-ext-unix.md @@ -1,7 +1,7 @@ - Feature Name: `process_exec` - Start Date: 2015-11-09 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1359](https://github.com/rust-lang/rfcs/pull/1359) +- Rust Issue: [rust-lang/rust#31398](https://github.com/rust-lang/rust/issues/31398) # Summary [summary]: #summary From 120c28b13c14a224ba54e519f2dbc418538db3a1 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 5 Feb 2016 13:59:04 -0500 Subject: [PATCH 0730/1195] Update unresolved question about exhaustiveness. --- text/0000-restrict-constants-in-patterns.md | 49 +++++++++++++-------- 1 file changed, 31 insertions(+), 18 deletions(-) diff --git a/text/0000-restrict-constants-in-patterns.md b/text/0000-restrict-constants-in-patterns.md index 6c1ec5f94b9..60a94c1ba9f 100644 --- a/text/0000-restrict-constants-in-patterns.md +++ b/text/0000-restrict-constants-in-patterns.md @@ -39,6 +39,10 @@ design of the RFC means that existing code that uses constant patterns will generally "just work". The justification for this change is that it is clarifying ["underspecified language semantics" clause, as described in RFC 1122][ls]. +A [recent crater run][crater] with a prototype implementation found 6 +regressions. + +[crater]: https://gist.github.com/nikomatsakis/e714e4a824527e0ce5c9 **Note:** this was also discussed on an [internals thread]. Major points from that thread are summarized either inline or in @@ -432,6 +436,33 @@ When deriving the `Eq` trait, we will add the `#[structural_match]` to the type in question. Attributes added in this way will be **exempt from the feature gate**. +## Exhaustiveness and dead-code checking + +We will treat user-defined structs "opaquely" for the purpose of +exhaustiveness and dead-code checking. This is required to allow for +semantic equality semantics in the future, since in that case we +cannot rely on `Eq` to be correctly implemented (e.g., it could always +return `false`, no matter values are supplied to it, even though it's +not supposed to). The impact of this change has not been evaluated but +is expected to be **very** small, since in practice it is rather +challenging to successfully make an exhaustive match using +user-defined constants, unless they are something trivial like +newtype'd booleans (and, in that case, you can update the code to use +a more extended pattern). + +Similarly, dead code detection should treat constants in a +conservative fashion. that is, we can recognize that if there are two +arms using the same constant, the second one is dead code, even though +it may be that neither will matches (e.g., `match foo { C => _, C => _ +}`). We will make no assumptions about two distinct constants, even if +we can concretely evaluate them to the same value. + +One **unresolved question** (described below) is what behavior to +adopt for constants that involve no user-defined types. There, the +definition of `Eq` is purely under our control, and we know that it +matches structural equality, so we can retain our current aggressive +analysis if desired. + ### Phasing We will not make this change instantaneously. Rather, for at least one @@ -493,24 +524,6 @@ constants for that purpose. # Unresolved questions [unresolved]: #unresolved-questions -**Should we also adjust the exhaustiveness and match analysis -algorithm to be more conservative around user-defined structs and -enums?** This RFC leaves exhaustiveness and dead-code checking -unchanged. If we adopted semantic equality semantics, then we would -have to assume that the `Eq` impls are not buggy in order for the -exhaustiveness checking to continue working like this (that is, we -would have to assume that `x == x` always returned true). That said, -this might be OK, so long as the compiler handles the failure in some -graceful way, rather than generating undefined behavior. Furthermore, -in practice it is rather challenging to successfully make an -exhaustive match using user-defined constants unless they are -something trivial like newtype'd bools. - -Still, for maximum flexibility, the ideal behavior would be to be -conservative around exhaustiveness checking, but still detect and warn -about "dead-code" arms (e.g., `match foo { C => _, C => _ }`). We -would want to determine how possible this is. - **What about exhaustiveness etc on builtin types?** Even if we ignore user-defined types, there are complications around exhaustiveness checking for constants of any kind related to associated constants and From 3855c84457484dd390594f1fda8d4cd3c309ca4f Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 5 Feb 2016 14:02:17 -0500 Subject: [PATCH 0731/1195] Merge RFC #1445: restrict constants in patterns. --- text/0000-restrict-constants-in-patterns.md | 8 +- text/1445-restrict-constants-in-patterns.md | 621 ++++++++++++++++++++ 2 files changed, 625 insertions(+), 4 deletions(-) create mode 100644 text/1445-restrict-constants-in-patterns.md diff --git a/text/0000-restrict-constants-in-patterns.md b/text/0000-restrict-constants-in-patterns.md index 60a94c1ba9f..c4cb33b9fcf 100644 --- a/text/0000-restrict-constants-in-patterns.md +++ b/text/0000-restrict-constants-in-patterns.md @@ -1,7 +1,7 @@ -- Feature Name: (fill me in with a unique ident, my_awesome_feature) -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Feature Name: structural_match +- Start Date: 2015-02-06 +- RFC PR: [rust-lang/rfcs#1445](https://github.com/rust-lang/rfcs/pull/1445) +- Rust Issue: [rust-lang/rust#31434](https://github.com/rust-lang/rust/issues/31434) # Summary [summary]: #summary diff --git a/text/1445-restrict-constants-in-patterns.md b/text/1445-restrict-constants-in-patterns.md new file mode 100644 index 00000000000..60a94c1ba9f --- /dev/null +++ b/text/1445-restrict-constants-in-patterns.md @@ -0,0 +1,621 @@ +- Feature Name: (fill me in with a unique ident, my_awesome_feature) +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +The current compiler implements a more expansive semantics for pattern +matching than was originally intended. This RFC introduces several +mechanisms to reign in these semantics without actually breaking +(much, if any) extant code: + +- Introduce a feature-gated attribute `#[structural_match]` which can + be applied to a struct or enum `T` to indicate that constants of + type `T` can be used within patterns. +- Have `#[derive(Eq)]` automatically apply this attribute to + the struct or enum that it decorates. **Automatically inserted attributes + do not require use of feature-gate.** +- When expanding constants of struct or enum type into equivalent + patterns, require that the struct or enum type is decorated with + `#[structural_match]`. Constants of builtin types are always + expanded. + +The practical effect of these changes will be to prevent the use of +constants in patterns unless the type of those constants is either a +built-in type (like `i32` or `&str`) or a user-defined constant for +which `Eq` is **derived** (not merely *implemented*). + +To be clear, this `#[structural_match]` attribute is **never intended +to be stabilized**. Rather, the intention of this change is to +restrict constant patterns to those cases that everyone can agree on +for now. We can then have further discussion to settle the best +semantics in the long term. + +Because the compiler currently accepts arbitrary constant patterns, +this is technically a backwards incompatible change. However, the +design of the RFC means that existing code that uses constant patterns +will generally "just work". The justification for this change is that +it is clarifying +["underspecified language semantics" clause, as described in RFC 1122][ls]. +A [recent crater run][crater] with a prototype implementation found 6 +regressions. + +[crater]: https://gist.github.com/nikomatsakis/e714e4a824527e0ce5c9 + +**Note:** this was also discussed on an [internals thread]. Major +points from that thread are summarized either inline or in +alternatives. + +[ls]: https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md#underspecified-language-semantics +[crater run]: https://gist.github.com/nikomatsakis/26096ec2a2df3c1fb224 +[internals thread]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846) + +# Motivation +[motivation]: #motivation + +The compiler currently permits any kind of constant to be used within +a pattern. However, the *meaning* of such a pattern is somewhat +controversial: the current semantics implemented by the compiler were +[adopted in July of 2014](https://github.com/rust-lang/rust/pull/15650) +and were never widely discussed nor did they go through the RFC +process. Moreover, the discussion at the time was focused primarily on +implementation concerns, and overlooked the potential semantic +hazards. + +### Semantic vs structural equality + +Consider a program like this one, which references a constant value +from within a pattern: + +```rust +struct SomeType { + a: u32, + b: u32, +} + +const SOME_CONSTANT: SomeType = SomeType { a: 22+22, b: 44+44 }; + +fn test(v: SomeType) { + match v { + SOME_CONSTANT => println!("Yes"), + _ => println!("No"), + } +} +``` + +The question at hand is what do we expect this match to do, precisely? +There are two main possibilities: semantic and structural equality. + +**Semantic equality.** Semantic equality states that a pattern +`SOME_CONSTANT` matches a value `v` if `v == SOME_CONSTANT`. In other +words, the `match` statement above would be exactly equivalent to an +`if`: + +```rust +if v == SOME_CONSTANT { + println!("Yes") +} else { + println!("No"); +} +``` + +Under semantic equality, the program above would not compile, because +`SomeType` does not implement the `PartialEq` trait. + +**Structural equality.** Under structural equality, `v` matches the +pattern `SOME_CONSTANT` if all of its fields are (structurally) equal. +Primitive types like `u32` are structurally equal if they represent +the same value (but see below for discussion about floating point +types like `f32` and `f64`). This means that the `match` statement +above would be roughly equivalent to the following `if` (modulo +privacy): + +```rust +if v.a == SOME_CONSTANT.a && v.b == SOME_CONSTANT.b { + println!("Yes") +} else { + println!("No"); +} +``` + +Structural equality basically says "two things are structurally equal +if their fields are structurally equal". It is sort of equality you +would get if everyone used `#[derive(PartialEq)]` on all types. Note +that the equality defined by structural equality is completely +distinct from the `==` operator, which is tied to the `PartialEq` +traits. That is, two values that are *semantically unequal* could be +*structurally equal* (an example where this might occur is the +floating point value `NaN`). + +**Current semantics.** The compiler's current semantics are basically +structural equality, though in the case of floating point numbers they +are arguably closer to semantic equality (details below). In +particular, when a constant appears in a pattern, the compiler first +evaluates that constant to a specific value. So we would reduce the +expression: + +```rust +const SOME_CONSTANT: SomeType = SomeType { a: 22+22, b: 44+44 }; +``` + +to the value `SomeType { a: 44, b: 88 }`. We then expand the pattern +`SOME_CONSTANT` as though you had typed this value in place (well, +almost as though, read on for some complications around privacy). +Thus the match statement above is equivalent to: + +```rust +match v { + SomeType { a: 44, b: 88 } => println!(Yes), + _ => println!("No"), +} +``` + +### Disadvantages of the current approach + +Given that the compiler already has a defined semantics, it is +reasonable to ask why we might want to change it. There +are two main disadvantages: + +1. **No abstraction boundary.** The current approach does not permit + types to define what equality means for themselves (at least not if + they can be constructed in a constant). +2. **Scaling to associated constants.** The current approach does not + permit associated constants or generic integers to be used in a + match statement. + +#### Disadvantage: Weakened abstraction bounary + +The single biggest concern with structural equality is that it +introduces two distinct notions of equality: the `==` operator, based +on the `PartialEq` trait, and pattern matching, based on a builtin +structural recursion. This will cause problems for user-defined types +that rely on `PartialEq` to define equality. Put another way, **it is +no longer possible for user-defined types to completely define what +equality means for themselves** (at least not if they can be +constructed in a constant). Furthermore, because the builtin +structural recursion does not consider privacy, `match` statements can +now be used to **observe private fields**. + +**Example: Normalized durations.** Consider a simple duration type: + +```rust +#[derive(Copy, Clone)] +pub struct Duration { + pub seconds: u32, + pub minutes: u32, +} +``` + +Let's say that this `Duration` type wishes to represent a span of +time, but it also wishes to preserve whether that time was expressed +in seconds or minutes. In other words, 60 seconds and 1 minute are +equal values, but we don't want to normalize 60 seconds into 1 minute; +perhaps because it comes from user input and we wish to keep things +just as the user chose to express it. + +We might implement `PartialEq` like so (actually the `PartialEq` trait +is slightly different, but you get the idea): + +```rust +impl PartialEq for Duration { + fn eq(&self, other: &Duration) -> bool { + let s1 = (self.seconds as u64) + (self.minutes as u64 * 60); + let s2 = (other.seconds as u64) + (other.minutes as u64 * 60); + s1 == s2 + } +} +``` + +Now imagine I have some constants: + +```rust +const TWENTY_TWO_SECONDS: Duration = Duration { seconds: 22, minutes: 0 }; +const ONE_MINUTE: Duration = Duration { seconds: 0, minutes: 1 }; +``` + +And I write a match statement using those constants: + +```rust +fn detect_some_case_or_other(d: Duration) { + match d { + TWENTY_TWO_SECONDS => /* do something */, + ONE_MINUTE => /* do something else */, + _ => /* do something else again */, + } +} +``` + +Now this code is, in all probability, buggy. Probably I meant to use +the notion of equality that `Duration` defined, where seconds and +minutes are normalized. But that is not the behavior I will see -- +instead I will use a pure structural match. What's worse, this means +the code will probably work in my local tests, since I like to say +"one minute", but it will break when I demo it for my customer, since +she prefers to write "60 seconds". + +**Example: Floating point numbers.** Another example is floating point +numbers. Consider the case of `0.0` and `-0.0`: these two values are +distinct, but they typically behave the same; so much so that they +compare equal (that is, `0.0 == -0.0` is `true`). So it is likely +that code such as: + +```rust +match some_computation() { + 0.0 => ..., + x => ..., +} +``` + +did not intend to discriminate between zero and negative zero. In +fact, in the compiler today, match *will* compare 0.0 and -0.0 as +equal. We simply do not extend that courtesy to user-defined types. + +**Example: observing private fields.** The current constant expansion +code does not consider privacy. In other words, constants are expanded +into equivalent patterns, but those patterns may not have been +something the user could have typed because of privacy rules. Consider +a module like: + +```rust +mod foo { + pub struct Foo { b: bool } + pub const V1: Foo = Foo { b: true }; + pub const V2: Foo = Foo { b: false }; +} +``` + +Note that there is an abstraction boundary here: b is a private +field. But now if I wrote code from another module that matches on a +value of type Foo, that abstraction boundary is pierced: + +```rust +fn bar(f: x::Foo) { + // rustc knows this is exhaustive because if expanded `V1` into + // equivalent patterns; patterns you could not write by hand! + match f { + x::V1 => { /* moreover, now we know that f.b is true */ } + x::V2 => { /* and here we know it is false */ } + } +} +``` + +Note that, because `Foo` does not implement `PartialEq`, just having +access to `V1` would not otherwise allow us to observe the value of +`f.b`. (And even if `Foo` *did* implement `PartialEq`, that +implementation might not read `f.b`, so we still would not be able to +observe its value.) + +**More examples.** There are numerous possible examples here. For +example, strings that compare using case-insensitive comparisons, but +retain the original case for reference, such as those used in +file-systems. Views that extract a subportion of a larger value (and +hence which should only compare that subportion). And so forth. + +#### Disadvantage: Scaling to associated constants and generic integers + +Rewriting constants into patterns requires that we can **fully +evaluate** the constant at the time of exhaustiveness checking. For +associated constants and type-level integers, that is not possible -- +we have to wait until monomorphization time. Consider: + +```rust +trait SomeTrait { + const A: bool; + const B: bool; +} + +fn foo(x: bool) { + match x { + T::A => println!("A"), + T::B => println!("B"), + } +} + +impl SomeTrait for i32 { + const A: bool = true; + const B: bool = true; +} + +impl SomeTrait for u32 { + const A: bool = true; + const B: bool = false; +} +``` + +Is this match exhaustive? Does it contain dead code? The answer will +depend on whether `T=i32` or `T=u32`, of course. + +### Advantages of the current approach + +However, structural equality also has a number of advantages: + +**Better optimization.** One of the biggest "pros" is that it can +potentially enable nice optimization. For example, given constants like the following: + +```rust +struct Value { x: u32 } +const V1: Value = Value { x: 0 }; +const V2: Value = Value { x: 1 }; +const V3: Value = Value { x: 2 }; +const V4: Value = Value { x: 3 }; +const V5: Value = Value { x: 4 }; +``` + +and a match pattern like the following: + +```rust +match v { + V1 => ..., + ..., + V5 => ..., +} +``` + +then, because pattern matching is always a process of structurally +extracting values, we can compile this to code that reads the field +`x` (which is a `u32`) and does an appropriate switch on that +value. Semantic equality would potentially force a more conservative +compilation strategy. + +**Better exhautiveness and dead-code checking.** Similarly, we can do +more thorough exhaustiveness and dead-code checking. So for example if +I have a struct like: + +```rust +struct Value { field: bool } +const TRUE: Value { field: true }; +const FALSE: Value { field: false }; +``` + +and a match pattern like: + +```rust +match v { TRUE => .., FALSE => .. } +``` + +then we can prove that this match is exhaustive. Similarly, we can prove +that the following match contains dead-code: + +```rust +const A: Value { field: true }; +match v { + TRUE => ..., + A => ..., +} +``` + +Again, some of the alternatives might not allow this. (But note the +cons, which also raise the question of exhaustiveness checking.) + +**Nullary variants and constants are (more) equivalent.** Currently, +there is a sort of equivalence between enum variants and constants, at +least with respect to pattern matching. Consider a C-like enum: + +```rust +enum Modes { + Happy = 22, + Shiny = 44, + People = 66, + Holding = 88, + Hands = 110, +} + +const C: Modes = Modes::Happy; +``` + +Now if I match against `Modes::Happy`, that is matching against an +enum variant, and under *all* the proposals I will discuss below, it +will check the actual variant of the value being matched (regardless +of whether `Modes` implements `PartialEq`, which it does not here). On +the other hand, if matching against `C` were to require a `PartialEq` +impl, then it would be illegal. Therefore matching against an *enum +variant* is distinct from matching against a *constant*. + +# Detailed design +[design]: #detailed-design + +The goal of this RFC is not to decide between semantic and structural +equality. Rather, the goal is to restrict pattern matching to that subset +of types where the two variants behave roughly the same. + +### The structural match attribute + +We will introduce an attribute `#[structural_match]` which can be +applied to struct and enum types. Explicit use of this attribute will +(naturally) be feature-gated. When converting a constant value into a +pattern, if the constant is of struct or enum type, we will check +whether this attribute is present on the struct -- if so, we will +convert the value as we do today. If not, we will report an error that +the struct/enum value cannot be used in a pattern. + +### Behavior of `#[derive(Eq)]` + +When deriving the `Eq` trait, we will add the `#[structural_match]` to +the type in question. Attributes added in this way will be **exempt from +the feature gate**. + +## Exhaustiveness and dead-code checking + +We will treat user-defined structs "opaquely" for the purpose of +exhaustiveness and dead-code checking. This is required to allow for +semantic equality semantics in the future, since in that case we +cannot rely on `Eq` to be correctly implemented (e.g., it could always +return `false`, no matter values are supplied to it, even though it's +not supposed to). The impact of this change has not been evaluated but +is expected to be **very** small, since in practice it is rather +challenging to successfully make an exhaustive match using +user-defined constants, unless they are something trivial like +newtype'd booleans (and, in that case, you can update the code to use +a more extended pattern). + +Similarly, dead code detection should treat constants in a +conservative fashion. that is, we can recognize that if there are two +arms using the same constant, the second one is dead code, even though +it may be that neither will matches (e.g., `match foo { C => _, C => _ +}`). We will make no assumptions about two distinct constants, even if +we can concretely evaluate them to the same value. + +One **unresolved question** (described below) is what behavior to +adopt for constants that involve no user-defined types. There, the +definition of `Eq` is purely under our control, and we know that it +matches structural equality, so we can retain our current aggressive +analysis if desired. + +### Phasing + +We will not make this change instantaneously. Rather, for at least one +release cycle, users who are pattern matching on struct types that +lack `#[structural_match]` will be warned about imminent breakage. + +# Drawbacks +[drawbacks]: #drawbacks + +This is a breaking change, which means some people might have to +change their code. However, that is considered extremely unlikely, +because such users would have to be pattern matching on constants that +are not comparable for equality (this is likely a bug in any case). + +# Alternatives +[alternatives]: #alternatives + + **Limit matching to builtin types.** An earlier version of this RFC +limited matching to builtin types like integers (and tuples of +integers). This RFC is a generalization of that which also +accommodates struct types that derive `Eq`. + +**Embrace current semantics (structural equality).** Naturally we +could opt to keep the semantics as they are. The advantages and +disadvantages are discussed above. + +**Embrace semantic equality.** We could opt to just go straight +towards "semantic equality". However, it seems better to reset the +semantics to a base point that everyone can agree on, and then extend +from that base point. Moreover, adopting semantic equality straight +out would be a riskier breaking change, as it could silently change +the semantics of existing programs (whereas the current proposal only +causes compilation to fail, never changes what an existing program +will do). + +# Discussion thread summary + +This section summarizes various points that were raised in the +[internals thread] which are related to patterns but didn't seem to +fit elsewhere. + +**Overloaded patterns.** Some languages, notably Scala, permit +overloading of patterns. This is related to "semantic equality" in +that it involves executing custom, user-provided code at compilation +time. + +**Pattern synonyms.** Haskell offers a feature called "pattern +synonyms" and +[it was argued](https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/39?u=nikomatsakis) +that the current treatment of patterns can be viewed as a similar +feature. This may be true, but constants-in-patterns are lacking a +number of important features from pattern synonyms, such as bindings, +as +[discussed in this response](https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/48?u=nikomatsakis). +The author feels that pattern synonyms might be a useful feature, but +it would be better to design them as a first-class feature, not adapt +constants for that purpose. + +# Unresolved questions +[unresolved]: #unresolved-questions + +**What about exhaustiveness etc on builtin types?** Even if we ignore +user-defined types, there are complications around exhaustiveness +checking for constants of any kind related to associated constants and +other possible future extensions. For example, the following code +[fails to compile](http://is.gd/PJjNKl) because it contains dead-code: + +```rust +const X: u64 = 0; +const Y: u64 = 0; +fn bar(foo: u64) { + match foo { + X => { } + Y => { } + _ => { } + } +} +``` + +However, we would be unable to perform such an analysis in a more +generic context, such as with an associated constant: + +```rust +trait Trait { + const X: u64; + const Y: u64; +} + +fn bar(foo: u64) { + match foo { + T::X => { } + T::Y => { } + _ => { } + } +} +``` + +Here, although it may well be that `T::X == T::Y`, we can't know for +sure. So, for consistency, we may wish to treat all constants opaquely +regardless of whether we are in a generic context or not. (However, it +also seems reasonable to make a "best effort" attempt at +exhaustiveness and dead pattern checking, erring on the conservative +side in those cases where constants cannot be fully evaluated.) + +A different argument in favor of treating all constants opaquely is +that the current behavior can leak details that perhaps were intended +to be hidden. For example, imagine that I define a fn `hash` that, +given a previous hash and a value, produces a new hash. Because I am +lazy and prototyping my system, I decide for now to just ignore the +new value and pass the old hash through: + +```rust +const fn add_to_hash(prev_hash: u64, _value: u64) -> u64 { + prev_hash +} +``` + +Now I have some consumers of my library and they define a few constants: + +```rust +const HASH_OF_ZERO: add_to_hash(0, 0); +const HASH_OF_ONE: add_to_hash(0, 1); +``` + +And at some point they write a match statement: + +```rust +fn process_hash(h: u64) { + match h { + HASH_OF_ZERO => /* do something */, + HASH_OF_ONE => /* do something else */, + _ => /* do something else again */, +} +``` + +As before, what you get when you [compile this](http://is.gd/u5WtCo) +is a dead-code error, because the compiler can see that `HASH_OF_ZERO` +and `HASH_OF_ONE` are the same value. + +Part of the solution here might be making "unreachable patterns" a +warning and not an error. The author feels this would be a good idea +regardless (though not necessarily as part of this RFC). However, +that's not a complete solution, since -- at least for `bool` constants +-- the same issues arise if you consider exhaustiveness checking. + +On the other hand, it feels very silly for the compiler not to +understand that `match some_bool { true => ..., false => ... }` is +exhaustive. Furthermore, there are other ways for the values of +constants to "leak out", such as when part of a type like +`[u8; SOME_CONSTANT]` (a point made by both [arielb1][arielb1ac] and +[glaebhoerl][gac] on the [internals thread]). Therefore, the proper +way to address this question is perhaps to consider an explicit form +of "abstract constant". + +[arielb1ac]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/9?u=nikomatsakis +[gac]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/32?u=nikomatsakis From 9f7dc89ee6181327fe36459c8005913d61e8437f Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 5 Feb 2016 14:31:13 -0500 Subject: [PATCH 0732/1195] Merge and rename RFC #243. --- ...ception-handling.md => 0243-trait-based-exception-handling.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename active/{0000-trait-based-exception-handling.md => 0243-trait-based-exception-handling.md} (100%) diff --git a/active/0000-trait-based-exception-handling.md b/active/0243-trait-based-exception-handling.md similarity index 100% rename from active/0000-trait-based-exception-handling.md rename to active/0243-trait-based-exception-handling.md From 94390a2ce4330959d5a5ede7d69017e1d3480380 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 5 Feb 2016 14:47:00 -0500 Subject: [PATCH 0733/1195] Adjust to narrow to the keyword `catch` --- active/0243-trait-based-exception-handling.md | 233 +++++++----------- 1 file changed, 91 insertions(+), 142 deletions(-) diff --git a/active/0243-trait-based-exception-handling.md b/active/0243-trait-based-exception-handling.md index 507af12c767..d12380ffed5 100644 --- a/active/0243-trait-based-exception-handling.md +++ b/active/0243-trait-based-exception-handling.md @@ -12,7 +12,7 @@ The new constructs are: * An `?` operator for explicitly propagating "exceptions". - * A `try`..`catch` construct for conveniently catching and handling + * A `catch { ... }` expression for conveniently catching and handling "exceptions". The idea for the `?` operator originates from [RFC PR 204][204] by @@ -39,10 +39,11 @@ These constructs are strict additions to the existing language, and apart from the issue of keywords, the legality and behavior of all currently existing Rust programs is entirely unaffected. -The most important additions are a postfix `?` operator for propagating -"exceptions" and a `try`..`catch` block for catching and handling them. By an -"exception", for now, we essentially just mean the `Err` variant of a `Result`. - +The most important additions are a postfix `?` operator for +propagating "exceptions" and a `catch {..}` expression for catching +them. By an "exception", for now, we essentially just mean the `Err` +variant of a `Result`, though the Unresolved Questions includes some +discussion of extending to other types. ## `?` operator @@ -112,54 +113,31 @@ forwarding from `From`). The precise requirements for a conversion to be "like" a subtyping coercion are an open question; see the "Unresolved questions" section. - -## `try`..`catch` - -Like most other things in Rust, and unlike other languages that I know of, -`try`..`catch` is an *expression*. If no exception is thrown in the `try` block, -the `try`..`catch` evaluates to the value of `try` block; if an exception is -thrown, it is passed to the `catch` block, and the `try`..`catch` evaluates to -the value of the `catch` block. As with `if`..`else` expressions, the types of -the `try` and `catch` blocks must therefore unify. Unlike other languages, only -a single type of exception may be thrown in the `try` block (a `Result` only has -a single `Err` type); all exceptions are always caught; and there may only be -one `catch` block. This dramatically simplifies thinking about the behavior of -exception-handling code. - -There are two variations on this theme: - - 1. `try { EXPR }` - - In this case the `try` block evaluates directly to a `Result` containing - either the value of `EXPR`, or the exception which was thrown. For instance, - `try { foo()? }` is essentially equivalent to `foo()`. This can be useful if - you want to coalesce *multiple* potential exceptions - - `try { foo()?.bar()?.baz()? }` - into a single `Result`, which you wish to - then e.g. pass on as-is to another function, rather than analyze yourself. - - 2. `try { EXPR } catch { PAT => EXPR, PAT => EXPR, ... }` - - For example: - - try { - foo()?.bar()? - } catch { - Red(rex) => baz(rex), - Blue(bex) => quux(bex) - } - - Here the `catch` performs a `match` on the caught exception directly, using - any number of refutable patterns. This form is convenient for handling the - exception in-place. - +## `catch` expressions + +This RFC also introduces an expression form `catch {..}`, which serves +to "scope" the `?` operator. The `catch` operator executes its +associated block. If no exception is thrown, then the result is +`Ok(v)` where `v` is the value of the block. Otherwise, if an +exception is thrown, then the result is `Err(e)`. Note that unlike +other languages, a `catch` block always catches all errors, and they +must all be coercable to a single type, as a `Result` only has a +single `Err` type. This dramatically simplifies thinking about the +behavior of exception-handling code. + +Note that `catch { foo()? }` is essentially equivalent to `foo()`. +`catch` can be useful if you want to coalesce *multiple* potential +exceptions -- `try { foo()?.bar()?.baz()? }` -- into a single +`Result`, which you wish to then e.g. pass on as-is to another +function, rather than analyze yourself. (The last example could also +be expressed using a series of `and_then` calls.) # Detailed design The meaning of the constructs will be specified by a source-to-source -translation. We make use of an "early exit from any block" feature which doesn't -currently exist in the language, generalizes the current `break` and `return` -constructs, and is independently useful. - +translation. We make use of an "early exit from any block" feature +which doesn't currently exist in the language, generalizes the current +`break` and `return` constructs, and is independently useful. ## Early exit from any block @@ -250,42 +228,6 @@ are merely one way. }.bar()) } - * Construct: - - try { - foo()?.bar() - } catch { - A(a) => baz(a), - B(b) => quux(b) - } - - Shallow: - - match (try { - foo()?.bar() - }) { - Ok(a) => a, - Err(e) => match e { - A(a) => baz(a), - B(b) => quux(b) - } - } - - Deep: - - match ('here: { - Ok(match foo() { - Ok(a) => a, - Err(e) => break 'here Err(e.into()) - }.bar()) - }) { - Ok(a) => a, - Err(e) => match e { - A(a) => baz(a), - B(b) => quux(b) - } - } - The fully expanded translations get quite gnarly, but that is why it's good that you don't have to write them! @@ -325,9 +267,63 @@ independently. These questions should be satisfactorally resolved before stabilizing the relevant features, at the latest. +## Optional `match` sugar + +Originally, the RFC included the ability to `match` the errors caught +by a `catch` by writing `catch { .. } match { .. }`, which could be translated +as follows: + + * Construct: + + catch { + foo()?.bar() + } match { + A(a) => baz(a), + B(b) => quux(b) + } + + Shallow: + + match (catch { + foo()?.bar() + }) { + Ok(a) => a, + Err(e) => match e { + A(a) => baz(a), + B(b) => quux(b) + } + } + + Deep: + + match ('here: { + Ok(match foo() { + Ok(a) => a, + Err(e) => break 'here Err(e.into()) + }.bar()) + }) { + Ok(a) => a, + Err(e) => match e { + A(a) => baz(a), + B(b) => quux(b) + } + } + +However, it was removed for the following reasons: + +- The `catch` (originally: `try`) keyword adds the real expressive "step up" here, the `match` (originally: `catch`) was just sugar for `unwrap_or`. +- It would be easy to add further sugar in the future, once we see how `catch` is used (or not used) in practice. +- There was some concern about potential user confusion about two aspects: + - `catch { }` yields a `Result` but `catch { } match { }` yields just `T`; + - `catch { } match { }` handles all kinds of errors, unlike `try/catch` in other languages which let you pick and choose. + +It may be worth adding such a sugar in the future, or perhaps a +variant that binds irrefutably and does not immediately lead into a +`match` block. + ## Choice of keywords -The RFC to this point uses the keywords `try`..`catch`, but there are a number +The RFC to this point uses the keyword `catch`, but there are a number of other possibilities, each with different advantages and drawbacks: * `try { ... } catch { ... }` @@ -358,7 +354,6 @@ Among the considerations: * Language-level backwards compatibility when adding new keywords. I'm not sure how this could or should be handled. - ## Semantics for "upcasting" What should the contract for a `From`/`Into` `impl` be? Are these even the right @@ -401,12 +396,20 @@ Some further thoughts and possibilities on this matter, only as brainstorming: (This perhaps ties into the subtyping angle: `Ipv4Addr` is clearly not a supertype of `u32`.) - ## Forwards-compatibility If we later want to generalize this feature to other types such as `Option`, as described below, will we be able to do so while maintaining backwards-compatibility? +## Monadic do notation + +There have been many comparisons drawn between this syntax and monadic +do notation. Before stabilizing, we should determine whether we plan +to make changes to better align this feature with a possible `do` +notation (for example, by removing the implicit `Ok` at the end of a +`catch` block). Note that such a notation would have to extend the +standard monadic bind to accommodate rich control flow like `break`, +`continue`, and `return`. # Drawbacks @@ -466,60 +469,6 @@ described below, will we be able to do so while maintaining backwards-compatibil This RFC doesn't propose doing so at this time, but as it would be an independently useful feature, it could be added as well. -## An additional `catch` form to bind the caught exception irrefutably - -The `catch` described above immediately passes the caught exception into a -`match` block. It may sometimes be desirable to instead bind it directly to a -single variable. That might look like this: - - try { EXPR } catch IRR-PAT { EXPR } - -Where `catch` is followed by any irrefutable pattern (as with `let`). - -For example: - - try { - foo()?.bar()? - } catch e { - let x = baz(e); - quux(x, e); - } - -While it may appear to be extravagant to provide both forms, there is reason to -do so: either form on its own leads to unavoidable rightwards drift under some -circumstances. - -The first form leads to rightwards drift if one wishes to do more complex -multi-statement work with the caught exception: - - try { - foo()?.bar()? - } catch { - e => { - let x = baz(e); - quux(x, e); - } - } - -This single case arm is quite redundant and unfortunate. - -The second form leads to rightwards drift if one wishes to `match` on the caught -exception: - - try { - foo()?.bar()? - } catch e { - match e { - Red(rex) => baz(rex), - Blue(bex) => quux(bex) - } - } - -This `match e` is quite redundant and unfortunate. - -Therefore, neither form can be considered strictly superior to the other, and it -may be preferable to simply provide both. - ## `throw` and `throws` It is possible to carry the exception handling analogy further and also add From 00e839fe4d5a7dab29db13476fa723166375dead Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 5 Feb 2016 15:54:29 -0500 Subject: [PATCH 0734/1195] Update "metadata" --- active/0243-trait-based-exception-handling.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/active/0243-trait-based-exception-handling.md b/active/0243-trait-based-exception-handling.md index d12380ffed5..d2155cb689c 100644 --- a/active/0243-trait-based-exception-handling.md +++ b/active/0243-trait-based-exception-handling.md @@ -1,6 +1,7 @@ +- Feature-gates: `question_mark`, `try_catch` - Start Date: 2014-09-16 -- RFC PR #: (leave this empty) -- Rust Issue #: (leave this empty) +- RFC PR #: [rust-lang/rfcs#243](https://github.com/rust-lang/rfcs/pull/243) +- Rust Issue #: [rust-lang/rust#31436](https://github.com/rust-lang/rust/issues/31436) # Summary From d9e79e4c57c7dcc1c2116e18912ef93db56ae5ff Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 5 Feb 2016 17:59:03 -0500 Subject: [PATCH 0735/1195] s/try/catch --- active/0243-trait-based-exception-handling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/active/0243-trait-based-exception-handling.md b/active/0243-trait-based-exception-handling.md index d2155cb689c..f89420e020f 100644 --- a/active/0243-trait-based-exception-handling.md +++ b/active/0243-trait-based-exception-handling.md @@ -128,7 +128,7 @@ behavior of exception-handling code. Note that `catch { foo()? }` is essentially equivalent to `foo()`. `catch` can be useful if you want to coalesce *multiple* potential -exceptions -- `try { foo()?.bar()?.baz()? }` -- into a single +exceptions -- `catch { foo()?.bar()?.baz()? }` -- into a single `Result`, which you wish to then e.g. pass on as-is to another function, rather than analyze yourself. (The last example could also be expressed using a series of `and_then` calls.) From ee281d7119bf221f3b3e64c8dba2e7bcbad15e22 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 5 Feb 2016 18:03:29 -0500 Subject: [PATCH 0736/1195] Correct a few more references to `try` in RFC 243 --- active/0243-trait-based-exception-handling.md | 31 ++++++++----------- 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/active/0243-trait-based-exception-handling.md b/active/0243-trait-based-exception-handling.md index f89420e020f..c26500bee48 100644 --- a/active/0243-trait-based-exception-handling.md +++ b/active/0243-trait-based-exception-handling.md @@ -61,11 +61,11 @@ Naturally, in this case the types of the "exceptions thrown by" `foo()` and `bar()` must unify. Like the current `try!()` macro, the `?` operator will also perform an implicit "upcast" on the exception type. -When used outside of a `try` block, the `?` operator propagates the exception to +When used outside of a `catch` block, the `?` operator propagates the exception to the caller of the current function, just like the current `try!` macro does. (If the return type of the function isn't a `Result`, then this is a type error.) -When used inside a `try` block, it propagates the exception up to the innermost -`try` block, as one would expect. +When used inside a `catch` block, it propagates the exception up to the innermost +`catch` block, as one would expect. Requiring an explicit `?` operator to propagate exceptions strikes a very pleasing balance between completely automatic exception propagation, which most @@ -203,7 +203,7 @@ are merely one way. Err(e) => break 'here Err(e.into()) } - Where `'here` refers to the innermost enclosing `try` block, or to `'fn` if + Where `'here` refers to the innermost enclosing `catch` block, or to `'fn` if there is none. The `?` operator has the same precedence as `.`. @@ -247,21 +247,18 @@ items. Without any attempt at completeness, here are some things which should be true: - * `try { foo() } ` = `Ok(foo())` - * `try { Err(e)? } ` = `Err(e.into())` - * `try { try_foo()? } ` = `try_foo().map_err(Into::into)` - * `try { Err(e)? } catch { e => e }` = `e.into()` - * `try { Ok(try_foo()?) } catch { e => Err(e) }` = `try_foo().map_err(Into::into)` + * `catch { foo() } ` = `Ok(foo())` + * `catch { Err(e)? } ` = `Err(e.into())` + * `catch { try_foo()? } ` = `try_foo().map_err(Into::into)` (In the above, `foo()` is a function returning any type, and `try_foo()` is a function returning a `Result`.) ## Feature gates -The two major features here, the `?` syntax and the `try`/`catch` -syntax, will be tracked by independent feature gates. Each of the -features has a distinct motivation, and we should evaluate them -independently. +The two major features here, the `?` syntax and `catch` expressions, +will be tracked by independent feature gates. Each of the features has +a distinct motivation, and we should evaluate them independently. # Unresolved questions @@ -435,11 +432,9 @@ standard monadic bind to accommodate rich control flow like `break`, * Don't. - * Only add the `?` operator, but not `try` and `try`..`catch`. + * Only add the `?` operator, but not `catch` expressions. - * Only add `?` and `try`, but not `try`..`catch`. - - * Instead of a built-in `try`..`catch` construct, attempt to define one using + * Instead of a built-in `catch` construct, attempt to define one using macros. However, this is likely to be awkward because, at least, macros may only have their contents as a single block, rather than two. Furthermore, macros are excellent as a "safety net" for features which we forget to add @@ -477,7 +472,7 @@ It is possible to carry the exception handling analogy further and also add `throw` is very simple: `throw EXPR` is essentially the same thing as `Err(EXPR)?`; in other words it throws the exception `EXPR` to the innermost -`try` block, or to the function's caller if there is none. +`catch` block, or to the function's caller if there is none. A `throws` clause on a function: From 2cd24ce2565e5490b3e38e247130640e3ffe750e Mon Sep 17 00:00:00 2001 From: Vadim Petrochenkov Date: Sat, 6 Feb 2016 22:10:28 +0300 Subject: [PATCH 0737/1195] RFC: `..` in patterns --- text/0000-dotdot-in-patterns.md | 124 ++++++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 text/0000-dotdot-in-patterns.md diff --git a/text/0000-dotdot-in-patterns.md b/text/0000-dotdot-in-patterns.md new file mode 100644 index 00000000000..c6ae4bc9fa4 --- /dev/null +++ b/text/0000-dotdot-in-patterns.md @@ -0,0 +1,124 @@ +- Feature Name: dotdot_in_patterns +- Start Date: 2016-02-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Permit the `..` pattern fragment in more contexts. + +# Motivation +[motivation]: #motivation + +The pattern fragment `..` can be used in some patterns to denote several elements in list contexts. +However, it doesn't always compiles when used in such contexts. +One can expect the ability to match tuple variants like `V(u8, u8, u8)` with patterns like +`V(x, ..)` or `V(.., z)`, but the compiler rejects such patterns currently despite accepting +very similar `V(..)`. + +This RFC is intended to "complete" the feature and make it work in all possible list contexts, +making the language a bit more convenient and consistent. + +# Detailed design +[design]: #detailed-design + +Let's list all the patterns currently existing in the language, that contain lists of subpatterns: + +``` +// Struct patterns. +S { field1, field2, ..., fieldN } + +// Tuple struct patterns. +S(field1, field2, ..., fieldN) + +// Tuple patterns. +(field1, field2, ..., fieldN) + +// Slice patterns. +[elem1, elem2, ..., elemN] +``` +In all the patterns above, except for struct patterns, field/element positions are significant. + +Now list all the contexts that currently permit the `..` pattern fragment: +``` +// Struct patterns, the last position. +S { subpat1, subpat2, .. } + +// Tuple struct patterns, the last and the only position, no extra subpatterns allowed. +S(..) + +// Slice patterns, the last position. +[subpat1, subpat2, ..] +// Slice patterns, the first position. +[.., subpatN-1, subpatN] +// Slice patterns, any other position. +[subpat1, .., subpatN] +// Slice patterns, any of the above with a subslice binding. +// (The binding is not actually a binding, but one more pattern bound to the sublist, but this is +// not important for our discussion.) +[subpat1, binding.., subpatN] +``` +Something is obviously missing, let's fill in the missing parts. + +``` +// Struct patterns, the last position. +S { subpat1, subpat2, .. } +// **NOT PROPOSED**: Struct patterns, any position. +// Since named struct fields are not positional, there's essentially no sense in placing the `..` +// anywhere except for one conventionally chosen position (the last one) or in sublist bindings, +// so we don't propose extensions to struct patterns. +S { subpat1, .., subpatN } +S { subpat1, binding.., subpatN } + +// Tuple struct patterns, the last and the only position, no extra subpatterns allowed. +S(..) +// **NEW**: Tuple struct patterns, any position. +S(subpat1, subpat2, ..) +S(.., subpatN-1, subpatN) +S(subpat1, .., subpatN) +// **NEW**: Tuple struct patterns, any position with a sublist binding. +// The binding has a tuple type. +S(subpat1, binding.., subpatN) + +// **NEW**: Tuple patterns, any position. +(subpat1, subpat2, ..) +(.., subpatN-1, subpatN) +(subpat1, .., subpatN) +// **NEW**: Tuple patterns, any position with a sublist binding. +// The binding has a tuple type. +(subpat1, binding.., subpatN) + +// Slice patterns, the last position. +[subpat1, subpat2, ..] +// Slice patterns, the first position. +[.., subpatN-1, subpatN] +// Slice patterns, any other position. +[subpat1, .., subpatN] +// Slice patterns, any of the above with a subslice binding. +[subpat1, binding.., subpatN] +``` + +Trailing comma is not allowed after `..` in the last position by analogy with existing slice and +struct patterns. + +This RFC is not critically important and can be rolled out in parts, for example, bare `..` first, +`..` with a sublist binding eventually. + +# Drawbacks +[drawbacks]: #drawbacks + +None. + +# Alternatives +[alternatives]: #alternatives + +None. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Sublist binding syntax conflicts with possible exclusive range patterns +`begin .. end`/`begin..`/`..end`. This problem already exists for slice patterns and has to be +solved independently from extensions to `..`. +This RFC simply selects the same syntax that slice patterns already have. From 4e9e8433bdf05927f24870f69216eed640707306 Mon Sep 17 00:00:00 2001 From: Vadim Petrochenkov Date: Mon, 8 Feb 2016 21:06:40 +0300 Subject: [PATCH 0738/1195] Restrict sublist bindings in tuple and tuple struct patterns --- text/0000-dotdot-in-patterns.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/0000-dotdot-in-patterns.md b/text/0000-dotdot-in-patterns.md index c6ae4bc9fa4..954698d81a3 100644 --- a/text/0000-dotdot-in-patterns.md +++ b/text/0000-dotdot-in-patterns.md @@ -79,6 +79,8 @@ S(.., subpatN-1, subpatN) S(subpat1, .., subpatN) // **NEW**: Tuple struct patterns, any position with a sublist binding. // The binding has a tuple type. +// By ref bindings are not allowed, because layouts of S(A, B, C, D) and (B, C) are not necessarily +// compatible (e.g. field reordering is possible). S(subpat1, binding.., subpatN) // **NEW**: Tuple patterns, any position. @@ -87,6 +89,8 @@ S(subpat1, binding.., subpatN) (subpat1, .., subpatN) // **NEW**: Tuple patterns, any position with a sublist binding. // The binding has a tuple type. +// By ref bindings are not allowed, because layouts of (A, B, C, D) and (B, C) are not necessarily +// compatible (e.g. field reordering is possible). (subpat1, binding.., subpatN) // Slice patterns, the last position. @@ -96,6 +100,7 @@ S(subpat1, binding.., subpatN) // Slice patterns, any other position. [subpat1, .., subpatN] // Slice patterns, any of the above with a subslice binding. +// By ref bindings are allowed, slices and subslices always have compatible layouts. [subpat1, binding.., subpatN] ``` @@ -113,7 +118,7 @@ None. # Alternatives [alternatives]: #alternatives -None. +Do not permit sublist bindings in tuples and tuple structs at all. # Unresolved questions [unresolved]: #unresolved-questions From 8de222b6a4013e4f19a70727a3c4e75a5a09f9af Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Mon, 8 Feb 2016 14:51:13 -0500 Subject: [PATCH 0739/1195] Amend RFC 550 with misc. follow set corrections RFC 550 introduced follow sets for future-proofing macro definitions. They have been tweaked since then to reflect the realities of extant Rust syntax, or to bail out crates that were too-big-to-fail. The idea (insofar as I understand it) is that the follow set for a fragment specifier `X` contains any tokens or symbols that could follow `X` in current (or planned) valid Rust syntax. In particular, `FOLLOW(X)` does _not_ contain anything that could be part of `X` (silly example: `+` is not in `FOLLOW(expr)` because obviously `+` may be in the middle of an expression). A macro may not accept syntax that has an `X` followed by something not in `FOLLOW(X)`. That way, if in the future we extend Rust syntax so that `X` can encompass more things (that were previously not in the follow set), the macro doesn't break. The upshot is that if there is something that can _currently_ follow `X` in valid Rust, but it is not in `FOLLOW(X)`, it is simply an unnecessary roadblock to macro writing, because a change that would break a macro would also break regular syntax. This RFC amendment proposes to remove two of those roadblocks. (If there are more that I missed, please suggest them.) Specifically: - Allow `ty` (and `path`) fragments to be followed by `block` fragments. Precedent is function and closure declarations, i.e. `fn foo() -> TYPE BLOCK`. Indeed you can already write `$t:ty { $($foo:tt)* }` in a macro rule, so `$t:ty $b:block` is natural. And `FOLLOW(path) = FOLLOW(ty)` because a path can name a type. - Allow `pat` fragments to be followed by `:`. Precedent is let-bindings and function/closure arguments, i.e. `let PAT: TYPE = EXPR`. --- text/0550-macro-future-proofing.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 705ea1880d0..845e3584822 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -414,9 +414,9 @@ specifier for the NT. The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. -- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in)}` +- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in), Colon}` - `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` -- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where), OpenDelim(Bracket)}` +- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where), OpenDelim(Bracket), Interpolated(NtBlock(_))}` - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(path)` = `FOLLOW(ty)` - `FOLLOW(block)` = any token From c9bdc713ebcec62f19d23693cfda3d4fed80240e Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Tue, 9 Feb 2016 17:39:52 -0800 Subject: [PATCH 0740/1195] Move 0243 from active/ to text/ --- {active => text}/0243-trait-based-exception-handling.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename {active => text}/0243-trait-based-exception-handling.md (100%) diff --git a/active/0243-trait-based-exception-handling.md b/text/0243-trait-based-exception-handling.md similarity index 100% rename from active/0243-trait-based-exception-handling.md rename to text/0243-trait-based-exception-handling.md From 4e4c756e6df95167dda553251cc37a75a294f002 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 10 Feb 2016 16:49:16 -0800 Subject: [PATCH 0741/1195] RFC 1317 is the Rust Language Server --- text/{0000-ide.md => 1317-ide.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-ide.md => 1317-ide.md} (99%) diff --git a/text/0000-ide.md b/text/1317-ide.md similarity index 99% rename from text/0000-ide.md rename to text/1317-ide.md index 5e41fe2163e..8c31d2703cb 100644 --- a/text/0000-ide.md +++ b/text/1317-ide.md @@ -1,6 +1,6 @@ - Feature Name: n/a - Start Date: 2015-10-13 -- RFC PR: (leave this empty) +- RFC PR: [rust-lang/rfcs#1317](https://github.com/rust-lang/rfcs/pull/1317) - Rust Issue: (leave this empty) # Summary From 82d48b36568286b93fc2ce55f89594024f421a3b Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 10 Feb 2016 16:51:42 -0800 Subject: [PATCH 0742/1195] Add a tracking issue for RFC 1317 --- text/1317-ide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1317-ide.md b/text/1317-ide.md index 8c31d2703cb..579c4c1d96a 100644 --- a/text/1317-ide.md +++ b/text/1317-ide.md @@ -1,7 +1,7 @@ - Feature Name: n/a - Start Date: 2015-10-13 - RFC PR: [rust-lang/rfcs#1317](https://github.com/rust-lang/rfcs/pull/1317) -- Rust Issue: (leave this empty) +- Rust Issue: [rust-lang/rust#31548](https://github.com/rust-lang/rust/issues/31548) # Summary From e767f331f24f6690ccbb9c5acb275c74a316d1e4 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 10 Feb 2016 16:58:39 -0800 Subject: [PATCH 0743/1195] RFC 1415 is deprecating much of std::os::*::raw --- text/{0000-trim-std-os.md => 1415-trim-std-os.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-trim-std-os.md => 1415-trim-std-os.md} (97%) diff --git a/text/0000-trim-std-os.md b/text/1415-trim-std-os.md similarity index 97% rename from text/0000-trim-std-os.md rename to text/1415-trim-std-os.md index 1496aff06f1..6fff1115fb9 100644 --- a/text/0000-trim-std-os.md +++ b/text/1415-trim-std-os.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-12-18 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1415](https://github.com/rust-lang/rfcs/pull/1415) +- Rust Issue: [rust-lang/rust#31549](https://github.com/rust-lang/rust/issues/31549) # Summary [summary]: #summary From 8f8d548e981cc9b91b0045be298a354ae31a0f79 Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Wed, 10 Feb 2016 23:57:42 -0500 Subject: [PATCH 0744/1195] revert addition of Colon to FOLLOW(pat) --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 845e3584822..0d286c991bd 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -414,7 +414,7 @@ specifier for the NT. The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. -- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in), Colon}` +- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in)}` - `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` - `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where), OpenDelim(Bracket), Interpolated(NtBlock(_))}` - `FOLLOW(stmt)` = `FOLLOW(expr)` From cc1282c9a6aa4909016e5d564b306388832f313f Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Wed, 10 Feb 2016 17:13:27 -0500 Subject: [PATCH 0745/1195] Clarify that new lints are are T-lang changes They therefore require an RFC. --- compiler_changes.md | 1 + lang_changes.md | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/compiler_changes.md b/compiler_changes.md index 75137743041..4b9f8cdf17f 100644 --- a/compiler_changes.md +++ b/compiler_changes.md @@ -21,6 +21,7 @@ submitted later if there is scope for large changes to the language RFC. ## Changes which need an RFC +* New lints (these fall under the lang team) * Large refactorings or redesigns of the compiler * Changing the API presented to syntax extensions or other compiler plugins in non-trivial ways diff --git a/lang_changes.md b/lang_changes.md index 7e7e6a732e7..bc09d9a417e 100644 --- a/lang_changes.md +++ b/lang_changes.md @@ -1,6 +1,8 @@ # RFC policy - language design -Pretty much every change to the language needs an RFC. +Pretty much every change to the language needs an RFC. Note that new +lints (or major changes to an existing lint) are considered changes to +the language. Language RFCs are managed by the language sub-team, and tagged `T-lang`. The language sub-team will do an initial triage of new PRs within a week of From 087bb92ea72bacabcce8fa636680198d4ad1bbe2 Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Fri, 12 Feb 2016 18:31:29 +0100 Subject: [PATCH 0746/1195] replace_range: rename to splice. --- text/0000-replace-slice.md | 35 +++++++++++++++-------------------- 1 file changed, 15 insertions(+), 20 deletions(-) diff --git a/text/0000-replace-slice.md b/text/0000-replace-slice.md index c6437981df1..fa2834148f5 100644 --- a/text/0000-replace-slice.md +++ b/text/0000-replace-slice.md @@ -1,4 +1,4 @@ -- Feature Name: replace-slice +- Feature Name: splice - Start Date: 2015-12-28 - RFC PR: - Rust Issue: @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Add a `replace_slice` method to `Vec` and `String` removes a range of elements, +Add a `splice` method to `Vec` and `String` removes a range of elements, and replaces it in place with a given sequence of values. The new sequence does not necessarily have the same length as the range it replaces. @@ -50,12 +50,12 @@ use collections::range::RangeArgument; use std::ptr; trait ReplaceVecSlice { - fn replace_slice(&mut self, range: R, iterable: I) + fn splice(&mut self, range: R, iterable: I) where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator; } impl ReplaceVecSlice for Vec { - fn replace_slice(&mut self, range: R, iterable: I) + fn splice(&mut self, range: R, iterable: I) where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator { let len = self.len(); @@ -117,11 +117,11 @@ impl ReplaceVecSlice for Vec { } trait ReplaceStringSlice { - fn replace_slice(&mut self, range: R, s: &str) where R: RangeArgument; + fn splice(&mut self, range: R, s: &str) where R: RangeArgument; } impl ReplaceStringSlice for String { - fn replace_slice(&mut self, range: R, s: &str) where R: RangeArgument { + fn splice(&mut self, range: R, s: &str) where R: RangeArgument { if let Some(&start) = range.start() { assert!(self.is_char_boundary(start)); } @@ -130,19 +130,19 @@ impl ReplaceStringSlice for String { } unsafe { self.as_mut_vec() - }.replace_slice(range, s.bytes()) + }.splice(range, s.bytes()) } } #[test] fn it_works() { let mut v = vec![1, 2, 3, 4, 5]; - v.replace_slice(2..4, [10, 11, 12].iter().cloned()); + v.splice(2..4, [10, 11, 12].iter().cloned()); assert_eq!(v, &[1, 2, 10, 11, 12, 5]); - v.replace_slice(1..3, Some(20)); + v.splice(1..3, Some(20)); assert_eq!(v, &[1, 20, 11, 12, 5]); let mut s = "Hello, world!".to_owned(); - s.replace_slice(7.., "世界!"); + s.splice(7.., "世界!"); assert_eq!(s, "Hello, 世界!"); } @@ -150,7 +150,7 @@ fn it_works() { #[should_panic] fn char_boundary() { let mut s = "Hello, 世界!".to_owned(); - s.replace_slice(..8, "") + s.splice(..8, "") } ``` @@ -184,9 +184,9 @@ not every program needs it, and standard library growth has a maintainance cost. With `ExactSizeIterator` it only happens when `ExactSizeIterator::len` is incorrect which means that someone is doing something wrong. -* Alternatively, should `replace_slice` panic when `ExactSizeIterator::len` is incorrect? +* Alternatively, should `splice` panic when `ExactSizeIterator::len` is incorrect? -* It would be nice to be able to `Vec::replace_slice` with a slice +* It would be nice to be able to `Vec::splice` with a slice without writing `.iter().cloned()` explicitly. This is possible with the same trick as for the `Extend` trait ([RFC 839](https://github.com/rust-lang/rfcs/blob/master/text/0839-embrace-extend-extinguish.md)): @@ -194,10 +194,10 @@ not every program needs it, and standard library growth has a maintainance cost. ```rust impl<'a, T: 'a> ReplaceVecSlice<&'a T> for Vec where T: Copy { - fn replace_slice(&mut self, range: R, iterable: I) + fn splice(&mut self, range: R, iterable: I) where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator { - self.replace_slice(range, iterable.into_iter().cloned()) + self.splice(range, iterable.into_iter().cloned()) } } ``` @@ -206,11 +206,6 @@ not every program needs it, and standard library growth has a maintainance cost. (By the way, what was the motivation for `Extend` being a trait rather than inherent methods, before RFC 839?) -* Naming. - I accidentally typed `replace_range` instead of `replace_slice` several times - while typing up this RFC. - Update: I’m told `splice` is how this operation is called. - * The method could return an iterator of the replaced elements. Nothing would happen when the method is called, only when the returned iterator is advanced or dropped. From ae0b0cd0de07af9d13f5c17ba0d15d0067f67f99 Mon Sep 17 00:00:00 2001 From: Simon Sapin Date: Fri, 12 Feb 2016 18:53:01 +0100 Subject: [PATCH 0747/1195] splice: Return an iterator --- text/0000-replace-slice.md | 127 +++++++++++++------------------------ 1 file changed, 43 insertions(+), 84 deletions(-) diff --git a/text/0000-replace-slice.md b/text/0000-replace-slice.md index fa2834148f5..7e4d0938cfa 100644 --- a/text/0000-replace-slice.md +++ b/text/0000-replace-slice.md @@ -9,6 +9,8 @@ Add a `splice` method to `Vec` and `String` removes a range of elements, and replaces it in place with a given sequence of values. The new sequence does not necessarily have the same length as the range it replaces. +In the `Vec` case, this method returns an iterator of the elements being moved out, like `drain`. + # Motivation [motivation]: #motivation @@ -49,78 +51,44 @@ extern crate collections; use collections::range::RangeArgument; use std::ptr; -trait ReplaceVecSlice { - fn splice(&mut self, range: R, iterable: I) - where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator; +trait VecSplice { + fn splice(&mut self, range: R, iterable: I) -> Splice + where R: RangeArgument, I: IntoIterator; } -impl ReplaceVecSlice for Vec { - fn splice(&mut self, range: R, iterable: I) - where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator +impl VecSplice for Vec { + fn splice(&mut self, range: R, iterable: I) -> Splice + where R: RangeArgument, I: IntoIterator { - let len = self.len(); - let range_start = *range.start().unwrap_or(&0); - let range_end = *range.end().unwrap_or(&len); - assert!(range_start <= range_end); - assert!(range_end <= len); - let mut iter = iterable.into_iter(); - // Overwrite range - for i in range_start..range_end { - if let Some(new_element) = iter.next() { - unsafe { - *self.get_unchecked_mut(i) = new_element - } - } else { - // Iterator shorter than range - self.drain(i..range_end); - return - } - } - // Insert rest - let iter_len = iter.len(); - let elements_after = len - range_end; - let free_space_start = range_end; - let free_space_end = free_space_start + iter_len; - - if iter_len > 0 { - // FIXME: merge the reallocating case with the first ptr::copy below? - self.reserve(iter_len); - - let p = self.as_mut_ptr(); - unsafe { - // In case iter.next() panics, leak some elements rather than risk double-freeing them. - self.set_len(free_space_start); - // Shift everything over to make space (duplicating some elements). - ptr::copy(p.offset(free_space_start as isize), - p.offset(free_space_end as isize), - elements_after); - for i in free_space_start..free_space_end { - if let Some(new_element) = iter.next() { - *self.get_unchecked_mut(i) = new_element - } else { - // Iterator shorter than its ExactSizeIterator::len() - ptr::copy(p.offset(free_space_end as isize), - p.offset(i as isize), - elements_after); - self.set_len(i + elements_after); - return - } - } - self.set_len(free_space_end + elements_after); - } - } - // Iterator longer than its ExactSizeIterator::len(), degenerate to quadratic time - for (new_element, i) in iter.zip(free_space_end..) { - self.insert(i, new_element); - } + unimplemented!() // FIXME: Fill in when exact semantics are decided. } } -trait ReplaceStringSlice { +struct Splice { + vec: &mut Vec, + range: Range + iter: I::IntoIter, + // FIXME: Fill in when exact semantics are decided. +} + +impl Iterator for Splice { + type Item = I::Item; + fn next(&mut self) -> Option { + unimplemented!() // FIXME: Fill in when exact semantics are decided. + } +} + +impl Drop for Splice { + fn drop(&mut self) { + unimplemented!() // FIXME: Fill in when exact semantics are decided. + } +} + +trait StringSplice { fn splice(&mut self, range: R, s: &str) where R: RangeArgument; } -impl ReplaceStringSlice for String { +impl StringSplice for String { fn splice(&mut self, range: R, s: &str) where R: RangeArgument { if let Some(&start) = range.start() { assert!(self.is_char_boundary(start)); @@ -154,12 +122,14 @@ fn char_boundary() { } ``` -This implementation defends against `ExactSizeIterator::len()` being incorrect. -If `len()` is too high, it reserves more capacity than necessary -and does more copying than necessary, -but stays in linear time. -If `len()` is too low, the algorithm degenerates to quadratic time -using `Vec::insert` for each additional new element. +The elements of the vector after the range first be moved by an offset of +the lower bound of `Iterator::size_hint` minus the length of the range. +Then, depending on the real length of the iterator: + +* If it’s the same as the lower bound, we’re done. +* If it’s lower than the lower bound (which was then incorrect), the elements will be moved once more. +* If it’s higher, the extra iterator items well be collected into a temporary `Vec` + in order to know exactly how many there are, and the elements after will be moved once more. # Drawbacks [drawbacks]: #drawbacks @@ -178,13 +148,8 @@ not every program needs it, and standard library growth has a maintainance cost. # Unresolved questions [unresolved]: #unresolved-questions -* Should the `ExactSizeIterator` bound be removed? - The lower bound of `Iterator::size_hint` could be used instead of `ExactSizeIterator::len`, - but the degenerate quadratic time case would become “normal”. - With `ExactSizeIterator` it only happens when `ExactSizeIterator::len` is incorrect - which means that someone is doing something wrong. - -* Alternatively, should `splice` panic when `ExactSizeIterator::len` is incorrect? +* Should the input iterator be consumed incrementally at each `Splice::next` call, + or only in `Splice::drop`? * It would be nice to be able to `Vec::splice` with a slice without writing `.iter().cloned()` explicitly. @@ -193,9 +158,9 @@ not every program needs it, and standard library growth has a maintainance cost. accept iterators of `&T` as well as iterators of `T`: ```rust - impl<'a, T: 'a> ReplaceVecSlice<&'a T> for Vec where T: Copy { + impl<'a, T: 'a> VecSplice<&'a T> for Vec where T: Copy { fn splice(&mut self, range: R, iterable: I) - where R: RangeArgument, I: IntoIterator, I::IntoIter: ExactSizeIterator + where R: RangeArgument, I: IntoIterator { self.splice(range, iterable.into_iter().cloned()) } @@ -206,12 +171,6 @@ not every program needs it, and standard library growth has a maintainance cost. (By the way, what was the motivation for `Extend` being a trait rather than inherent methods, before RFC 839?) -* The method could return an iterator of the replaced elements. - Nothing would happen when the method is called, - only when the returned iterator is advanced or dropped. - There’s is precedent of this in `Vec::drain`, - though the input iterator being lazily consumed could be surprising. - * If coherence rules and backward-compatibility allow it, this functionality could be added to `Vec::insert` and `String::insert` by overloading them / making them more generic. From 0b71d709c8754c3fc533195f63b674281b3dcc36 Mon Sep 17 00:00:00 2001 From: Andrew Ayer Date: Fri, 12 Feb 2016 17:24:48 -0800 Subject: [PATCH 0748/1195] Add RFC for ipv6addr octets interface --- text/0000-ipv6addr-octets.md | 69 ++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 text/0000-ipv6addr-octets.md diff --git a/text/0000-ipv6addr-octets.md b/text/0000-ipv6addr-octets.md new file mode 100644 index 00000000000..d0aa85926d1 --- /dev/null +++ b/text/0000-ipv6addr-octets.md @@ -0,0 +1,69 @@ +- Feature Name: ipv6addr_octets_interface +- Start Date: 2016-02-12 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add constructor and getter functions to `std::net::Ipv6Addr` that are +oriented around octets. + +# Motivation +[motivation]: #motivation + +Currently, the interface for `std::net::Ipv6Addr` is oriented around 16-bit +"segments". The constructor takes eight 16-bit integers as arguments, +and the sole getter function, `segments`, returns an array of eight +16-bit integers. This interface is unnatural when doing low-level network +programming, where IPv6 addresses are treated as a sequence of 16 octets. +For example, building and parsing IPv6 packets requires doing +bitwise arithmetic with careful attention to byte order in order to convert +between the on-wire format of 16 octets and the eight segments format used +by `std::net::Ipv6Addr`. + +# Detailed design +[design]: #detailed-design + +Two functions would be added to `impl std::net::Ipv6Addr`: + +``` +pub fn from_octets(octets: &[u8; 16]) -> Ipv6Addr { + let mut addr: c::in6_addr = unsafe { std::mem::zeroed() }; + addr.s6_addr = *octets; + Ipv6Addr { inner: addr } +} +pub fn octets(&self) -> &[u8; 16] { + &self.inner.s6_addr +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +It adds additional functions to the `Ipv6Addr` API, which increases cognitive load +and maintenance burden. That said, the functions are conceptually very simple +and their implementations short. + +Returning a reference from `octets` ties the interface to the internal representation +of `Ipv6Addr`, which is currently `[u8; 16]`. It would not be possible to change `Ipv6Addr` +to use a different representation without changing the return type of `octets` to be a non-reference. + +# Alternatives +[alternatives]: #alternatives + +Do nothing. The downside is that developers will need to resort to +bitwise arithmetic, which is awkward and error-prone (particularly with +respect to byte ordering) to convert between `Ipv6Addr` and the on-wire +representation of IPv6 addresses. Or they will use their alternative +implementations of `Ipv6Addr`, fragmenting the ecosystem. + +`octets` could return a non-reference to avoid tying the interface to the +internal representation. However, it seems unlikely that the internal +representation would ever be anything besides a `[u8; 16]`. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Should `octets` return a reference? Pro: avoid a copy. Con: ties the interface to the internal +representation, which is presently `[u8; 16]`. From ea0ad1c79e38ab798f598c198f5713a795d91b34 Mon Sep 17 00:00:00 2001 From: Nicholas Mazzuca Date: Fri, 12 Feb 2016 21:30:39 -0800 Subject: [PATCH 0749/1195] Rename copy_from -> copy_from_slice --- text/0000-slice-copy.md | 30 ++++++++++++++++-------------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/text/0000-slice-copy.md b/text/0000-slice-copy.md index f49da7f0b95..f53f0324abe 100644 --- a/text/0000-slice-copy.md +++ b/text/0000-slice-copy.md @@ -22,13 +22,14 @@ Add one method to Primitive Type `slice`. ```rust impl [T] where T: Copy { - pub fn copy_from(&mut self, src: &[T]); + pub fn copy_from_slice(&mut self, src: &[T]); } ``` -`copy_from` asserts that `src.len() == self.len()`, then `memcpy`s the members into -`self` from `src`. Calling `copy_from` is semantically equivalent to a `memcpy`. -`self` shall have exactly the same members as `src` after a call to `copy_from`. +`copy_from_slice` asserts that `src.len() == self.len()`, then `memcpy`s the +members into `self` from `src`. Calling `copy_from_slice` is semantically +equivalent to a `memcpy`. `self` shall have exactly the same members as `src` +after a call to `copy_from_slice`. # Drawbacks [drawbacks]: #drawbacks @@ -38,19 +39,20 @@ One new method on `slice`. # Alternatives [alternatives]: #alternatives -`copy_from` could be known as `copy_from_slice`, which would follow -`clone_from_slice`. - -`copy_from` could be called `copy_to`, and have the order of the arguments +`copy_from_slice` could be called `copy_to`, and have the order of the arguments switched around. This would follow `ptr::copy_nonoverlapping` ordering, and not -`dst = src` or `.clone_from()` ordering. +`dst = src` or `.clone_from_slice()` ordering. + +`copy_from_slice` could panic only if `dst.len() < src.len()`. This would be the +same as what came before, but we would also lose the guarantee that an +uninitialized slice would be fully initialized. -`copy_from` could panic only if `dst.len() < src.len()`. This would be the same -as what came before, but we would also lose the guarantee that an uninitialized -slice would be fully initialized. +`copy_from_slice` could be a free function, as it was in the original draft of +this document. However, there was overwhelming support for it as a method. -`copy_from` could be a free function, as it was in the original draft of this -document. However, there was overwhelming support for it as a method. +`copy_from_slice` could be not merged, and `clone_from_slice` could be +specialized to `memcpy` in cases of `T: Copy`. I think it's good to have a +specific function to do this, however, which asserts that `T: Copy`. # Unresolved questions [unresolved]: #unresolved-questions From 449c6a653b53d7eaa4304047539a0292019e6157 Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Mon, 15 Feb 2016 13:37:53 -0500 Subject: [PATCH 0750/1195] change notation --- text/0550-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 0d286c991bd..881347e07f1 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -416,7 +416,7 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, - `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in)}` - `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` -- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where), OpenDelim(Bracket), Interpolated(NtBlock(_))}` +- `FOLLOW(ty)` = `{OpenDelim(Brace), Comma, FatArrow, Colon, Eq, Gt, Semi, Or, Ident(as), Ident(where), OpenDelim(Bracket), Nonterminal(Block)}` - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(path)` = `FOLLOW(ty)` - `FOLLOW(block)` = any token From 14e49449cf355161982ca7489f09fec49d8ddf2e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 18 Feb 2016 10:43:40 -0800 Subject: [PATCH 0751/1195] RFC 1419 is <[T]>::copy_from_slice --- text/{0000-slice-copy.md => 1419-slice-copy.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-slice-copy.md => 1419-slice-copy.md} (92%) diff --git a/text/0000-slice-copy.md b/text/1419-slice-copy.md similarity index 92% rename from text/0000-slice-copy.md rename to text/1419-slice-copy.md index f53f0324abe..4e3abedc5e5 100644 --- a/text/0000-slice-copy.md +++ b/text/1419-slice-copy.md @@ -1,7 +1,7 @@ - Feature Name: slice\_copy\_from - Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1419](https://github.com/rust-lang/rfcs/pull/1419) +- Rust Issue: [rust-lang/rust#31755](https://github.com/rust-lang/rust/issues/31755) # Summary [summary]: #summary From 7fed13a7ed9fff1b349e8997b2bdbf012bb7945a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 18 Feb 2016 10:45:25 -0800 Subject: [PATCH 0752/1195] Remove duplicate 0000-restrict-constants-in-patterns.md Also update the metadata in 1445 --- text/0000-restrict-constants-in-patterns.md | 621 -------------------- text/1445-restrict-constants-in-patterns.md | 18 +- 2 files changed, 9 insertions(+), 630 deletions(-) delete mode 100644 text/0000-restrict-constants-in-patterns.md diff --git a/text/0000-restrict-constants-in-patterns.md b/text/0000-restrict-constants-in-patterns.md deleted file mode 100644 index c4cb33b9fcf..00000000000 --- a/text/0000-restrict-constants-in-patterns.md +++ /dev/null @@ -1,621 +0,0 @@ -- Feature Name: structural_match -- Start Date: 2015-02-06 -- RFC PR: [rust-lang/rfcs#1445](https://github.com/rust-lang/rfcs/pull/1445) -- Rust Issue: [rust-lang/rust#31434](https://github.com/rust-lang/rust/issues/31434) - -# Summary -[summary]: #summary - -The current compiler implements a more expansive semantics for pattern -matching than was originally intended. This RFC introduces several -mechanisms to reign in these semantics without actually breaking -(much, if any) extant code: - -- Introduce a feature-gated attribute `#[structural_match]` which can - be applied to a struct or enum `T` to indicate that constants of - type `T` can be used within patterns. -- Have `#[derive(Eq)]` automatically apply this attribute to - the struct or enum that it decorates. **Automatically inserted attributes - do not require use of feature-gate.** -- When expanding constants of struct or enum type into equivalent - patterns, require that the struct or enum type is decorated with - `#[structural_match]`. Constants of builtin types are always - expanded. - -The practical effect of these changes will be to prevent the use of -constants in patterns unless the type of those constants is either a -built-in type (like `i32` or `&str`) or a user-defined constant for -which `Eq` is **derived** (not merely *implemented*). - -To be clear, this `#[structural_match]` attribute is **never intended -to be stabilized**. Rather, the intention of this change is to -restrict constant patterns to those cases that everyone can agree on -for now. We can then have further discussion to settle the best -semantics in the long term. - -Because the compiler currently accepts arbitrary constant patterns, -this is technically a backwards incompatible change. However, the -design of the RFC means that existing code that uses constant patterns -will generally "just work". The justification for this change is that -it is clarifying -["underspecified language semantics" clause, as described in RFC 1122][ls]. -A [recent crater run][crater] with a prototype implementation found 6 -regressions. - -[crater]: https://gist.github.com/nikomatsakis/e714e4a824527e0ce5c9 - -**Note:** this was also discussed on an [internals thread]. Major -points from that thread are summarized either inline or in -alternatives. - -[ls]: https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md#underspecified-language-semantics -[crater run]: https://gist.github.com/nikomatsakis/26096ec2a2df3c1fb224 -[internals thread]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846) - -# Motivation -[motivation]: #motivation - -The compiler currently permits any kind of constant to be used within -a pattern. However, the *meaning* of such a pattern is somewhat -controversial: the current semantics implemented by the compiler were -[adopted in July of 2014](https://github.com/rust-lang/rust/pull/15650) -and were never widely discussed nor did they go through the RFC -process. Moreover, the discussion at the time was focused primarily on -implementation concerns, and overlooked the potential semantic -hazards. - -### Semantic vs structural equality - -Consider a program like this one, which references a constant value -from within a pattern: - -```rust -struct SomeType { - a: u32, - b: u32, -} - -const SOME_CONSTANT: SomeType = SomeType { a: 22+22, b: 44+44 }; - -fn test(v: SomeType) { - match v { - SOME_CONSTANT => println!("Yes"), - _ => println!("No"), - } -} -``` - -The question at hand is what do we expect this match to do, precisely? -There are two main possibilities: semantic and structural equality. - -**Semantic equality.** Semantic equality states that a pattern -`SOME_CONSTANT` matches a value `v` if `v == SOME_CONSTANT`. In other -words, the `match` statement above would be exactly equivalent to an -`if`: - -```rust -if v == SOME_CONSTANT { - println!("Yes") -} else { - println!("No"); -} -``` - -Under semantic equality, the program above would not compile, because -`SomeType` does not implement the `PartialEq` trait. - -**Structural equality.** Under structural equality, `v` matches the -pattern `SOME_CONSTANT` if all of its fields are (structurally) equal. -Primitive types like `u32` are structurally equal if they represent -the same value (but see below for discussion about floating point -types like `f32` and `f64`). This means that the `match` statement -above would be roughly equivalent to the following `if` (modulo -privacy): - -```rust -if v.a == SOME_CONSTANT.a && v.b == SOME_CONSTANT.b { - println!("Yes") -} else { - println!("No"); -} -``` - -Structural equality basically says "two things are structurally equal -if their fields are structurally equal". It is sort of equality you -would get if everyone used `#[derive(PartialEq)]` on all types. Note -that the equality defined by structural equality is completely -distinct from the `==` operator, which is tied to the `PartialEq` -traits. That is, two values that are *semantically unequal* could be -*structurally equal* (an example where this might occur is the -floating point value `NaN`). - -**Current semantics.** The compiler's current semantics are basically -structural equality, though in the case of floating point numbers they -are arguably closer to semantic equality (details below). In -particular, when a constant appears in a pattern, the compiler first -evaluates that constant to a specific value. So we would reduce the -expression: - -```rust -const SOME_CONSTANT: SomeType = SomeType { a: 22+22, b: 44+44 }; -``` - -to the value `SomeType { a: 44, b: 88 }`. We then expand the pattern -`SOME_CONSTANT` as though you had typed this value in place (well, -almost as though, read on for some complications around privacy). -Thus the match statement above is equivalent to: - -```rust -match v { - SomeType { a: 44, b: 88 } => println!(Yes), - _ => println!("No"), -} -``` - -### Disadvantages of the current approach - -Given that the compiler already has a defined semantics, it is -reasonable to ask why we might want to change it. There -are two main disadvantages: - -1. **No abstraction boundary.** The current approach does not permit - types to define what equality means for themselves (at least not if - they can be constructed in a constant). -2. **Scaling to associated constants.** The current approach does not - permit associated constants or generic integers to be used in a - match statement. - -#### Disadvantage: Weakened abstraction bounary - -The single biggest concern with structural equality is that it -introduces two distinct notions of equality: the `==` operator, based -on the `PartialEq` trait, and pattern matching, based on a builtin -structural recursion. This will cause problems for user-defined types -that rely on `PartialEq` to define equality. Put another way, **it is -no longer possible for user-defined types to completely define what -equality means for themselves** (at least not if they can be -constructed in a constant). Furthermore, because the builtin -structural recursion does not consider privacy, `match` statements can -now be used to **observe private fields**. - -**Example: Normalized durations.** Consider a simple duration type: - -```rust -#[derive(Copy, Clone)] -pub struct Duration { - pub seconds: u32, - pub minutes: u32, -} -``` - -Let's say that this `Duration` type wishes to represent a span of -time, but it also wishes to preserve whether that time was expressed -in seconds or minutes. In other words, 60 seconds and 1 minute are -equal values, but we don't want to normalize 60 seconds into 1 minute; -perhaps because it comes from user input and we wish to keep things -just as the user chose to express it. - -We might implement `PartialEq` like so (actually the `PartialEq` trait -is slightly different, but you get the idea): - -```rust -impl PartialEq for Duration { - fn eq(&self, other: &Duration) -> bool { - let s1 = (self.seconds as u64) + (self.minutes as u64 * 60); - let s2 = (other.seconds as u64) + (other.minutes as u64 * 60); - s1 == s2 - } -} -``` - -Now imagine I have some constants: - -```rust -const TWENTY_TWO_SECONDS: Duration = Duration { seconds: 22, minutes: 0 }; -const ONE_MINUTE: Duration = Duration { seconds: 0, minutes: 1 }; -``` - -And I write a match statement using those constants: - -```rust -fn detect_some_case_or_other(d: Duration) { - match d { - TWENTY_TWO_SECONDS => /* do something */, - ONE_MINUTE => /* do something else */, - _ => /* do something else again */, - } -} -``` - -Now this code is, in all probability, buggy. Probably I meant to use -the notion of equality that `Duration` defined, where seconds and -minutes are normalized. But that is not the behavior I will see -- -instead I will use a pure structural match. What's worse, this means -the code will probably work in my local tests, since I like to say -"one minute", but it will break when I demo it for my customer, since -she prefers to write "60 seconds". - -**Example: Floating point numbers.** Another example is floating point -numbers. Consider the case of `0.0` and `-0.0`: these two values are -distinct, but they typically behave the same; so much so that they -compare equal (that is, `0.0 == -0.0` is `true`). So it is likely -that code such as: - -```rust -match some_computation() { - 0.0 => ..., - x => ..., -} -``` - -did not intend to discriminate between zero and negative zero. In -fact, in the compiler today, match *will* compare 0.0 and -0.0 as -equal. We simply do not extend that courtesy to user-defined types. - -**Example: observing private fields.** The current constant expansion -code does not consider privacy. In other words, constants are expanded -into equivalent patterns, but those patterns may not have been -something the user could have typed because of privacy rules. Consider -a module like: - -```rust -mod foo { - pub struct Foo { b: bool } - pub const V1: Foo = Foo { b: true }; - pub const V2: Foo = Foo { b: false }; -} -``` - -Note that there is an abstraction boundary here: b is a private -field. But now if I wrote code from another module that matches on a -value of type Foo, that abstraction boundary is pierced: - -```rust -fn bar(f: x::Foo) { - // rustc knows this is exhaustive because if expanded `V1` into - // equivalent patterns; patterns you could not write by hand! - match f { - x::V1 => { /* moreover, now we know that f.b is true */ } - x::V2 => { /* and here we know it is false */ } - } -} -``` - -Note that, because `Foo` does not implement `PartialEq`, just having -access to `V1` would not otherwise allow us to observe the value of -`f.b`. (And even if `Foo` *did* implement `PartialEq`, that -implementation might not read `f.b`, so we still would not be able to -observe its value.) - -**More examples.** There are numerous possible examples here. For -example, strings that compare using case-insensitive comparisons, but -retain the original case for reference, such as those used in -file-systems. Views that extract a subportion of a larger value (and -hence which should only compare that subportion). And so forth. - -#### Disadvantage: Scaling to associated constants and generic integers - -Rewriting constants into patterns requires that we can **fully -evaluate** the constant at the time of exhaustiveness checking. For -associated constants and type-level integers, that is not possible -- -we have to wait until monomorphization time. Consider: - -```rust -trait SomeTrait { - const A: bool; - const B: bool; -} - -fn foo(x: bool) { - match x { - T::A => println!("A"), - T::B => println!("B"), - } -} - -impl SomeTrait for i32 { - const A: bool = true; - const B: bool = true; -} - -impl SomeTrait for u32 { - const A: bool = true; - const B: bool = false; -} -``` - -Is this match exhaustive? Does it contain dead code? The answer will -depend on whether `T=i32` or `T=u32`, of course. - -### Advantages of the current approach - -However, structural equality also has a number of advantages: - -**Better optimization.** One of the biggest "pros" is that it can -potentially enable nice optimization. For example, given constants like the following: - -```rust -struct Value { x: u32 } -const V1: Value = Value { x: 0 }; -const V2: Value = Value { x: 1 }; -const V3: Value = Value { x: 2 }; -const V4: Value = Value { x: 3 }; -const V5: Value = Value { x: 4 }; -``` - -and a match pattern like the following: - -```rust -match v { - V1 => ..., - ..., - V5 => ..., -} -``` - -then, because pattern matching is always a process of structurally -extracting values, we can compile this to code that reads the field -`x` (which is a `u32`) and does an appropriate switch on that -value. Semantic equality would potentially force a more conservative -compilation strategy. - -**Better exhautiveness and dead-code checking.** Similarly, we can do -more thorough exhaustiveness and dead-code checking. So for example if -I have a struct like: - -```rust -struct Value { field: bool } -const TRUE: Value { field: true }; -const FALSE: Value { field: false }; -``` - -and a match pattern like: - -```rust -match v { TRUE => .., FALSE => .. } -``` - -then we can prove that this match is exhaustive. Similarly, we can prove -that the following match contains dead-code: - -```rust -const A: Value { field: true }; -match v { - TRUE => ..., - A => ..., -} -``` - -Again, some of the alternatives might not allow this. (But note the -cons, which also raise the question of exhaustiveness checking.) - -**Nullary variants and constants are (more) equivalent.** Currently, -there is a sort of equivalence between enum variants and constants, at -least with respect to pattern matching. Consider a C-like enum: - -```rust -enum Modes { - Happy = 22, - Shiny = 44, - People = 66, - Holding = 88, - Hands = 110, -} - -const C: Modes = Modes::Happy; -``` - -Now if I match against `Modes::Happy`, that is matching against an -enum variant, and under *all* the proposals I will discuss below, it -will check the actual variant of the value being matched (regardless -of whether `Modes` implements `PartialEq`, which it does not here). On -the other hand, if matching against `C` were to require a `PartialEq` -impl, then it would be illegal. Therefore matching against an *enum -variant* is distinct from matching against a *constant*. - -# Detailed design -[design]: #detailed-design - -The goal of this RFC is not to decide between semantic and structural -equality. Rather, the goal is to restrict pattern matching to that subset -of types where the two variants behave roughly the same. - -### The structural match attribute - -We will introduce an attribute `#[structural_match]` which can be -applied to struct and enum types. Explicit use of this attribute will -(naturally) be feature-gated. When converting a constant value into a -pattern, if the constant is of struct or enum type, we will check -whether this attribute is present on the struct -- if so, we will -convert the value as we do today. If not, we will report an error that -the struct/enum value cannot be used in a pattern. - -### Behavior of `#[derive(Eq)]` - -When deriving the `Eq` trait, we will add the `#[structural_match]` to -the type in question. Attributes added in this way will be **exempt from -the feature gate**. - -## Exhaustiveness and dead-code checking - -We will treat user-defined structs "opaquely" for the purpose of -exhaustiveness and dead-code checking. This is required to allow for -semantic equality semantics in the future, since in that case we -cannot rely on `Eq` to be correctly implemented (e.g., it could always -return `false`, no matter values are supplied to it, even though it's -not supposed to). The impact of this change has not been evaluated but -is expected to be **very** small, since in practice it is rather -challenging to successfully make an exhaustive match using -user-defined constants, unless they are something trivial like -newtype'd booleans (and, in that case, you can update the code to use -a more extended pattern). - -Similarly, dead code detection should treat constants in a -conservative fashion. that is, we can recognize that if there are two -arms using the same constant, the second one is dead code, even though -it may be that neither will matches (e.g., `match foo { C => _, C => _ -}`). We will make no assumptions about two distinct constants, even if -we can concretely evaluate them to the same value. - -One **unresolved question** (described below) is what behavior to -adopt for constants that involve no user-defined types. There, the -definition of `Eq` is purely under our control, and we know that it -matches structural equality, so we can retain our current aggressive -analysis if desired. - -### Phasing - -We will not make this change instantaneously. Rather, for at least one -release cycle, users who are pattern matching on struct types that -lack `#[structural_match]` will be warned about imminent breakage. - -# Drawbacks -[drawbacks]: #drawbacks - -This is a breaking change, which means some people might have to -change their code. However, that is considered extremely unlikely, -because such users would have to be pattern matching on constants that -are not comparable for equality (this is likely a bug in any case). - -# Alternatives -[alternatives]: #alternatives - - **Limit matching to builtin types.** An earlier version of this RFC -limited matching to builtin types like integers (and tuples of -integers). This RFC is a generalization of that which also -accommodates struct types that derive `Eq`. - -**Embrace current semantics (structural equality).** Naturally we -could opt to keep the semantics as they are. The advantages and -disadvantages are discussed above. - -**Embrace semantic equality.** We could opt to just go straight -towards "semantic equality". However, it seems better to reset the -semantics to a base point that everyone can agree on, and then extend -from that base point. Moreover, adopting semantic equality straight -out would be a riskier breaking change, as it could silently change -the semantics of existing programs (whereas the current proposal only -causes compilation to fail, never changes what an existing program -will do). - -# Discussion thread summary - -This section summarizes various points that were raised in the -[internals thread] which are related to patterns but didn't seem to -fit elsewhere. - -**Overloaded patterns.** Some languages, notably Scala, permit -overloading of patterns. This is related to "semantic equality" in -that it involves executing custom, user-provided code at compilation -time. - -**Pattern synonyms.** Haskell offers a feature called "pattern -synonyms" and -[it was argued](https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/39?u=nikomatsakis) -that the current treatment of patterns can be viewed as a similar -feature. This may be true, but constants-in-patterns are lacking a -number of important features from pattern synonyms, such as bindings, -as -[discussed in this response](https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/48?u=nikomatsakis). -The author feels that pattern synonyms might be a useful feature, but -it would be better to design them as a first-class feature, not adapt -constants for that purpose. - -# Unresolved questions -[unresolved]: #unresolved-questions - -**What about exhaustiveness etc on builtin types?** Even if we ignore -user-defined types, there are complications around exhaustiveness -checking for constants of any kind related to associated constants and -other possible future extensions. For example, the following code -[fails to compile](http://is.gd/PJjNKl) because it contains dead-code: - -```rust -const X: u64 = 0; -const Y: u64 = 0; -fn bar(foo: u64) { - match foo { - X => { } - Y => { } - _ => { } - } -} -``` - -However, we would be unable to perform such an analysis in a more -generic context, such as with an associated constant: - -```rust -trait Trait { - const X: u64; - const Y: u64; -} - -fn bar(foo: u64) { - match foo { - T::X => { } - T::Y => { } - _ => { } - } -} -``` - -Here, although it may well be that `T::X == T::Y`, we can't know for -sure. So, for consistency, we may wish to treat all constants opaquely -regardless of whether we are in a generic context or not. (However, it -also seems reasonable to make a "best effort" attempt at -exhaustiveness and dead pattern checking, erring on the conservative -side in those cases where constants cannot be fully evaluated.) - -A different argument in favor of treating all constants opaquely is -that the current behavior can leak details that perhaps were intended -to be hidden. For example, imagine that I define a fn `hash` that, -given a previous hash and a value, produces a new hash. Because I am -lazy and prototyping my system, I decide for now to just ignore the -new value and pass the old hash through: - -```rust -const fn add_to_hash(prev_hash: u64, _value: u64) -> u64 { - prev_hash -} -``` - -Now I have some consumers of my library and they define a few constants: - -```rust -const HASH_OF_ZERO: add_to_hash(0, 0); -const HASH_OF_ONE: add_to_hash(0, 1); -``` - -And at some point they write a match statement: - -```rust -fn process_hash(h: u64) { - match h { - HASH_OF_ZERO => /* do something */, - HASH_OF_ONE => /* do something else */, - _ => /* do something else again */, -} -``` - -As before, what you get when you [compile this](http://is.gd/u5WtCo) -is a dead-code error, because the compiler can see that `HASH_OF_ZERO` -and `HASH_OF_ONE` are the same value. - -Part of the solution here might be making "unreachable patterns" a -warning and not an error. The author feels this would be a good idea -regardless (though not necessarily as part of this RFC). However, -that's not a complete solution, since -- at least for `bool` constants --- the same issues arise if you consider exhaustiveness checking. - -On the other hand, it feels very silly for the compiler not to -understand that `match some_bool { true => ..., false => ... }` is -exhaustive. Furthermore, there are other ways for the values of -constants to "leak out", such as when part of a type like -`[u8; SOME_CONSTANT]` (a point made by both [arielb1][arielb1ac] and -[glaebhoerl][gac] on the [internals thread]). Therefore, the proper -way to address this question is perhaps to consider an explicit form -of "abstract constant". - -[arielb1ac]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/9?u=nikomatsakis -[gac]: https://internals.rust-lang.org/t/how-to-handle-pattern-matching-on-constants/2846/32?u=nikomatsakis diff --git a/text/1445-restrict-constants-in-patterns.md b/text/1445-restrict-constants-in-patterns.md index 60a94c1ba9f..74eedb4520b 100644 --- a/text/1445-restrict-constants-in-patterns.md +++ b/text/1445-restrict-constants-in-patterns.md @@ -1,7 +1,7 @@ -- Feature Name: (fill me in with a unique ident, my_awesome_feature) -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Feature Name: `structural_match` +- Start Date: 2015-02-06 +- RFC PR: [rust-lang/rfcs#1445](https://github.com/rust-lang/rfcs/pull/1445) +- Rust Issue: [rust-lang/rust#31434](https://github.com/rust-lang/rust/issues/31434) # Summary [summary]: #summary @@ -164,7 +164,7 @@ are two main disadvantages: 2. **Scaling to associated constants.** The current approach does not permit associated constants or generic integers to be used in a match statement. - + #### Disadvantage: Weakened abstraction bounary The single biggest concern with structural equality is that it @@ -185,7 +185,7 @@ now be used to **observe private fields**. pub struct Duration { pub seconds: u32, pub minutes: u32, -} +} ``` Let's say that this `Duration` type wishes to represent a span of @@ -316,12 +316,12 @@ fn foo(x: bool) { impl SomeTrait for i32 { const A: bool = true; const B: bool = true; -} +} impl SomeTrait for u32 { const A: bool = true; const B: bool = false; -} +} ``` Is this match exhaustive? Does it contain dead code? The answer will @@ -347,7 +347,7 @@ and a match pattern like the following: ```rust match v { - V1 => ..., + V1 => ..., ..., V5 => ..., } From 2d53b528892a5b5903797c96ca957109ced0bc76 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 18 Feb 2016 10:58:25 -0800 Subject: [PATCH 0753/1195] RFC 1467 is ptr::{read,write}_volatile --- text/{0000-volatile.md => 1467-volatile.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-volatile.md => 1467-volatile.md} (89%) diff --git a/text/0000-volatile.md b/text/1467-volatile.md similarity index 89% rename from text/0000-volatile.md rename to text/1467-volatile.md index a2b9ac833a3..f3e7f6b628c 100644 --- a/text/0000-volatile.md +++ b/text/1467-volatile.md @@ -1,7 +1,7 @@ - Feature Name: volatile - Start Date: 2016-01-18 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1467](https://github.com/rust-lang/rfcs/pull/1467) +- Rust Issue: [rust-lang/rust#31756](https://github.com/rust-lang/rust/issues/31756) # Summary [summary]: #summary From 9638c35e07a2e9b46645c4c5172345c66653bc1c Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Thu, 18 Feb 2016 19:02:16 +0000 Subject: [PATCH 0754/1195] Rename compare_exchange_strong to compare_exchange --- text/0000-extended-compare-and-swap.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-extended-compare-and-swap.md b/text/0000-extended-compare-and-swap.md index d8fdcade35d..5c219202aa2 100644 --- a/text/0000-extended-compare-and-swap.md +++ b/text/0000-extended-compare-and-swap.md @@ -24,19 +24,19 @@ While all of these variants are identical on x86, they can allow more efficient # Detailed design [design]: #detailed-design -Since `compare_and_swap` is stable, we can't simply add a second memory ordering parameter to it. This RFC proposes deprecating the `compare_and_swap` function and replacing it with `compare_exchange_strong` and `compare_exchange_weak`, which match the names of the equivalent C++11 functions. +Since `compare_and_swap` is stable, we can't simply add a second memory ordering parameter to it. This RFC proposes deprecating the `compare_and_swap` function and replacing it with `compare_exchange` and `compare_exchange_weak`, which match the names of the equivalent C++11 functions (with the `_strong` suffix removed). -## `compare_exchange_strong` +## `compare_exchange` A new method is instead added to atomic types: ```rust -fn compare_exchange_strong(&self, current: T, new: T, success: Ordering, failure: Ordering) -> T; +fn compare_exchange(&self, current: T, new: T, success: Ordering, failure: Ordering) -> T; ``` The restrictions on the failure ordering are the same as C++11: only `SeqCst`, `Acquire` and `Relaxed` are allowed and it must be equal or weaker than the success ordering. Passing an invalid memory ordering will result in a panic, although this can often be optimized away since the ordering is usually statically known. -The documentation for the original `compare_and_swap` is updated to say that it is equivalent to `compare_exchange_strong` with the following mapping for memory orders: +The documentation for the original `compare_and_swap` is updated to say that it is equivalent to `compare_exchange` with the following mapping for memory orders: Original | Success | Failure -------- | ------- | ------- @@ -54,7 +54,7 @@ A new method is instead added to atomic types: fn compare_exchange_weak(&self, current: T, new: T, success: Ordering, failure: Ordering) -> (T, bool); ``` -`compare_exchange_strong` does not need to return a success flag because it can be inferred by checking if the returned value is equal to the expected one. This is not possible for `compare_exchange_weak` because it is allowed to fail spuriously, which means that it could fail to perform the swap even though the returned value is equal to the expected one. +`compare_exchange` does not need to return a success flag because it can be inferred by checking if the returned value is equal to the expected one. This is not possible for `compare_exchange_weak` because it is allowed to fail spuriously, which means that it could fail to perform the swap even though the returned value is equal to the expected one. A lock free algorithm using a loop would use the returned bool to determine whether to break out of the loop, and if not, use the returned value for the next iteration of the loop. From b80356a21db04c09f65c61b3bd15b61a8592542a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 18 Feb 2016 13:49:08 -0800 Subject: [PATCH 0755/1195] RFC 1461 is adding some net2 methods to std --- text/{0000-net2-mutators.md => 1461-net2-mutators.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-net2-mutators.md => 1461-net2-mutators.md} (95%) diff --git a/text/0000-net2-mutators.md b/text/1461-net2-mutators.md similarity index 95% rename from text/0000-net2-mutators.md rename to text/1461-net2-mutators.md index fd095df99df..329412f7558 100644 --- a/text/0000-net2-mutators.md +++ b/text/1461-net2-mutators.md @@ -1,7 +1,7 @@ -- Feature Name: net2_mutators +- Feature Name: `net2_mutators` - Start Date: 2016-01-12 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1461](https://github.com/rust-lang/rfcs/pull/1461) +- Rust Issue: [rust-lang/rust#31766](https://github.com/rust-lang/rust/issues/31766) # Summary [summary]: #summary From c447aa8a6f0b954ce2cd493a34081e5f4e81a594 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 18 Feb 2016 14:18:15 -0800 Subject: [PATCH 0756/1195] RFC 1443 is compare_exchange on atomics --- ...ompare-and-swap.md => 1443-extended-compare-and-swap.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-extended-compare-and-swap.md => 1443-extended-compare-and-swap.md} (97%) diff --git a/text/0000-extended-compare-and-swap.md b/text/1443-extended-compare-and-swap.md similarity index 97% rename from text/0000-extended-compare-and-swap.md rename to text/1443-extended-compare-and-swap.md index 5c219202aa2..25a41cdc8be 100644 --- a/text/0000-extended-compare-and-swap.md +++ b/text/1443-extended-compare-and-swap.md @@ -1,7 +1,7 @@ -- Feature Name: extended_compare_and_swap +- Feature Name: `extended_compare_and_swap` - Start Date: 2016-1-5 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1443](https://github.com/rust-lang/rfcs/pull/1443) +- Rust Issue: [rust-lang/rust#31767](https://github.com/rust-lang/rust/issues/31767) # Summary [summary]: #summary From c3b4a8d6b4063528a9880cc88c9531f10a911e62 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Sun, 21 Feb 2016 02:39:06 +0000 Subject: [PATCH 0757/1195] Add support for 128-bit integers --- text/0000-int128.md | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 text/0000-int128.md diff --git a/text/0000-int128.md b/text/0000-int128.md new file mode 100644 index 00000000000..55db47159ff --- /dev/null +++ b/text/0000-int128.md @@ -0,0 +1,42 @@ +- Feature Name: int128 +- Start Date: 21-02-2016 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC adds the `i128` and `u128` types to Rust. Because these types are not available on all platforms, a new target flag (`target_has_int128`) is added to allow users to check whether 128-bit integers are supported. The `i128` and `u128` are not added to the prelude, and must instead be explicitly imported with `use core::{i128, u128}`. + +# Motivation +[motivation]: #motivation + +Some algorithms need to work with very large numbers that don't fit in 64 bits, such as certain cryptographic algorithms. One possibility would be to use a BigNum library, but these use heap allocation and tend to have high overhead. LLVM has support for very efficient 128-bit integers, which are exposed by Clang in C as the `__int128` type. + +# Detailed design +[design]: #detailed-design + +From a quick look at Clang's source, 128-bit integers are supported on all 64-bit platforms and a few 32-bit ones (those with 64-bit registers: x32 and MIPS n32). To allow users to determine whether 128-bit integers are available, a `target_has_int128` cfg is added. The `i128` and `u128` types are only available when this flag is set. + +The actual `i128` and `u128` types are not added to the Rust prelude since that would break compatibility. Instead they must be explicitly imported with `use core::{i128, u128}` or `use std::{i128, u128}`. This will also catch attempts to use 128-bit integers when they are not supported by the underlying platform since the import will fail if `target_has_int128` is not defined. + +Implementation-wise, this should just be a matter of adding a new primitive type to the compiler and adding trait implementations for `i128`/`u128` in libcore. A new entry will need to be added to target specifications to specify whether the target supports 128-bit integers. + +One possible complication is that primitive types aren't currently part of the prelude, instead they are directly added to the global namespace by the compiler. The new `i128` and `u128` types will behave differently and will need to be explicitly imported. + +Another possible issue is that a `u128` can hold a very large number that doesn't fit in a `f32`. We need to make sure this doesn't lead to any `undef`s from LLVM. + +# Drawbacks +[drawbacks]: #drawbacks + +It adds a type to the language that may or may not be present depending on the target architecture. + +# Alternatives +[alternatives]: #alternatives + +There have been several attempts to create `u128`/`i128` wrappers based on two `u64` values, but these can't match the performance of LLVM's native 128-bit integers. + +# Unresolved questions +[unresolved]: #unresolved-questions + +How should 128-bit literals be handled? The easiest solution would be to limit integer literals to 64 bits, which is what GCC does (no support for `__int128` literals). From ad25344d63ca3a787f6cc739cfcdcba6fe9ac5be Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Mon, 22 Feb 2016 01:00:52 +0000 Subject: [PATCH 0758/1195] Change RFC to supporting 128-bit integers on all architectures --- text/0000-int128.md | 60 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 48 insertions(+), 12 deletions(-) diff --git a/text/0000-int128.md b/text/0000-int128.md index 55db47159ff..1d12a9f93df 100644 --- a/text/0000-int128.md +++ b/text/0000-int128.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -This RFC adds the `i128` and `u128` types to Rust. Because these types are not available on all platforms, a new target flag (`target_has_int128`) is added to allow users to check whether 128-bit integers are supported. The `i128` and `u128` are not added to the prelude, and must instead be explicitly imported with `use core::{i128, u128}`. +This RFC adds the `i128` and `u128` types to Rust. The `i128` and `u128` are not added to the prelude, and must instead be explicitly imported with `use core::{i128, u128}`. # Motivation [motivation]: #motivation @@ -16,20 +16,56 @@ Some algorithms need to work with very large numbers that don't fit in 64 bits, # Detailed design [design]: #detailed-design -From a quick look at Clang's source, 128-bit integers are supported on all 64-bit platforms and a few 32-bit ones (those with 64-bit registers: x32 and MIPS n32). To allow users to determine whether 128-bit integers are available, a `target_has_int128` cfg is added. The `i128` and `u128` types are only available when this flag is set. - -The actual `i128` and `u128` types are not added to the Rust prelude since that would break compatibility. Instead they must be explicitly imported with `use core::{i128, u128}` or `use std::{i128, u128}`. This will also catch attempts to use 128-bit integers when they are not supported by the underlying platform since the import will fail if `target_has_int128` is not defined. - -Implementation-wise, this should just be a matter of adding a new primitive type to the compiler and adding trait implementations for `i128`/`u128` in libcore. A new entry will need to be added to target specifications to specify whether the target supports 128-bit integers. - -One possible complication is that primitive types aren't currently part of the prelude, instead they are directly added to the global namespace by the compiler. The new `i128` and `u128` types will behave differently and will need to be explicitly imported. - -Another possible issue is that a `u128` can hold a very large number that doesn't fit in a `f32`. We need to make sure this doesn't lead to any `undef`s from LLVM. +The `i128` and `u128` types are not added to the Rust prelude since that would break compatibility. Instead they must be explicitly imported with `use core::{i128, u128}` or `use std::{i128, u128}`. + +Implementation-wise, this should just be a matter of adding a new primitive type to the compiler and adding trait implementations for `i128`/`u128` in libcore. Literals will need to be extended to support `i128`/`u128`. + +LLVM fully supports 128-bit integers on all architectures, however it will emit calls to functions in `compiler-rt` for many operations such as multiplication and division (addition and subtraction are implemented natively). However, `compiler-rt` only provides the functions for 128-bit integers on 64-bit platforms (`#ifdef __LP64__`). We will need to provide our own implementations of the following functions to allow `i128`/`u128` to be available on all architectures: + +```c +// si_int = i32 +// su_int = u32 +// ti_int = i128 +// tu_int = u128 +ti_int __absvti2(ti_int a); +ti_int __addvti3(ti_int a, ti_int b); +ti_int __ashlti3(ti_int a, si_int b); +ti_int __ashrti3(ti_int a, si_int b); +si_int __clzti2(ti_int a); +si_int __cmpti2(ti_int a, ti_int b); +si_int __ctzti2(ti_int a); +ti_int __divti3(ti_int a, ti_int b); +si_int __ffsti2(ti_int a); +ti_int __fixdfti(double a); +ti_int __fixsfti(float a); +tu_int __fixunsdfti(double a); +tu_int __fixunssfti(float a); +double __floattidf(ti_int a); +float __floattisf(ti_int a); +double __floatuntidf(tu_int a); +float __floatuntisf(tu_int a); +ti_int __lshrti3(ti_int a, si_int b); +ti_int __modti3(ti_int a, ti_int b); +ti_int __muloti4(ti_int a, ti_int b, int* overflow); +ti_int __multi3(ti_int a, ti_int b); +ti_int __mulvti3(ti_int a, ti_int b); +ti_int __negti2(ti_int a); +ti_int __negvti2(ti_int a); +si_int __parityti2(ti_int a); +si_int __popcountti2(ti_int a); +ti_int __subvti3(ti_int a, ti_int b); +si_int __ucmpti2(tu_int a, tu_int b); +tu_int __udivmodti4(tu_int a, tu_int b, tu_int* rem); +tu_int __udivti3(tu_int a, tu_int b); +tu_int __umodti3(tu_int a, tu_int b); +``` # Drawbacks [drawbacks]: #drawbacks -It adds a type to the language that may or may not be present depending on the target architecture. +One possible complication is that primitive types aren't currently part of the prelude, instead they are directly added to the global namespace by the compiler. The new `i128` and `u128` types will behave differently and will need to be explicitly imported. + +Another possible issue is that a `u128` can hold a very large number that doesn't fit in a `f32`. We need to make sure this doesn't lead to any `undef`s from LLVM. See [this comment](https://github.com/rust-lang/rust/issues/10185#issuecomment-110955148), and [this example code](https://gist.github.com/Amanieu/f87da5f0599b343c5500). # Alternatives [alternatives]: #alternatives @@ -39,4 +75,4 @@ There have been several attempts to create `u128`/`i128` wrappers based on two ` # Unresolved questions [unresolved]: #unresolved-questions -How should 128-bit literals be handled? The easiest solution would be to limit integer literals to 64 bits, which is what GCC does (no support for `__int128` literals). +None From 82bb5a0e98eb57790b1ad8cd67d4f13640ef7056 Mon Sep 17 00:00:00 2001 From: Vadim Petrochenkov Date: Mon, 22 Feb 2016 13:32:07 +0300 Subject: [PATCH 0759/1195] RFC: Clarify the relationships between various kinds of structs and variants --- text/0000-adt-kinds.md | 169 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+) create mode 100644 text/0000-adt-kinds.md diff --git a/text/0000-adt-kinds.md b/text/0000-adt-kinds.md new file mode 100644 index 00000000000..e5894ea5509 --- /dev/null +++ b/text/0000-adt-kinds.md @@ -0,0 +1,169 @@ +- Feature Name: clarified_adt_kinds +- Start Date: 2016-02-07 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Provide a simple model describing three kinds of structs and variants and their relationships. +Provide a way to match on structs/variants in patterns regardless of their kind (`S{..}`). +Permit tuple structs and tuple variants with zero fields (`TS()`). + +# Motivation +[motivation]: #motivation + +There's some mental model lying under the current implementation of ADTs, but it is not written +out explicitly and not implemented completely consistently. +Writing this model out helps to identify its missing parts. +Some of this missing parts turn out to be practically useful. +This RFC can also serve as a piece of documentation. + +# Detailed design +[design]: #detailed-design + +The text below mostly talks about structures, but almost everything is equally applicable to +variants. + +## Braced structs + +Braced structs are declared with braces (unsurprisingly). + +``` +struct S { + field1: Type1, + field2: Type2, + field3: Type3, +} +``` + +Braced structs are the basic struct kind, other kinds are built on top of them. +Braced structs have 0 or more user-named fields and are defined only in type namespace. + +Braced structs can be used in struct expressions `S{field1: expr, field2: expr}`, including +functional record update (FRU) `S{field1: expr, ..s}`/`S{..s}` and with struct patterns +`S{field1: pat, field2: pat}`/`S{field1: pat, ..}`/`S{..}`. +In all cases the path `S` of the expression or pattern is looked up in the type namespace. +Fields of a braced struct can be accessed with dot syntax `s.field1`. + +Note: struct *variants* are currently defined in the value namespace in addition to type namespace, + there are no particular reasons for this and this is probably temporary. + +## Unit structs + +Unit structs are defined without any fields or brackets. + +``` +struct US; +``` + +Unit structs can be thought of as a single declaration for two things: a basic struct + +``` +struct US {} +``` + +and a constant with the same name + +``` +const US: US = US{}; +``` + +Unit structs have 0 fields and are defined in both type (the type `US`) and value (the +constant `US`) namespaces. + +As a basic struct, a unit struct can participate in struct expressions `US{}`, including FRU +`US{..s}` and in struct patterns `US{}`/`US{..}`. In both cases the path `US` of the expression +or pattern is looked up in the type namespace. +Fields of a unit struct could also be accessed with dot syntax, but it doesn't have any fields. + +As a constant, a unit struct can participate in unit struct expressions `US` and unit struct +patterns `US`, both of these are looked up in the value namespace in which the constant `US` is +defined. + +Note 1: the constant is not exactly a `const` item, there are subtle differences, but it's a close +approximation. +Note 2: the constant is pretty weirdly namespaced in case of unit *variants*, constants can't be +defined in "enum modules" manually. + +## Tuple structs + +Tuple structs are declared with parentheses. +``` +struct TS(Type0, Type1, Type2); +``` + +Tuple structs can be thought of as a single declaration for two things: a basic struct + +``` +struct TS { + 0: Type0, + 1: Type1, + 2: Type2, +} +``` + +and a constructor function with the same name + +``` +fn TS(arg0: Type0, arg1: Type1, arg2: Type2) -> TS { + TS{0: arg0, 1: arg1, 2: arg2} +} +``` + +Tuple structs have 0 or more automatically-named fields and are defined in both type (the type `TS`) +and the value (the constructor function `TS`) namespaces. + +As a basic struct, a tuple struct can participate in struct expressions `TS{0: expr, 1: expr}`, +including FRU `TS{0: expr, ..ts}`/`TS{..ts}` and in struct patterns +`TS{0: pat, 1: pat}`/`TS{0: pat, ..}`/`TS{..}`. +In both cases the path `TS` of the expression or pattern is looked up in the type namespace. +Fields of a braced tuple can be accessed with dot syntax `ts.0`. + +As a constructor, a tuple struct can participate in tuple struct expressions `TS(expr, expr)` and +tuple struct patterns `TS(pat, pat)`/`TS(..)`, both of these are looked up in the value namespace +in which the constructor `TS` is defined. Tuple struct expressions `TS(expr, expr)` are usual +function calls, but the compiler reserves the right to make observable improvements to them based +on the additional knowledge, that `TS` is a constructor. + +Note: the automatically assigned field names are quite interesting, they are not identifiers +lexically (they are integer literals), so such fields can't be defined manually. + +## Summary of the changes. + +Everything related to braced structs and unit structs is already implemented. + +New: Permit tuple structs and tuple variants with 0 fields. This restriction is artificial and can +be lifted trivially. Macro writers dealing with tuple structs/variants will be happy to get rid of +this one special case. + +New: Permit using tuple structs and tuple variants in braced struct patterns and expressions not +requiring naming their fields - `TS{..ts}`/`TS{}`/`TS{..}`. This doesn't require much effort to +implement as well. +This also means that `S{..}` patterns can be used to match structures and variants of any kind. +The desire to have such "match everything" patterns is sometimes expressed given +that number of fields in structures and variants can change from zero to non-zero and back during +development. + +New: Permit using tuple structs and tuple variants in braced struct patterns and expressions +requiring naming their fields - `TS{0: expr}`/`TS{0: pat}`/etc. This change looks a bit worse from +the cost/benefit point of view. There's not much motivation for it besides consistency and probably +shortening patterns like `ItemFn(name, _, _, _, _, _)` into something like `ItemFn{0: name, ..}`. +Automatic code generators (e.g. syntax extensions like `derive`) can probably benefit from the +ability to generate uniform code for all structure kinds as well. +The author of the RFC is ready to postpone or drop this particular extension at any moment. + +# Drawbacks +[drawbacks]: #drawbacks + +None. + +# Alternatives +[alternatives]: #alternatives + +None. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None. From f10dc0214dd92a5813af59c23e79567e57e2fac9 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 23 Feb 2016 13:17:00 -0500 Subject: [PATCH 0760/1195] Rename and link specialization RFC to tracking issue --- ...000-impl-specialization.md => 1210-impl-specialization.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-impl-specialization.md => 1210-impl-specialization.md} (99%) diff --git a/text/0000-impl-specialization.md b/text/1210-impl-specialization.md similarity index 99% rename from text/0000-impl-specialization.md rename to text/1210-impl-specialization.md index 644486e5866..90a9a197c3a 100644 --- a/text/0000-impl-specialization.md +++ b/text/1210-impl-specialization.md @@ -1,7 +1,7 @@ - Feature Name: specialization - Start Date: 2015-06-17 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1210](https://github.com/rust-lang/rfcs/pull/1210) +- Rust Issue: [rust-lang/rust#31844](https://github.com/rust-lang/rust/issues/31844) # Summary From 132349f42c53cd06600fc4e04a5cdfab851e5ed9 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 23 Feb 2016 16:54:36 -0800 Subject: [PATCH 0761/1195] RFC: Add a new crate-type, rdylib Add a new crate type accepted by the compiler, called `rdylib`, which corresponds to the behavior of `-C prefer-dynamic` plus `--crate-type dylib`. --- text/0000-rdylib.md | 151 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 151 insertions(+) create mode 100644 text/0000-rdylib.md diff --git a/text/0000-rdylib.md b/text/0000-rdylib.md new file mode 100644 index 00000000000..b868ef38822 --- /dev/null +++ b/text/0000-rdylib.md @@ -0,0 +1,151 @@ +- Feature Name: N/A +- Start Date: 2016-02-23 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add a new crate type accepted by the compiler, called `rdylib`, which +corresponds to the behavior of `-C prefer-dynamic` plus `--crate-type dylib`. + +# Motivation +[motivation]: #motivation + +Currently the compiler supports two modes of generating dynamic libraries: + +1. One form of dynamic library is intended for reuse with further compilations. + This kind of library exposes all Rust symbols, links to the standard library + dynamically, etc. I'll refer to this mode as **rdylib** as it's a Rust + dynamic library talking to Rust. +2. Another form of dynamic library is intended for embedding a Rust application + into another. Currently the only difference from the previous kind of dynamic + library is that it favors linking statically to other Rust libraries + (bundling them inside). I'll refer to this as a **cdylib** as it's a Rust + dynamic library exporting a C API. + +Each of these flavors of dynamic libraries has a distinct use case. For examples +rdylibs are used by the compiler itself to implement plugins, and cdylibs are +used whenever Rust needs to be dynamically loaded from another language or +application. + +Unfortunately the balance of features is tilted a little bit too much towards +the smallest use case, rdylibs. In practice because Rust is statically linked by +default and has an unstable ABI, rdylibs are used quite rarely. There are a +number of requirements they impose, however, which aren't necessary for +cdylibs: + +* Metadata is included in all dynamic libraries. If you're just loading Rust + into somewhere else, however, you have no need for the metadata! +* *Reachable* symbols are exposed from dynamic libraries, but if you're loading + Rust into somewhere else then, like executables, only *public* non-Rust-ABI + function sneed to be exported. This can lead to unnecessarily large Rust + dynamic libraries in terms of object size as well as missed optimization + opportunities from knowing that a function is otherwise private. +* We can't run LTO for dylibs because those are intended for end products, not + intermediate ones like (1) is. + +The purpose of this RFC is to solve these drawbacks with a new crate-type to +represent the more rarely used form of dynamic library (rdylibs). + +# Detailed design +[design]: #detailed-design + +A new crate type will be accepted by the compiler, `rdylib`, which can be passed +as either `--crate-type rdylib` on the command line or via `#![crate_type = +"rdylib"]` in crate attributes. This crate type will conceptually correspond to +the rdylib use case described above, and today's `dylib` crate-type will +correspond to the cdylib use case above. + +The two formats will differ in the parts listed in the motivation above, +specifically: + +* **Metadata** - rdylibs will have a section of the library with metadata, + whereas cdylibs will not. +* **Symbol visibility** - rdylibs will expose all symbols as rlibs do, cdylibs + will expose symbols as executables do. This means that `pub fn foo() {}` will + not be an exported symbol, but `#[no_mangle] pub extern fn foo() {}` will be + an exported symbol. Note that the compiler will also be at liberty to pass + extra flags to the linker to actively hide exported Rust symbols from linked + libraries. +* **LTO** - this will disallowed for rdylibs, but enabled for cdylibs. +* **Linkage** - rdylibs will link dynamically to one another by default, for + example the standard library will be linked dynamically by default. On the + other hand, cdylibs will link all Rust dependencies statically by default. + +As is evidenced from many of these changes, however, the reinterpretation of the +`dylib` output format from what it is today is a breaking change. For example +metadata will not be present and symbols will be hidden. As a result, this RFC +has a... + +### Transition Plan + +This RFC is technically a breaking change, but it is expected to not actually +break many work flows in practice because there is only one known user of +rdylibs, the compiler itself. This notably means that plugins will also need to +be compiled differently, but because they are nightly-only we've got some more +leeway around them. + +All other known users of the `dylib` output crate type fall into the cdylib use +case. The "breakage" here would mean: + +* The metadata section no longer exists. In almost all cases this just means + that the output artifacts will get smaller if it isn't present, it's expected + that no one other than the compiler itself is actually consuming this + information. +* Rust symbols will be hidden by default. The symbols, however, have + unpredictable hashes so there's not really any way they can be meaningfully + leveraged today. + +Given that background, it's expected that if there's a smooth migration path for +plugins and the compiler then the "breakage" here won't actually appear in +practice. The proposed implementation strategy and migration path is: + +1. Implement the `rdylib` output type as proposed in this RFC. +2. Change Cargo to use `--crate-type rdylib` when compiling plugins instead of + `--crate-type dylib` + `-C prefer-dynamic`. +3. Implement the changes to the `dylib` output format as proposed in this RFC. + +So long as the steps are spaced apart by a few days it should be the case that +no nightly builds break if they're always using an up-to-date nightly compiler. + +# Drawbacks +[drawbacks]: #drawbacks + +Rust's ephemeral and ill-defined "linkage model" is... well... ill defined and +ephemeral. This RFC is an extension of this model, but it's difficult to reason +about extending that which is not well defined. As a result there could be +unforseen interactions between this output format and where it's used. + +As usual, of course, proposing a breaking change is indeed a drawback. It is +expected that RFC doesn't break anything in practice, but that'd be difficult to +gauge until it's implemented. + +# Alternatives +[alternatives]: #alternatives + +* Instead of reinterpreting the `dylib` output format as a cdylib, we could + continue interpreting it as an rdylib and add a new dedicated `cdylib` output + format. This would not be a breaking change, but it doesn't come without its + drawbacks. As the most common output type, many projects would have to switch + to `cdylib` from `dylib`, meaning that they no longer support older Rust + compilers. This may also take time to propagate throughout the community. It's + also arguably a "better name", so this RFC proposes an + in-practice-not-a-breaking-change by adding a worse name of `rdylib` for the + less used output format. + +* The compiler could have a longer transition period where `-C prefer-dynamic` + plus `--crate-type dylib` is interpreted as an rdylib. Either that or the + implementation strategy here could be extended by a release or two to let + changes time to propagate throughout the ecosystem. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* This RFC is currently founded upon the assumption that rdylibs are very rarely + used in the ecosystem. An audit has not been performed to determine whether + this is true or not, but is this actually the case? + +* Should the new `rdylib` format be considered unstable? (should it require a + nightly compiler?). The use case for a Rust dynamic library is so limited, and + so volatile, we may want to just gate access to it by default. From add3ae7c390164f27a20efc272171c94f030a00b Mon Sep 17 00:00:00 2001 From: Ivan Enderlin Date: Wed, 24 Feb 2016 14:29:54 +0100 Subject: [PATCH 0762/1195] Fix markup --- text/0505-api-comment-conventions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0505-api-comment-conventions.md b/text/0505-api-comment-conventions.md index a0243053ae4..432c82fdcad 100644 --- a/text/0505-api-comment-conventions.md +++ b/text/0505-api-comment-conventions.md @@ -39,7 +39,7 @@ Instead of: */ ``` -Only use inner doc comments //! to write crate and module-level documentation, +Only use inner doc comments `//!` to write crate and module-level documentation, nothing else. When using `mod` blocks, prefer `///` outside of the block: ```rust From f002ab9cd69fc62bb9c62d21e92a4cae6a7d8443 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 24 Feb 2016 09:28:08 -0800 Subject: [PATCH 0763/1195] Flesh out background and motivation --- text/0000-conservative-impl-trait.md | 101 ++++++++++++++++++++++++--- 1 file changed, 91 insertions(+), 10 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 5cc848a9ddb..a69fe145359 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -18,7 +18,7 @@ type behind a trait interface similar to trait objects, while still generating the same statically dispatched code as with concrete types: ```rust -fn foo(n: u32) -> impl Iterator { +fn foo(n: u32) -> @Iterator { (0..n).map(|x| x * 100) } // ^ behaves as if it had return type Map, Clos> @@ -30,20 +30,101 @@ for x in foo(10) { ``` +# Background + +There has been much discussion around the `impl Trait` feature already, with +different proposals extending the core idea into different directions: + +- The [original proposal](https://github.com/rust-lang/rfcs/pull/105). +- A [blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/) reviving + the proposal and further exploring the design space. +- A [more recent proposal](https://github.com/rust-lang/rfcs/pull/1305) with a + substantially more ambitious scope. + +This RFC is an attempt to make progress on the feature by proposing a minimal +subset that should be forwards-compatible with a whole range of extensions that +have been discussed (and will be reviewed in this RFC). However, even this small +step requires resolving some of the core questions raised in +[the blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/). + +This RFC is closest in spirit to the +[original RFC]((https://github.com/rust-lang/rfcs/pull/105), and we'll repeat +its motivation and some other parts of its text below. + # Motivation [motivation]: #motivation > Why are we doing this? What use cases does it support? What is the expected outcome? -There has been much discussion around the `impl Trait` feature already, with -different proposals extending the core idea into different directions. +In today's Rust, you can write a function signature like + +````rust +fn consume_iter_static>(iter: I) +fn consume_iter_dynamic(iter: Box>) +```` + +In both cases, the function does not depend on the exact type of the argument. +The type is held "abstract", and is assumed only to satisfy a trait bound. + +* In the `_static` version using generics, each use of the function is + specialized to a concrete, statically-known type, giving static dispatch, inline + layout, and other performance wins. + +* In the `_dynamic` version using trait objects, the concrete argument type is + only known at runtime using a vtable. + +On the other hand, while you can write + +````rust +fn produce_iter_dynamic() -> Box> +```` + +you _cannot_ write something like + +````rust +fn produce_iter_static() -> Iterator +```` + +That is, in today's Rust, abstract return types can only be written using trait +objects, which can be a significant performance penalty. This RFC proposes +"unboxed abstract types" as a way of achieving signatures like +`produce_iter_static`. Like generics, unboxed abstract types guarantee static +dispatch and inline data layout. + +Here are some problems that unboxed abstract types solve or mitigate: + +* _Returning unboxed closures_. Closure syntax generates an anonymous type + implementing a closure trait. Without unboxed abstract types, there is no way + to use this syntax while returning the resulting closure unboxed, because there + is no way to write the name of the generated type. + +* _Leaky APIs_. Functions can easily leak implementation details in their return + type, when the API should really only promise a trait bound. For example, a + function returning `Rev>` is revealing exactly how the iterator + is constructed, when the function should only promise that it returns _some_ + type implementing `Iterator`. Using newtypes/structs with private fields + helps, but is extra work. Unboxed abstract types make it as easy to promise only + a trait bound as it is to return a concrete type. + +* _Complex types_. Use of iterators in particular can lead to huge types: + + ````rust + Chain>>>, SkipWhile<'a, u16, Map<'a, &u16, u16, slice::Items>>> + ```` + + Even when using newtypes to hide the details, the type still has to be written + out, which can be very painful. Unboxed abstract types only require writing the + trait bound. -See http://aturon.github.io/blog/2015/09/28/impl-trait/ for detailed motivation, and -https://github.com/rust-lang/rfcs/pull/105 and https://github.com/rust-lang/rfcs/pull/1305 for prior RFCs on this topic. +* _Documentation_. In today's Rust, reading the documentation for the `Iterator` + trait is needlessly difficult. Many of the methods return new iterators, but + currently each one returns a different type (`Chain`, `Zip`, `Map`, `Filter`, + etc), and it requires drilling down into each of these types to determine what + kind of iterator they produce. -It is not yet clear which, if any, of the proposals will end up as the "final form" -of the feature, so this RFC aims to only specify a usable subset that will -be compatible with most of them. +In short, unboxed abstract types make it easy for a function signature to +promise nothing more than a trait bound, and do not generally require the +function's author to write down the concrete type implementing the bound. # Detailed design [design]: #detailed-design @@ -76,9 +157,9 @@ there is also the option of using keyword-based syntax like `impl Trait` or `abstract Trait`, but this would add a verbosity overhead for a feature that will be used somewhat commonly. -#### Semantic +#### Semantics -The core semantic of the feature is described below. +The core semantics of the feature is described below. Note that the sections after this one go into more detail on some of the design decisions, and that it is likely for most of the mentioned limitations to be From 70221b692bb7ea2024fe84e0fa7dd83406f73138 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 25 Feb 2016 15:43:10 -0800 Subject: [PATCH 0764/1195] Edit primary rationale, fix up section headers --- text/0000-conservative-impl-trait.md | 230 ++++++++++++++++++--------- 1 file changed, 153 insertions(+), 77 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index a69fe145359..75210208cd2 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -133,20 +133,29 @@ function's author to write down the concrete type implementing the bound. > with the language to understand, and for somebody familiar with the compiler to implement. > This should get into specifics and corner-cases, and include examples of how the feature is used. -#### Syntax +As explained at the start of the RFC, the focus here is a relatively narrow +introduction of abstract types limited to the return type of inherent methods +and free functions. While we still need to resolve some of the core questions +about what an "abstract type" means even in these cases, we avoid some of the +complexities that come along with allowing the feature in other locations or +with other extensions. + +## Syntax Let's start with the bikeshed: The proposed syntax is `@Trait` in return type position, composing like trait objects to forms like `@(Foo+Send+'a)`. -The reason for choosing a sigil is ergonomics: Whatever the exact final -implementation will be capable of, you'd want it to be as easy to read/write -as trait objects, or else the more performant and idiomatic option would -be the more verbose one, and thus probably less used. +The reason for choosing a sigil is ergonomics: Whatever the exact final feature +will be capable of, you'd want it to be as easy to read/write as trait objects, +or else the more performant and idiomatic option would be the more verbose one, +and thus probably less used. -The argument can be made this decreases the google-ability of Rust syntax -(and this doesn't even talk about the _old_ `@T` pointer semantic the internet is still littered with), -but this would be somewhat mitigated by the feature being supposedly used commonly once it lands, -and can be explained in the docs as being short for `abstract` or `anonym`. +The argument can be made this decreases the google-ability of Rust syntax (and +this doesn't even talk about the _old_ `@T` pointer semantic the internet is +still littered with), but this would be somewhat mitigated by the feature being +supposedly used commonly once it lands, and can be explained in the docs as +being short for `abstract` or `anonym`. And in any case, it's a problem we +already suffer with `&T` and `&mut T`. If there are good reasons against `@`, there is also the choice of `~`. All points from above still apply, except `~` is a bit rarer in language @@ -157,37 +166,41 @@ there is also the option of using keyword-based syntax like `impl Trait` or `abstract Trait`, but this would add a verbosity overhead for a feature that will be used somewhat commonly. -#### Semantics +## Semantics The core semantics of the feature is described below. Note that the sections after this one go into more detail on some of the design -decisions, and that it is likely for most of the mentioned limitations to be -lifted at some point in the future. - -- `@Trait` may only be written at return type position - of a freestanding or inherent-impl function, not in trait definitions, - closure traits, function pointers, or any non-return type position. -- The function body can return values of any type that implements Trait, - but all return values need to be of the same type. -- As far as the typesystem and the compiler is concerned, - the return type outside of the function would - not be a entirely "new" type, nor would it be - a simple type alias. Rather, its semantic would be very similar to that of - _generic type paramters_ inside a function, with small differences caused - by being an _output_ rather than an _input_ of the function. +decisions, and that **it is likely for many of the mentioned limitations to be +lifted at some point in the future**. For clarity, we'll separately categories the *core +semantics* of the feature (aspects that would stay unchanged with future extensions) +and the *initial limitations* (which are likely to be lifted later). + +**Core semantics**: + +- If a function returns `@Trait`, its body can return values of any type that + implements `Trait`, but all return values need to be of the same type. + +- As far as the typesystem and the compiler is concerned, the return type + outside of the function would not be a entirely "new" type, nor would it be a + simple type alias. Rather, its semantics would be very similar to that of + _generic type paramters_ inside a function, with small differences caused by + being an _output_ rather than an _input_ of the function. + - The type would be known to implement the specified traits. - The type would not be known to implement any other trait, with - the exception of OIBITS and default traits like `Sized`. + the exception of OIBITS (aka "auto traits") and default traits like `Sized`. - The type would not be considered equal to the actual underlying type. - - The type would not be allowed to be implemented on. - - The type would be unnameable, just like closures and function items. -- Because OIBITS like `Send` and `Sync` will leak through an - abstract return type, there will be some additional complexity in the - compiler due to some non-local type checking becoming necessary. -- The return type has a identity based on all generic parameters the + - The type would not be allowed to appear as the Self type for an `impl` block. + +- Because OIBITS like `Send` and `Sync` will leak through an abstract return + type, there will be some additional complexity in the compiler due to some + non-local type checking becoming necessary. + +- The return type has an identity based on all generic parameters the function body is parametrized by, and by the location of the function in the module system. This means type equality behaves like this: + ```rust fn foo(t: T) -> @Trait { t @@ -205,8 +218,33 @@ lifted at some point in the future. equal_type(foo::(false), foo::(0)); // ERROR, `@Trait {foo}` is not the same type as `@Trait {foo}` ``` -- The function body can not see through its own return type, so code like this +- The code generation passes of the compiler would not draw a distinction + between the abstract return type and the underlying type, just like they don't + for generic paramters. This means: + - The same trait code would be instantiated, for example, `-> @Any` + would return the type id of the underlying type. + - Specialization would specialize based on the underlying type. + +**Initial limitations**: + +- `@Trait` may only be written at return type position of a freestanding or + inherent-impl function, not in trait definitions, closure traits, function + pointers, or any non-return type position. + + - Eventually, we will want to allow the feature to be used within traits, and + like in argument position as well (as an ergonomic improvement over today's generics). + +- The type produced when a function returns `@Trait` would be effectively + unnameable, just like closures and function items. + + - We will almost certainly want to lift this limitation in the long run, so + that abstract return types can be placed into structs and so on. There are a + few ways we could do so, all related to getting at the "output type" of a + function given all of its generic arguments. + +- The function body cannot see through its own return type, so code like this would be forbidden just like on the outside: + ```rust fn sum_to(n: u32) -> @Display { if n == 0 { @@ -217,37 +255,58 @@ lifted at some point in the future. } ``` -- The code generation passes of the compiler would - not draw a distinction between the abstract return type and the underlying type, - just like they don't for generic paramters. This means: - - The same trait code would be instantiated, for example, `-> @Any` - would return the type id of the underlying type. - - Specialization would specialize based on the underlying type. + - It's unclear whether we'll want to lift this limitation, but it should be possible to do so. + +## Rationale + +### Why this semantics for the return type? + +There has been a lot of discussion about what the semantics of the return type +should be, with the theoretical extremes being "full return type inference" and +"fully abstract type that behaves like a autogenerated newtype wrapper". (This +was in fact the main focus of the +[blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/) on `impl +Trait`.) + +The design as choosen in this RFC lies somewhat in between those two, since it +allows OIBITs to leak through, and allows specialization to "see" the full type +being returned. That is, `@Trait` does not attempt to be a "tightly sealed" +abstraction boundary. The rationale for this design is a mixture of pragmatics +and principles. + +#### Specialization transparency -#### Why this semantic for the return type? +**Principles for specialization transparency**: -There has been a lot of discussion about what the semantic of -the return type should be, with the theoretical extremes being "full return type inference" and "fully abstract type that behaves like a autogenerated newtype wrapper" +The [specialization RFC](https://github.com/rust-lang/rfcs/pull/1210) has given +us a basic principle for how to understand bounds in function generics: they +represent a *minimum* contract between the caller and the callee, in that the +caller must meet at least those bounds, and the callee must be prepared to work +with any type that meets at least those bounds. However, with specialization, +the callee may choose different behavior when additional bounds hold. -The design as choosen in this RFC lies somewhat in between those two, -for the following reasons: +This RFC abides by a similar interpretation for return types: the signature +represents the minimum bound that the callee must satisfy, and the caller must +be prepared to work with any type that meets at least that bound. Again, with +specialization, the caller may dispatch on additional type information beyond +those bounds. -- Usage of this feature should not imply worse performance - than not using it, so specialization and codegeneration has to - treat it the same. -- Likewise, there should not be any bad interactions - caused by part of the typesystem treating the return type different - than other parts, so it should not have its own "identity" - in the sense of allowing additional or different trait or inherent implementations. -- It should not enable return type inference in item signatures, - so the exact underlying type needs to be hidden. -- It should not cause type errors to change the function - body and/or the underlying type as long as the specifed trait - bounds are still satisfied. -- As a exception to the above, it should not act as a barrier to OIBITs like - `Send` and `Sync` due to ergonomic reasons. For more details, see next section. +In other words, to the extent that returning `@Trait` is intended to be +symmetric with taking a generic `T: Trait`, transparency with respect to +specialization maintains that symmetry. -#### OIBIT semantic +**Pragmatics for specialization transparency**: + +The practical reason we want `@Trait` to be transparent to specialization is the +same as the reason we want specialization in the first place: to be able to +break through abstractions with more efficient special-case code. + +This is particularly important for one of the primary intended usecases: +returning `@Iterator`. We are very likely to employ specialization for various +iterator types, and making the underlying return type invisible to +specialization would lose out on those efficiency wins. + +#### OIBIT transparency OIBITs leak through an abstract return type. This might be considered controversial, since it effectively opens a channel where the result of function-local type inference affects @@ -255,32 +314,49 @@ item-level API, but has been deemed worth it for the following reasons: - Ergonomics: Trait objects already have the issue of explicitly needing to declare `Send`/`Sync`-ability, and not extending this problem to abstract - return types is desireable. In practice, most uses - of this feature would have to add explicit bounds for OIBITS - if they wanted to be maximally usable. + return types is desireable. In practice, most uses of this feature would have + to add explicit bounds for OIBITS if they wanted to be maximally usable. + - Low real change, since the situation already somewhat exists on structs with private fields: - In both cases, a change to the private implementation might change whether a OIBIT is implemented or not. - In both cases, the existence of OIBIT impls is not visible without doc tools - In both cases, you can only assert the existence of OIBIT impls - by adding explicit trait bounds either to the API or to the crate's testsuite. + by adding explicit trait bounds either to the API or to the crate's testsuite. + +In fact, a large part of the point of OIBITs in the first place was to cut +across abstraction barriers and provide information about a type without the +type's author having to explicitly opt in. + +This means, however, that it has to be considered a silent breaking change to +change a function with a abstract return type in a way that removes OIBIT impls, +which might be a problem. (As noted above, this is already the case for `struct` +definitions.) + +But since the number of used OIBITs is relatvly small, deducing the return type +in a function body and reasoning about whether such a breakage will occur has +been deemed as a manageable amount of work. -This means, however, that it has to be considered a silent breaking change -to change a function with a abstract return type -in a way that removes OIBIT impls, which might be a problem. +#### Wherefore type abstraction? -But since the number of used OIBITs is relatvly small, -deducing the return type in a function body and reasoning -about whether such a breakage will occur has been deemed as a manageable amount of work. +In the [most recent RFC](https://github.com/rust-lang/rfcs/pull/1305) related to +this feature, a more "tightly sealed" abstraction mechanism was +proposed. However, part of the discussion on specialization centered on +precisely the issue of what type abstraction provides and how to achieve it. A +particular salient point there is that, in Rust, *privacy* is already our +primary mechanism for hiding +(["privacy is the new parametricity"](https://github.com/rust-lang/rfcs/pull/1210#issuecomment-181992044)). In +practice, that means that if you want opacity against specialization, you should +use something like a newtype. -#### Anonymity +### Anonymity -A abstract return type can not be named - this is similar to how closures +A abstract return type can not be named -- this is similar to how closures and function items are already unnameable types, and might be considered -a problem because it makes it not possible to build explicitly typed API +a problem because it makes it not possible to build explicitly typed APIs around the return type of a function. -The current semantic has been chosen for consistency and simplicity, +The current semantics has been chosen for consistency and simplicity, since the issue already exists with closures and function items, and a solution to them will also apply here. @@ -289,7 +365,7 @@ abstract return types could get upgraded to having a name transparently. Likewise, if `typeof` makes it into the language, then you could refer to the return type of a function without naming it. -#### Limitation to only return type position +### Limitation to only return type position There have been various proposed additional places where abstract types might be usable. For example, `fn x(y: @Trait)` as shorthand for @@ -300,7 +376,7 @@ locations are yet unclear (`@Trait` would effectively behave completely different before and after the `->`), this has also been excluded from this proposal. -#### Type transparency in recursive functions +### Type transparency in recursive functions Functions with abstract return types can not see through their own return type, making code like this not compile: @@ -341,13 +417,13 @@ fn sum_to(n: u32) -> @Display { } ``` -#### Not legal in function pointers/closure traits +### Not legal in function pointers/closure traits Because `@Trait` defines a type tied to the concrete function body, it does not make much sense to talk about it separately in a function signature, so the syntax is forbidden there. -#### Compability with conditional trait bounds +### Compability with conditional trait bounds On valid critique for the existing `@Trait` proposal is that it does not cover more complex scenarios, where the return type would implement @@ -368,7 +444,7 @@ Using just `-> @Iterator`, this would not be possible to reproduce. Since there has been no proposals so far that would address this in a way that would conflict with the fixed-trait-set case, this RFC punts on that issue as well. -#### Limitation to free/inherent functions +### Limitation to free/inherent functions One important usecase of abstract return types is to use them in trait methods. From 47a25e89a2af24759351d71cf7c8e90ff44014ed Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 25 Feb 2016 16:33:34 -0800 Subject: [PATCH 0765/1195] Edits to rationale for anonymity and return type position --- text/0000-conservative-impl-trait.md | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 75210208cd2..29d8d839fbf 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -351,19 +351,12 @@ use something like a newtype. ### Anonymity -A abstract return type can not be named -- this is similar to how closures -and function items are already unnameable types, and might be considered -a problem because it makes it not possible to build explicitly typed APIs -around the return type of a function. - -The current semantics has been chosen for consistency and simplicity, -since the issue already exists with closures and function items, and -a solution to them will also apply here. - -For example, if named abstract types get added, then existing -abstract return types could get upgraded to having a name transparently. -Likewise, if `typeof` makes it into the language, then you could refer to the -return type of a function without naming it. +A abstract return type cannot be named in this proposal, which means that it +cannot be placed into `structs` and so on. This is not a fundamental limitation +in any sense; the limitation is there both to keep this RFC simple, and because +the precise way we might want to allow naming of such types is still a bit +unclear. Some possibilities include a `typeof` operator, or explicit named +abstract types. ### Limitation to only return type position @@ -371,10 +364,9 @@ There have been various proposed additional places where abstract types might be usable. For example, `fn x(y: @Trait)` as shorthand for `fn x(y: T)`. -Since the exact semantic and user experience for these -locations are yet unclear -(`@Trait` would effectively behave completely different before and after the `->`), -this has also been excluded from this proposal. +Since the exact semantics and user experience for these locations are yet +unclear (`@Trait` would effectively behave completely different before and after +the `->`), this has also been excluded from this proposal. ### Type transparency in recursive functions From dfc73d9cb34accced1bf40433a4e41b2c65030f7 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 25 Feb 2016 16:33:41 -0800 Subject: [PATCH 0766/1195] Edits to drawbacks and alternatives --- text/0000-conservative-impl-trait.md | 30 ++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 29d8d839fbf..4a59066bb88 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -503,20 +503,42 @@ so forbidding them in traits seems like the best initial course of action. > Why should we *not* do this? +## Drawbacks due to the proposal's minimalism + As has been elaborated on above, there are various way this feature could be -extended and combined with the language, so implementing it might -cause issues down the road if limitations or incompatibilities become apparent. +extended and combined with the language, so implementing it might cause issues +down the road if limitations or incompatibilities become apparent. However, +variations of this RFC's proposal have been under discussion for quite a long +time at this point, and this proposal is carefully designed to be +future-compatible with them, while resolving the core issue around transparency. + +A drawback of limiting the feature to return type position (and not arguments) +is that it creates a somewhat inconsistent mental model: it forces you to +understand the feature in a highly special-cased way, rather than as a general +way to talk about unknown-but-bounded types in function signatures. This could +be particularly bewildering to newcomers, who must choose between `T: Trait`, +`Box`, and `@Trait`, with the latter only usable in one place. + +## Drawbacks due to partial transparency + +The fact that specialization and OIBITs can "see through" `@Trait` may be +surprising, to the extent that one wants to see `@Trait` as an abstraction +mechanism. However, as the RFC argued in the rationale section, this design is +probably the most consistent with our existing post-specialization abstraction +mechanisms, and lead to the relatively simple story that *privacy* is the way to +achieve hiding in Rust. # Alternatives [alternatives]: #alternatives > What other designs have been considered? What is the impact of not doing this? -See the links in the motivation section for a more detailed analysis. +See the links in the motivation section for detailed analysis that we won't +repeat here. But basically, without this feature certain things remain hard or impossible to do in Rust, like returning a efficiently usable type parametricised by -types private to a function body, for example a iterator adapter containing a closure. +types private to a function body, for example an iterator adapter containing a closure. # Unresolved questions [unresolved]: #unresolved-questions From a7b88b41bd156ad3f1d24ed1511ff42448be746e Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 25 Feb 2016 16:35:25 -0800 Subject: [PATCH 0767/1195] Edits to unresolved questions --- text/0000-conservative-impl-trait.md | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 4a59066bb88..c508053621e 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -545,12 +545,7 @@ types private to a function body, for example an iterator adapter containing a c > What parts of the design are still TBD? -- What happens if you specialize a function with an abstract return type, - and differ in whether the return type implements an OIBIT or not? - - It would mean that specialization choice - has to flow back into typechecking. - - It seems sound, but would mean that different input type combinations - of such a function could cause different OIBIT behavior independent - of the input type parameters themself. - - Which would not necessarily be an issue, since the actual type could not - be observed from the outside anyway. +The precise implementation details for OIBIT transparency are a bit unclear: in +general, it means that type checking may need to proceed in a particular order, +since you cannot get the full type information from the signature alone (you +have to typecheck the function body to determine which OIBITs apply). From a370ca67313fe5887c52d9532fd6bca141335c82 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 25 Feb 2016 17:38:47 -0800 Subject: [PATCH 0768/1195] RFC: Stabilize implementing panics as aborts * Stabilize the `-Z no-landing-pads` flag under the name `-C unwind=val` * Implement a number of unstable features akin to custom allocators to swap out implementations of panic just before a final product is generated. * Add a `[profile.dev]` option to Cargo to disable unwinding. --- text/0000-less-unwinding.md | 281 ++++++++++++++++++++++++++++++++++++ 1 file changed, 281 insertions(+) create mode 100644 text/0000-less-unwinding.md diff --git a/text/0000-less-unwinding.md b/text/0000-less-unwinding.md new file mode 100644 index 00000000000..d89194e2326 --- /dev/null +++ b/text/0000-less-unwinding.md @@ -0,0 +1,281 @@ +- Feature Name: `panic_runtime` +- Start Date: 2016-02-25 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Stabilize implementing panics as aborts. + +* Stabilize the `-Z no-landing-pads` flag under the name `-C panic=strategy` +* Implement a number of unstable features akin to custom allocators to swap out + implementations of panic just before a final product is generated. +* Add a `[profile.dev]` option to Cargo to configure how panics are implemented. + +# Motivation +[motivation]: #motivation + +Panics in Rust have long since been implemented with the intention of being +caught at particular boundaries (for example the thread boundary). This is quite +useful for isolating failures in Rust code, for example: + +* Servers can avoid taking down the entire process but can instead just take + down one request. +* Embedded Rust libraries can avoid taking down the entire process and can + instead gracefully inform the caller that an internal logic error occurred. +* Rust applications can isolate failure from various components. The classical + example of this is Servo can display a "red X" for an image which fails to + decode instead of aborting the entire browser or killing an entire page. + +While these are examples where a recoverable panic is useful, there are many +applications where recovering panics is undesirable or doesn't lead to anything +productive: + +* Rust applications which use `Result` for error handling typically use `panic!` + to indicate a fatal error, in which case the process *should* be taken down. +* Many applications simply can't recover from an internal assertion failure, so + there's no need trying to recover it. +* To implement a recoverable panic, the compiler and standard library use a + method called stack unwinding. The compiler must generate code to support this + unwinding, however, and this takes time in codegen and optimizers. +* Low-level applications typically don't use unwinding at all as there's no + stack unwinder (e.g. kernels). + +> **Note**: as an idea of the compile-time and object-size savings from +> disabling the extra codegen, compiling Cargo as a library is 11% faster (16s +> from 18s) and 13% smaller (15MB to 13MB). Sizable gains! + +Overall, the ability to recover panics is something that needs to be decided at +the application level rather than at the language level. Currently the compiler +does not support the ability to translate panics to process aborts in a stable +fashion, and the purpose of this RFC is to add such a venue. + +With such an important codegen option, however, as whether or not exceptions can +be caught, it's easy to get into a situation where libraries of mixed +compilation modes are linked together, causing odd or unknown errors. This RFC +proposes a situation similar to the design of custom allocators to alleviate +this situation. + +# Detailed design +[design]: #detailed-design + +The major goal of this RFC is to develop a work flow around managing crates +which wish to disable unwinding. This intends to set forth a complete vision for +how these crates interact with the ecosystem at large. Much of this design will +be similar to the [custom allocator RFC][custom-allocators]. + +[custom-allocators]: https://github.com/rust-lang/rfcs/blob/master/text/1183-swap-out-jemalloc.md + +### High level design + +This section serves as a high-level tour through the design proposed in this +RFC. The linked sections provide more complete explanation as to what each step +entails. + +* The compiler will have a [new stable flag](#new-compiler-flags), `-C panic` + which will configure how unwinding-related code is generated. +* [Two new unstable attributes](#panic-attributes) will be added to the + compiler, `#![needs_panic_runtime]` and `#![panic_runtime]`. The standard + library will need a runtime and will be lazily linked to a crate which has + `#![panic_runtime]`. +* [Two unstable crates](#panic-crates) tagged with `#![panic_runtime]` will be + distributed as the runtime implementation of panicking, `panic_abort` and + `panic_unwind` crates. The former will translate all panics to process + aborts, whereas the latter will be implemented as unwinding is today, via the + system stack unwinder. +* [Cargo will gain](#cargo-changes) a new `panic` option in the `[profile.foo]` + sections to indicate how that profile should compile panic support. + +### New Compiler Flags + +The first component to this design is to have a **stable** flag to the compiler +which configures how panic-related code is generated. This will be +stabilized in the form: + +``` +$ rustc -C help + +Available codegen options: + + ... + -C panic=val -- strategy to compile in for panic related code + ... +``` + +There will currently be two supported strategies: + +* `unwind` - this is what the compiler implements by default today via the + `invoke` LLVM instruction. +* `abort` - this will implement that `-Z no-landing-pads` does today, which is + to disable the `invoke` instruction and use `call` instead everywhere. + +This codegen option will default to `unwind` if not specified (what happens +today), and the value will be encoded into the crate metadata. This option is +planned with extensibility in mind to future panic strategies if we ever +implement some (return-based unwinding is at least one other possible option). + +### Panic Attributes + +Very similarly to [custom allocators][allocator-attributes], two new +**unstable** crate attributes will be added to the compiler: + +[allocator-attributes]: https://github.com/rust-lang/rfcs/blob/master/text/1183-swap-out-jemalloc.md#new-attributes + +* `#![needs_panic_runtime]` - indicates that this crate requires a "panic + runtime" to link correctly. This will be attached to the standard library and + is not intended to be attached to any other crate. +* `#![panic_runtime]` - indicates that this crate is a runtime implementation of + panics. + +As with allocators, there are a number of limitations imposed by these +attributes by the compiler: + +* Any crate DAG can only contain at most one instance of `#![panic_runtime]`. +* Implicit dependency edges are drawn from crates tagged with + `#![needs_panic_runtime]` to those tagged with `#![panic_runtime]`. Loops as + usual are forbidden (e.g. a panic runtime can't depend on libstd). +* Complete artifacts which include a crate tagged with `#![needs_panic_runtime]` + must include a panic runtime. This includes executables, dylibs, and + staticlibs. If no panic runtime is explicitly linked, then the compiler will + select an appropriate runtime to inject. +* Finally, the compiler will ensure that panic runtimes and compilation modes + are not mismatched. For a resolved DAG, the panic runtime will have been + compiled with a particular `-C panic` option, let's call it PS (panic + strategy). If PS is "abort", then no validation is performed (doesn't matter + how the rest of the DAG is compiled). Otherwise, all other crates must also + be compiled with the same PS. + +The purpose of these limitations is to solve a number of problems that arise +when switching panic strategies. For example with aborting panic crates won't +have to link to runtime support of unwinding, or rustc will disallow mixing +panic strategies by accident. + +The actual API of panic runtimes will not be detailed in this RFC. These new +attributes will be unstable, and consequently the API itself will also be +unstable. It suffices to say, however, that like custom allocators a panic +runtime will implement some public `extern` symbols known to the crates that +need a panic runtime, and that's how they'll communicate/link up. + +### Panic Crates + +Two new **unstable** crates will be added to the distribution for each target: + +* `panic_unwind` - this is an extraction of the current implementation of + panicking from the standard library. It will use the same mechanism of stack + unwinding as is implemented on all current platforms. +* `panic_abort` - this is a new implementation of panicking which will simply + translate unwinding to process aborts. There will be no runtime support + required by this crate. + +The compiler will assume that these crates are distributed for each platform +where the standard library is also distributed (e.g. a crate that has +`#![needs_panic_runtime]`). + +### Compiler defaults + +The compiler will ship with a few defaults which affect how panic runtimes are +selected in Rust programs. Specifically: + +* The `-C panic` option will default to **unwind** as it does today. +* The libtest crate will explicitly link to `panic_unwind`. The test runner that + libtest implements relies on equating panics with failure and cannot work if + panics are translated to aborts. +* If no panic runtime is explicitly selected, the compiler will employ the + following logic to decide what panic runtime to inject: + + 1. If any crate in the DAG is compiled with `-C panic=abort`, then `panic_abort` + will be injected. + 2. If all crates in the DAG are compiled with `-C panic=unwind`, then + `panic_unwind` is injected. + +### Cargo changes + +In order to export this new feature to Cargo projects, a new option will be +added to the `[profile]` section of manifests: + +```toml +[profile.dev] +panic = 'unwind' +``` + +This will cause Cargo to pass `-C panic=unwind` to all `rustc` invocations for +a crate graph. Cargo will have special knowledge, however, that for `cargo +test` it cannot pass `-C panic=abort`. + +# Drawbacks +[drawbacks]: #drawbacks + +* The implementation of custom allocators was no small feat in the compiler, and + much of this RFC is essentially the same thing. Similar infrastructure can + likely be leveraged to alleviate the implementation complexity, but this is + undeniably a large change to the compiler for albeit a relatively minor + option. The counter point to this, however, is that disabling unwinding in a + principled fashion provides far higher quality error messages, prevents + erroneous situations, and provides an immediate benefit for many Rust users + today. + +* The binary distribution of the standard library will not change from what it + is today. In other words, the standard library (and dependency crates like + libcore) will be compiled with `-C panic=unwind`. This introduces the + opportunity for extra code bloat or missed optimizations in applications that + end up disabling unwinding in the long run. Distribution, however, is *far* + easier because there's only one copy of the standard library and we don't have + to rely on any other form of infrastructure. + +* This represents a proliferation of the `#![needs_foo]` and `#![foo]` style + system that allocators have begun. This may be indicative of a deeper + underlying requirement here of the standard library or perhaps showing how the + strategy in the standard library needs to change. If the standard library were + a crates.io crate it would arguably support these options via Cargo features, + but without that option is this the best way to be implementing these switches + for the standard library? + +* Applications may silently revert to the wrong panic runtime given the + heuristics here. For example if an application relies on unwinding panics, if + a dependency is pulled in with an explicit `extern crate panic_abort`, then + the entire application will switch to aborting panics silently. This can be + corrected, however, with an explicit `extern crate panic_unwind` on behalf of + the application. + +# Alternatives +[alternatives]: #alternatives + +* Currently this RFC allows mixing multiple panic runtimes in a crate graph so + long as the actual runtime is compiled with `-C panic=abort`. This is + primarily done to immediately reap benefit from `-C panic=abort` even though + the standard library we distribute will still have unwinding support compiled + in (compiled with `-C panic=unwind`). In the not-too-distant future however, + we will likely be poised to distribute multiple binary copies of the standard + library compiled with different profiles. We may be able to tighten this + restriction on behalf of the compiler, requiring that all crates in a DAG have + the same `-C panic` compilation mode, but there would unfortunately be no + immediate benefit to implementing the RFC from users of our precompiled + nightlies. + + This alternative, additionally, can also be viewed as a drawback. It's unclear + what a future libstd distribution mechanism would look like and how this RFC + might interact with it. Stabilizing disabling unwinding via a compiler switch + or a Cargo profile option may not end up meshing well with the strategy we + pursue with shipping multiple standard libraries. + +* Instead of the panic runtime support in this RFC, we could instead just ship + two different copies of the standard library where one simply translates + panics to abort instead of unwinding. This is unfortunately very difficult + for Cargo or the compiler to track, however, to ensure that the codegen + option of how panics are translated is propagated throughout the rest of + the crate graph. Additionally it may be easy to mix up crates of different + panic strategies. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* One possible implementation of unwinding is via return-based flags. Much of + this RFC is designed with the intention of supporting arbitrary unwinding + implementations, but it's unclear whether it's too heavily biased towards + panic is either unwinding or aborting. + +* The current implementation of Cargo would mean that a native implementation of + the profile option would cause recompiles between `cargo build` and `cargo + test` for projects that specify `panic = 'unwind'`. Is this acceptable? Should + Cargo cache both copies of the crate? From 490d1e35af50ea267f9554f2934b39514bbb3c56 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 26 Feb 2016 15:57:56 -0800 Subject: [PATCH 0769/1195] Typos --- text/0000-less-unwinding.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-less-unwinding.md b/text/0000-less-unwinding.md index d89194e2326..f6caacc191f 100644 --- a/text/0000-less-unwinding.md +++ b/text/0000-less-unwinding.md @@ -275,7 +275,7 @@ test` it cannot pass `-C panic=abort`. implementations, but it's unclear whether it's too heavily biased towards panic is either unwinding or aborting. -* The current implementation of Cargo would mean that a native implementation of +* The current implementation of Cargo would mean that a naive implementation of the profile option would cause recompiles between `cargo build` and `cargo - test` for projects that specify `panic = 'unwind'`. Is this acceptable? Should + test` for projects that specify `panic = 'abort'`. Is this acceptable? Should Cargo cache both copies of the crate? From 5c6be3f4c6c739d3077615d820bbc484d4efccd4 Mon Sep 17 00:00:00 2001 From: jethrogb Date: Fri, 26 Feb 2016 19:48:10 -0800 Subject: [PATCH 0770/1195] Stray "try" in RFC 243 --- text/0243-trait-based-exception-handling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0243-trait-based-exception-handling.md b/text/0243-trait-based-exception-handling.md index c26500bee48..946428d000f 100644 --- a/text/0243-trait-based-exception-handling.md +++ b/text/0243-trait-based-exception-handling.md @@ -210,7 +210,7 @@ are merely one way. * Construct: - try { + catch { foo()?.bar() } From 433edd99f195d013c2d6e3a350fd2bd939cd88da Mon Sep 17 00:00:00 2001 From: Vadim Petrochenkov Date: Sun, 28 Feb 2016 16:19:41 +0300 Subject: [PATCH 0771/1195] Some improvements Mention type aliases Mention differences between unit structs / tuple struct constructors and `const` / `fn` items. Add more use cases for `TS{0: expr}`/`TS{0: pat}` --- text/0000-adt-kinds.md | 50 ++++++++++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 19 deletions(-) diff --git a/text/0000-adt-kinds.md b/text/0000-adt-kinds.md index e5894ea5509..30a4f10678e 100644 --- a/text/0000-adt-kinds.md +++ b/text/0000-adt-kinds.md @@ -43,7 +43,8 @@ Braced structs have 0 or more user-named fields and are defined only in type nam Braced structs can be used in struct expressions `S{field1: expr, field2: expr}`, including functional record update (FRU) `S{field1: expr, ..s}`/`S{..s}` and with struct patterns `S{field1: pat, field2: pat}`/`S{field1: pat, ..}`/`S{..}`. -In all cases the path `S` of the expression or pattern is looked up in the type namespace. +In all cases the path `S` of the expression or pattern is looked up in the type namespace (so these +expressions/patterns can be used with type aliases). Fields of a braced struct can be accessed with dot syntax `s.field1`. Note: struct *variants* are currently defined in the value namespace in addition to type namespace, @@ -63,7 +64,7 @@ Unit structs can be thought of as a single declaration for two things: a basic s struct US {} ``` -and a constant with the same name +and a constant with the same nameNote 1 ``` const US: US = US{}; @@ -74,15 +75,16 @@ constant `US`) namespaces. As a basic struct, a unit struct can participate in struct expressions `US{}`, including FRU `US{..s}` and in struct patterns `US{}`/`US{..}`. In both cases the path `US` of the expression -or pattern is looked up in the type namespace. +or pattern is looked up in the type namespace (so these expressions/patterns can be used with type +aliases). Fields of a unit struct could also be accessed with dot syntax, but it doesn't have any fields. As a constant, a unit struct can participate in unit struct expressions `US` and unit struct patterns `US`, both of these are looked up in the value namespace in which the constant `US` is -defined. +defined (so these expressions/patterns cannot be used with type aliases). -Note 1: the constant is not exactly a `const` item, there are subtle differences, but it's a close -approximation. +Note 1: the constant is not exactly a `const` item, there are subtle differences (e.g. with regards +to `match` exhaustiveness), but it's a close approximation. Note 2: the constant is pretty weirdly namespaced in case of unit *variants*, constants can't be defined in "enum modules" manually. @@ -103,7 +105,7 @@ struct TS { } ``` -and a constructor function with the same name +and a constructor function with the same nameNote 2 ``` fn TS(arg0: Type0, arg1: Type1, arg2: Type2) -> TS { @@ -117,17 +119,21 @@ and the value (the constructor function `TS`) namespaces. As a basic struct, a tuple struct can participate in struct expressions `TS{0: expr, 1: expr}`, including FRU `TS{0: expr, ..ts}`/`TS{..ts}` and in struct patterns `TS{0: pat, 1: pat}`/`TS{0: pat, ..}`/`TS{..}`. -In both cases the path `TS` of the expression or pattern is looked up in the type namespace. -Fields of a braced tuple can be accessed with dot syntax `ts.0`. +In both cases the path `TS` of the expression or pattern is looked up in the type namespace (so +these expressions/patterns can be used with type aliases). +Fields of a tuple struct can be accessed with dot syntax `ts.0`. As a constructor, a tuple struct can participate in tuple struct expressions `TS(expr, expr)` and tuple struct patterns `TS(pat, pat)`/`TS(..)`, both of these are looked up in the value namespace -in which the constructor `TS` is defined. Tuple struct expressions `TS(expr, expr)` are usual +in which the constructor `TS` is defined (so these expressions/patterns cannot be used with type +aliases). Tuple struct expressions `TS(expr, expr)` are usual function calls, but the compiler reserves the right to make observable improvements to them based on the additional knowledge, that `TS` is a constructor. -Note: the automatically assigned field names are quite interesting, they are not identifiers -lexically (they are integer literals), so such fields can't be defined manually. +Note 1: the automatically assigned field names are quite interesting, they are not identifiers +lexically (they are integer literals), so such fields can't be defined manually. +Note 2: the constructor function is not exactly a `fn` item, there are subtle differences (e.g. with +regards to privacy checks), but it's a close approximation. ## Summary of the changes. @@ -143,15 +149,21 @@ implement as well. This also means that `S{..}` patterns can be used to match structures and variants of any kind. The desire to have such "match everything" patterns is sometimes expressed given that number of fields in structures and variants can change from zero to non-zero and back during -development. +development. +An extra benefit is ability to match/construct tuple structs using their type aliases. New: Permit using tuple structs and tuple variants in braced struct patterns and expressions -requiring naming their fields - `TS{0: expr}`/`TS{0: pat}`/etc. This change looks a bit worse from -the cost/benefit point of view. There's not much motivation for it besides consistency and probably -shortening patterns like `ItemFn(name, _, _, _, _, _)` into something like `ItemFn{0: name, ..}`. -Automatic code generators (e.g. syntax extensions like `derive`) can probably benefit from the -ability to generate uniform code for all structure kinds as well. -The author of the RFC is ready to postpone or drop this particular extension at any moment. +requiring naming their fields - `TS{0: expr}`/`TS{0: pat}`/etc. +While this change is important for consistency, there's not much motivation for it in hand-written +code besides shortening patterns like `ItemFn(_, _, unsafety, _, _, _)` into something like +`ItemFn{2: unsafety, ..}` and ability to match/construct tuple structs using their type aliases. +However, automatic code generators (e.g. syntax extensions) can get more benefits from the +ability to generate uniform code for all structure kinds. +`#[derive]` for example, currently has separate code paths for generating expressions and patterns +for braces structs (`ExprStruct`/`PatKind::Struct`), tuple structs +(`ExprCall`/`PatKind::TupleStruct`) and unit structs (`ExprPath`/`PatKind::Path`). With proposed +changes `#[derive]` could simplify its logic and always generate braced forms for expressions and +patterns. # Drawbacks [drawbacks]: #drawbacks From d69cd0a0adafbaffe33f62f3c38f65d272da1847 Mon Sep 17 00:00:00 2001 From: ubsan Date: Mon, 29 Feb 2016 17:41:42 -0800 Subject: [PATCH 0772/1195] First commit --- text/0000-copy-clone-semantics.md | 52 +++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 text/0000-copy-clone-semantics.md diff --git a/text/0000-copy-clone-semantics.md b/text/0000-copy-clone-semantics.md new file mode 100644 index 00000000000..af9f342c0c0 --- /dev/null +++ b/text/0000-copy-clone-semantics.md @@ -0,0 +1,52 @@ +- Feature Name: N/A +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +With specialization on the way, we need to talk about the semantics of +`::clone() where T: Copy`. + +It's generally been an unspoken rule of Rust that a `clone` of a `Copy` type is +equivalent to a `memcpy` of that type; however, that fact is not documented +anywhere. This fact should be in the documentation for the `Clone` trait, just +like the fact that `T: Eq` should implement `a == b == c == a` rules. + +# Motivation +[motivation]: #motivation + +Currently, `Vec::clone()` is implemented by creating a new `Vec`, and then +cloning all of the elements from one into the other. This is slow in debug mode, +and may not always be optimized (although it often will be). Specialization +would allow us to simply `memcpy` the values from the old `Vec` to the new +`Vec`. However, if we don't actually specify this, we will not be able to do +this. + +# Detailed design +[design]: #detailed-design + +Simply add something like the following sentence to the documentation for the +`Clone` trait: + +"If `T: Copy`, `x: T`, and `y: &T`, then `let x = y.clone()` is equivalent to +`let x = *y`;" + +# Drawbacks +[drawbacks]: #drawbacks + +This is a breaking change, technically, although it breaks code that was +malformed in the first place. + +# Alternatives +[alternatives]: #alternatives + +The alternative is that, for each type and function we would like to specialize +in this way, we document this separately. This is how we started off with +`clone_from_slice`. + +# Unresolved questions +[unresolved]: #unresolved-questions + +What the exact sentence should be. From 5542c80869c290ea7d371f26564fd83ff05fd5d8 Mon Sep 17 00:00:00 2001 From: ubsan Date: Mon, 29 Feb 2016 17:48:48 -0800 Subject: [PATCH 0773/1195] Add some modifications suggested by @durka --- text/0000-copy-clone-semantics.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-copy-clone-semantics.md b/text/0000-copy-clone-semantics.md index af9f342c0c0..4bc8e906f9b 100644 --- a/text/0000-copy-clone-semantics.md +++ b/text/0000-copy-clone-semantics.md @@ -21,8 +21,8 @@ Currently, `Vec::clone()` is implemented by creating a new `Vec`, and then cloning all of the elements from one into the other. This is slow in debug mode, and may not always be optimized (although it often will be). Specialization would allow us to simply `memcpy` the values from the old `Vec` to the new -`Vec`. However, if we don't actually specify this, we will not be able to do -this. +`Vec` in the case of `T: Copy`. However, if we don't specify this, we will not +be able to, and we will be stuck looping over every value. # Detailed design [design]: #detailed-design @@ -30,8 +30,8 @@ this. Simply add something like the following sentence to the documentation for the `Clone` trait: -"If `T: Copy`, `x: T`, and `y: &T`, then `let x = y.clone()` is equivalent to -`let x = *y`;" +"If `T: Copy`, `x: T`, and `y: &T`, then `let x = y.clone();` is equivalent to +`let x = *y;`. Manual implementations must be careful to uphold this." # Drawbacks [drawbacks]: #drawbacks From 91629f9a72fcf22e04d0eea67a1850b7258453a8 Mon Sep 17 00:00:00 2001 From: ubsan Date: Mon, 29 Feb 2016 17:56:59 -0800 Subject: [PATCH 0774/1195] Fix some language for @aatch --- text/0000-copy-clone-semantics.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/0000-copy-clone-semantics.md b/text/0000-copy-clone-semantics.md index 4bc8e906f9b..fbfca12f061 100644 --- a/text/0000-copy-clone-semantics.md +++ b/text/0000-copy-clone-semantics.md @@ -27,7 +27,10 @@ be able to, and we will be stuck looping over every value. # Detailed design [design]: #detailed-design -Simply add something like the following sentence to the documentation for the +Specify that `::clone(t)` shall be equivalent to `ptr::read(t)` +where `T: Copy, t: &T`. + +Also add something like the following sentence to the documentation for the `Clone` trait: "If `T: Copy`, `x: T`, and `y: &T`, then `let x = y.clone();` is equivalent to From 0feac231b3f82c6d954c3e2a2f7637b099c31ae1 Mon Sep 17 00:00:00 2001 From: ubsan Date: Mon, 29 Feb 2016 21:50:27 -0800 Subject: [PATCH 0775/1195] A poorly defined clone shan't result in UB --- text/0000-copy-clone-semantics.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/text/0000-copy-clone-semantics.md b/text/0000-copy-clone-semantics.md index fbfca12f061..066777860ba 100644 --- a/text/0000-copy-clone-semantics.md +++ b/text/0000-copy-clone-semantics.md @@ -28,7 +28,8 @@ be able to, and we will be stuck looping over every value. [design]: #detailed-design Specify that `::clone(t)` shall be equivalent to `ptr::read(t)` -where `T: Copy, t: &T`. +where `T: Copy, t: &T`. An implementation that does not uphold this *shall not* +result in undefined behavior; `Clone` is not an `unsafe trait`. Also add something like the following sentence to the documentation for the `Clone` trait: @@ -52,4 +53,4 @@ in this way, we document this separately. This is how we started off with # Unresolved questions [unresolved]: #unresolved-questions -What the exact sentence should be. +What the exact wording should be. From ce52e56d5cc6dc54dd031aecba537294496a74c9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 1 Mar 2016 20:03:19 +0100 Subject: [PATCH 0776/1195] Touched up the initial syntax example a bit --- text/0000-conservative-impl-trait.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index c508053621e..158086ded66 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -15,17 +15,20 @@ initially being restricted to: Abstract return types allow a function to hide a concrete return type behind a trait interface similar to trait objects, while -still generating the same statically dispatched code as with concrete types: +still generating the same statically dispatched code as with concrete types. + +With the placeholder syntax used in discussions so far, +abstract return types would be used roughly like this: ```rust -fn foo(n: u32) -> @Iterator { +fn foo(n: u32) -> impl Iterator { (0..n).map(|x| x * 100) } -// ^ behaves as if it had return type Map, Clos> -// where Clos = type of the |x| x * 100 closure. +// ^ behaves as if it had return type Map, Closure> +// where Closure = type of the |x| x * 100 closure. for x in foo(10) { - // ... + // x = 0, 100, 200, ... } ``` From 74d669e3d89e8b0199a65a245006a7d18d2658c9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 1 Mar 2016 20:25:32 +0100 Subject: [PATCH 0777/1195] Fixed typo and unneeded quote of template --- text/0000-conservative-impl-trait.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 158086ded66..3eb10107c9f 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -51,7 +51,7 @@ step requires resolving some of the core questions raised in [the blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/). This RFC is closest in spirit to the -[original RFC]((https://github.com/rust-lang/rfcs/pull/105), and we'll repeat +[original RFC](https://github.com/rust-lang/rfcs/pull/105), and we'll repeat its motivation and some other parts of its text below. # Motivation @@ -132,10 +132,6 @@ function's author to write down the concrete type implementing the bound. # Detailed design [design]: #detailed-design -> This is the bulk of the RFC. Explain the design in enough detail for somebody familiar -> with the language to understand, and for somebody familiar with the compiler to implement. -> This should get into specifics and corner-cases, and include examples of how the feature is used. - As explained at the start of the RFC, the focus here is a relatively narrow introduction of abstract types limited to the return type of inherent methods and free functions. While we still need to resolve some of the core questions From 3e6f294fa516d6e2893195ac2ad05a62bf748c0d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 1 Mar 2016 20:44:18 +0100 Subject: [PATCH 0778/1195] Addressed that @Trait can be used anywhere in a return type --- text/0000-conservative-impl-trait.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 3eb10107c9f..101176cb759 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -226,12 +226,14 @@ and the *initial limitations* (which are likely to be lifted later). **Initial limitations**: -- `@Trait` may only be written at return type position of a freestanding or +- `@Trait` may only be written within the return type of a freestanding or inherent-impl function, not in trait definitions, closure traits, function pointers, or any non-return type position. - Eventually, we will want to allow the feature to be used within traits, and like in argument position as well (as an ergonomic improvement over today's generics). + - Using `@Trait` multiple times in the same return type would be valid, + like for example `-> (@Foo, @Bar)`. - The type produced when a function returns `@Trait` would be effectively unnameable, just like closures and function items. From 5ee47dea80858a5eb4195a972e3587662f932863 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Tue, 1 Mar 2016 21:21:48 +0100 Subject: [PATCH 0779/1195] Made it clear that function pointers may still include @Trait if used as part of a return type of a freestanding function --- text/0000-conservative-impl-trait.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 101176cb759..2f3f6693c2a 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -227,13 +227,14 @@ and the *initial limitations* (which are likely to be lifted later). **Initial limitations**: - `@Trait` may only be written within the return type of a freestanding or - inherent-impl function, not in trait definitions, closure traits, function - pointers, or any non-return type position. + inherent-impl function, not in trait definitions or any non-return type position. They may also not appear + in the return type of closure traits or function pointers, + unless these are themself part of a legal return type. - Eventually, we will want to allow the feature to be used within traits, and like in argument position as well (as an ergonomic improvement over today's generics). - Using `@Trait` multiple times in the same return type would be valid, - like for example `-> (@Foo, @Bar)`. + like for example in `-> (@Foo, @Bar)`. - The type produced when a function returns `@Trait` would be effectively unnameable, just like closures and function items. From df87d91bf8c2d1c38bec66e83fad3f0e57c2036d Mon Sep 17 00:00:00 2001 From: ubsan Date: Tue, 1 Mar 2016 18:51:54 -0800 Subject: [PATCH 0780/1195] Add more motivation --- text/0000-copy-clone-semantics.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/text/0000-copy-clone-semantics.md b/text/0000-copy-clone-semantics.md index 066777860ba..5dacfa31bb0 100644 --- a/text/0000-copy-clone-semantics.md +++ b/text/0000-copy-clone-semantics.md @@ -24,6 +24,13 @@ would allow us to simply `memcpy` the values from the old `Vec` to the new `Vec` in the case of `T: Copy`. However, if we don't specify this, we will not be able to, and we will be stuck looping over every value. +It's always been the intention that `Clone::clone == ptr::read for T: Copy`; see +[issue #23790][issue-copy]: "It really makes sense for `Clone` to be a +supertrait of `Copy` -- `Copy` is a refinement of `Clone` where `memcpy` +suffices, basically." This idea was also implicit in accepting +[rfc #0839][rfc-extend] where "[B]ecause Copy: Clone, it would be backwards +compatible to upgrade to Clone in the future if demand is high enough." + # Detailed design [design]: #detailed-design @@ -54,3 +61,6 @@ in this way, we document this separately. This is how we started off with [unresolved]: #unresolved-questions What the exact wording should be. + +[issue-copy]: https://github.com/rust-lang/rust/issues/23790 +[rfc-extend]: https://github.com/rust-lang/rfcs/blob/master/text/0839-embrace-extend-extinguish.md From b9b78547b0693af126c288adbc51a4fa2240008f Mon Sep 17 00:00:00 2001 From: ubsan Date: Tue, 1 Mar 2016 18:52:31 -0800 Subject: [PATCH 0781/1195] Add a date --- text/0000-copy-clone-semantics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-copy-clone-semantics.md b/text/0000-copy-clone-semantics.md index 5dacfa31bb0..0df568c4963 100644 --- a/text/0000-copy-clone-semantics.md +++ b/text/0000-copy-clone-semantics.md @@ -1,5 +1,5 @@ - Feature Name: N/A -- Start Date: (fill me in with today's date, YYYY-MM-DD) +- Start Date: 01 March, 2016 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) From 55877e5c02d50d630a3f919ef96cda57f7b367e6 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 2 Mar 2016 17:09:16 -0800 Subject: [PATCH 0782/1195] Verify `-C panic` matches the panic runtime for executables Helps alleviate choosing the wrong panic strategy by accident. --- text/0000-less-unwinding.md | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/text/0000-less-unwinding.md b/text/0000-less-unwinding.md index f6caacc191f..dc665b4a9ec 100644 --- a/text/0000-less-unwinding.md +++ b/text/0000-less-unwinding.md @@ -140,11 +140,10 @@ attributes by the compiler: staticlibs. If no panic runtime is explicitly linked, then the compiler will select an appropriate runtime to inject. * Finally, the compiler will ensure that panic runtimes and compilation modes - are not mismatched. For a resolved DAG, the panic runtime will have been - compiled with a particular `-C panic` option, let's call it PS (panic - strategy). If PS is "abort", then no validation is performed (doesn't matter - how the rest of the DAG is compiled). Otherwise, all other crates must also - be compiled with the same PS. + are not mismatched. For a final product (outputs that aren't rlibs) the + `-C panic` mode of the panic runtime must match the final product itself. If + the panic mode is `abort`, then no other validation is performed, but + otherwise all crates in the DAG must have the same value of `-C panic`. The purpose of these limitations is to solve a number of problems that arise when switching panic strategies. For example with aborting panic crates won't @@ -231,13 +230,6 @@ test` it cannot pass `-C panic=abort`. but without that option is this the best way to be implementing these switches for the standard library? -* Applications may silently revert to the wrong panic runtime given the - heuristics here. For example if an application relies on unwinding panics, if - a dependency is pulled in with an explicit `extern crate panic_abort`, then - the entire application will switch to aborting panics silently. This can be - corrected, however, with an explicit `extern crate panic_unwind` on behalf of - the application. - # Alternatives [alternatives]: #alternatives From 9deb22665b672d1f8976997377865d2480f27fe9 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 15 Sep 2015 17:47:51 -0700 Subject: [PATCH 0783/1195] RFC: Add workspaces to Cargo Improve Cargo's story around multi-crate single-repo project management by introducing the concept of workspaces. All packages in a workspace will share `Cargo.lock` and an output directory for artifacts. Cargo will infer workspaces where possible, but it will also have knobs for explicitly controlling what crates belong to which workspace. --- text/0000-cargo-workspace.md | 179 +++++++++++++++++++++++++++++++++++ 1 file changed, 179 insertions(+) create mode 100644 text/0000-cargo-workspace.md diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md new file mode 100644 index 00000000000..3d3d82ab683 --- /dev/null +++ b/text/0000-cargo-workspace.md @@ -0,0 +1,179 @@ +- Feature Name: N/A +- Start Date: 2015-09-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Improve Cargo's story around multi-crate single-repo project management by +introducing the concept of workspaces. All packages in a workspace will share +`Cargo.lock` and an output directory for artifacts. + +Cargo will infer workspaces where possible, but it will also have knobs for +explicitly controlling what crates belong to which workspace. + +# Motivation + +A common method to organize a multi-crate project is to have one +repository which contains all of the crates. Each crate has a corresponding +subdirectory along with a `Cargo.toml` describing how to build it. There are a +number of downsides to this approach, however: + +* Each sub-crate will have its own `Cargo.lock`, so it's difficult to ensure + that the entire project is using the same version of all dependencies. This is + desired as the main crate (often a binary) is often the one that has the + `Cargo.lock` "which counts", but it needs to be kept in sync with all + dependencies. + +* When building or testing sub-crates, all dependencies will be recompiled as + the target directory will be changing as you move around the source tree. This + can be overridden with `build.target-dir` or `CARGO_TARGET_DIR`, but this + isn't always convenient to set. + +Solving these two problems should help ease the development of large Rust +projects by ensuring that all dependencies remain in sync and builds by default +use already-built artifacts if available. + +# Detailed design + +Cargo will grow the concept of a **workspace** for managing repositories of +multiple crates. Workspaces will then have the properties: + +* A workspace can contain multiple local crates. +* Each workspace will have one root crate. +* Whenever any crate in the workspace is compiled, output will be placed in the + `target` directory next to the root crate. +* One `Cargo.lock` for the entire workspace will reside next to the root crate + and encompass the dependencies (and dev-dependencies) for all packages in the + workspace. + +With workspaces, Cargo can now solve the problems set forth in the motivation +section. Next, however, workspaces need to be defined. In the spirit of much of +the rest of Cargo's configuration today this will largely be automatic for +conventional project layouts but will have explicit controls for configuration. + +### New manifest keys + +First, let's look at the new manifest keys which will be added to `Cargo.toml`: + +```toml +[package] + +# ... + +workspace-root = true +workspace = ["relative/path/to/child1", "child2"] +``` + +Here the `workspace-root` key will be used to indicate whether a package is the +root of a workspace, and the `workspace` key will be a list of paths to crates +which should be added to the package's workspace. The paths listed in +`workspace` must be valid paths to crates. + +### Implicit relations + +In addition to the keys above, Cargo will apply a few heuristics to infer the +keys wherever possible: + +* All path dependencies of a crate are considered members of the `workspace` key + implicitly. +* Starting from a package's `Cargo.toml`, Cargo will walk upwards on the + filesystem to find a sibling `Cargo.toml` and VCS directory (e.g. `.git` or + `.svn`). If found, this crate is also implicitly considered a member of the + workspace. +* Crates whose `Cargo.toml` that reside next to VCS directories are implicitly + workspace roots. + +These rules are intended to reflect conventional Cargo project layouts. "Root +crates" typically appear at the root of a repository with lots path dependencies +to all other crates in a repo. Additionally, we don't want to traverse wildly +across the filesystem so we only go upwards to a fixed point or downwards to +specific locations. + +### Constructing a workspace + +With the explicit and implicit relations defined above, each crate will now have +a flag indicating whether it's the root and a number of outgoing edges to other +crates. Two crates are then in the same workspace if they both transitively have +edges to one another. A valid workspace then only has one crate that is a root. + +While the restriction of one-root-per workspace may make sense, the restriction +of crates transitively having edges to one another may seem a bit odd. The +intention is to ensure that the set of packages in a workspace is the same +regardless of which package is selected to start discovering a workspace from. + +With the implicit relations defined it's possible for a repository to not have a +root package yet still have path dependencies. In this situation each dependency +would not know how to get back to the "root package", so the workspace from the +point of view of the path dependencies would be different than that of the root +package. This could in turn lead to `Cargo.lock` getting out of sync. + +To alleviate misconfiguration, however, if the `workspace` configuration key +contains a crate which is not a member of the constructed workspace, Cargo will +emit an error indicating such. + +### Workspaces in practice + +The conventional layout for a Rust project is to have a `Cargo.toml` at the root +with the "main project" with dependencies and/or satellite projects underneath. +Consequently the conventional layout will need no extra configuration to benefit +from the workspaces proposed in this RFC. + +Projects like the compiler, however, will likely need explicit configuration. +The `rust` repo conceptually has two workspaces, the standard library and the +compiler, and these would need to be manually configured with `workspace` and +`workspace-root` keys amongst all crates. + +### Future Extensions + +Once Cargo understands a workspace of crates, we could easily extend various +subcommands with a `--all` flag to perform tasks such as: + +* Test all crates within a workspace (run all unit tests, doc tests, etc) +* Build all binaries for a set of crates within a workspace +* Publish all crates in a workspace if necessary to crates.io + +This support isn't proposed to be added in this RFC specifically, but simply to +show that workspaces can be used to solve other existing issues in Cargo. + +# Drawbacks + +* This change is not backwards compatible with older versions of Cargo.lock. For + example if a newer cargo were used to develop a repository which otherwise is + developed with older versions of Cargo, the `Cargo.lock` files generated would + be incompatible. If all maintainers agree on versions of Cargo, however, this + is not a problem. + +* If no crate exists at the root of a repository, it may be the case that an + unduly large amount of configuration is required to setup the workspace + correctly. A minor deviation from the normal conventions should in theory only + require a proportionally minor amount of configuration. + +* As proposed there is no method to disable implicit actions taken by Cargo. + It's unclear what the use case for this is, but it could in theory arise. + +# Alternatives + +* Cargo could attempt to perform more inference of workspace members by simply + walking the entire directory tree starting at `Cargo.toml`. All children found + could implicitly be members of the workspace. Walking entire trees, + unfortunately, isn't always efficient to do and it would be unfortunate to + have to unconditionally do this. + +* Cargo could support "virtual packages" where a `Cargo.toml` is placed at the + root of a repository but only to serve as a global project configuration. No + crate would actually be described by a virtual package, but it would play into + the workspace heuristics described here. This feature could alleviate the "too + much extra configuration" drawback described above, but it's unclear whether + it's needed at this point. + +* Implicit members are currently only path dependencies and a "Cargo.toml next + to VCS" traveling upwards. Instead all Cargo.toml members found traveling + upwards could be implicit members of a workspace. This behavior, however, may + end up picking up too many crates. + +# Unresolved questions + +* Does this approach scale well to repositories with a large number of crates? + For example does the winapi-rs repository experience a slowdown on standard + `cargo build` as a result? From 2dd9b01324433e7f1a72ff1158d5b8f322674c85 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Thu, 3 Mar 2016 21:09:27 -0800 Subject: [PATCH 0784/1195] Update to disallow abstract namespaces --- text/0000-unix-socket.md | 47 +++++++++++----------------------------- 1 file changed, 13 insertions(+), 34 deletions(-) diff --git a/text/0000-unix-socket.md b/text/0000-unix-socket.md index 5e509c3f0ae..c72d2d2586d 100644 --- a/text/0000-unix-socket.md +++ b/text/0000-unix-socket.md @@ -38,7 +38,10 @@ Postgres server will listen on a Unix socket located at `/run/postgresql/.s.PGSQL.5432` in some configurations. However, the `socketpair` function can make a pair of *unnamed* connected Unix sockets not associated with a filesystem path. In addition, Linux provides a separate -*abstract* namespace not associated with the filesystem. +*abstract* namespace not associated with the filesystem, indicated by a leading +null byte in the address. In the initial implementation, the abstract namespace +will not be supported - the various socket constructors will check for and +reject addresses with interior null bytes. A `std::os::unix::net` module will be created with the following contents: @@ -51,11 +54,7 @@ pub struct UnixStream { impl UnixStream { /// Connects to the socket named by `path`. /// - /// Linux provides, as a nonportable extension, a separate "abstract" - /// address namespace as opposed to filesystem-based addressing. If `path` - /// begins with a null byte, it will be interpreted as an "abstract" - /// address. Otherwise, it will be interpreted as a "pathname" address, - /// corresponding to a path on the filesystem. + /// `path` may not contain any null bytes. pub fn connect>(path: P) -> io::Result { ... } @@ -196,15 +195,6 @@ impl SocketAddr { } ``` -A Linux-specific extension trait is provided for the abstract namespace: -```rust -pub trait SocketAddrExt { - /// Returns the contents of this address (without the leading null byte) if - /// it is an abstract address. - fn as_abstract(&self) -> Option<&[u8]> -} -``` - The `UnixListener` type mirrors the `TcpListener` type: ```rust pub struct UnixListener { @@ -214,11 +204,7 @@ pub struct UnixListener { impl UnixListener { /// Creates a new `UnixListener` bound to the specified socket. /// - /// Linux provides, as a nonportable extension, a separate "abstract" - /// address namespace as opposed to filesystem-based addressing. If `path` - /// begins with a null byte, it will be interpreted as an "abstract" - /// address. Otherwise, it will be interpreted as a "pathname" address, - /// corresponding to a path on the filesystem. + /// `path` may not contain any null bytes. pub fn bind>(path: P) -> io::Result { ... } @@ -294,11 +280,7 @@ pub struct UnixDatagram { impl UnixDatagram { /// Creates a Unix datagram socket bound to the given path. /// - /// Linux provides, as a nonportable extension, a separate "abstract" - /// address namespace as opposed to filesystem-based addressing. If `path` - /// begins with a null byte, it will be interpreted as an "abstract" - /// address. Otherwise, it will be interpreted as a "pathname" address, - /// corresponding to a path on the filesystem. + /// `path` may not contain any null bytes. pub fn bind>(path: P) -> io::Result { ... } @@ -329,6 +311,8 @@ impl UnixDatagram { /// /// The `send` method may be used to send data to the specified address. /// `recv` and `recv_from` will only receive data from that address. + /// + /// `path` may not contain any null bytes. pub fn connect>(&self, path: P) -> io::Result<()> { ... } @@ -363,6 +347,8 @@ impl UnixDatagram { /// Sends data on the socket to the specified address. /// /// On success, returns the number of bytes written. + /// + /// `path` may not contain any null bytes. pub fn send_to>(&self, buf: &[u8], path: P) -> io::Result { ... } @@ -454,6 +440,8 @@ Differences from `UdpSocket`: Some functionality is notably absent from this proposal: +* Linux's abstract namespace is not supported. Functionality may be added in + the future via extension traits in `std::os::linux::net`. * No support for `SOCK_SEQPACKET` sockets is proposed, as it has not yet been implemented. Since it is connection oriented, there will be a socket type `UnixSeqPacket` and a listener type `UnixSeqListener`. The naming of the @@ -481,15 +469,6 @@ The naming convention of `UnixStream` and `UnixDatagram` doesn't perfectly mirror `TcpStream` and `UdpSocket`, but `UnixStream` and `UnixSocket` seems way too confusing. -Constructors for the various socket types take an `AsRef`, which makes -construction of sockets associated with Linux abstract namespaces somewhat -nonobvious, as the leading null byte has to be explicitly added. However, it is -still possible, either via `&str` for UTF8 names or via `&OsStr` and -`std::os::unix::ffi::OsStrExt` for arbitrary names. Use of the abstract -namespace appears to be very obscure, so it seems best to optimize for -ergonomics of normal pathname addresses. We can add extension traits providing -methods taking `&[u8]` in the future if deemed necessary. - # Unresolved questions [unresolved]: #unresolved-questions From 356e7b99aeebc5e10eb5cb571db3445c021ebb1e Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Thu, 3 Mar 2016 22:58:14 -0800 Subject: [PATCH 0785/1195] Remove keepalive methods from net2 RFC Issues came up in the implementation of these methods that caused us to back off from implementing keepalive functionality for now. --- text/1461-net2-mutators.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/text/1461-net2-mutators.md b/text/1461-net2-mutators.md index 329412f7558..cccd2423b32 100644 --- a/text/1461-net2-mutators.md +++ b/text/1461-net2-mutators.md @@ -35,9 +35,6 @@ impl TcpStream { fn set_nodelay(&self, nodelay: bool) -> io::Result<()>; fn nodelay(&self) -> io::Result; - fn set_keepalive(&self, keepalive: Option) -> io::Result<()>; - fn keepalive(&self) -> io::Result>; - fn set_ttl(&self, ttl: u32) -> io::Result<()>; fn ttl(&self) -> io::Result; From 616f6faec55f05d1b4a349ab4969cfd728cd0ba2 Mon Sep 17 00:00:00 2001 From: Gleb Mazovetskiy Date: Fri, 4 Mar 2016 09:26:09 +0000 Subject: [PATCH 0786/1195] Fix a typo in 1210-impl-specialization.md --- text/1210-impl-specialization.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1210-impl-specialization.md b/text/1210-impl-specialization.md index 90a9a197c3a..32e3696d9e2 100644 --- a/text/1210-impl-specialization.md +++ b/text/1210-impl-specialization.md @@ -817,8 +817,8 @@ simultaneously, but it's not like high school algebra; the equations involved all have the limited form of `type1 = type2`. One immediate way in which unification is relevant to this RFC is in determining -when two impls "overlap": roughly speaking, they overlap if you can each pair of -input types can be unified simultaneously. For example: +when two impls "overlap": roughly speaking, they overlap if each pair of input +types can be unified simultaneously. For example: ```rust // No overlap: String and bool do not unify From 85276c214b78cbe70552cf0ed26c9e5764ea8927 Mon Sep 17 00:00:00 2001 From: Gleb Mazovetskiy Date: Fri, 4 Mar 2016 09:31:01 +0000 Subject: [PATCH 0787/1195] Fix another typo in 1210-impl-specialization.md --- text/1210-impl-specialization.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1210-impl-specialization.md b/text/1210-impl-specialization.md index 32e3696d9e2..7e75147900f 100644 --- a/text/1210-impl-specialization.md +++ b/text/1210-impl-specialization.md @@ -992,8 +992,8 @@ is more specific than the other in the overlapping region. ### Interaction with lifetimes A hard constraint in the design of the trait system is that *dispatch cannot -depend on lifetime information*. In particular, we both cannot, and allow -specialization based on lifetimes: +depend on lifetime information*. In particular, we both cannot, and should not +allow specialization based on lifetimes: - We can't, because when the compiler goes to actually generate code ("trans"), lifetime information has been erased -- so we'd have no idea what From 504fc536b48f35f58b52089d330145095f341c53 Mon Sep 17 00:00:00 2001 From: Kamal Marhubi Date: Fri, 4 Mar 2016 17:35:44 -0500 Subject: [PATCH 0788/1195] rfc 1291: Add libutil to scope of libc crate on Linux The initial motivation for adding this library is to get access to `openpty(3)` and `forkpty(3)`. These simplify opening a pseudoterminal master / slave pair. The functions are defined on Linux, OSX, FreeBSD, NetBSD, and OpenBSD. On OS X, they are available without linking to any additionaly libraries; on the other platforms they require linking with `-lutil`. On Linux, libutil is part of glibc, and defines 6 symbols: - forkpty - logwtmp - login_tty - openpty - login - logout libutil is available in the base installs of Debian-, Fedora-, and Arch-derived distributions, as well as openSUSE and Android, though on Android the `pty.h` header that declares the functions seems not to be available. Together, these cover the DistroWatch [0] top 10 distributions. In musl libc, `openpty(3)` and `forkpty(3)` are included in `libc.a`. On NetBSD and OpenBSD, these functions are included in libutil, which defines about 100 functions. On FreeBSD, these functions are included in libutil, which defines about 200 functions. [0]: http://distrowatch.com/ --- text/1291-promote-libc.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/1291-promote-libc.md b/text/1291-promote-libc.md index 92166b753fa..c719e69a02b 100644 --- a/text/1291-promote-libc.md +++ b/text/1291-promote-libc.md @@ -166,9 +166,9 @@ In order to have a well defined scope while satisfying these constraints, this RFC proposes that this crate will have a scope that is defined separately for each platform that it targets. The proposals are: -* Linux (and other unix-like platforms) - the libc, libm, librt, libdl, and - libpthread libraries. Additional platforms can include libraries whose symbols - are found in these libraries on Linux as well. +* Linux (and other unix-like platforms) - the libc, libm, librt, libdl, + libutil, and libpthread libraries. Additional platforms can include libraries + whose symbols are found in these libraries on Linux as well. * OSX - the common library to link to on this platform is libSystem, but this transitively brings in quite a few dependencies, so this crate will refine what it depends upon from libSystem a little further, specifically: From 261a4f1d4126363cb0407f85b67799f0db316ab1 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 7 Mar 2016 11:00:26 -0800 Subject: [PATCH 0789/1195] Add some examples of workspaces --- text/0000-cargo-workspace.md | 62 ++++++++++++++++++++++++++++++++++-- 1 file changed, 60 insertions(+), 2 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index 3d3d82ab683..574fc887b0d 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -117,12 +117,70 @@ emit an error indicating such. The conventional layout for a Rust project is to have a `Cargo.toml` at the root with the "main project" with dependencies and/or satellite projects underneath. Consequently the conventional layout will need no extra configuration to benefit -from the workspaces proposed in this RFC. +from the workspaces proposed in this RFC. For example, all of these project +layouts (with `/` being the root of a repository) will not require any +configuration to have all crates be members of a workspace: + +* An FFI crate with a sub-scrate for FFI bindings + + ``` + Cargo.toml + src/ + foo-sys/ + Cargo.toml + src/ + ``` + +* A crate with multiple in-tree dependencies + + ``` + Cargo.toml + src/ + dep1/ + Cargo.toml + src/ + dep2/ + Cargo.toml + src/ + ``` Projects like the compiler, however, will likely need explicit configuration. The `rust` repo conceptually has two workspaces, the standard library and the compiler, and these would need to be manually configured with `workspace` and -`workspace-root` keys amongst all crates. +`workspace-root` keys amongst all crates. Some examples of layouts that will +require extra configuration are: + +* Trees without any root crate + + ``` + crate1/ + Cargo.toml + src/ + crate2/ + Cargo.toml + src/ + crate3/ + Cargo.toml + src/ + ``` + +* Trees with multiple workspaces + + ``` + ws1/ + crate1/ + Cargo.toml + src/ + crate2/ + Cargo.toml + src/ + ws2/ + Cargo.toml + src/ + crate3/ + Cargo.toml + src/ + ``` ### Future Extensions From dcafbc119a5a79581709fdd5e01432c95656401e Mon Sep 17 00:00:00 2001 From: Magnus Hoff Date: Tue, 8 Mar 2016 00:11:11 +0100 Subject: [PATCH 0790/1195] Fix typo --- text/0000-unix-socket.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-unix-socket.md b/text/0000-unix-socket.md index c72d2d2586d..69e207e43a8 100644 --- a/text/0000-unix-socket.md +++ b/text/0000-unix-socket.md @@ -445,8 +445,8 @@ Some functionality is notably absent from this proposal: * No support for `SOCK_SEQPACKET` sockets is proposed, as it has not yet been implemented. Since it is connection oriented, there will be a socket type `UnixSeqPacket` and a listener type `UnixSeqListener`. The naming of the - listener is a bit unfortunate, but use `SOCK_SEQPACKET` is rare compared to - `SOCK_STREAM` so naming priority can go to that version. + listener is a bit unfortunate, but use of `SOCK_SEQPACKET` is rare compared + to `SOCK_STREAM` so naming priority can go to that version. * Unix sockets support file descriptor and credential transfer, but these will not initially be supported as the `sendmsg`/`recvmsg` interface is complex and bindings will need some time to prototype. From 60926a68527186a3ac510cf422670626753739b9 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Tue, 8 Mar 2016 04:35:40 -0800 Subject: [PATCH 0791/1195] Remove unnecessary runtime library functions --- text/0000-int128.md | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/text/0000-int128.md b/text/0000-int128.md index 1d12a9f93df..3ebb408ced0 100644 --- a/text/0000-int128.md +++ b/text/0000-int128.md @@ -27,15 +27,9 @@ LLVM fully supports 128-bit integers on all architectures, however it will emit // su_int = u32 // ti_int = i128 // tu_int = u128 -ti_int __absvti2(ti_int a); -ti_int __addvti3(ti_int a, ti_int b); ti_int __ashlti3(ti_int a, si_int b); ti_int __ashrti3(ti_int a, si_int b); -si_int __clzti2(ti_int a); -si_int __cmpti2(ti_int a, ti_int b); -si_int __ctzti2(ti_int a); ti_int __divti3(ti_int a, ti_int b); -si_int __ffsti2(ti_int a); ti_int __fixdfti(double a); ti_int __fixsfti(float a); tu_int __fixunsdfti(double a); @@ -48,14 +42,6 @@ ti_int __lshrti3(ti_int a, si_int b); ti_int __modti3(ti_int a, ti_int b); ti_int __muloti4(ti_int a, ti_int b, int* overflow); ti_int __multi3(ti_int a, ti_int b); -ti_int __mulvti3(ti_int a, ti_int b); -ti_int __negti2(ti_int a); -ti_int __negvti2(ti_int a); -si_int __parityti2(ti_int a); -si_int __popcountti2(ti_int a); -ti_int __subvti3(ti_int a, ti_int b); -si_int __ucmpti2(tu_int a, tu_int b); -tu_int __udivmodti4(tu_int a, tu_int b, tu_int* rem); tu_int __udivti3(tu_int a, tu_int b); tu_int __umodti3(tu_int a, tu_int b); ``` From fb5c86791c6a0d4bb995d973d5393441f1dce51d Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Wed, 9 Mar 2016 22:04:49 +0000 Subject: [PATCH 0792/1195] Stabilize -C overflow-checks --- 0000-template.md | 36 ------------------- text/0000-stable-overflow-checks.md | 55 +++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+), 36 deletions(-) delete mode 100644 0000-template.md create mode 100644 text/0000-stable-overflow-checks.md diff --git a/0000-template.md b/0000-template.md deleted file mode 100644 index a45c6110e58..00000000000 --- a/0000-template.md +++ /dev/null @@ -1,36 +0,0 @@ -- Feature Name: (fill me in with a unique ident, my_awesome_feature) -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary -[summary]: #summary - -One para explanation of the feature. - -# Motivation -[motivation]: #motivation - -Why are we doing this? What use cases does it support? What is the expected outcome? - -# Detailed design -[design]: #detailed-design - -This is the bulk of the RFC. Explain the design in enough detail for somebody familiar -with the language to understand, and for somebody familiar with the compiler to implement. -This should get into specifics and corner-cases, and include examples of how the feature is used. - -# Drawbacks -[drawbacks]: #drawbacks - -Why should we *not* do this? - -# Alternatives -[alternatives]: #alternatives - -What other designs have been considered? What is the impact of not doing this? - -# Unresolved questions -[unresolved]: #unresolved-questions - -What parts of the design are still TBD? diff --git a/text/0000-stable-overflow-checks.md b/text/0000-stable-overflow-checks.md new file mode 100644 index 00000000000..2c0a27a1707 --- /dev/null +++ b/text/0000-stable-overflow-checks.md @@ -0,0 +1,55 @@ +- Feature Name: (fill me in with a unique ident, my_awesome_feature) +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Stabilize the `-C overflow-checks` command line argument. + +# Motivation +[motivation]: #motivation + +This is an easy way to turn on overflow checks in release builds +without otherwise turning on debug assertions, via the `-C +debug-assertions` flag. In stable Rust today you can't get one without +the other. + +Users can use the `-C overflow-checks` flag from their Cargo +config to turn on overflow checks for an entire application. + +This flag, which accepts values of 'yes'/'no', 'on'/'off', is being +renamed from `force-overflow-checks` because the `force` doesn't add +anything that the 'yes'/'no' + +# Detailed design +[design]: #detailed-design + +This is a stabilization RFC. The only steps will be to move +`force-overflow-checks` from `-Z` to `-C`, renaming it to +`overflow-checks`, and making it stable. + +# Drawbacks +[drawbacks]: #drawbacks + +It's another rather ad-hoc flag for modifying code generation. + +Like other such flags, this applies to the entire code unit, +regardless of monomorphizations. This means that code generation for a +single function can be diferent based on which code unit its +instantiated in. + +# Alternatives +[alternatives]: #alternatives + +The flag could instead be tied to crates such that any time code from +that crate is inlined/monomorphized it turns on overflow checks. + +We might also want a design that provides per-function control over +overflow checks. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None. \ No newline at end of file From 65fcd92ee94ff7ab4f92ad676f4a277838fb590a Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Thu, 10 Mar 2016 00:26:05 +0000 Subject: [PATCH 0793/1195] Add an unresolved question about Cargo integration --- text/0000-stable-overflow-checks.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/text/0000-stable-overflow-checks.md b/text/0000-stable-overflow-checks.md index 2c0a27a1707..1762bc29325 100644 --- a/text/0000-stable-overflow-checks.md +++ b/text/0000-stable-overflow-checks.md @@ -52,4 +52,13 @@ overflow checks. # Unresolved questions [unresolved]: #unresolved-questions -None. \ No newline at end of file +Cargo might also add a profile option like + +```toml +[profile.dev] +overflow-checks = true +``` + +This may also be accomplished by Cargo's pending support for passing +arbitrary flags to rustc. + From f26c41b57084bf9858c0d363c3444c201dc940bb Mon Sep 17 00:00:00 2001 From: Andrew Ayer Date: Wed, 9 Mar 2016 20:56:22 -0800 Subject: [PATCH 0794/1195] Update RFC in response to feedback * `octets` no longer returns a reference, due to bad experiences with returning internal references to libc structures in the past. * Replace `from_octets` with an implementation of `From<[u8; 16]>` for `Ipv6Addr`. * For consistency, also implement `From<[u8; 4]>` for `Ipv4Addr`. --- text/0000-ipv6addr-octets.md | 46 +++++++++++++++++++++--------------- 1 file changed, 27 insertions(+), 19 deletions(-) diff --git a/text/0000-ipv6addr-octets.md b/text/0000-ipv6addr-octets.md index d0aa85926d1..f63141a0d21 100644 --- a/text/0000-ipv6addr-octets.md +++ b/text/0000-ipv6addr-octets.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Add constructor and getter functions to `std::net::Ipv6Addr` that are +Add constructor and conversion functions for `std::net::Ipv6Addr` that are oriented around octets. # Motivation @@ -25,30 +25,44 @@ by `std::net::Ipv6Addr`. # Detailed design [design]: #detailed-design -Two functions would be added to `impl std::net::Ipv6Addr`: +The following method would be added to `impl std::net::Ipv6Addr`: ``` -pub fn from_octets(octets: &[u8; 16]) -> Ipv6Addr { - let mut addr: c::in6_addr = unsafe { std::mem::zeroed() }; - addr.s6_addr = *octets; - Ipv6Addr { inner: addr } +pub fn octets(&self) -> [u8; 16] { + self.inner.s6_addr } -pub fn octets(&self) -> &[u8; 16] { - &self.inner.s6_addr +``` + +The following `From` trait would be implemented: + +``` +impl From<[u8; 16]> for Ipv6Addr { + fn from(octets: [u8; 16]) -> Ipv6Addr { + let mut addr: c::in6_addr = unsafe { std::mem::zeroed() }; + addr.s6_addr = octets; + Ipv6Addr { inner: addr } + } +} +``` + +For consistency, the following `From` trait would be +implemented for `Ipv4Addr`: + +``` +impl From<[u8; 4]> for Ipv4Addr { + fn from(octets: [u8; 4]) -> Ipv4Addr { + Ipv4Addr::new(octets[0], octets[1], octets[2], octets[3]) + } } ``` # Drawbacks [drawbacks]: #drawbacks -It adds additional functions to the `Ipv6Addr` API, which increases cognitive load +It adds additional functions to the API, which increases cognitive load and maintenance burden. That said, the functions are conceptually very simple and their implementations short. -Returning a reference from `octets` ties the interface to the internal representation -of `Ipv6Addr`, which is currently `[u8; 16]`. It would not be possible to change `Ipv6Addr` -to use a different representation without changing the return type of `octets` to be a non-reference. - # Alternatives [alternatives]: #alternatives @@ -58,12 +72,6 @@ respect to byte ordering) to convert between `Ipv6Addr` and the on-wire representation of IPv6 addresses. Or they will use their alternative implementations of `Ipv6Addr`, fragmenting the ecosystem. -`octets` could return a non-reference to avoid tying the interface to the -internal representation. However, it seems unlikely that the internal -representation would ever be anything besides a `[u8; 16]`. - # Unresolved questions [unresolved]: #unresolved-questions -Should `octets` return a reference? Pro: avoid a copy. Con: ties the interface to the internal -representation, which is presently `[u8; 16]`. From 0fede964c112958bb7525efaf9c8e4dcfcfef3fa Mon Sep 17 00:00:00 2001 From: Andrew Ayer Date: Wed, 9 Mar 2016 21:05:37 -0800 Subject: [PATCH 0795/1195] Expand summary and rename feature to account for expanded scope --- text/0000-ipv6addr-octets.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/0000-ipv6addr-octets.md b/text/0000-ipv6addr-octets.md index f63141a0d21..2dcdec1f798 100644 --- a/text/0000-ipv6addr-octets.md +++ b/text/0000-ipv6addr-octets.md @@ -1,4 +1,4 @@ -- Feature Name: ipv6addr_octets_interface +- Feature Name: ipaddr_octet_arrays - Start Date: 2016-02-12 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -6,8 +6,8 @@ # Summary [summary]: #summary -Add constructor and conversion functions for `std::net::Ipv6Addr` that are -oriented around octets. +Add constructor and conversion functions for `std::net::Ipv6Addr` and +`std::net::Ipv4Addr` that are oriented around arrays of octets. # Motivation [motivation]: #motivation @@ -56,6 +56,8 @@ impl From<[u8; 4]> for Ipv4Addr { } ``` +Note: `Ipv4Addr` already has an `octets` method that returns a `[u8; 4]`. + # Drawbacks [drawbacks]: #drawbacks From f1985d2d760a0b6119cb25d337d139919a456fa7 Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Sat, 12 Mar 2016 00:24:22 -0500 Subject: [PATCH 0796/1195] finish updating RFC 1192 WRT single-ended ranges Fixes #1537 --- text/1192-inclusive-ranges.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/text/1192-inclusive-ranges.md b/text/1192-inclusive-ranges.md index 7bef8c643c1..a6f619d47a1 100644 --- a/text/1192-inclusive-ranges.md +++ b/text/1192-inclusive-ranges.md @@ -79,8 +79,9 @@ winner. [discuss]: https://internals.rust-lang.org/t/vs-for-inclusive-ranges/1539 -This RFC doesn't propose non-double-ended syntax, like `a...`, `...b` -or `...` since it isn't clear that this is so useful. Maybe it is. +This RFC proposes single-ended syntax with only an end, `...b`, but not +with only a start (`a...`) or unconstrained `...`. This balance could be +reevaluated for usefulness and conflicts with other proposed syntax. The `Empty` variant could be omitted, leaving two options: From adc2547baa84fe5288dce7f94b22e02c98bfe346 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Thu, 10 Mar 2016 21:33:59 -0800 Subject: [PATCH 0797/1195] Add TryFrom and TryInto traits --- text/0000-try-from.md | 165 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 165 insertions(+) create mode 100644 text/0000-try-from.md diff --git a/text/0000-try-from.md b/text/0000-try-from.md new file mode 100644 index 00000000000..a08c9e09d46 --- /dev/null +++ b/text/0000-try-from.md @@ -0,0 +1,165 @@ +- Feature Name: try_from +- Start Date: 2016-03-10 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +The standard library provides the `From` and `Into` traits as standard ways to +convert between types. However, these traits only support *infallable* +conversions. This RFC proposes the addition of `TryFrom` and `TryInto` traits +to support these use cases in a standard way. + +# Motivation +[motivation]: #motivation + +Fallible conversions are fairly common, and a collection of ad-hoc traits has +arisen to support them, both [within the standard library][from-str] and [in +third party crates][into-connect-params]. A standardized set of traits +following the pattern set by `From` and `Into` will ease these APIs by +providing a standardized interface as we expand the set of fallible +conversions. + +One specific avenue of expansion that has been frequently requested is fallible +integer conversion traits. Conversions between integer types may currently be +performed with the `as` operator, which will silently truncate the value if it +is out of bounds of the target type. Code which needs to down-cast values must +manually check that the cast will succeed, which is both tedious and error +prone. A fallible conversion trait reduces code like this: + +```rust +let value: isize = ...; + +let value: u32 = if value < 0 || value > u32::max_value() as isize { + return Err(BogusCast); +} else { + value as u32 +}; +``` + +to simply: + +```rust +let value: isize = ...; +let value: u32 = try!(value.try_into()); +``` + +# Detailed design +[design]: #detailed-design + +Two traits will be added to the `core::convert` module: + +```rust +pub trait TryFrom: Sized { + type Err; + + fn try_from(t: T) -> Result; +} + +pub trait TryInto: Sized { + type Err; + + fn try_into(self) -> Result; +} +``` + +In a fashion similar to `From` and `Into`, a blanket implementation of `TryInto` +is provided for all `TryFrom` implementations: + +```rust +impl TryInto for T where U: TryFrom { + type Error = U::Err; + + fn try_into(self) -> Result { + U::try_from(self) + } +} +``` + +In addition, implementations of `TryFrom` will be provided to convert between +*all combinations* of integer types: + +```rust +#[derive(Debug)] +pub struct TryFromIntError(()); + +impl fmt::Display for TryFromIntError { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + fmt.write_str(self.description()) + } +} + +impl Error for TryFromIntError { + fn description(&self) -> &str { + "out of range integral type conversion attempted" + } +} + +impl TryFrom for u8 { + type Err = TryFromIntError; + + fn try_from(t: usize) -> Result { + // ... + } +} + +// ... +``` + +This notably includes implementations that are actually infallible, including +implementations between a type and itself. A common use case for these kinds +of conversions is when interacting with a C API and converting, for example, +from a `u64` to a `libc::c_long`. `c_long` may be `u32` on some platforms but +`u64` on others, so having an `impl TryFrom for u64` ensures that +conversions using these traits will compile on all architectures. Similarly, a +conversion from `usize` to `u32` may or may not be fallible depending on the +target architecture. + +The standard library provides a reflexive implementation of the `From` trait +for all types: `impl From for T`. We could similarly provide a "lifting" +implementation of `TryFrom`: + +```rust +impl> TryFrom for U { + type Err = Void; + + fn try_from(t: T) -> Result { + Ok(U::from(t)) + } +} +``` + +However, this implementation would directly conflict with our goal of having +uniform `TryFrom` implementations between all combinations of integer types. In +addition, it's not clear what value such an implementation would actually +provide, so this RFC does *not* propose its addition. + +# Drawbacks +[drawbacks]: #drawbacks + +It is unclear if existing fallible conversion traits can backwards-compatibly +be subsumed into `TryFrom` and `TryInto`, which may result in an awkward mix of +ad-hoc traits in addition to `TryFrom` and `TryInto`. + +# Alternatives +[alternatives]: #alternatives + +We could avoid general traits and continue making distinct conversion traits for +each use case. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Are `TryFrom` and `TryInto` the right names? There is some precedent for the +`try_` prefix: `TcpStream::try_clone`, `Mutex::try_lock`, etc. + +What should be done about `FromStr`, `ToSocketAddrs`, and other ad-hoc fallible +conversion traits? An upgrade path may exist in the future with specialization, +but it is probably too early to say definitively. + +Should `TryFrom` and `TryInto` be added to the prelude? This would be the first +prelude addition since the 1.0 release. + +[from-str]: https://doc.rust-lang.org/1.7.0/std/str/trait.FromStr.html +[into-connect-params]: http://sfackler.github.io/rust-postgres/doc/v0.11.4/postgres/trait.IntoConnectParams.html From fd98dd916a22ca76c849f5d7acfc447e282ff46a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 14 Mar 2016 09:29:04 -0700 Subject: [PATCH 0798/1195] Output is the same --- text/0000-rdylib.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-rdylib.md b/text/0000-rdylib.md index b868ef38822..313667ab152 100644 --- a/text/0000-rdylib.md +++ b/text/0000-rdylib.md @@ -55,7 +55,8 @@ A new crate type will be accepted by the compiler, `rdylib`, which can be passed as either `--crate-type rdylib` on the command line or via `#![crate_type = "rdylib"]` in crate attributes. This crate type will conceptually correspond to the rdylib use case described above, and today's `dylib` crate-type will -correspond to the cdylib use case above. +correspond to the cdylib use case above. Note that the literal output artifacts +of these two crate types (files, file names, etc) will be the same. The two formats will differ in the parts listed in the motivation above, specifically: From 450afdd8434d85022adb4b6a0dfc822494f14d05 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Mon, 14 Mar 2016 19:50:30 +0100 Subject: [PATCH 0799/1195] Add more integer atomic types --- text/0000-integer_atomics.md | 105 +++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) create mode 100644 text/0000-integer_atomics.md diff --git a/text/0000-integer_atomics.md b/text/0000-integer_atomics.md new file mode 100644 index 00000000000..9a9a330586e --- /dev/null +++ b/text/0000-integer_atomics.md @@ -0,0 +1,105 @@ +- Feature Name: integer_atomics +- Start Date: 2016-03-14 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC basically changes `core::sync::atomic` to look like this: + +```rust +#[cfg(target_has_atomic = "8")] +struct AtomicBool {} +#[cfg(target_has_atomic = "8")] +struct AtomicI8 {} +#[cfg(target_has_atomic = "8")] +struct AtomicU8 {} +#[cfg(target_has_atomic = "16")] +struct AtomicI16 {} +#[cfg(target_has_atomic = "16")] +struct AtomicU16 {} +#[cfg(target_has_atomic = "32")] +struct AtomicI32 {} +#[cfg(target_has_atomic = "32")] +struct AtomicU32 {} +#[cfg(target_has_atomic = "64")] +struct AtomicI64 {} +#[cfg(target_has_atomic = "64")] +struct AtomicU64 {} +#[cfg(target_has_atomic = "128")] +struct AtomicI128 {} +#[cfg(target_has_atomic = "128")] +struct AtomicU128 {} +#[cfg(target_has_atomic = "ptr")] +struct AtomicIsize {} +#[cfg(target_has_atomic = "ptr")] +struct AtomicUsize {} +#[cfg(target_has_atomic = "ptr")] +struct AtomicPtr {} +``` + +# Motivation +[motivation]: #motivation + +Many lock-free algorithms require a two-value `compare_exchange`, which is effectively twice the size of a `usize`. This would be implemented by atomically swapping a struct containing two members. + +Another use case is to support Linux's futex API. This API is based on atomic `i32` variables, which currently aren't available on x86_64 because `AtomicIsize` is 64-bit. + +# Detailed design +[design]: #detailed-design + +## New atomic types + +The `AtomicI8`, `AtomicI16`, `AtomicI32`, `AtomicI64` and `AtomicI128` types are added along with their matching `AtomicU*` type. These have the same API as the existing `AtomicIsize` and `AtomicUsize` types. Note that support for 128-bit atomics is dependent on the [i128/u128 RFC](https://github.com/rust-lang/rfcs/pull/1504) being accepted. + +## Target support + +One problem is that it is hard for a user to determine if a certain type `T` can be placed inside an `Atomic`. After a quick survey of the LLVM and Clang code, architectures can be classified into 3 categories: + +- The architecture does not support any form of atomics (mainly microcontroller architectures). +- The architecture supports all atomic operations for integers from i8 to iN (where N is the architecture word/pointer size). +- The architecture supports all atomic operations for integers from i8 to i(N*2). + +A new target cfg is added: `target_has_atomic`. It will have multiple values, one for each atomic size supported by the target. For example: + +```rust +#[cfg(target_has_atomic = "128")] +static ATOMIC: AtomicU128 = AtomicU128::new(mem::transmute((0u64, 0u64))); +#[cfg(not(target_has_atomic = "128"))] +static ATOMIC: Mutex<(u64, u64)> = Mutex::new((0, 0)); + +#[cfg(target_has_atomic = "64")] +static COUNTER: AtomicU64 = AtomicU64::new(0); +#[cfg(not(target_has_atomic = "64"))] +static COUTNER: AtomicU32 = AtomicU32::new(0); +``` + +Note that it is not necessary for an architecture to natively support atomic operations for all sizes (`i8`, `i16`, etc) as long as it is able to perform a `compare_exchange` operation with a larger size. All smaller operations can be emulated using that. For example a byte atomic can be emulated by using a `compare_exchange` loop that only modifies a single byte of the value. This is actually how LLVM implements byte-level atomics on MIPS, which only supports word-sized atomics native. Note that the out-of-bounds read is fine here because atomics are aligned and will never cross a page boundary. Since this transformation is performed transparently by LLVM, we do not need to do any extra work to support this. + +## Changes to `AtomicPtr`, `AtomicIsize` and `AtomicUsize` + +These types will have a `#[cfg(target_has_atomic = "ptr")]` bound added to them. Although these types are stable, this isn't a breaking change because all targets currently supported by Rust will have this type available. This would only affect custom targets, which currently fail to link due to missing compiler-rt symbols anyways. + +## Changes to `AtomicBool` + +This type will be changes to use an `AtomicU8` internally instead of an `AtomicUsize`, which will allow it to be safely transmuted to a `bool`. This will make it more consistent with the other atomic types that have the same layout as their underlying type. (For example futex code will assume that a `&AtomicI32` can be passed as a `&i32` to the system call) + +# Drawbacks +[drawbacks]: #drawbacks + +Having certain atomic types get enabled/disable based on the target isn't very nice, but it's unavoidable because support for atomic operations is very architecture-specific. + +This approach doesn't directly support for atomic operations on user-defined structs, but this can be emulated using transmutes. + +# Alternatives +[alternatives]: #alternatives + +One alternative that was discussed in a [previous RFC](https://github.com/rust-lang/rfcs/pull/1505) was to add a generic `Atomic` type. However the consensus was that having unsupported atomic types either fail at monomorphization time or fall back to lock-based implementations was undesirable. + +Several other designs have been suggested [here](https://internals.rust-lang.org/t/pre-rfc-extended-atomic-types/3068). + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From 808b0ec2c6540bb4f548467f20b02a8bc9616f9a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 14 Mar 2016 16:14:54 -0700 Subject: [PATCH 0800/1195] Extend with a "virtual manifest" Allow mostly empty manifests to more easily allow for roots. --- text/0000-cargo-workspace.md | 118 +++++++++++++++++++++++++++-------- 1 file changed, 93 insertions(+), 25 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index 574fc887b0d..b18f9a13687 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -40,12 +40,12 @@ Cargo will grow the concept of a **workspace** for managing repositories of multiple crates. Workspaces will then have the properties: * A workspace can contain multiple local crates. -* Each workspace will have one root crate. +* Each workspace will have a root. * Whenever any crate in the workspace is compiled, output will be placed in the - `target` directory next to the root crate. -* One `Cargo.lock` for the entire workspace will reside next to the root crate - and encompass the dependencies (and dev-dependencies) for all packages in the - workspace. + `target` directory next to the root. +* One `Cargo.lock` for the entire workspace will reside next to the workspace + root and encompass the dependencies (and dev-dependencies) for all packages + in the workspace. With workspaces, Cargo can now solve the problems set forth in the motivation section. Next, however, workspaces need to be defined. In the spirit of much of @@ -57,18 +57,15 @@ conventional project layouts but will have explicit controls for configuration. First, let's look at the new manifest keys which will be added to `Cargo.toml`: ```toml -[package] - -# ... - -workspace-root = true -workspace = ["relative/path/to/child1", "child2"] +[workspace] +root = true +members = ["relative/path/to/child1", "child2"] ``` -Here the `workspace-root` key will be used to indicate whether a package is the -root of a workspace, and the `workspace` key will be a list of paths to crates -which should be added to the package's workspace. The paths listed in -`workspace` must be valid paths to crates. +Here the `workspace.root` key will be used to indicate whether a `Cargo.toml` is +the root of a workspace, and the `members` key will be a list of paths to +crates which should be added to the package's workspace. The paths listed in +`members` must be valid paths to crates. ### Implicit relations @@ -81,14 +78,37 @@ keys wherever possible: filesystem to find a sibling `Cargo.toml` and VCS directory (e.g. `.git` or `.svn`). If found, this crate is also implicitly considered a member of the workspace. -* Crates whose `Cargo.toml` that reside next to VCS directories are implicitly - workspace roots. - -These rules are intended to reflect conventional Cargo project layouts. "Root -crates" typically appear at the root of a repository with lots path dependencies -to all other crates in a repo. Additionally, we don't want to traverse wildly -across the filesystem so we only go upwards to a fixed point or downwards to -specific locations. +* A `Cargo.toml` which resides next to a VCS directory is implicitly a + workspace root. + +These rules are intended to reflect some conventional Cargo project layouts. +"Root crates" typically appear at the root of a repository with lots path +dependencies to all other crates in a repo. Additionally, we don't want to +traverse wildly across the filesystem so we only go upwards to a fixed point or +downwards to specific locations. + +### "Virtual" `Cargo.toml` + +A good number of projects do not have a root `Cargo.toml` at the top of a +repository, however. While the explicit `[workspace]` keys should be enough to +configure the workspace in addition to the implicit relations above, this +directory structure is common enough that it shouldn't require *that* much more +configuration. + +To accomodate this project layout, Cargo will now allow for "virtual manifest" +files. These manifests will currently **only** contains the `[workspace]` key +and will notably be lacking a `[project]` or `[package]` top level key. + +A virtual manifest does not itself define a crate, but can help when defining a +root. For example a `Cargo.toml` file at the root of a repository with +`workspace.members` keys would suffice for the project configurations in +question. + +Cargo will for the time being disallow many commands against a virtual manifest, +for example `cargo build` will be rejected. Arguments that take a package, +however, such as `cargo test -p foo` will be allowed. Workspaces can eventually +get extended with `--all` flags so in a workspace root you could execute +`cargo build --all` to compile all crates. ### Constructing a workspace @@ -147,8 +167,10 @@ configuration to have all crates be members of a workspace: Projects like the compiler, however, will likely need explicit configuration. The `rust` repo conceptually has two workspaces, the standard library and the compiler, and these would need to be manually configured with `workspace` and -`workspace-root` keys amongst all crates. Some examples of layouts that will -require extra configuration are: +`workspace-root` keys amongst all crates. + +Some examples of layouts that will require extra configuration, along with the +configuration necessary, are: * Trees without any root crate @@ -164,6 +186,18 @@ require extra configuration are: src/ ``` + these crates can all join the same workspace via a `Cargo.toml` file at the + root looking like: + + ```toml + [workspace] + members = [ + "crate1", + "crate2", + "crate3", + ] + ``` + * Trees with multiple workspaces ``` @@ -182,6 +216,40 @@ require extra configuration are: src/ ``` + The two workspaces here can be configured by placing the following in the + manifests: + + ```toml + # ws1/Cargo.toml + [workspace] + root = true + members = ["crate1", "crate2"] + ``` + + ```toml + # ws1/crate1/Cargo.toml + [workspace] + members = [".."] + ``` + + ```toml + # ws1/crate2/Cargo.toml + [workspace] + members = [".."] + ``` + + ```toml + # ws2/Cargo.toml + [workspace] + root = true + ``` + + ```toml + # ws2/crate3/Cargo.toml + [workspace] + members = [".."] + ``` + ### Future Extensions Once Cargo understands a workspace of crates, we could easily extend various From d9a9f2d924cc65c2985464717206ea4f5979c7bc Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 12:58:58 +0100 Subject: [PATCH 0801/1195] added discussion of std lib extension. --- text/0000-kinds-of-allocators.md | 73 ++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 01aca4a5cf0..417f5988779 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -460,6 +460,79 @@ fn main() { And that's all to the demo, folks. +### What about standard library containers? + +The intention of this RFC is that the Rust standard library will be +extended with parameteric allocator support: `Vec`, `HashMap`, etc +should all eventually be extended with the ability to use an +alternative allocator for their backing storage. + +However, this RFC does not prescribe when or how this should happen. + +Under the design of this RFC, Allocators parameters are specified via +a *generic type parameter* on the container type. This strongly +implies that `Vec` and `HashMap` will need to be extended +with an allocator type parameter, i.e.: `Vec` and +`HashMap`. + +There are two reasons why such extension is left to later work, after +this RFC. + +#### Default type parameter fallback + +On its own, such a change would be backwards incompatible (i.e. a huge +breaking change), and also would simply be just plain inconvenient for +typical use cases. Therefore, the newly added type parameters will +almost certainly require a *default type*: `Vec` and +`HashMap`. + +Default type parameters themselves, in the context of type defintions, +are a stable part of the Rust language. + +However, the exact semantics of how default type parameters interact +with inference is still being worked out (in part *because* allocators +are a motivating use case), as one can see by reading the following: + +* RFC 213, "Finalize defaulted type parameters": https://github.com/rust-lang/rfcs/blob/master/text/0213-defaulted-type-params.md + + * Tracking Issue for RFC 213: Default Type Parameter Fallback: https://github.com/rust-lang/rust/issues/27336 + +* Feature gate defaulted type parameters appearing outside of types: https://github.com/rust-lang/rust/pull/30724 + +#### Fully general container integration needs Dropck Eyepatch + +The previous problem was largely one of programmer +ergonomics. However, there is also a subtle soundness issue that +arises due to an current implementation artifact. + +Standard library types like `Vec` and `HashMap` allow +instantiating the generic parameters `T`, `K`, `V` with types holding +lifetimes that do not strictly outlive that of the container itself. +(I will refer to such instantiations of `Vec` and `HashMap` +"same-lifetime instances" as a shorthand in this discussion.) + +Same-lifetime instance support is currently implemented for `Vec` and +`HashMap` via an unstable attribute that is too +coarse-grained. Therefore, we cannot soundly add the allocator +parameter to `Vec` and `HashMap` while also continuing to allow +same-lifetime instances without first addressing this overly coarse +attribute. I have an open RFC to address this, the "Dropck Eyepatch" +RFC; that RFC explains in more detail why this problem arises, using +allocators as a specific motivating use case. + + * Concrete code illustrating this exact example (part of Dropck Eyepatch RFC): + https://github.com/pnkfelix/rfcs/blob/dropck-eyepatch/text/0000-dropck-param-eyepatch.md#example-vect-aallocatordefaultallocator + + * Nonparametric dropck RFC https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md + +#### Standard library containers conclusion + +Rather than wait for the above issues to be resolved, this RFC +proposes that we at least stabilize the `Allocator` trait interface; +then we will at least have a starting point upon which to prototype +standard library integration. + ## Allocators and lifetimes [lifetimes]: #allocators-and-lifetimes From fe88acf2dbb857cdf59c8dff77b35ae1050960d6 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 13:27:04 +0100 Subject: [PATCH 0802/1195] fix a typo. --- text/0000-kinds-of-allocators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 417f5988779..2c52605e4d2 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -704,7 +704,7 @@ An instance of an allocator has many methods, but an implementor of the trait need only provide two method bodies: [alloc and dealloc][]. (This is only *somewhat* analogous to the `Iterator` trait in Rust. It -is currently very uncommon to override any methods of `Iterator` ecept +is currently very uncommon to override any methods of `Iterator` except for `fn next`. However, I expect it will be much more common for `Allocator` to override at least some of the other methods, like `fn realloc`.) From 2cdb57563e1c81cb25f9e4dc75b3bcea47bf459b Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 13:28:18 +0100 Subject: [PATCH 0803/1195] Expand the walk-through with some inlining of names of methods being discussed. --- text/0000-kinds-of-allocators.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 2c52605e4d2..83d87db09e2 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -730,7 +730,7 @@ Of course, real-world allocation often needs more than just `alloc`/`dealloc`: in particular, one often wants to avoid extra copying if the existing block of memory can be conceptually expanded in place to meet new allocation needs. In other words, we want -`realloc`, plus alternatives to it that allow clients to avoid +`realloc`, plus alternatives to it (`alloc_excess`) that allow clients to avoid round-tripping through the allocator API. For this, the [memory reuse][] family of methods is appropriate. @@ -743,7 +743,8 @@ let my clients choose how the backing memory is chosen! Why do I have to wrestle with this `Kind` business?" I agree with the sentiment; that's why the `Allocator` trait provides -a family of methods capturing [common usage patterns][]. +a family of methods capturing [common usage patterns][], +for example, `a.alloc_one::()` will return a `Unique` (or error). ## Unchecked variants @@ -758,7 +759,8 @@ via local invariants in their container type). For these clients, the `Allocator` trait provides ["unchecked" variants][unchecked variants] of nearly all of its -methods. +methods; so `a.alloc_unchecked(kind)` will return an `Option
` +(where `None` corresponds to allocation failure). The idea here is that `Allocator` implementors are encouraged to streamline the implmentations of such methods by assuming that all From b6c00503cd8981a292ba0c7121001eb59e579678 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 13:32:36 +0100 Subject: [PATCH 0804/1195] Make `dealloc` infallibe (i.e. remove Result return-type from `fn dealloc`). --- text/0000-kinds-of-allocators.md | 45 +++++--------------------------- 1 file changed, 6 insertions(+), 39 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 83d87db09e2..dd7185a00a7 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -360,9 +360,8 @@ impl<'a> Allocator for &'a DumbBumpPool { } } - unsafe fn dealloc(&mut self, _ptr: Address, _kind: &Self::Kind) -> Result<(), Self::Error> { + unsafe fn dealloc(&mut self, _ptr: Address, _kind: &Self::Kind) { // this bump-allocator just no-op's on dealloc - Ok(()) } unsafe fn oom(&mut self) -> ! { @@ -1092,13 +1091,6 @@ few motivating examples that *are* clearly feasible and useful. (In fact, most of the "Variations correspond to potentially unresolved questions.) - * Should `dealloc` return a `Result` or not? (Under what - circumstances would we expect `dealloc` to fail in a manner worth - signalling? The main one I can think of is a transient failure, - which was in a previous version of the API but has since been removed. - Still, if errors *can* happen, maybe its best to provide *some* way - for a client to catch them and report them in context.) - * Are the type definitions for `Size`, `Capacity`, `Alignment`, and `Address` an abuse of the `NonZero` type? (Or do we just need some constructor for `NonZero` that asserts that the input is non-zero)? @@ -1772,17 +1764,7 @@ pub unsafe trait Allocator { /// `ptr` must have previously been provided via this allocator, /// and `kind` must *fit* the provided block (see above); /// otherwise yields undefined behavior. - /// - /// Returns `Err` only if deallocation fails in some fashion. - /// In this case callers must assume that ownership of the block has - /// been unrecoverably lost (memory may have been leaked). - /// - /// Note: Implementors are encouraged to avoid `Err`-failure from - /// `dealloc`; most memory allocation APIs do not support - /// signalling failure in their `free` routines, and clients are - /// likely to incorporate that assumption into their own code and - /// just `unwrap` the result of this call. - unsafe fn dealloc(&mut self, ptr: Address, kind: Kind) -> Result<(), Self::Error>; + unsafe fn dealloc(&mut self, ptr: Address, kind: Kind); /// Allocator-specific method for signalling an out-of-memory /// condition. @@ -1913,22 +1895,7 @@ pub unsafe trait Allocator { let result = self.alloc(new_kind); if let Ok(new_ptr) = result { ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*kind.size(), *new_kind.size())); - if let Err(_) = self.dealloc(ptr, kind) { - // all we can do from the realloc abstraction - // is either: - // - // 1. free the block we just finished copying - // into and pass the error up, - // 2. panic (same as if we had called `unwrap`), - // 3. try to dealloc again, or - // 4. ignore the dealloc error. - // - // They are all terrible; (1.) and (2.) seem unjustifiable, - // and (3.) seems likely to yield an infinite loop (unless - // we add back in some notion of a transient error - // into the API). - // So we choose (4.): ignore the dealloc error. - } + self.dealloc(ptr, kind); } result } @@ -1980,9 +1947,9 @@ pub unsafe trait Allocator { /// Deallocates a block suitable for holding an instance of `T`. /// /// Captures a common usage pattern for allocators. - unsafe fn dealloc_one(&mut self, mut ptr: Unique) -> Result<(), Self::Error> { + unsafe fn dealloc_one(&mut self, mut ptr: Unique) { let raw_ptr = NonZero::new(ptr.get_mut() as *mut T as *mut u8); - self.dealloc(raw_ptr, Kind::new::().unwrap()) + self.dealloc(raw_ptr, Kind::new::().unwrap()); } /// Allocates a block suitable for holding `n` instances of `T`. @@ -2062,7 +2029,7 @@ pub unsafe trait Allocator { /// Otherwise yields undefined behavior. unsafe fn dealloc_unchecked(&mut self, ptr: Address, kind: Kind) { // (default implementation carries checks, but impl's are free to omit them.) - self.dealloc(ptr, kind).unwrap() + self.dealloc(ptr, kind).unwrap(); } /// Returns a pointer suitable for holding data described by From 5138a3214731d401abb9418cf4d477ef75bebc7c Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 13:33:34 +0100 Subject: [PATCH 0805/1195] some minor rephrasing in the text. --- text/0000-kinds-of-allocators.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index dd7185a00a7..dd009b648f6 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -803,7 +803,7 @@ to both the allocation and deallocation call sites. a simple value or array type is being allocated. * The `alloc_array_unchecked` and `dealloc_array_unchecked` likewise - capture a similar pattern, but are "less safe" in that they put more + capture a common pattern, but are "less safe" in that they put more of an onus on the caller to validate the input parameters before calling the methods. @@ -854,8 +854,7 @@ out-of-memory (OOM) conditions on allocation failure. However, since I also suspect that some programs would benefit from contextual information about *which* allocator is reporting memory exhaustion, I have made `oom` a method of the `Allocator` trait, so -that allocator clients can just call that on error (assuming they want -to trust the failure behavior of the allocator). +that allocator clients have the option of calling that on error. ### Why is `usable_size` ever needed? Why not call `kind.size()` directly, as is done in the default implementation? @@ -1755,8 +1754,14 @@ pub unsafe trait Allocator { /// initialized. (Extension subtraits might restrict this /// behavior, e.g. to ensure initialization.) /// - /// Returns `Err` if allocation fails or if `kind` does + /// Returning `Err` indicates that either memory is exhausted or `kind` does /// not meet allocator's size or alignment constraints. + /// + /// Implementations are encouraged to return `Err` on memory + /// exhaustion rather than panicking or aborting, but this is + /// not a strict requirement. (Specifically: it is *legal* to use + /// this trait to wrap an underlying native allocation library + /// that aborts on memory exhaustion.) unsafe fn alloc(&mut self, kind: Kind) -> Result; /// Deallocate the memory referenced by `ptr`. @@ -1774,13 +1779,12 @@ pub unsafe trait Allocator { /// practice this means implementors should eschew allocating, /// especially from `self` (directly or indirectly). /// - /// Implementors of this trait are discouraged from panicking or - /// aborting from other methods in the event of memory exhaustion; + /// Implementions of this trait's allocation methods are discouraged + /// from panicking (or aborting) in the event of memory exhaustion; /// instead they should return an appropriate error from the /// invoked method, and let the client decide whether to invoke /// this `oom` method. unsafe fn oom(&mut self) -> ! { ::core::intrinsics::abort() } - ``` ### Allocator-specific quantities and limits From 4aa94d92d67d0b410f1f36ada85909b3a4eea634 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 13:41:19 +0100 Subject: [PATCH 0806/1195] alpha-rename `Kind` to `Layout`. (This really was an improvement, if only because it forced me to realize that in some contexts in the text I was using the word "kind" to mean something different than the layout structures being passed around... which sometimes can seem like a nice pun, but overall I suspect it was just a net increase in potential confusion.) --- text/0000-kinds-of-allocators.md | 366 ++++++++++++++++--------------- 1 file changed, 185 insertions(+), 181 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index dd009b648f6..35f7a08eb41 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -115,24 +115,24 @@ individual sections of code.) allocators can inject more specific error types to indicate why an allocation failed. - * The metadata for any allocation is captured in a `Kind` + * The metadata for any allocation is captured in a `Layout` abstraction. This type carries (at minimum) the size and alignment requirements for a memory request. - * The `Kind` type provides a large family of functional construction + * The `Layout` type provides a large family of functional construction methods for building up the description of how memory is laid out. - * Any sized type `T` can be mapped to its `Kind`, via `Kind::new::()`, + * Any sized type `T` can be mapped to its `Layout`, via `Layout::new::()`, - * Heterogenous structure; e.g. `kind1.extend(kind2)`, + * Heterogenous structure; e.g. `layout1.extend(layout2)`, - * Homogenous array types: `kind.repeat(n)` (for `n: usize`), + * Homogenous array types: `layout.repeat(n)` (for `n: usize`), * There are packed and unpacked variants for the latter two methods. * Helper `Allocator` methods like `fn alloc_one` and `fn alloc_array` allow client code to interact with an allocator - without ever directly constructing a `Kind`. + without ever directly constructing a `Layout`. * Once an `Allocator` implementor has the `fn alloc` and `fn dealloc` methods working, it can provide overrides of the other methods, @@ -335,14 +335,14 @@ Here is the demo implementation of `Allocator` for the type. ```rust impl<'a> Allocator for &'a DumbBumpPool { - type Kind = alloc::Kind; + type Layout = alloc::Layout; type Error = BumpAllocError; - unsafe fn alloc(&mut self, kind: &Self::Kind) -> Result { + unsafe fn alloc(&mut self, layout: &Self::Layout) -> Result { let curr = self.avail.load(Ordering::Relaxed) as usize; - let align = *kind.align(); + let align = *layout.align(); let curr_aligned = (curr.overflowing_add(align - 1)) & !(align - 1); - let size = *kind.size(); + let size = *layout.size(); let remaining = (self.end as usize) - curr_aligned; if remaining <= size { return Err(BumpAllocError::MemoryExhausted); @@ -360,7 +360,7 @@ impl<'a> Allocator for &'a DumbBumpPool { } } - unsafe fn dealloc(&mut self, _ptr: Address, _kind: &Self::Kind) { + unsafe fn dealloc(&mut self, _ptr: Address, _layout: &Self::Layout) { // this bump-allocator just no-op's on dealloc } @@ -711,17 +711,17 @@ realloc`.) The `alloc` method returns an `Address` when it succeeds, and `dealloc` takes such an address as its input. But the client must also provide metadata for the allocated block like its size and alignment. -This is encapsulated in the `Kind` argument to `alloc` and `dealloc`. +This is encapsulated in the `Layout` argument to `alloc` and `dealloc`. -### Kinds of allocations +### Memory layouts -A `Kind` just carries the metadata necessary for satisfying an +A `Layout` just carries the metadata necessary for satisfying an allocation request. Its (current, private) representation is just a size and alignment. -The more interesting thing about `Kind` is the -family of public methods associated with it for building new kinds via -composition; these are shown in the [kind api][]. +The more interesting thing about `Layout` is the +family of public methods associated with it for building new layouts via +composition; these are shown in the [layout api][]. ### Reallocation Methods @@ -736,10 +736,10 @@ For this, the [memory reuse][] family of methods is appropriate. ### Type-based Helper Methods -Some readers might skim over the `Kind` API and immediately say "yuck, +Some readers might skim over the `Layout` API and immediately say "yuck, all I wanted to do was allocate some nodes for a tree-structure and let my clients choose how the backing memory is chosen! Why do I have -to wrestle with this `Kind` business?" +to wrestle with this `Layout` business?" I agree with the sentiment; that's why the `Allocator` trait provides a family of methods capturing [common usage patterns][], @@ -758,7 +758,7 @@ via local invariants in their container type). For these clients, the `Allocator` trait provides ["unchecked" variants][unchecked variants] of nearly all of its -methods; so `a.alloc_unchecked(kind)` will return an `Option
` +methods; so `a.alloc_unchecked(layout)` will return an `Option
` (where `None` corresponds to allocation failure). The idea here is that `Allocator` implementors are encouraged @@ -811,12 +811,12 @@ to both the allocation and deallocation call sites. callers who can make use of excess memory to avoid unnecessary calls to `realloc`. -### Why the `Kind` abstraction? +### Why the `Layout` abstraction? While we do want to require clients to hand the allocator the size and alignment, we have found that the code to compute such things follows regular patterns. It makes more sense to factor those patterns out -into a common abstraction; this is what `Kind` provides: a high-level +into a common abstraction; this is what `Layout` provides: a high-level API for describing the memory layout of a composite structure by composing the layout of its subparts. @@ -856,14 +856,14 @@ contextual information about *which* allocator is reporting memory exhaustion, I have made `oom` a method of the `Allocator` trait, so that allocator clients have the option of calling that on error. -### Why is `usable_size` ever needed? Why not call `kind.size()` directly, as is done in the default implementation? +### Why is `usable_size` ever needed? Why not call `layout.size()` directly, as is done in the default implementation? -`kind.size()` returns the minimum required size that the client needs. +`layout.size()` returns the minimum required size that the client needs. In a block-based allocator, this may be less than the *actual* size -that the allocator would ever provide to satisfy that `kind` of +that the allocator would ever provide to satisfy that kind of request. Therefore, `usable_size` provides a way for clients to observe what the minimum actual size of an allocated block for -that`kind` would be, for a given allocator. +that`layout` would be, for a given allocator. (Note that the documentation does say that in general it is better for clients to use `alloc_excess` and `realloc_excess` instead, if they @@ -899,19 +899,23 @@ designs come around. But the same is not true for user-defined allocators: we want to ensure that adding support for them does not inadvertantly kill any chance for adding GC later. -### The inspiration for Kind +### The inspiration for Layout Some aspects of the design of this RFC were selected in the hopes that it would make such integration easier. In particular, the introduction of the relatively high-level `Kind` abstraction was developed, in part, as a way that a GC-aware allocator would build up a tracing -method associated with a kind. +method associated with a layout. Then I realized that the `Kind` abstraction may be valuable on its own, without GC: It encapsulates important patterns when working with representing data as memory records. -So, this RFC offers the `Kind` abstraction without promising that it +(Later we decided to rename `Kind` to `Layout`, in part to avoid +confusion with the use of the word "kind" in the context of +higher-kinded types (HKT).) + +So, this RFC offers the `Layout` abstraction without promising that it solves the GC problem. (It might, or it might not; we don't know yet.) ### Forwards-compatibility @@ -964,7 +968,7 @@ from [RFC PR 39][]. While that is true, it seems like it would be a little short-sighted. In particular, I have neither proven *nor* disproven the value of -`Kind` system described here with respect to GC integration. +`Layout` system described here with respect to GC integration. As far as I know, it is the closest thing we have to a workable system for allowing client code of allocators to accurately describe the @@ -972,15 +976,15 @@ layout of values they are planning to allocate, which is the main ingredient I believe to be necessary for the kind of dynamic reflection that a GC will require of a user-defined allocator. -## Make `Kind` an associated type of `Allocator` trait +## Make `Layout` an associated type of `Allocator` trait -I explored making an `AllocKind` bound and then having +I explored making an `AllocLayout` bound and then having ```rust pub unsafe trait Allocator { /// Describes the sort of records that this allocator can /// construct. - type Kind: AllocKind; + type Layout: AllocLayout; ... } @@ -994,28 +998,28 @@ But the question is: What benefit does it bring? The main one I could imagine is that it might allow us to introduce a division, at the type-system level, between two kinds of allocators: those that are integrated with the GC (i.e., have an associated -`Allocator::Kind` that ensures that all allocated blocks are scannable +`Allocator::Layout` that ensures that all allocated blocks are scannable by a GC) and allocators that are *not* integrated with the GC (i.e., -have an associated `Allocator::Kind` that makes no guarantees about +have an associated `Allocator::Layout` that makes no guarantees about one will know how to scan the allocated blocks. However, no such design has proven itself to be "obviously feasible to -implement," and therefore it would be unreasonable to make the `Kind` +implement," and therefore it would be unreasonable to make the `Layout` an associated type of the `Allocator` trait without having at least a few motivating examples that *are* clearly feasible and useful. -## Variations on the `Kind` API +## Variations on the `Layout` API - * Should `Kind` offer a `fn resize(&self, new_size: usize) -> Kind` constructor method? - (Such a method would rule out deriving GC tracers from kinds; but we could + * Should `Layout` offer a `fn resize(&self, new_size: usize) -> Layout` constructor method? + (Such a method would rule out deriving GC tracers from layouts; but we could maybe provide it as an `unsafe` method.) - * Should `Kind` ensure an invariant that its associated size is + * Should `Layout` ensure an invariant that its associated size is always a multiple of its alignment? * Doing this would allow simplifying a small part of the API, - namely the distinct `Kind::repeat` (returns both a kind and an - offset) versus `Kind::array` (where the offset is derivable from + namely the distinct `Layout::repeat` (returns both a layout and an + offset) versus `Layout::array` (where the offset is derivable from the input `T`). * Such a constraint would have precendent; in particular, the @@ -1027,21 +1031,21 @@ few motivating examples that *are* clearly feasible and useful. invariant implies a certain loss of expressiveness over what we already provide today. - * Should `Kind` ensure an invariant that its associated size is always positive? + * Should `Layout` ensure an invariant that its associated size is always positive? * Pro: Removes something that allocators would need to check about - input kinds (the backing memory allocators will tend to require + input layouts (the backing memory allocators will tend to require that the input sizes are positive). * Con: Requiring positive size means that zero-sized types do not have an associated - `Kind`. That's not the end of the world, but it does make the `Kind` API slightly - less convenient (e.g. one cannot use `extend` with a zero-sized kind to - forcibly inject padding, because zero-sized kinds do not exist). + `Layout`. That's not the end of the world, but it does make the `Layout` API slightly + less convenient (e.g. one cannot use `extend` with a zero-sized layout to + forcibly inject padding, because zero-sized layouts do not exist). - * Should `Kind::align_to` add padding to the associated size? (Probably not; this would + * Should `Layout::align_to` add padding to the associated size? (Probably not; this would make it impossible to express certain kinds of patteerns.) - * Should the `Kind` methods that might "fail" return `Result` instead of `Option`? + * Should the `Layout` methods that might "fail" return `Result` instead of `Option`? ## Variations on the `Allocator` API @@ -1050,15 +1054,15 @@ few motivating examples that *are* clearly feasible and useful. * Clearly `fn dealloc` and `fn realloc` need to be `unsafe`, since feeding in improper inputs could cause unsound behavior. But is there any analogous input to `fn alloc` that could cause - unsoundness (assuming that the `Kind` struct enforces invariants + unsoundness (assuming that the `Layout` struct enforces invariants like "the associated size is non-zero")? * (I left it as `unsafe fn alloc` just to keep the API uniform with `dealloc` and `realloc`.) - * Should `Allocator::realloc` not require that `new_kind.align()` - evenly divide `kind.align()`? In particular, it is not too - expensive to check if the two kinds are not compatible, and fall + * Should `Allocator::realloc` not require that `new_layout.align()` + evenly divide `layout.align()`? In particular, it is not too + expensive to check if the two layouts are not compatible, and fall back on `alloc`/`dealloc` in that case. * Should `Allocator` not provide unchecked variants on `fn alloc`, @@ -1085,7 +1089,7 @@ few motivating examples that *are* clearly feasible and useful. * Since we cannot do `RefCell` (see FIXME above), what is our standard recommendation for what to do instead? - * Should `Kind` be an associated type of `Allocator` (see + * Should `Layout` be an associated type of `Allocator` (see [alternatives][] section for discussion). (In fact, most of the "Variations correspond to potentially unresolved questions.) @@ -1323,25 +1327,25 @@ fn size_align() -> (usize, usize) { ``` -### Kind API -[kind api]: #kind-api +### Layout API +[layout api]: #layout-api ```rust /// Category for a memory record. /// -/// An instance of `Kind` describes a particular layout of memory. -/// You build a `Kind` up as an input to give to an allocator. +/// An instance of `Layout` describes a particular layout of memory. +/// You build a `Layout` up as an input to give to an allocator. /// -/// All kinds have an associated positive size; note that this implies -/// zero-sized types have no corresponding kind. +/// All layouts have an associated positive size; note that this implies +/// zero-sized types have no corresponding layout. #[derive(Copy, Clone, Debug, PartialEq, Eq)] -pub struct Kind { +pub struct Layout { // size of the requested block of memory, measured in bytes. size: Size, // alignment of the requested block of memory, measured in bytes. // we ensure that this is always a power-of-two, because API's ///like `posix_memalign` require it and it is a reasonable - // constraint to impose on Kind constructors. + // constraint to impose on Layout constructors. // // (However, we do not analogously require `align >= sizeof(void*)`, // even though that is *also* a requirement of `posix_memalign`.) @@ -1353,59 +1357,59 @@ pub struct Kind { // (potentially switching to overflowing_add and // overflowing_mul as necessary). -impl Kind { +impl Layout { // (private constructor) - fn from_size_align(size: usize, align: usize) -> Kind { + fn from_size_align(size: usize, align: usize) -> Layout { assert!(align.is_power_of_two()); let size = unsafe { assert!(size > 0); NonZero::new(size) }; let align = unsafe { assert!(align > 0); NonZero::new(align) }; - Kind { size: size, align: align } + Layout { size: size, align: align } } - /// The minimum size in bytes for a memory block of this kind. + /// The minimum size in bytes for a memory block of this layout. pub fn size(&self) -> NonZero { self.size } - /// The minimum byte alignment for a memory block of this kind. + /// The minimum byte alignment for a memory block of this layout. pub fn align(&self) -> NonZero { self.align } - /// Constructs a `Kind` suitable for holding a value of type `T`. - /// Returns `None` if no such kind exists (e.g. for zero-sized `T`). + /// Constructs a `Layout` suitable for holding a value of type `T`. + /// Returns `None` if no such layout exists (e.g. for zero-sized `T`). pub fn new() -> Option { let (size, align) = size_align::(); - if size > 0 { Some(Kind::from_size_align(size, align)) } else { None } + if size > 0 { Some(Layout::from_size_align(size, align)) } else { None } } - /// Produces kind describing a record that could be used to + /// Produces layout describing a record that could be used to /// allocate backing structure for `T` (which could be a trait /// or other unsized type like a slice). /// - /// Returns `None` when no such kind exists; for example, when `x` + /// Returns `None` when no such layout exists; for example, when `x` /// is a reference to a zero-sized type. pub fn for_value(t: &T) -> Option { let (size, align) = (mem::size_of_val(t), mem::align_of_val(t)); if size > 0 { - Some(Kind::from_size_align(size, align)) + Some(Layout::from_size_align(size, align)) } else { None } } - /// Creates a kind describing the record that can hold a value - /// of the same kind as `self`, but that also is aligned to + /// Creates a layout describing the record that can hold a value + /// of the same layout as `self`, but that also is aligned to /// alignment `align` (measured in bytes). /// /// If `self` already meets the prescribed alignment, then returns /// `self`. /// /// Note that this method does not add any padding to the overall - /// size, regardless of whether the returned kind has a different + /// size, regardless of whether the returned layout has a different /// alignment. In other words, if `K` has size 16, `K.align_to(32)` /// will *still* have size 16. pub fn align_to(&self, align: Alignment) -> Self { if align > self.align { let pow2_align = align.checked_next_power_of_two().unwrap(); debug_assert!(pow2_align > 0); // (this follows from self.align > 0...) - Kind { align: unsafe { NonZero::new(pow2_align) }, + Layout { align: unsafe { NonZero::new(pow2_align) }, ..*self } } else { *self @@ -1430,11 +1434,11 @@ impl Kind { return len_rounded_up - len; } - /// Creates a kind describing the record for `n` instances of + /// Creates a layout describing the record for `n` instances of /// `self`, with a suitable amount of padding between each to /// ensure that each instance is given its requested size and /// alignment. On success, returns `(k, offs)` where `k` is the - /// kind of the array and `offs` is the distance between the start + /// layout of the array and `offs` is the distance between the start /// of each element in the array. /// /// On zero `n` or arithmetic overflow, returns `None`. @@ -1448,15 +1452,15 @@ impl Kind { None => return None, Some(alloc_size) => alloc_size, }; - Some((Kind::from_size_align(alloc_size, *self.align), padded_size)) + Some((Layout::from_size_align(alloc_size, *self.align), padded_size)) } - /// Creates a kind describing the record for `self` followed by + /// Creates a layout describing the record for `self` followed by /// `next`, including any necessary padding to ensure that `next` - /// will be properly aligned. Note that the result kind will + /// will be properly aligned. Note that the result layout will /// satisfy the alignment properties of both `self` and `next`. /// - /// Returns `Some((k, offset))`, where `k` is kind of the concatenated + /// Returns `Some((k, offset))`, where `k` is layout of the concatenated /// record and `offset` is the relative location, in bytes, of the /// start of the `next` embedded witnin the concatenated record /// (assuming that the record itself starts at offset 0). @@ -1464,14 +1468,14 @@ impl Kind { /// On arithmetic overflow, returns `None`. pub fn extend(&self, next: Self) -> Option<(Self, usize)> { let new_align = unsafe { NonZero::new(cmp::max(*self.align, *next.align)) }; - let realigned = Kind { align: new_align, ..*self }; + let realigned = Layout { align: new_align, ..*self }; let pad = realigned.padding_needed_for(new_align); let offset = *self.size() + pad; let new_size = offset + *next.size(); - Some((Kind::from_size_align(new_size, *new_align), offset)) + Some((Layout::from_size_align(new_size, *new_align), offset)) } - /// Creates a kind describing the record for `n` instances of + /// Creates a layout describing the record for `n` instances of /// `self`, with no padding between each instance. /// /// On zero `n` or overflow, returns `None`. @@ -1481,15 +1485,15 @@ impl Kind { Some(scaled) => scaled, }; let size = unsafe { assert!(scaled > 0); NonZero::new(scaled) }; - Some(Kind { size: size, align: self.align }) + Some(Layout { size: size, align: self.align }) } - /// Creates a kind describing the record for `self` followed by + /// Creates a layout describing the record for `self` followed by /// `next` with no additional padding between the two. Since no /// padding is inserted, the alignment of `next` is irrelevant, - /// and is not incoporated *at all* into the resulting kind. + /// and is not incoporated *at all* into the resulting layout. /// - /// Returns `(k, offset)`, where `k` is kind of the concatenated + /// Returns `(k, offset)`, where `k` is layout of the concatenated /// record and `offset` is the relative location, in bytes, of the /// start of the `next` embedded witnin the concatenated record /// (assuming that the record itself starts at offset 0). @@ -1505,7 +1509,7 @@ impl Kind { Some(new_size) => new_size, }; let new_size = unsafe { NonZero::new(new_size) }; - Some((Kind { size: new_size, ..*self }, *self.size())) + Some((Layout { size: new_size, ..*self }, *self.size())) } // Below family of methods *assume* inputs are pre- or @@ -1513,23 +1517,23 @@ impl Kind { ///do indirectly validate, but that is not part of their /// specification.) // - // Since invalid inputs could yield ill-formed kinds, these + // Since invalid inputs could yield ill-formed layouts, these // methods are `unsafe`. - /// Creates kind describing the record for a single instance of `T`. + /// Creates layout describing the record for a single instance of `T`. /// Requires `T` has non-zero size. pub unsafe fn new_unchecked() -> Self { let (size, align) = size_align::(); - Kind::from_size_align(size, align) + Layout::from_size_align(size, align) } - /// Creates a kind describing the record for `self` followed by + /// Creates a layout describing the record for `self` followed by /// `next`, including any necessary padding to ensure that `next` - /// will be properly aligned. Note that the result kind will + /// will be properly aligned. Note that the result layout will /// satisfy the alignment properties of both `self` and `next`. /// - /// Returns `(k, offset)`, where `k` is kind of the concatenated + /// Returns `(k, offset)`, where `k` is layout of the concatenated /// record and `offset` is the relative location, in bytes, of the /// start of the `next` embedded witnin the concatenated record /// (assuming that the record itself starts at offset 0). @@ -1539,7 +1543,7 @@ impl Kind { self.extend(next).unwrap() } - /// Creates a kind describing the record for `n` instances of + /// Creates a layout describing the record for `n` instances of /// `self`, with a suitable amount of padding between each. /// /// Requires non-zero `n` and no arithmetic overflow from inputs. @@ -1548,7 +1552,7 @@ impl Kind { self.repeat(n).unwrap() } - /// Creates a kind describing the record for `n` instances of + /// Creates a layout describing the record for `n` instances of /// `self`, with no padding between each instance. /// /// Requires non-zero `n` and no arithmetic overflow from inputs. @@ -1557,12 +1561,12 @@ impl Kind { self.repeat_packed(n).unwrap() } - /// Creates a kind describing the record for `self` followed by + /// Creates a layout describing the record for `self` followed by /// `next` with no additional padding between the two. Since no /// padding is inserted, the alignment of `next` is irrelevant, - /// and is not incoporated *at all* into the resulting kind. + /// and is not incoporated *at all* into the resulting layout. /// - /// Returns `(k, offset)`, where `k` is kind of the concatenated + /// Returns `(k, offset)`, where `k` is layout of the concatenated /// record and `offset` is the relative location, in bytes, of the /// start of the `next` embedded witnin the concatenated record /// (assuming that the record itself starts at offset 0). @@ -1577,11 +1581,11 @@ impl Kind { self.extend_packed(next).unwrap() } - /// Creates a kind describing the record for a `[T; n]`. + /// Creates a layout describing the record for a `[T; n]`. /// /// On zero `n`, zero-sized `T`, or arithmetic overflow, returns `None`. pub fn array(n: usize) -> Option { - Kind::new::() + Layout::new::() .and_then(|k| k.repeat(n)) .map(|(k, offs)| { debug_assert!(offs == mem::size_of::()); @@ -1589,12 +1593,12 @@ impl Kind { }) } - /// Creates a kind describing the record for a `[T; n]`. + /// Creates a layout describing the record for a `[T; n]`. /// /// Requires nonzero `n`, nonzero-sized `T`, and no arithmetic /// overflow; otherwise behavior undefined. pub fn array_unchecked(n: usize) -> Self { - Kind::array::(n).unwrap() + Layout::array::(n).unwrap() } } @@ -1709,17 +1713,17 @@ impl AllocError for AllocErr { ```rust /// An implementation of `Allocator` can allocate, reallocate, and -/// deallocate arbitrary blocks of data described via `Kind`. +/// deallocate arbitrary blocks of data described via `Layout`. /// -/// Some of the methods require that a kind *fit* a memory block. -/// What it means for a kind to "fit" a memory block means is that +/// Some of the methods require that a layout *fit* a memory block. +/// What it means for a layout to "fit" a memory block means is that /// the following two conditions must hold: /// -/// 1. The block's starting address must be aligned to `kind.align()`. +/// 1. The block's starting address must be aligned to `layout.align()`. /// /// 2. The block's size must fall in the range `[use_min, use_max]`, where: /// -/// * `use_min` is `self.usable_size(kind).0`, and +/// * `use_min` is `self.usable_size(layout).0`, and /// /// * `use_max` is the capacity that was (or would have been) /// returned when (if) the block was allocated via a call to @@ -1727,7 +1731,7 @@ impl AllocError for AllocErr { /// /// Note that: /// -/// * the size of the kind most recently used to allocate the block +/// * the size of the layout most recently used to allocate the block /// is guaranteed to be in the range `[use_min, use_max]`, and /// /// * a lower-bound on `use_max` can be safely approximated by a call to @@ -1748,13 +1752,13 @@ pub unsafe trait Allocator { ```rust /// Returns a pointer suitable for holding data described by - /// `kind`, meeting its size and alignment guarantees. + /// `layout`, meeting its size and alignment guarantees. /// /// The returned block of storage may or may not have its contents /// initialized. (Extension subtraits might restrict this /// behavior, e.g. to ensure initialization.) /// - /// Returning `Err` indicates that either memory is exhausted or `kind` does + /// Returning `Err` indicates that either memory is exhausted or `layout` does /// not meet allocator's size or alignment constraints. /// /// Implementations are encouraged to return `Err` on memory @@ -1762,14 +1766,14 @@ pub unsafe trait Allocator { /// not a strict requirement. (Specifically: it is *legal* to use /// this trait to wrap an underlying native allocation library /// that aborts on memory exhaustion.) - unsafe fn alloc(&mut self, kind: Kind) -> Result; + unsafe fn alloc(&mut self, layout: Layout) -> Result; /// Deallocate the memory referenced by `ptr`. /// /// `ptr` must have previously been provided via this allocator, - /// and `kind` must *fit* the provided block (see above); + /// and `layout` must *fit* the provided block (see above); /// otherwise yields undefined behavior. - unsafe fn dealloc(&mut self, ptr: Address, kind: Kind); + unsafe fn dealloc(&mut self, ptr: Address, layout: Layout); /// Allocator-specific method for signalling an out-of-memory /// condition. @@ -1812,10 +1816,10 @@ pub unsafe trait Allocator { fn max_align(&self) -> Option { None } /// Returns bounds on the guaranteed usable size of a successful - /// allocation created with the specified `kind`. + /// allocation created with the specified `layout`. /// - /// In particular, for a given kind `k`, if `usable_size(k)` returns - /// `(l, m)`, then one can use a block of kind `k` as if it has any + /// In particular, for a given layout `k`, if `usable_size(k)` returns + /// `(l, m)`, then one can use a block of layout `k` as if it has any /// size in the range `[l, m]` (inclusive). /// /// (All implementors of `fn usable_size` must ensure that @@ -1835,8 +1839,8 @@ pub unsafe trait Allocator { /// However, for clients that do not wish to track the capacity /// returned by `alloc_excess` locally, this method is likely to /// produce useful results. - unsafe fn usable_size(&self, kind: Kind) -> (Capacity, Capacity) { - (kind.size(), kind.size()) + unsafe fn usable_size(&self, layout: Layout) -> (Capacity, Capacity) { + (layout.size(), layout.size()) } ``` @@ -1849,21 +1853,21 @@ pub unsafe trait Allocator { // realloc. alloc_excess, realloc_excess /// Returns a pointer suitable for holding data described by - /// `new_kind`, meeting its size and alignment guarantees. To + /// `new_layout`, meeting its size and alignment guarantees. To /// accomplish this, this may extend or shrink the allocation - /// referenced by `ptr` to fit `new_kind`. + /// referenced by `ptr` to fit `new_layout`. /// /// * `ptr` must have previously been provided via this allocator. /// - /// * `kind` must *fit* the `ptr` (see above). (The `new_kind` + /// * `layout` must *fit* the `ptr` (see above). (The `new_layout` /// argument need not fit it.) /// /// Behavior undefined if either of latter two constraints are unmet. /// - /// In addition, `new_kind` should not impose a stronger alignment - /// constraint than `kind`. (In other words, `new_kind.align()` - /// must evenly divide `kind.align()`; note this implies the - /// alignment of `new_kind` must not exceed that of `kind`.) + /// In addition, `new_layout` should not impose a stronger alignment + /// constraint than `layout`. (In other words, `new_layout.align()` + /// must evenly divide `layout.align()`; note this implies the + /// alignment of `new_layout` must not exceed that of `layout`.) /// However, behavior is well-defined (though underspecified) when /// this constraint is violated; further discussion below. /// @@ -1874,12 +1878,12 @@ pub unsafe trait Allocator { /// transferred back to the caller again via the return value of /// this method). /// - /// Returns `Err` only if `new_kind` does not meet the allocator's + /// Returns `Err` only if `new_layout` does not meet the allocator's /// size and alignment constraints of the allocator or the - /// alignment of `kind`, or if reallocation otherwise fails. (Note + /// alignment of `layout`, or if reallocation otherwise fails. (Note /// that did not say "if and only if" -- in particular, an /// implementation of this method *can* return `Ok` if - /// `new_kind.align() > old_kind.align()`; or it can return `Err` + /// `new_layout.align() > old_layout.align()`; or it can return `Err` /// in that scenario.) /// /// If this method returns `Err`, then ownership of the memory @@ -1887,40 +1891,40 @@ pub unsafe trait Allocator { /// contents of the memory block are unaltered. unsafe fn realloc(&mut self, ptr: Address, - kind: Kind, - new_kind: Kind) -> Result { - let (min, max) = self.usable_size(kind); - let s = new_kind.size(); - // All Kind alignments are powers of two, so a comparison + layout: Layout, + new_layout: Layout) -> Result { + let (min, max) = self.usable_size(layout); + let s = new_layout.size(); + // All Layout alignments are powers of two, so a comparison // suffices here (rather than resorting to a `%` operation). - if min <= s && s <= max && new_kind.align() <= kind.align() { + if min <= s && s <= max && new_layout.align() <= layout.align() { return Ok(ptr); } else { - let result = self.alloc(new_kind); + let result = self.alloc(new_layout); if let Ok(new_ptr) = result { - ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*kind.size(), *new_kind.size())); - self.dealloc(ptr, kind); + ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*layout.size(), *new_layout.size())); + self.dealloc(ptr, layout); } result } } /// Behaves like `fn alloc`, but also returns the whole size of - /// the returned block. For some `kind` inputs, like arrays, this + /// the returned block. For some `layout` inputs, like arrays, this /// may include extra storage usable for additional data. - unsafe fn alloc_excess(&mut self, kind: Kind) -> Result { - self.alloc(kind).map(|p| Excess(p, self.usable_size(kind).1)) + unsafe fn alloc_excess(&mut self, layout: Layout) -> Result { + self.alloc(layout).map(|p| Excess(p, self.usable_size(layout).1)) } /// Behaves like `fn realloc`, but also returns the whole size of - /// the returned block. For some `kind` inputs, like arrays, this + /// the returned block. For some `layout` inputs, like arrays, this /// may include extra storage usable for additional data. unsafe fn realloc_excess(&mut self, ptr: Address, - kind: Kind, - new_kind: Kind) -> Result { - self.realloc(ptr, kind, new_kind) - .map(|p| Excess(p, self.usable_size(new_kind).1)) + layout: Layout, + new_layout: Layout) -> Result { + self.realloc(ptr, layout, new_layout) + .map(|p| Excess(p, self.usable_size(new_layout).1)) } ``` @@ -1939,7 +1943,7 @@ pub unsafe trait Allocator { /// The returned block is suitable for passing to the /// `alloc`/`realloc` methods of this allocator. unsafe fn alloc_one(&mut self) -> Result, Self::Error> { - if let Some(k) = Kind::new::() { + if let Some(k) = Layout::new::() { self.alloc(k).map(|p|Unique::new(*p as *mut T)) } else { // (only occurs for zero-sized T) @@ -1953,7 +1957,7 @@ pub unsafe trait Allocator { /// Captures a common usage pattern for allocators. unsafe fn dealloc_one(&mut self, mut ptr: Unique) { let raw_ptr = NonZero::new(ptr.get_mut() as *mut T as *mut u8); - self.dealloc(raw_ptr, Kind::new::().unwrap()); + self.dealloc(raw_ptr, Layout::new::().unwrap()); } /// Allocates a block suitable for holding `n` instances of `T`. @@ -1963,8 +1967,8 @@ pub unsafe trait Allocator { /// The returned block is suitable for passing to the /// `alloc`/`realloc` methods of this allocator. unsafe fn alloc_array(&mut self, n: usize) -> Result, Self::Error> { - match Kind::array::(n) { - Some(kind) => self.alloc(kind).map(|p|Unique::new(*p as *mut T)), + match Layout::array::(n) { + Some(layout) => self.alloc(layout).map(|p|Unique::new(*p as *mut T)), None => Err(Self::Error::invalid_input()), } } @@ -1981,7 +1985,7 @@ pub unsafe trait Allocator { ptr: Unique, n_old: usize, n_new: usize) -> Result, Self::Error> { - let old_new_ptr = (Kind::array::(n_old), Kind::array::(n_new), *ptr); + let old_new_ptr = (Layout::array::(n_old), Layout::array::(n_new), *ptr); if let (Some(k_old), Some(k_new), ptr) = old_new_ptr { self.realloc(NonZero::new(ptr as *mut u8), k_old, k_new) .map(|p|Unique::new(*p as *mut T)) @@ -1995,7 +1999,7 @@ pub unsafe trait Allocator { /// Captures a common usage pattern for allocators. unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), Self::Error> { let raw_ptr = NonZero::new(*ptr as *mut u8); - if let Some(k) = Kind::array::(n) { + if let Some(k) = Layout::array::(n) { self.dealloc(raw_ptr, k) } else { Err(Self::Error::invalid_input()) @@ -2011,7 +2015,7 @@ pub unsafe trait Allocator { // UNCHECKED METHOD VARIANTS /// Returns a pointer suitable for holding data described by - /// `kind`, meeting its size and alignment guarantees. + /// `layout`, meeting its size and alignment guarantees. /// /// The returned block of storage may or may not have its contents /// initialized. (Extension subtraits might restrict this @@ -2021,25 +2025,25 @@ pub unsafe trait Allocator { /// /// Behavior undefined if input does not meet size or alignment /// constraints of this allocator. - unsafe fn alloc_unchecked(&mut self, kind: Kind) -> Option
{ + unsafe fn alloc_unchecked(&mut self, layout: Layout) -> Option
{ // (default implementation carries checks, but impl's are free to omit them.) - self.alloc(kind).ok() + self.alloc(layout).ok() } /// Deallocate the memory referenced by `ptr`. /// /// `ptr` must have previously been provided via this allocator, - /// and `kind` must *fit* the provided block (see above). + /// and `layout` must *fit* the provided block (see above). /// Otherwise yields undefined behavior. - unsafe fn dealloc_unchecked(&mut self, ptr: Address, kind: Kind) { + unsafe fn dealloc_unchecked(&mut self, ptr: Address, layout: Layout) { // (default implementation carries checks, but impl's are free to omit them.) - self.dealloc(ptr, kind).unwrap(); + self.dealloc(ptr, layout).unwrap(); } /// Returns a pointer suitable for holding data described by - /// `new_kind`, meeting its size and alignment guarantees. To + /// `new_layout`, meeting its size and alignment guarantees. To /// accomplish this, may extend or shrink the allocation - /// referenced by `ptr` to fit `new_kind`. + /// referenced by `ptr` to fit `new_layout`. //// /// (In other words, ownership of the memory block associated with /// `ptr` is first transferred back to this allocator, but the @@ -2048,12 +2052,12 @@ pub unsafe trait Allocator { /// /// * `ptr` must have previously been provided via this allocator. /// - /// * `kind` must *fit* the `ptr` (see above). (The `new_kind` + /// * `layout` must *fit* the `ptr` (see above). (The `new_layout` /// argument need not fit it.) /// - /// * `new_kind` must meet the allocator's size and alignment - /// constraints. In addition, `new_kind.align()` must equal - /// `kind.align()`. (Note that this is a stronger constraint + /// * `new_layout` must meet the allocator's size and alignment + /// constraints. In addition, `new_layout.align()` must equal + /// `layout.align()`. (Note that this is a stronger constraint /// that that imposed by `fn realloc`.) /// /// Behavior undefined if any of latter three constraints are unmet. @@ -2065,25 +2069,25 @@ pub unsafe trait Allocator { /// original memory block referenced by `ptr` is unaltered. unsafe fn realloc_unchecked(&mut self, ptr: Address, - kind: Kind, - new_kind: Kind) -> Option
{ + layout: Layout, + new_layout: Layout) -> Option
{ // (default implementation carries checks, but impl's are free to omit them.) - self.realloc(ptr, kind, new_kind).ok() + self.realloc(ptr, layout, new_layout).ok() } /// Behaves like `fn alloc_unchecked`, but also returns the whole /// size of the returned block. - unsafe fn alloc_excess_unchecked(&mut self, kind: Kind) -> Option { - self.alloc_excess(kind).ok() + unsafe fn alloc_excess_unchecked(&mut self, layout: Layout) -> Option { + self.alloc_excess(layout).ok() } /// Behaves like `fn realloc_unchecked`, but also returns the /// whole size of the returned block. unsafe fn realloc_excess_unchecked(&mut self, ptr: Address, - kind: Kind, - new_kind: Kind) -> Option { - self.realloc_excess(ptr, kind, new_kind).ok() + layout: Layout, + new_layout: Layout) -> Option { + self.realloc_excess(ptr, layout, new_layout).ok() } @@ -2095,8 +2099,8 @@ pub unsafe trait Allocator { /// overflow, and `T` is not zero sized; otherwise yields /// undefined behavior. unsafe fn alloc_array_unchecked(&mut self, n: usize) -> Option> { - let kind = Kind::array_unchecked::(n); - self.alloc_unchecked(kind).map(|p|Unique::new(*p as *mut T)) + let layout = Layout::array_unchecked::(n); + self.alloc_unchecked(layout).map(|p|Unique::new(*p as *mut T)) } /// Reallocates a block suitable for holding `n_old` instances of `T`, @@ -2111,8 +2115,8 @@ pub unsafe trait Allocator { ptr: Unique, n_old: usize, n_new: usize) -> Option> { - let (k_old, k_new, ptr) = (Kind::array_unchecked::(n_old), - Kind::array_unchecked::(n_new), + let (k_old, k_new, ptr) = (Layout::array_unchecked::(n_old), + Layout::array_unchecked::(n_new), *ptr); self.realloc_unchecked(NonZero::new(ptr as *mut u8), k_old, k_new) .map(|p|Unique::new(*p as *mut T)) @@ -2126,8 +2130,8 @@ pub unsafe trait Allocator { /// overflow, and `T` is not zero sized; otherwise yields /// undefined behavior. unsafe fn dealloc_array_unchecked(&mut self, ptr: Unique, n: usize) { - let kind = Kind::array_unchecked::(n); - self.dealloc_unchecked(NonZero::new(*ptr as *mut u8), kind); + let layout = Layout::array_unchecked::(n); + self.dealloc_unchecked(NonZero::new(*ptr as *mut u8), layout); } } ``` From e2d461cdffe306cf2b62da94c1456319c28f2a95 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 14:15:27 +0100 Subject: [PATCH 0807/1195] revised `fn oom` interface to also take the `Self::Error` as input, so that contextual information can be fed back into the allocator itself. Though on further review, the comment that inspired this, https://github.com/rust-lang/rfcs/pull/1398#issuecomment-169203800 also wanted a `FormatArgs` argument too, so that the client code could feed back in arbitrary info. --- text/0000-kinds-of-allocators.md | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 35f7a08eb41..ec8cdf9f435 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -307,7 +307,7 @@ will expose: ```rust #[derive(Copy, Clone, PartialEq, Eq, Debug)] -enum BumpAllocError { Invalid, MemoryExhausted, Interference } +enum BumpAllocError { Invalid, MemoryExhausted(alloc::Layout), Interference } impl BumpAllocError { fn is_transient(&self) { *self == BumpAllocError::Interference } @@ -315,7 +315,7 @@ impl BumpAllocError { impl alloc::AllocError for BumpAllocError { fn invalid_input() -> Self { BumpAllocError::MemoryExhausted } - fn is_memory_exhausted(&self) -> bool { *self == BumpAllocError::MemoryExhausted } + fn is_memory_exhausted(&self) -> bool { if let BumpAllocError::MemoryExhausted(_) = *self { true } else { false } } fn is_request_unsupported(&self) -> bool { false } } ``` @@ -335,17 +335,16 @@ Here is the demo implementation of `Allocator` for the type. ```rust impl<'a> Allocator for &'a DumbBumpPool { - type Layout = alloc::Layout; type Error = BumpAllocError; - unsafe fn alloc(&mut self, layout: &Self::Layout) -> Result { + unsafe fn alloc(&mut self, layout: &alloc::Layout) -> Result { let curr = self.avail.load(Ordering::Relaxed) as usize; let align = *layout.align(); let curr_aligned = (curr.overflowing_add(align - 1)) & !(align - 1); let size = *layout.size(); let remaining = (self.end as usize) - curr_aligned; if remaining <= size { - return Err(BumpAllocError::MemoryExhausted); + return Err(BumpAllocError::MemoryExhausted(layout.clone())); } let curr = curr as *mut u8; @@ -360,12 +359,12 @@ impl<'a> Allocator for &'a DumbBumpPool { } } - unsafe fn dealloc(&mut self, _ptr: Address, _layout: &Self::Layout) { + unsafe fn dealloc(&mut self, _ptr: Address, _layout: &alloc::Layout) { // this bump-allocator just no-op's on dealloc } - unsafe fn oom(&mut self) -> ! { - panic!("exhausted memory in {}", self.name); + fn oom(&mut self, err: Self::Error) -> ! { + panic!("exhausted memory in {} on request {:?}", self.name, err); } } @@ -1098,10 +1097,6 @@ few motivating examples that *are* clearly feasible and useful. `Address` an abuse of the `NonZero` type? (Or do we just need some constructor for `NonZero` that asserts that the input is non-zero)? - * Should `fn oom(&self)` take in more arguments (e.g. to allow the - client to provide more contextual information about the OOM - condition)? - * Should we get rid of the `AllocError` bound entirely? Is the given set of methods actually worth providing to all generic clients? @@ -1778,8 +1773,8 @@ pub unsafe trait Allocator { /// Allocator-specific method for signalling an out-of-memory /// condition. /// - /// Any activity done by the `oom` method should ensure that it - /// does not infinitely regress in nested calls to `oom`. In + /// Implementations of the `oom` method are discouraged from + /// infinitely regressing in nested calls to `oom`. In /// practice this means implementors should eschew allocating, /// especially from `self` (directly or indirectly). /// @@ -1788,7 +1783,7 @@ pub unsafe trait Allocator { /// instead they should return an appropriate error from the /// invoked method, and let the client decide whether to invoke /// this `oom` method. - unsafe fn oom(&mut self) -> ! { ::core::intrinsics::abort() } + fn oom(&mut self, _: Self::Error) -> ! { ::core::intrinsics::abort() } ``` ### Allocator-specific quantities and limits From 05c963505048e3141e8c54b64811364f13f7d2b1 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 14:22:40 +0100 Subject: [PATCH 0808/1195] Expanded docs for `AllocError` trait's methods. Added unresolved Q about the new `fn oom` method. --- text/0000-kinds-of-allocators.md | 39 +++++++++++++++++++++----------- 1 file changed, 26 insertions(+), 13 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index ec8cdf9f435..6d535f0822b 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1118,6 +1118,11 @@ few motivating examples that *are* clearly feasible and useful. low-level allocators will behave well for large alignments. See https://github.com/rust-lang/rust/issues/30170 + * Should `Allocator::oom` also take a `std::fmt::Arguments<'a>` parameter + so that clients can feed in context-specific information that is not + part of the original input `Layout` argument? (I have not done this + mainly because I do not want to introduce a dependency on `libstd`.) + # Change History * Changed `fn usable_size` to return `(l, m)` rathern than just `m`. @@ -1609,15 +1614,19 @@ pub trait AllocError { /// Construct an error that indicates operation failure due to /// invalid input values for the request. /// - /// This can be used, for example, to signal an overflow occurred - /// during arithmetic computation. (However, since overflows - /// frequently represent an allocation attempt that would exhaust - /// memory, clients are alternatively allowed to constuct an error - /// representing memory exhaustion in such scenarios.) + /// This can be used, for example, to signal that allocation of + /// a zero-sized type was requested. + /// + /// As another example, it might be used to signal that an overflow + /// occurred during arithmetic computation with the input. (However, + /// since overflows can also occur during large allocation requests + /// that would exhaust memory if arbitrary-precision arithmetic were + /// used, clients are alternatively allowed to constuct an error + /// representing memory exhaustion in this scenario.) fn invalid_input() -> Self; /// Returns true if the error is due to hitting some resource - /// limit or otherwise running out of memory. This condition + /// limit, or otherwise running out of memory. This condition /// serves as a hint that some series of deallocations *might* /// allow a subsequent reissuing of the original allocation /// request to succeed. @@ -1626,23 +1635,27 @@ pub trait AllocError { /// e.g. usually when `malloc` returns `null`, it is because of /// hitting a user resource limit or system memory exhaustion. /// - /// Note that the resource exhaustion could be specific to the + /// Note that the resource exhaustion could be internal to the /// original allocator (i.e. the only way to free up memory is by /// deallocating memory attached to that allocator), or it could - /// be associated with some other state outside of the original - /// alloactor. The `AllocError` trait does not distinguish between - /// the two scenarios. + /// be associated with some other state external to the original + /// allocator (e.g. freeing up memory or reducing fragmentation + /// globally might allow a call to the system `malloc` to succeed). + /// The `AllocError` trait does not distinguish between the two + /// scenarios (but instances of the associated `Allocator::Error` + /// type might provide ways to distinguish them). /// /// Finally, error responses to allocation input requests that are /// *always* illegal for *any* allocator (e.g. zero-sized or /// arithmetic-overflowing requests) are allowed to respond `true` - /// here. (This is to allow `MemoryExhausted` as a valid error type - /// for an allocator that can handle all "sane" requests.) + /// here. (This is to allow `MemoryExhausted` as a valid + /// zero-sized error type for an allocator that can handle all + /// "sane" requests.) fn is_memory_exhausted(&self) -> bool; /// Returns true if the allocator is fundamentally incapable of /// satisfying the original request. This condition implies that - /// such an allocation request will never succeed on this + /// such an allocation request would never succeed on *this* /// allocator, regardless of environment, memory pressure, or /// other contextual condtions. /// From a93abd7defe72526e182cdcf0b06136b57292654 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 14:24:24 +0100 Subject: [PATCH 0809/1195] made `Layout` (formerly `Kind`) non-`Copy`. --- text/0000-kinds-of-allocators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 6d535f0822b..e8ca9a40178 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1338,7 +1338,7 @@ fn size_align() -> (usize, usize) { /// /// All layouts have an associated positive size; note that this implies /// zero-sized types have no corresponding layout. -#[derive(Copy, Clone, Debug, PartialEq, Eq)] +#[derive(Clone, Debug, PartialEq, Eq)] pub struct Layout { // size of the requested block of memory, measured in bytes. size: Size, From cad7d43997dcc77092e512d886bd2528f630c246 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 14:24:54 +0100 Subject: [PATCH 0810/1195] expanded change history to note latest changes. --- text/0000-kinds-of-allocators.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index e8ca9a40178..7cc94e6ef17 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1130,6 +1130,13 @@ few motivating examples that *are* clearly feasible and useful. * Removed `fn is_transient` from `trait AllocError`, and removed discussion of transient errors from the API. +* Made `fn dealloc` method infallible (i.e. removed its `Result` return type). + +* Alpha-renamed `alloc::Kind` type to `alloc::Layout`, and made it non-`Copy`. + +* Revised `fn oom` method to take the `Self::Error` as an input (so that the + allocator can, indirectly, feed itself information about what went wrong). + # Appendices ## Bibliography From 5ae80387569178808fba81a4ce01666750e2e1e5 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 15:03:34 +0100 Subject: [PATCH 0811/1195] added discussion of why API uses `&mut self` (rather than `&self` or `self`). --- text/0000-kinds-of-allocators.md | 39 ++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 7cc94e6ef17..8391e04e023 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1048,6 +1048,45 @@ few motivating examples that *are* clearly feasible and useful. ## Variations on the `Allocator` API + * Should the allocator methods take `&self` or `self` rather than `&mut self`. + + As noted during in the RFC comments, nearly every trait goes through a bit + of an identity crisis in terms of deciding what kind of `self` parameter is + appropriate. + + The justification for `&mut self` is this: + + * It does not restrict allocator implementors from making sharable allocators: + to do so, just do `impl<'a> Allocator for &'a MySharedAlloc`, as illustrated + in the `DumbBumpPool` example. + + * `&mut self` is better than `&self` for simple allocators that are *not* sharable. + `&mut self` ensures that the allocation methods have exclusive + access to the underlying allocator state, without resorting to a + lock. (Another way of looking at it: It moves the onus of using a + lock outward, to the allocator clients.) + + * One might think that the points made + above apply equally well to `self` (i.e., if you want to implement an allocator + that wants to take itself via a `&mut`-reference when the methods take `self`, + then do `impl<'a> Allocator for &'a mut MyUniqueAlloc`). + + However, the problem with `self` is that if you want to use an + allocator for *more than one* allocation, you will need to call + `clone()` (or make the allocator parameter implement + `Copy`). This means in practice all allocators will need to + support `Clone` (and thus support sharing in general, as + discussed in the [Allocators and lifetimes][lifetimes] section). + + Put more simply, requiring that allocators implement `Clone` means + that it will *not* be pratical to do + `impl<'a> Allocator for &'a mut MyUniqueAlloc`. + + By using `&mut self` for the allocation methods, we can encode + the expected use case of an *unshared* allocator that is used + repeatedly in a linear fashion (e.g. vector that needs to + reallocate its backing storage). + * Should `Allocator::alloc` be safe instead of `unsafe fn`? * Clearly `fn dealloc` and `fn realloc` need to be `unsafe`, since From dd485fd688b8b55afb2971ee681ff1250452cb76 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 15:14:19 +0100 Subject: [PATCH 0812/1195] amend discussion of `&mut self` with explicit note about why `impl Allocator for &mut MyUniqAlloc` cannot just rely on reborrows to handle satisfying `self` parameters. --- text/0000-kinds-of-allocators.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 8391e04e023..8a8b3bceaf0 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1078,6 +1078,11 @@ few motivating examples that *are* clearly feasible and useful. support `Clone` (and thus support sharing in general, as discussed in the [Allocators and lifetimes][lifetimes] section). + (Remember, I'm thinking about allocator-parametric code like + `Vec`, which does not know if the `A` is a + `&mut`-reference. In that context, therefore one cannot assume + that reborrowing machinery is available to the client code.) + Put more simply, requiring that allocators implement `Clone` means that it will *not* be pratical to do `impl<'a> Allocator for &'a mut MyUniqueAlloc`. From d9232de6c6b1983ef4095cbb449489954cb83013 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 15:16:22 +0100 Subject: [PATCH 0813/1195] added very short discussion of why there's no lifetime-enriched `Address<'a>`. --- text/0000-kinds-of-allocators.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 8a8b3bceaf0..f5498a37721 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1092,6 +1092,18 @@ few motivating examples that *are* clearly feasible and useful. repeatedly in a linear fashion (e.g. vector that needs to reallocate its backing storage). + * Should the types representing allocated storage have lifetimes attached? + (E.g. `fn alloc<'a>(&mut self, layout: &alloc::Layout) -> Address<'a>`.) + + I think Gankro [put it best](https://github.com/rust-lang/rfcs/pull/1398#issuecomment-164003160): + + > This is a low-level unsafe interface, and the expected usecases make it + > both quite easy to avoid misuse, and impossible to use lifetimes + > (you want a struct to store the allocator and the allocated elements). + > Any time we've tried to shove more lifetimes into these kinds of + > interfaces have just been an annoying nuisance necessitating + > copy-lifetime/transmute nonsense. + * Should `Allocator::alloc` be safe instead of `unsafe fn`? * Clearly `fn dealloc` and `fn realloc` need to be `unsafe`, since From 2f3034a7c401f98f1c33a74de7e24e3a32b97d74 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 16 Mar 2016 18:49:43 +0100 Subject: [PATCH 0814/1195] Added extensive discussion of zero-sized allocations to the alternatives section. --- text/0000-kinds-of-allocators.md | 49 ++++++++++++++++++++++++++++++-- 1 file changed, 47 insertions(+), 2 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index f5498a37721..0c7719ba31d 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1138,6 +1138,42 @@ few motivating examples that *are* clearly feasible and useful. (But the resulting uniformity of the whole API might shift the balance to "worth it".) + * Should the precondition of allocation methods be loosened to + accept zero-sized types? + + Right now, there is a requirement that the allocation requests + denote non-zero sized types (this requirement is encoded in two + ways: for `Layout`-consuming methods like `alloc`, it is enforced + via the invariant that the `Size` is a `NonZero`, and this is + enforced by checks in the `Layout` construction code; for the + convenience methods like `alloc_one`, they will return `Err` if the + allocation request is zero-sized). + + The main motivation for this restriction is some underlying system + allocators, like `jemalloc`, explicitly disallow zero-sized + inputs. Therefore, to remove all unnecessary control-flow branches + between the client and the underlying allocator, the `Allocator` + trait is bubbling that restriction up and imposing it onto the + clients, who will presumably enforce this invariant via + container-specific means. + + But: pre-existing container types (like `Vec`) already + *allow* zero-sized `T`. Therefore, there is an unfortunate mismatch + between the ideal API those container would prefer for their + allocators and the actual service that this `Allocator` trait is + providing. + + So: Should we lift this precondition of the allocation methods, and allow + zero-sized requests (which might be handled by a global sentinel value, or + by an allocator-specific sentinel value, or via some other means -- this + would have to be specified as part of the Allocator API)? + + (As a middle ground, we could lift the precondition solely for the convenience + methods like `fn alloc_one` and `fn alloc_array`; that way, the most low-level + methods like `fn alloc` would continue to minimize the overhead they add + over the underlying system allocator, while the convenience methods would truly + be convenient.) + # Unresolved questions [unresolved]: #unresolved-questions @@ -2000,8 +2036,8 @@ pub unsafe trait Allocator { ``` -### Allocator common usage patterns -[common usage patterns]: #allocator-common-usage-patterns +### Allocator convenience methods for common usage patterns +[common usage patterns]: #allocator-convenience-methods-for-common-usage-patterns ```rust // == COMMON USAGE PATTERNS == @@ -2013,6 +2049,8 @@ pub unsafe trait Allocator { /// /// The returned block is suitable for passing to the /// `alloc`/`realloc` methods of this allocator. + /// + /// Returns `Err` for zero-sized `T`. unsafe fn alloc_one(&mut self) -> Result, Self::Error> { if let Some(k) = Layout::new::() { self.alloc(k).map(|p|Unique::new(*p as *mut T)) @@ -2025,6 +2063,11 @@ pub unsafe trait Allocator { /// Deallocates a block suitable for holding an instance of `T`. /// + /// The given block must have been produced by this allocator, + /// and must be suitable for storing a `T` (in terms of alignment + /// as well as minimum and maximum size); otherwise yields + /// undefined behavior. + /// /// Captures a common usage pattern for allocators. unsafe fn dealloc_one(&mut self, mut ptr: Unique) { let raw_ptr = NonZero::new(ptr.get_mut() as *mut T as *mut u8); @@ -2037,6 +2080,8 @@ pub unsafe trait Allocator { /// /// The returned block is suitable for passing to the /// `alloc`/`realloc` methods of this allocator. + /// + /// Returns `Err` for zero-sized `T` or `n == 0`. unsafe fn alloc_array(&mut self, n: usize) -> Result, Self::Error> { match Layout::array::(n) { Some(layout) => self.alloc(layout).map(|p|Unique::new(*p as *mut T)), From a110ef5877ab72b3e1e4ef53e98d9965960180c0 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 17 Mar 2016 09:24:16 -0700 Subject: [PATCH 0815/1195] RFC 1432 is {Vec,String}::splice --- text/{0000-replace-slice.md => 1432-replace-slice.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-replace-slice.md => 1432-replace-slice.md} (97%) diff --git a/text/0000-replace-slice.md b/text/1432-replace-slice.md similarity index 97% rename from text/0000-replace-slice.md rename to text/1432-replace-slice.md index 7e4d0938cfa..2dbe69feee2 100644 --- a/text/0000-replace-slice.md +++ b/text/1432-replace-slice.md @@ -1,7 +1,7 @@ - Feature Name: splice - Start Date: 2015-12-28 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1432](https://github.com/rust-lang/rfcs/pull/1432) +- Rust Issue: [rust-lang/rust#32310](https://github.com/rust-lang/rust/issues/32310) # Summary [summary]: #summary From d9e4e8f6bdec1c215b7c23193ae89d71bf319e41 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 17 Mar 2016 09:27:00 -0700 Subject: [PATCH 0816/1195] RFC 1434 is `contains` for ranges --- ...hod-for-ranges.md => 1434-contains-method-for-ranges.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-contains-method-for-ranges.md => 1434-contains-method-for-ranges.md} (92%) diff --git a/text/0000-contains-method-for-ranges.md b/text/1434-contains-method-for-ranges.md similarity index 92% rename from text/0000-contains-method-for-ranges.md rename to text/1434-contains-method-for-ranges.md index 0ee3d1cc91f..2c4a2d39b91 100644 --- a/text/0000-contains-method-for-ranges.md +++ b/text/1434-contains-method-for-ranges.md @@ -1,7 +1,7 @@ -- Feature Name: contains_method +- Feature Name: `contains_method` - Start Date: 2015-12-28 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1434](https://github.com/rust-lang/rfcs/pull/1434) +- Rust Issue: [rust-lang/rust#32311](https://github.com/rust-lang/rust/issues/32311) # Summary [summary]: #summary From 00fb96df2eabaee808e799a566b43c5106077245 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 17 Mar 2016 09:29:30 -0700 Subject: [PATCH 0817/1195] RFC 1479 is Unix sockets --- text/{0000-unix-socket.md => 1479-unix-socket.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-unix-socket.md => 1479-unix-socket.md} (98%) diff --git a/text/0000-unix-socket.md b/text/1479-unix-socket.md similarity index 98% rename from text/0000-unix-socket.md rename to text/1479-unix-socket.md index 69e207e43a8..96e26df72ee 100644 --- a/text/0000-unix-socket.md +++ b/text/1479-unix-socket.md @@ -1,7 +1,7 @@ -- Feature Name: unix_socket +- Feature Name: `unix_socket` - Start Date: 2016-01-25 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1479](https://github.com/rust-lang/rfcs/pull/1479) +- Rust Issue: [rust-lang/rust#32312](https://github.com/rust-lang/rust/issues/32312) # Summary [summary]: #summary From 1dd5f1d9568bb5c91e05694e868608efe03dd7b2 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 17 Mar 2016 09:33:23 -0700 Subject: [PATCH 0818/1195] RFC 1498 is Ipv6Addr octet methods --- text/{0000-ipv6addr-octets.md => 1498-ipv6addr-octets.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-ipv6addr-octets.md => 1498-ipv6addr-octets.md} (91%) diff --git a/text/0000-ipv6addr-octets.md b/text/1498-ipv6addr-octets.md similarity index 91% rename from text/0000-ipv6addr-octets.md rename to text/1498-ipv6addr-octets.md index 2dcdec1f798..5e77166aa6c 100644 --- a/text/0000-ipv6addr-octets.md +++ b/text/1498-ipv6addr-octets.md @@ -1,7 +1,7 @@ -- Feature Name: ipaddr_octet_arrays +- Feature Name: `ipaddr_octet_arrays` - Start Date: 2016-02-12 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1498](https://github.com/rust-lang/rfcs/pull/1498) +- Rust Issue: [rust-lang/rust#32313](https://github.com/rust-lang/rust/issues/32313) # Summary [summary]: #summary From 3a597f92854f194621f57282bd03a37ccc72cd3d Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 17 Mar 2016 18:08:14 +0100 Subject: [PATCH 0819/1195] Fix code to reflect that `fn dealloc` method no longer returns `Result`. Remove `fn dealloc_unchecked` method since it no longer provides any benefit over `fn dealloc` anymore, since `fn dealloc` no longer has any preconditions to check (*). (*) Or at least, if it *does* choose to check preconditions (like "was the address part of my set of memory blocks?"), there is no longer way for `fn dealloc` to signal an error condition besides `panic`, and I do not think trying to prepare for that hypothetical scenario is worth adding the `fn dealloc_unchecked` method to the API. --- text/0000-kinds-of-allocators.md | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 0c7719ba31d..cb08fdb8ca3 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -746,14 +746,14 @@ for example, `a.alloc_one::()` will return a `Unique` (or error). ## Unchecked variants -Finally, all of the methods above return `Result`, and guarantee some +Finally, almost all of the methods above return `Result`, and guarantee some amount of input validation. (This is largely because I observed code duplication doing such validation on the client side; or worse, such validation accidentally missing.) However, some clients will want to bypass such checks (and do it -without risking undefined behavior by ensuring the preconditions hold -via local invariants in their container type). +without risking undefined behavior, namely by ensuring the method preconditions +hold via local invariants in their container type). For these clients, the `Allocator` trait provides ["unchecked" variants][unchecked variants] of nearly all of its @@ -2116,7 +2116,8 @@ pub unsafe trait Allocator { unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), Self::Error> { let raw_ptr = NonZero::new(*ptr as *mut u8); if let Some(k) = Layout::array::(n) { - self.dealloc(raw_ptr, k) + self.dealloc(raw_ptr, k); + Ok(()) } else { Err(Self::Error::invalid_input()) } @@ -2146,16 +2147,6 @@ pub unsafe trait Allocator { self.alloc(layout).ok() } - /// Deallocate the memory referenced by `ptr`. - /// - /// `ptr` must have previously been provided via this allocator, - /// and `layout` must *fit* the provided block (see above). - /// Otherwise yields undefined behavior. - unsafe fn dealloc_unchecked(&mut self, ptr: Address, layout: Layout) { - // (default implementation carries checks, but impl's are free to omit them.) - self.dealloc(ptr, layout).unwrap(); - } - /// Returns a pointer suitable for holding data described by /// `new_layout`, meeting its size and alignment guarantees. To /// accomplish this, may extend or shrink the allocation From 40c84c2664a08cb6b208371d2c9ac3c15cf79de1 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 17 Mar 2016 18:09:41 +0100 Subject: [PATCH 0820/1195] fixed oversight (should have been part of previous commit). --- text/0000-kinds-of-allocators.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index cb08fdb8ca3..35d6060a6c9 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -2238,7 +2238,7 @@ pub unsafe trait Allocator { /// undefined behavior. unsafe fn dealloc_array_unchecked(&mut self, ptr: Unique, n: usize) { let layout = Layout::array_unchecked::(n); - self.dealloc_unchecked(NonZero::new(*ptr as *mut u8), layout); + self.dealloc(NonZero::new(*ptr as *mut u8), layout); } } ``` From 06e226374b1736d6dff2ad642390382c7778283e Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 17 Mar 2016 18:13:49 +0100 Subject: [PATCH 0821/1195] "fixed" `oom` default method impl (the `abort` intrinisic requires `unsafe`) (Though to be honest revisiting this made me wonder if I should just require clients to implement `fn oom` themselves.) --- text/0000-kinds-of-allocators.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 35d6060a6c9..6da03b15dc7 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1895,7 +1895,9 @@ pub unsafe trait Allocator { /// instead they should return an appropriate error from the /// invoked method, and let the client decide whether to invoke /// this `oom` method. - fn oom(&mut self, _: Self::Error) -> ! { ::core::intrinsics::abort() } + fn oom(&mut self, _: Self::Error) -> ! { + unsafe { ::core::intrinsics::abort() } + } ``` ### Allocator-specific quantities and limits From 34d019d19e021a1506aee3933ddffbf14a858e1d Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 17 Mar 2016 18:23:28 +0100 Subject: [PATCH 0822/1195] Updated implementation to reflect that `Layout` (nee `Kind`) is no longer `Copy`. (I should have done this as part of the earlier commit that removed `deriving(Copy)` from `Layout`.) --- text/0000-kinds-of-allocators.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 6da03b15dc7..885f6c505ea 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1511,7 +1511,7 @@ impl Layout { Layout { align: unsafe { NonZero::new(pow2_align) }, ..*self } } else { - *self + self.clone() } } @@ -1948,7 +1948,7 @@ pub unsafe trait Allocator { /// However, for clients that do not wish to track the capacity /// returned by `alloc_excess` locally, this method is likely to /// produce useful results. - unsafe fn usable_size(&self, layout: Layout) -> (Capacity, Capacity) { + unsafe fn usable_size(&self, layout: &Layout) -> (Capacity, Capacity) { (layout.size(), layout.size()) } @@ -2002,16 +2002,18 @@ pub unsafe trait Allocator { ptr: Address, layout: Layout, new_layout: Layout) -> Result { - let (min, max) = self.usable_size(layout); + let (min, max) = self.usable_size(&layout); let s = new_layout.size(); // All Layout alignments are powers of two, so a comparison // suffices here (rather than resorting to a `%` operation). if min <= s && s <= max && new_layout.align() <= layout.align() { return Ok(ptr); } else { + let new_size = new_layout.size(); + let old_size = layout.size(); let result = self.alloc(new_layout); if let Ok(new_ptr) = result { - ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*layout.size(), *new_layout.size())); + ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*old_size, *new_size)); self.dealloc(ptr, layout); } result @@ -2022,7 +2024,8 @@ pub unsafe trait Allocator { /// the returned block. For some `layout` inputs, like arrays, this /// may include extra storage usable for additional data. unsafe fn alloc_excess(&mut self, layout: Layout) -> Result { - self.alloc(layout).map(|p| Excess(p, self.usable_size(layout).1)) + let usable_size = self.usable_size(&layout); + self.alloc(layout).map(|p| Excess(p, usable_size.1)) } /// Behaves like `fn realloc`, but also returns the whole size of @@ -2032,8 +2035,9 @@ pub unsafe trait Allocator { ptr: Address, layout: Layout, new_layout: Layout) -> Result { + let usable_size = self.usable_size(&new_layout); self.realloc(ptr, layout, new_layout) - .map(|p| Excess(p, self.usable_size(new_layout).1)) + .map(|p| Excess(p, usable_size.1)) } ``` From ede39d06f38f214f03d3178042147190e9c0a8db Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 17 Mar 2016 18:38:39 +0100 Subject: [PATCH 0823/1195] Updated the demo allocator implementation to reflect changes to the API. --- text/0000-kinds-of-allocators.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 885f6c505ea..5f1218da753 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -306,15 +306,15 @@ will expose: method. ```rust -#[derive(Copy, Clone, PartialEq, Eq, Debug)] -enum BumpAllocError { Invalid, MemoryExhausted(alloc::Layout), Interference } +#[derive(Clone, PartialEq, Eq, Debug)] +pub enum BumpAllocError { Invalid, MemoryExhausted(alloc::Layout), Interference } impl BumpAllocError { - fn is_transient(&self) { *self == BumpAllocError::Interference } + fn is_transient(&self) -> bool { *self == BumpAllocError::Interference } } impl alloc::AllocError for BumpAllocError { - fn invalid_input() -> Self { BumpAllocError::MemoryExhausted } + fn invalid_input() -> Self { BumpAllocError::Invalid } fn is_memory_exhausted(&self) -> bool { if let BumpAllocError::MemoryExhausted(_) = *self { true } else { false } } fn is_request_unsupported(&self) -> bool { false } } @@ -337,13 +337,14 @@ Here is the demo implementation of `Allocator` for the type. impl<'a> Allocator for &'a DumbBumpPool { type Error = BumpAllocError; - unsafe fn alloc(&mut self, layout: &alloc::Layout) -> Result { + unsafe fn alloc(&mut self, layout: alloc::Layout) -> Result { let curr = self.avail.load(Ordering::Relaxed) as usize; let align = *layout.align(); - let curr_aligned = (curr.overflowing_add(align - 1)) & !(align - 1); + let (sum, oflo) = curr.overflowing_add(align - 1); + let curr_aligned = sum & !(align - 1); let size = *layout.size(); let remaining = (self.end as usize) - curr_aligned; - if remaining <= size { + if oflo || remaining <= size { return Err(BumpAllocError::MemoryExhausted(layout.clone())); } @@ -359,7 +360,7 @@ impl<'a> Allocator for &'a DumbBumpPool { } } - unsafe fn dealloc(&mut self, _ptr: Address, _layout: &alloc::Layout) { + unsafe fn dealloc(&mut self, _ptr: Address, _layout: alloc::Layout) { // this bump-allocator just no-op's on dealloc } From ffdf71ee71e6bdbae22fbf2d8185be924b3fa71b Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Thu, 17 Mar 2016 19:05:30 +0100 Subject: [PATCH 0824/1195] Added missing `Sync` impl for `DumbBumpPool`, and also fixed some privacy oversights from earlier. The demo code is still not perfect (it currently presumes `Vec::new_in` addition, which is not great since another part of the RFC now says that standard library integration is specifically not addressed by this RFC). I'm working on revising the demo but I don't think that should hold up overall discussion. --- text/0000-kinds-of-allocators.md | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 5f1218da753..98d2feab31b 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -248,7 +248,7 @@ For this demo I want to try to minimize cleverness, so we will use ```rust impl DumbBumpPool { - fn new(name: &'static str, + pub fn new(name: &'static str, size_in_bytes: usize, start_align: usize) -> DumbBumpPool { unsafe { @@ -310,7 +310,7 @@ will expose: pub enum BumpAllocError { Invalid, MemoryExhausted(alloc::Layout), Interference } impl BumpAllocError { - fn is_transient(&self) -> bool { *self == BumpAllocError::Interference } + pub fn is_transient(&self) -> bool { *self == BumpAllocError::Interference } } impl alloc::AllocError for BumpAllocError { @@ -331,6 +331,20 @@ With that out of the way, here are some other design choices of note: (lifetime-scoped) threads, we will implement the `Allocator` interface as a *handle* pointing to the pool; in this case, a simple reference. + * Since the whole point of this particular bump-allocator is to + shared across threads (otherwise there would be no need to use + `AtomicPtr` for the `avail` field), we will want to implement the + (unsafe) `Sync` trait on it (doing this signals that it is safe to + send `&DumbBumpPool` to other threads). + +Here is that `impl Sync`. + +```rust +/// Note of course that this impl implies we must review all other +/// code for DumbBumpPool even more carefully. +unsafe impl Sync for DumbBumpPool { } +``` + Here is the demo implementation of `Allocator` for the type. ```rust From 9eae82ddbdc21969281801d93d31f559760a4710 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 18 Mar 2016 14:09:04 +0100 Subject: [PATCH 0825/1195] lifted the `fmt::Debug` bound from associated type up to `AllocError` trait itself. (The only place where `AllocError` is used is as a bound on that associated type, so does not present any additional burden on clients of `Allocator` itself, though it does rule out certain pathological programmtic constructions that I'm not worried about.) --- text/0000-kinds-of-allocators.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 98d2feab31b..5191b1c9ef8 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1724,7 +1724,7 @@ impl Layout { ```rust /// `AllocError` instances provide feedback about the cause of an allocation failure. -pub trait AllocError { +pub trait AllocError: fmt::Debug { /// Construct an error that indicates operation failure due to /// invalid input values for the request. /// @@ -1865,7 +1865,7 @@ pub unsafe trait Allocator { /// /// Many allocators will want to use the zero-sized /// `MemoryExhausted` type for this. - type Error: AllocError + fmt::Debug; + type Error: AllocError; ``` From 59ed824c1f906e655d239dd2472e9354806602ae Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 18 Mar 2016 14:15:26 +0100 Subject: [PATCH 0826/1195] Extended `AllocErr` enum variants with more contextual info about allocation failure cause. The important one: for `Unsupported` operations, lets have the allocator be able to provide more context about what was unsupported. (I chose `&'static str` here because that avoids allocation integration for the error message, but I would love to be convinced that we could employ `Cow<'static, str>` here instead, since that would be much more general purpose.) And since I was adding contextual information anyway, I decided to have the memory exhausted variant carry along the particular `Layout` straw that broke the camel's back. (Note that the `MemoryExhausted` *struct* still remains zero-sized.) --- text/0000-kinds-of-allocators.md | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 5191b1c9ef8..3700496ffa0 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1737,7 +1737,7 @@ pub trait AllocError: fmt::Debug { /// that would exhaust memory if arbitrary-precision arithmetic were /// used, clients are alternatively allowed to constuct an error /// representing memory exhaustion in this scenario.) - fn invalid_input() -> Self; + fn invalid_input(details: &'static str) -> Self where Self: Sized; /// Returns true if the error is due to hitting some resource /// limit, or otherwise running out of memory. This condition @@ -1800,32 +1800,38 @@ pub struct MemoryExhausted; /// Allocators that only support certain classes of inputs might choose this /// as their associated error type, so that clients can respond appropriately /// to specific error failure scenarios. -#[derive(Copy, Clone, PartialEq, Eq, Debug)] +#[derive(Clone, PartialEq, Eq, Debug)] pub enum AllocErr { /// Error due to hitting some resource limit or otherwise running /// out of memory. This condition strongly implies that *some* /// series of deallocations would allow a subsequent reissuing of /// the original allocation request to succeed. - Exhausted, + Exhausted { request: Layout }, /// Error due to allocator being fundamentally incapable of /// satisfying the original request. This condition implies that /// such an allocation request will never succeed on the given /// allocator, regardless of environment, memory pressure, or /// other contextual condtions. - Unsupported, + Unsupported { details: &'static str }, } impl AllocError for MemoryExhausted { - fn invalid_input() -> Self { MemoryExhausted } + fn invalid_input(_details: &'static str) -> Self { MemoryExhausted } fn is_memory_exhausted(&self) -> bool { true } fn is_request_unsupported(&self) -> bool { false } } impl AllocError for AllocErr { - fn invalid_input() -> Self { AllocErr::Unsupported } - fn is_memory_exhausted(&self) -> bool { *self == AllocErr::Exhausted } - fn is_request_unsupported(&self) -> bool { *self == AllocErr::Unsupported } + fn invalid_input(details: &'static str) -> Self { + AllocErr::Unsupported { details: details } + } + fn is_memory_exhausted(&self) -> bool { + if let AllocErr::Exhausted { .. } = *self { true } else { false } + } + fn is_request_unsupported(&self) -> bool { + if let AllocErr::Unsupported { .. } = *self { true } else { false } + } } ``` @@ -2078,7 +2084,7 @@ pub unsafe trait Allocator { } else { // (only occurs for zero-sized T) debug_assert!(mem::size_of::() == 0); - Err(Self::Error::invalid_input()) + Err(Self::Error::invalid_input("zero-sized type invalid for alloc_one")) } } @@ -2106,7 +2112,7 @@ pub unsafe trait Allocator { unsafe fn alloc_array(&mut self, n: usize) -> Result, Self::Error> { match Layout::array::(n) { Some(layout) => self.alloc(layout).map(|p|Unique::new(*p as *mut T)), - None => Err(Self::Error::invalid_input()), + None => Err(Self::Error::invalid_input("invalid layout for alloc_array")), } } @@ -2127,7 +2133,7 @@ pub unsafe trait Allocator { self.realloc(NonZero::new(ptr as *mut u8), k_old, k_new) .map(|p|Unique::new(*p as *mut T)) } else { - Err(Self::Error::invalid_input()) + Err(Self::Error::invalid_input("invalid layout for realloc_array")) } } @@ -2140,7 +2146,7 @@ pub unsafe trait Allocator { self.dealloc(raw_ptr, k); Ok(()) } else { - Err(Self::Error::invalid_input()) + Err(Self::Error::invalid_input("invalid layout for dealloc_array")) } } From fe9a9b254a49e1e1f0533fae52493cb182cc3122 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 18 Mar 2016 14:44:06 +0100 Subject: [PATCH 0827/1195] Added section on allocator trait objects. This includes the changes that were absolutely necessary to support them in the `Allocator` trait itself, as well as an (opinionated) type alias, `AllocatorObj`, for defining allocator trait objects. --- text/0000-kinds-of-allocators.md | 82 ++++++++++++++++++++++++++++---- 1 file changed, 73 insertions(+), 9 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 3700496ffa0..6b147287479 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -761,7 +761,7 @@ for example, `a.alloc_one::()` will return a `Unique` (or error). ## Unchecked variants -Finally, almost all of the methods above return `Result`, and guarantee some +Almost all of the methods above return `Result`, and guarantee some amount of input validation. (This is largely because I observed code duplication doing such validation on the client side; or worse, such validation accidentally missing.) @@ -787,6 +787,45 @@ of the preconditions hold. offered to impl's; but there is no guarantee that an arbitrary impl takes advantage of the privilege.) +## Object-oriented Allocators + +Finally, we get to object-oriented programming. + +Since the `Allocator` trait has an associated error type, one +cannot just encode virtually-dispatched allocator objects with +`Box` or `&Allocator`; trait objects need to have +their associated types specified as part of the object trait. + +In general, we expect allocator-parametric code to opt *not* to use +trait objects to generalize over allocators, but instead to use +generic types and instantiate those types with specific concrete +allocators. + +Nonetheless, it *is* an option to write `Box>`, or +`&Allocator`, when working with allocators that +use each corresponding error type. + + * (The allocator methods that are not object-safe, like + `fn alloc_one(&mut self)`, have a clause `where Self: Sized` to + ensure that their presence does not cause the `Allocator` trait as + a whole to become non-object-safe.) + +To encourage client code that chooses to use trait objects for their +allocators to try to standardize on one choice of associated `Error` +type, we provide a convenience `type` definition for +[allocator objects][], `AllocatorObj`, which makes an opinionated +decision about which one of the "standard error types" is the "right +one" for such general purpose objects: namely, `AllocErr`, since it is +both cheap to construct but also can provide some amount of +context-sensitive information about the original cause of an +allocation error. + +However, the main point remains that we expect this object-oriented +usage of allocators to be rare. If this assumption turns out to be +incorrect, we should revisit these decisions before stabilizing the +allocator API (that would be the time to e.g. remove the associated +error type). + ## Why this API [Why this API]: #why-this-api @@ -2078,7 +2117,8 @@ pub unsafe trait Allocator { /// `alloc`/`realloc` methods of this allocator. /// /// Returns `Err` for zero-sized `T`. - unsafe fn alloc_one(&mut self) -> Result, Self::Error> { + unsafe fn alloc_one(&mut self) -> Result, Self::Error> + where Self: Sized { if let Some(k) = Layout::new::() { self.alloc(k).map(|p|Unique::new(*p as *mut T)) } else { @@ -2096,7 +2136,8 @@ pub unsafe trait Allocator { /// undefined behavior. /// /// Captures a common usage pattern for allocators. - unsafe fn dealloc_one(&mut self, mut ptr: Unique) { + unsafe fn dealloc_one(&mut self, mut ptr: Unique) + where Self: Sized { let raw_ptr = NonZero::new(ptr.get_mut() as *mut T as *mut u8); self.dealloc(raw_ptr, Layout::new::().unwrap()); } @@ -2109,7 +2150,8 @@ pub unsafe trait Allocator { /// `alloc`/`realloc` methods of this allocator. /// /// Returns `Err` for zero-sized `T` or `n == 0`. - unsafe fn alloc_array(&mut self, n: usize) -> Result, Self::Error> { + unsafe fn alloc_array(&mut self, n: usize) -> Result, Self::Error> + where Self: Sized { match Layout::array::(n) { Some(layout) => self.alloc(layout).map(|p|Unique::new(*p as *mut T)), None => Err(Self::Error::invalid_input("invalid layout for alloc_array")), @@ -2127,7 +2169,8 @@ pub unsafe trait Allocator { unsafe fn realloc_array(&mut self, ptr: Unique, n_old: usize, - n_new: usize) -> Result, Self::Error> { + n_new: usize) -> Result, Self::Error> + where Self: Sized { let old_new_ptr = (Layout::array::(n_old), Layout::array::(n_new), *ptr); if let (Some(k_old), Some(k_new), ptr) = old_new_ptr { self.realloc(NonZero::new(ptr as *mut u8), k_old, k_new) @@ -2140,7 +2183,8 @@ pub unsafe trait Allocator { /// Deallocates a block suitable for holding `n` instances of `T`. /// /// Captures a common usage pattern for allocators. - unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), Self::Error> { + unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), Self::Error> + where Self: Sized { let raw_ptr = NonZero::new(*ptr as *mut u8); if let Some(k) = Layout::array::(n) { self.dealloc(raw_ptr, k); @@ -2232,7 +2276,8 @@ pub unsafe trait Allocator { /// Requires inputs are non-zero and do not cause arithmetic /// overflow, and `T` is not zero sized; otherwise yields /// undefined behavior. - unsafe fn alloc_array_unchecked(&mut self, n: usize) -> Option> { + unsafe fn alloc_array_unchecked(&mut self, n: usize) -> Option> + where Self: Sized { let layout = Layout::array_unchecked::(n); self.alloc_unchecked(layout).map(|p|Unique::new(*p as *mut T)) } @@ -2248,7 +2293,8 @@ pub unsafe trait Allocator { unsafe fn realloc_array_unchecked(&mut self, ptr: Unique, n_old: usize, - n_new: usize) -> Option> { + n_new: usize) -> Option> + where Self: Sized { let (k_old, k_new, ptr) = (Layout::array_unchecked::(n_old), Layout::array_unchecked::(n_new), *ptr); @@ -2263,9 +2309,27 @@ pub unsafe trait Allocator { /// Requires inputs are non-zero and do not cause arithmetic /// overflow, and `T` is not zero sized; otherwise yields /// undefined behavior. - unsafe fn dealloc_array_unchecked(&mut self, ptr: Unique, n: usize) { + unsafe fn dealloc_array_unchecked(&mut self, ptr: Unique, n: usize) + where Self: Sized { let layout = Layout::array_unchecked::(n); self.dealloc(NonZero::new(*ptr as *mut u8), layout); } } ``` + +### Allocator trait objects +[allocator objects]: #allocator-trait-objects + +```rust +/// `AllocatorObj` is a convenience for making allocator trait objects +/// such as `Box` or `&AllocatorObj`. (One cannot just +/// write `Box` because the one must specify the associated +/// error type as part of the trait object. +/// +/// Since one is pays the cost of virtual function dispatch when +/// calling methods on trait objects, this definition uses `AllocErr` +/// to encode more information when signalling errors in these +/// objects, rather than using the content-impoverished +/// `MemoryExhausted` error type for the associated error type. +pub type AllocatorObj = Allocator; +``` From 792eb3186e57261e605a63cda47b68e7177c8f8a Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 18 Mar 2016 14:46:08 +0100 Subject: [PATCH 0828/1195] bug fixes to `DumbBumpPool` example. --- text/0000-kinds-of-allocators.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 6b147287479..762b57d6121 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -249,8 +249,8 @@ For this demo I want to try to minimize cleverness, so we will use ```rust impl DumbBumpPool { pub fn new(name: &'static str, - size_in_bytes: usize, - start_align: usize) -> DumbBumpPool { + size_in_bytes: usize, + start_align: usize) -> DumbBumpPool { unsafe { let ptr = heap::allocate(size_in_bytes, start_align); if ptr.is_null() { panic!("allocation failed."); } @@ -307,14 +307,18 @@ will expose: ```rust #[derive(Clone, PartialEq, Eq, Debug)] -pub enum BumpAllocError { Invalid, MemoryExhausted(alloc::Layout), Interference } +pub enum BumpAllocError { + Invalid(&'static str), + MemoryExhausted(alloc::Layout), + Interference +} impl BumpAllocError { pub fn is_transient(&self) -> bool { *self == BumpAllocError::Interference } } impl alloc::AllocError for BumpAllocError { - fn invalid_input() -> Self { BumpAllocError::Invalid } + fn invalid_input(details: &'static str) -> Self { BumpAllocError::Invalid(details) } fn is_memory_exhausted(&self) -> bool { if let BumpAllocError::MemoryExhausted(_) = *self { true } else { false } } fn is_request_unsupported(&self) -> bool { false } } @@ -358,7 +362,7 @@ impl<'a> Allocator for &'a DumbBumpPool { let curr_aligned = sum & !(align - 1); let size = *layout.size(); let remaining = (self.end as usize) - curr_aligned; - if oflo || remaining <= size { + if oflo || remaining < size { return Err(BumpAllocError::MemoryExhausted(layout.clone())); } From 428446c234e925b8995f2b0b47d1a959d0f9cc89 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Fri, 18 Mar 2016 16:58:37 +0100 Subject: [PATCH 0829/1195] Add global_asm! for module-level inline assembly --- text/0000-global-asm.md | 61 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) create mode 100644 text/0000-global-asm.md diff --git a/text/0000-global-asm.md b/text/0000-global-asm.md new file mode 100644 index 00000000000..f30731af9ca --- /dev/null +++ b/text/0000-global-asm.md @@ -0,0 +1,61 @@ +- Feature Name: global_asm +- Start Date: 2016-03-18 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC exposes LLVM's support for [module-level inline assembly](http://llvm.org/docs/LangRef.html#module-level-inline-assembly) by adding a `global_asm!` macro. The syntax is very simple: it just takes a string literal containing the assembly code. + +Example: +```rust +global_asm!(r#" +.globl my_asm_func +my_asm_func: + ret +"#); + +extern { + fn my_asm_func(); +} +``` + +# Motivation +[motivation]: #motivation + +There are two main use cases for this feature. The first is that it allows functions to be written completely in assembly, which mostly eliminates the need for a `naked` attribute. This is mainly useful for function that use a custom calling convention, such as interrupt handlers. + +Another important use case is that it allows external assembly files to be used in a Rust module without needing hacks in the build system: + +```rust +global_asm!(include_str!("my_asm_file.s")); +``` + +Assembly files can also be preprocessed or generated by `build.rs` (for example using the C preprocessor), which will produce output files in the Cargo output directory: + +```rust +global_asm!(include_str!(concat!(env!("OUT_DIR"), "/preprocessed_asm.s"))); +``` + +# Detailed design +[design]: #detailed-design + +See description above, not much to add. The macro will map directly to LLVM's `module asm`. + +# Drawbacks +[drawbacks]: #drawbacks + +Like `asm!`, this feature depends on LLVM's integrated assembler. + +# Alternatives +[alternatives]: #alternatives + +The current way of including external assembly is to compile the assembly files using gcc in `build.rs` and link them into the Rust program as a static library. + +An alternative for functions written entirely in assembly is to add a [`#[naked]` function attribute](https://github.com/rust-lang/rfcs/pull/1201). + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From 0d45992eca4375c89b8f3653c0418ae1758638dd Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 21 Mar 2016 15:38:49 -0400 Subject: [PATCH 0830/1195] Update RFC 1201 --- text/{0000-naked-fns.md => 1201-naked-fns.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-naked-fns.md => 1201-naked-fns.md} (98%) diff --git a/text/0000-naked-fns.md b/text/1201-naked-fns.md similarity index 98% rename from text/0000-naked-fns.md rename to text/1201-naked-fns.md index c86679ddfa1..2e340df4463 100644 --- a/text/0000-naked-fns.md +++ b/text/1201-naked-fns.md @@ -1,7 +1,7 @@ - Feature Name: `naked_fns` - Start Date: 2015-07-10 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1201 +- Rust Issue: https://github.com/rust-lang/rust/issues/32408 # Summary From f39ec216ec8b6b593d8ee2d0249f89a04c310679 Mon Sep 17 00:00:00 2001 From: Lukas Kalbertodt Date: Tue, 22 Mar 2016 00:03:56 +0100 Subject: [PATCH 0831/1195] Add RFC 'contains_method_for_various_collections' --- ...contains-method-for-various-collections.md | 95 +++++++++++++++++++ 1 file changed, 95 insertions(+) create mode 100644 text/0000-contains-method-for-various-collections.md diff --git a/text/0000-contains-method-for-various-collections.md b/text/0000-contains-method-for-various-collections.md new file mode 100644 index 00000000000..3a2d97f7395 --- /dev/null +++ b/text/0000-contains-method-for-various-collections.md @@ -0,0 +1,95 @@ +- Feature Name: contains_method_for_various_collections +- Start Date: 2016-03-16 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add a `contains` method to `VecDeque` and `LinkedList` that checks if the +collection contains a given item. + +# Motivation +[motivation]: #motivation + +A `contains` method exists for the slice type `[T]` and for `Vec` through +`Deref`, but there is no easy way to check if a `VecDeque` or `LinkedList` +contains a specific item. Currently, the shortest way to do it is something +like: + +```rust +vec_deque.iter().any(|e| e == item) +``` + +While this is not insanely verbose, a `contains` method has the following +advantages: + +- the name `contains` expresses the programmer's intent... +- ... and thus is more idiomatic +- it's as short as it can get +- programmers that are used to call `contains` on a `Vec` are confused by the + non-existence of the method for `VecDeque` or `LinkedList` + +# Detailed design +[design]: #detailed-design + +Add the following method to `std::collections::VecDeque`: + +```rust +impl VecDeque { + /// Returns `true` if the `VecDeque` contains an element equal to the + /// given value. + pub fn contains(&self, x: &T) -> bool + where T: PartialEq + { + // implementation with a result equivalent to the result + // of `self.iter().any(|e| e == x)` + } +} +``` + +Add the following method to `std::collections::LinkedList`: + +```rust +impl LinkedList { + /// Returns `true` if the `LinkedList` contains an element equal to the + /// given value. + pub fn contains(&self, x: &T) -> bool + where T: PartialEq + { + // implementation with a result equivalent to the result + // of `self.iter().any(|e| e == x)` + } +} +``` + +The new methods should probably be marked as unstable initially and be +stabilized later. + +# Drawbacks +[drawbacks]: #drawbacks + +Obviously more methods increase the complexity of the standard library, but in +case of this RFC the increase is rather tiny. + +While `VecDeque::contains` should be (nearly) as fast as `[T]::contains`, +`LinkedList::contains` will probably be much slower due to the cache +inefficient nature of a linked list. Offering a method that is short to +write and convenient to use could lead to excessive use of said method +without knowing about the problems mentioned above. + +# Alternatives +[alternatives]: #alternatives + +There are a few alternatives: + +- add `VecDeque::contains` only and do not add `LinkedList::contains` +- do nothing, because -- technically -- the same functionality is offered + through iterators +- also add `BinaryHeap::contains`, since it could be convenient for some use + cases, too + +# Unresolved questions +[unresolved]: #unresolved-questions + +None so far. From cf38675860c9a1e44f8235000570533726a99fae Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 22 Mar 2016 13:28:32 +0100 Subject: [PATCH 0832/1195] update the [impl item example], incorporating oli-obk feedback from 2015dec21. --- text/0000-pub-restricted.md | 37 ++++++++++++++----------------------- 1 file changed, 14 insertions(+), 23 deletions(-) diff --git a/text/0000-pub-restricted.md b/text/0000-pub-restricted.md index 47882d45b8c..274d8791785 100644 --- a/text/0000-pub-restricted.md +++ b/text/0000-pub-restricted.md @@ -474,13 +474,22 @@ feature is making this code cleaner or easier to reason about). [impl item example]: #impl-item-example ```rust -pub struct S; +pub struct S(i32); mod a { - pub fn call_foo(s: &S) { s.foo(); } + pub fn call_foo(s: &super::S) { s.foo(); } - impl S { - pub(a) fn foo(&self) { println!("only callable within `a`"); } + mod b { + fn some_method_private_to_b() { + println!("inside some_method_private_to_b"); + } + + impl super::super::S { + pub(a) fn foo(&self) { + some_method_private_to_b(); + println!("only callable within `a`: {}", self.0); + } + } } } @@ -807,25 +816,7 @@ itself not accessible in `mod b`? pnkfelix is personally inclined to make this sort of thing illegal, mainly because he finds it totally unintuitive, but is interested in -hearing counter-arguments. Certainly the earlier [impl item example][] -would look prettier as: - -```rust -pub struct S; - -impl S { - pub(a) fn foo(&self) { println!("only callable within `a`"); } -} - -mod a { - pub fn call_foo(s: &S) { s.foo(); } - -} - -fn rejected(s: &S) { - s.foo(); //~ ERROR: `S::foo` not visible outside of module `a` -} -``` +hearing counter-arguments. ## Implicit Restriction Satisfaction (IRS:PUNPM) From 7646c4405175b9da68f1fdf4dd93b3c9c480cd69 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 22 Mar 2016 13:36:07 +0100 Subject: [PATCH 0833/1195] add a few more alternatives based on review of comment thread so far. --- text/0000-pub-restricted.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/text/0000-pub-restricted.md b/text/0000-pub-restricted.md index 274d8791785..cff2a118230 100644 --- a/text/0000-pub-restricted.md +++ b/text/0000-pub-restricted.md @@ -796,6 +796,30 @@ tree for all of its re-exports. do the inline refactoring, rewriting each `pub(crate)` as `pub(A1)` as necessary. +## Be more ambitious! + +This feature could be extended in various ways. + +For example: + + * As mentioned on the RFC comment thread, + we could allow multiple paths in the restriction-specification: + `pub(path1, path2, path3)`. + + This, for better or worse, would start + to look a lot like `friend` declarations from C++. + + * Also as mentioned on the RFC comment thread, the + `pub(restricted)` form does not have any variant where the + restrction-specification denotes the whole universe. + In other words, there's no current way to get the same effect + as `pub item` via `pub(restricted) item`. + + Some future syntaxes to support this have been proposed in the + RFC comment thread, such as `pub(::)`. But this RFC is leaving the + actual choice to add such an extension (and what syntax to use + for it) up to a later amendment in the future. + # Unresolved questions [unresolved]: #unresolved-questions From e3170dd408edc250e6242ffca81a25bfbe0c0b28 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 22 Mar 2016 13:37:33 +0100 Subject: [PATCH 0834/1195] updated comment to reflect feedback from mdinger on 2015dec24. --- text/0000-pub-restricted.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/text/0000-pub-restricted.md b/text/0000-pub-restricted.md index cff2a118230..40535f2812b 100644 --- a/text/0000-pub-restricted.md +++ b/text/0000-pub-restricted.md @@ -36,7 +36,7 @@ within a submodule of the tree, then `X` *cannot* be put at the root of the module tree. Illustration: ```rust -// Intent: `a` exports `I` and `foo`, but nothing else. +// Intent: `a` exports `I`, `bar`, and `foo`, but nothing else. pub mod a { pub const I: i32 = 3; @@ -44,8 +44,8 @@ pub mod a { // is not meant to be exposed outside of `a`. fn semisecret(x: i32) -> i32 { use self::b::c::J; x + J } - pub fn foo(y: i32) -> i32 { semisecret(I) + y } pub fn bar(z: i32) -> i32 { semisecret(I) * z } + pub fn foo(y: i32) -> i32 { semisecret(I) + y } mod b { mod c { @@ -68,7 +68,7 @@ accessed within the items of `a`, and then re-exporting `semisecret` as necessary up the module tree. ```rust -// Intent: `a` exports `I` and `foo`, but nothing else. +// Intent: `a` exports `I`, `bar`, and `foo`, but nothing else. pub mod a { pub const I: i32 = 3; @@ -77,8 +77,8 @@ pub mod a { // (If we put `pub use` here, then *anyone* could access it.) use self::b::semisecret; - pub fn foo(y: i32) -> i32 { semisecret(I) + y } pub fn bar(z: i32) -> i32 { semisecret(I) * z } + pub fn foo(y: i32) -> i32 { semisecret(I) + y } mod b { pub use self::c::semisecret; @@ -269,7 +269,7 @@ some manner. In the running example, one could instead write: ```rust -// Intent: `a` exports `I` and `foo`, but nothing else. +// Intent: `a` exports `I`, `bar`, and `foo`, but nothing else. pub mod a { pub const I: i32 = 3; @@ -278,8 +278,8 @@ pub mod a { // (`pub use` would be *rejected*; see Note 1 below) use self::b::semisecret; - pub fn foo(y: i32) -> i32 { semisecret(I) + y } pub fn bar(z: i32) -> i32 { semisecret(I) * z } + pub fn foo(y: i32) -> i32 { semisecret(I) + y } mod b { pub(a) use self::c::semisecret; From cd19173451b200d1bf2cfb4e387a78c007c60f6a Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 22 Mar 2016 13:38:03 +0100 Subject: [PATCH 0835/1195] spelling typo. --- text/0000-pub-restricted.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-pub-restricted.md b/text/0000-pub-restricted.md index 40535f2812b..41d43e9ccc9 100644 --- a/text/0000-pub-restricted.md +++ b/text/0000-pub-restricted.md @@ -220,7 +220,7 @@ VISIBILITY ::= | `pub` | `pub` `(` USE_PATH `)` | `pub` `(` `crate` `)` One can use these `pub(restriction)` forms anywhere that one can currently use `pub`. In particular, one can use them on item -defintions, methods in an impl, the fields of a struct +definitions, methods in an impl, the fields of a struct definition, and on `pub use` re-exports. ## Semantics From 51daa98feeb17917530d2a3d227222a047600c0e Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 22 Mar 2016 14:05:54 +0100 Subject: [PATCH 0836/1195] add the glob issue to unresolved Qs. --- text/0000-pub-restricted.md | 47 ++++++++++++++++++++++++++++++++++++- 1 file changed, 46 insertions(+), 1 deletion(-) diff --git a/text/0000-pub-restricted.md b/text/0000-pub-restricted.md index 41d43e9ccc9..8641acfee5f 100644 --- a/text/0000-pub-restricted.md +++ b/text/0000-pub-restricted.md @@ -813,7 +813,9 @@ For example: `pub(restricted)` form does not have any variant where the restrction-specification denotes the whole universe. In other words, there's no current way to get the same effect - as `pub item` via `pub(restricted) item`. + as `pub item` via `pub(restricted) item`; you cannot say + `pub(universe) item` (even though I do so in a tongue-in-cheek + manner elsewhere in this RFC). Some future syntaxes to support this have been proposed in the RFC comment thread, such as `pub(::)`. But this RFC is leaving the @@ -906,6 +908,49 @@ even in the context of a non-pub module like `mod b`. In particular, `pub(super) use item` may be imposing a new restriction on the re-exported name that was not part of its original definition.) +## Interaction with Globs + +Glob re-exports +currently only re-export `pub` (as in `pub(universe)` items). + +What should glob-reepxorts do with respect to `pub(restricted)`? + +Here is an illustrating example pointed out by petrochenkov in the +comment thread: + +```rust +mod m { + /*priv*/ pub(m) struct S1; + pub(super) S2; + pub(foo::bar) S3; + pub S4; + + mod n { + + // What is reexported here? + // Just `S4`? + // Anything in `m` visible + // to `n` (which is not consisent with the current treatment of + `pub` by globs). + + pub use m::*; + } +} + +// What is reexported here? +pub use m::*; +pub(baz::qux) use m::*; +``` + +This remains an unresolved question, but my personal inclination, at +least for the initial implementation, is to make globs only import +purely `pub` items; no non-`pub`, and no `pub(restricted)`. + +After we get more experience with `pub(restricted)` (and perhaps make +other changes that may come in future RFCs), we will be in a better +position to evaluate what to do here. + + # Appendices ## Associated Items Digression From 4b4fd5146c04c9c284094aad8f54ca5c2093c7f2 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Tue, 22 Mar 2016 14:09:43 +0100 Subject: [PATCH 0837/1195] update RFC 1422. --- text/{0000-pub-restricted.md => 1422-pub-restricted.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-pub-restricted.md => 1422-pub-restricted.md} (99%) diff --git a/text/0000-pub-restricted.md b/text/1422-pub-restricted.md similarity index 99% rename from text/0000-pub-restricted.md rename to text/1422-pub-restricted.md index 8641acfee5f..85f973d6286 100644 --- a/text/0000-pub-restricted.md +++ b/text/1422-pub-restricted.md @@ -1,7 +1,7 @@ - Feature Name: pub_restricted - Start Date: 2015-12-18 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1422 +- Rust Issue: https://github.com/rust-lang/rust/issues/32409 # Summary [summary]: #summary From 9951aae5b7d1d7b5eea932227a154952f729077e Mon Sep 17 00:00:00 2001 From: jethrogb Date: Thu, 24 Mar 2016 13:09:29 -0700 Subject: [PATCH 0838/1195] Typo in link in RFC 1415 --- text/1415-trim-std-os.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1415-trim-std-os.md b/text/1415-trim-std-os.md index 6fff1115fb9..5a2e8d5cd58 100644 --- a/text/1415-trim-std-os.md +++ b/text/1415-trim-std-os.md @@ -36,7 +36,7 @@ This strategy, however, runs into a few problems: this, however, would [involve changing the `stat` structure][libc-stat-change] and may be difficult to do. * Trait extensions in the `raw` module attempt to return the `libc` aliased type - on all platforms, for example [`DirEntryExt::ino`][std-nio] returns a type of + on all platforms, for example [`DirEntryExt::ino`][std-ino] returns a type of `ino_t`. The `ino_t` type is billed as being FFI compatible with the libc `ino_t` type, but not all platforms store the `d_ino` field in `dirent` with the `ino_t` type. For example on Android the [definition of From 73ee34aada32725bda1401dbd1f251e72f5d00d6 Mon Sep 17 00:00:00 2001 From: archshift Date: Fri, 25 Mar 2016 15:45:51 -0700 Subject: [PATCH 0839/1195] Add RFC 'closure_to_fn_coercion' --- text/0000-closure-to-fn-coercion.md | 155 ++++++++++++++++++++++++++++ text/0401-coercions.md | 12 ++- 2 files changed, 165 insertions(+), 2 deletions(-) create mode 100644 text/0000-closure-to-fn-coercion.md diff --git a/text/0000-closure-to-fn-coercion.md b/text/0000-closure-to-fn-coercion.md new file mode 100644 index 00000000000..8f9f95eb003 --- /dev/null +++ b/text/0000-closure-to-fn-coercion.md @@ -0,0 +1,155 @@ +- Feature Name: closure_to_fn_coercion +- Start Date: 2016-03-25 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +A non-capturing (that is, does not `Clone` or `move` any local variables) should be +coercable to a function pointer (`fn`). + +# Motivation +[motivation]: #motivation + +Currently in rust, it is impossible to bind anything but a pre-defined function +as a function pointer. When dealing with closures, one must either rely upon +rust's type-inference capabilities, or use the `Fn` trait to abstract for any +closure with a certain type signature. + +What is not possible, though, is to define a function while at the same time +binding it to a function pointer. + +This is mainly used for convenience purposes, but in certain situations +the lack of ability to do so creates a significant amount of boilerplate code. +For example, when attempting to create an array of small, simple, but unique functions, +it would be necessary to pre-define each and every function beforehand: + +```rust +fn inc_0(var: &mut u32) {} +fn inc_1(var: &mut u32) { *var += 1; } +fn inc_2(var: &mut u32) { *var += 2; } +fn inc_3(var: &mut u32) { *var += 3; } + +const foo: [fn(&mut u32); 4] = [ + inc_0, + inc_1, + inc_2, + inc_3, +]; +``` + +This is a trivial example, and one that might not seem too consequential, but the +code doubles with every new item added to the array. With very many elements, +the duplication begins to seem unwarranted. + +Another option, of course, is to use an array of `Fn` instead of `fn`: + +```rust +const foo: [&'static Fn(&mut u32); 4] = [ + &|var: &mut u32| {}, + &|var: &mut u32| *var += 1, + &|var: &mut u32| *var += 2, + &|var: &mut u32| *var += 3, +]; +``` + +And this seems to fix the problem. Unfortunately, however, looking closely one +can see that because we use the `Fn` trait, an extra layer of indirection +is added when attempting to run `foo[n](&mut bar)`. + +Rust must use dynamic dispatch because a closure is secretly a struct that +contains references to captured variables, and the code within that closure +must be able to access those references stored in the struct. + +In the above example, though, no variables are captured by the closures, +so in theory nothing would stop the compiler from treating them as anonymous +functions. By doing so, unnecessary indirection would be avoided. In situations +where this function pointer array is particularly hot code, the optimization +would be appreciated. + +# Detailed design +[design]: #detailed-design + +In C++, non-capturing lambdas (the C++ equivalent of closures) "decay" into function pointers +when they do not need to capture any variables. This is used, for example, to pass a lambda +into a C function: + +```cpp +void foo(void (*foobar)(void)) { + // impl +} +void bar() { + foo([]() { /* do something */ }); +} +``` + +With this proposal, rust users would be able to do the same: + +```rust +fn foo(foobar: fn()) { + // impl +} +fn bar() { + foo(|| { /* do something */ }); +} +``` + +Using the examples within ["Motivation"](#motivation), the code array would +be simplified to no performance detriment: + +```rust +const foo: [fn(&mut u32); 4] = [ + |var: &mut u32| {}, + |var: &mut u32| *var += 1, + |var: &mut u32| *var += 2, + |var: &mut u32| *var += 3, +]; +``` + +# Drawbacks +[drawbacks]: #drawbacks + +To a rust user, there is no drawback to this new coercion from closures to `fn` types. + +The only drawback is that it would add some amount of complexity to the type system. + +# Alternatives +[alternatives]: #alternatives + +## Anonymous function syntax + +With this alternative, rust users would be able to directly bind a function +to a variable, without needing to give the function a name. + +```rust +let foo = fn() { /* do something */ }; +foo(); +``` + +```rust +const foo: [fn(&mut u32); 4] = [ + fn(var: &mut u32) {}, + fn(var: &mut u32) { *var += 1 }, + fn(var: &mut u32) { *var += 2 }, + fn(var: &mut u32) { *var += 3 }, +]; +``` + +This isn't ideal, however, because it would require giving new semantics +to `fn` syntax. + +## Aggressive optimization + +This is possibly unrealistic, but an alternative would be to continue encouraging +the use of closures with the `Fn` trait, but conduct heavy optimization to determine +when the used closure is "trivial" and does not need indirection. + +Of course, this would probably significantly complicate the optimization process, and +would have the detriment of not being easily verifiable by the programmer without +checking the disassembly of their program. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None diff --git a/text/0401-coercions.md b/text/0401-coercions.md index 554ac61e11c..816879f4f89 100644 --- a/text/0401-coercions.md +++ b/text/0401-coercions.md @@ -154,6 +154,9 @@ Coercion is allowed between the following types: * `&mut T` to `*mut T` +* `T` to `fn` if `T` is a closure that does not capture any local variables + in its environment. + * `T` to `U` if `T` implements `CoerceUnsized` (see below) and `T = Foo<...>` and `U = Foo<...>` (for any `Foo`, when we get HKT I expect this could be a constraint on the `CoerceUnsized` trait, rather than being checked here) @@ -338,7 +341,7 @@ and where unsize_kind(`T`) is the kind of the unsize info in `T` - the vtable for a trait definition (e.g. `fmt::Display` or `Iterator`, not `Iterator`) or a length (or `()` if `T: Sized`). -Note that lengths are not adjusted when casting raw slices - +Note that lengths are not adjusted when casting raw slices - `T: *const [u16] as *const [u8]` creates a slice that only includes half of the original memory. @@ -441,4 +444,9 @@ Specifically for the DST custom coercions, the compiler could throw an error if it finds a user-supplied implementation of the `Unsize` trait, rather than silently ignoring them. -# Unresolved questions +# Amendments + +* Updated by [#1558](https://github.com/rust-lang/rfcs/pull/1558), which allows + coercions from a non-capturing closure to a function pointer. + +# Unresolved questions \ No newline at end of file From c60da1f9656294b09d06d6256ee2155bc738e44e Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sat, 26 Mar 2016 14:00:10 -0700 Subject: [PATCH 0840/1195] Switch to recognizing `union` as a contextual keyword --- text/0000-union.md | 79 +++++++++++++++++++++++++--------------------- 1 file changed, 43 insertions(+), 36 deletions(-) diff --git a/text/0000-union.md b/text/0000-union.md index 00c7301db49..21b413b1b9a 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -6,8 +6,9 @@ # Summary [summary]: #summary -Provide native support for C-compatible unions, defined via a built-in syntax -macro `union!`. +Provide native support for C-compatible unions, defined via a new "contextual +keyword" `union`, without breaking any existing code that uses `union` as an +identifier. # Motivation [motivation]: #motivation @@ -28,14 +29,11 @@ space-efficient or cache-efficient structures relying on value representation, such as machine-word-sized unions using the least-significant bits of aligned pointers to distinguish cases. -The syntax proposed here avoids reserving a new keyword (such as `union`), and -thus will not break any existing code. This syntax also avoids adding a pragma -to some existing keyword that doesn't quite fit, such as `struct` or `enum`, -which avoids attaching any of the semantic significance of those keywords to -this new construct. Rust does not produce an error or warning about the -redefinition of a macro already defined in the standard library, so the -proposed syntax will not even break code that currently defines a macro named -`union!`. +The syntax proposed here recognizes `union` as though it were a keyword when +used to introduce a union declaration, *without* breaking any existing code +that uses `union` as an identifier. Experiments by Niko Matsakis demonstrate +that recognizing `union` in this manner works unambiguously with zero conflicts +in the Rust grammar. To preserve memory safety, accesses to union fields may only occur in unsafe code. Commonly, code using unions will provide safe wrappers around unsafe @@ -47,16 +45,25 @@ union field accesses. ## Declaring a union type A union declaration uses the same field declaration syntax as a struct -declaration, except with `union!` in place of `struct`. +declaration, except with `union` in place of `struct`. ```rust -union! MyUnion { +union MyUnion { f1: u32, f2: f32, } ``` -`union!` implies `#[repr(C)]` as the default representation. +`union` implies `#[repr(C)]` as the default representation. + +## Contextual keyword + +Rust normally prevents the use of a keyword as an identifier; for instance, a +declaration `fn struct() {}` will produce an error "expected identifier, found +keyword `struct`". However, to avoid breaking existing declarations that use +`union` as an identifier, Rust will only recognize `union` as a keyword when +used to introduce a union declaration. A declaration `fn union() {}` will not +produce such an error. ## Instantiating a union @@ -132,7 +139,7 @@ allows matching on the tag and the corresponding field simultaneously: #[repr(u32)] enum Tag { I, F } -union! U { +union U { i: i32, f: f32, } @@ -168,7 +175,7 @@ entire union, such that any borrow conflicting with a borrow of the union containing the union) will produce an error. ```rust -union! U { +union U { f1: u32, f2: f32, } @@ -194,7 +201,7 @@ struct S { y: u32, } -union! U { +union U { s: S, both: u64, } @@ -252,7 +259,7 @@ size of any of its fields, and the maximum alignment of any of its fields. Note that those maximums may come from different fields; for instance: ```rust -union! U { +union U { f1: u16, f2: [u8; 4], } @@ -275,26 +282,26 @@ of unsafe code. # Alternatives [alternatives]: #alternatives -This proposal has a substantial history, with many variants and alternatives -prior to the current macro-based syntax. Thanks to many people in the Rust -community for helping to refine this RFC. +Proposals for unions in Rust have a substantial history, with many variants and +alternatives prior to the syntax proposed here with a `union` pseudo-keyword. +Thanks to many people in the Rust community for helping to refine this RFC. -As an alternative to the macro syntax, Rust could support unions via a new -keyword instead. However, any introduction of a new keyword will necessarily +The most obvious path to introducing unions in Rust would introduce `union` as +a new keyword. However, any introduction of a new keyword will necessarily break some code that previously compiled, such as code using the keyword as an -identifier. Using `union` as the keyword would break the substantial volume of -existing Rust code using `union` for other purposes, including [multiple -functions in the standard -library](https://doc.rust-lang.org/std/?search=union). Another keyword such as -`untagged_union` would reduce the likelihood of breaking code in practice; -however, in the absence of an explicit policy for introducing new keywords, -this RFC opts to not propose a new keyword. - -To avoid breakage caused by a new reserved keyword, Rust could use a compound -keyword like `unsafe union` (currently not legal syntax in any context), while -not reserving `union` on its own as a keyword, to avoid breaking use of `union` -as an identifier. This provides equally reasonable syntax, but potentially -introduces more complexity in the Rust parser. +identifier. Making `union` a keyword in the standard way would break the +substantial volume of existing Rust code using `union` for other purposes, +including [multiple functions in the standard +library](https://doc.rust-lang.org/std/?search=union). The approach proposed +here, recognizing `union` to introduce a union declaration without prohibiting +`union` as an identifier, provides the most natural declaration syntax and +avoids breaking any existing code. + +Proposals for unions in Rust have extensively explored possible variations on +declaration syntax, including longer keywords (`untagged_union`), built-in +syntax macros (`union!`), compound keywords (`unsafe union`), pragmas +(`#[repr(union)] struct`), and combinations of existing keywords (`unsafe +enum`). In the absence of a new keyword, since unions represent unsafe, untagged sum types, and enum represents safe, tagged sum types, Rust could base unions on @@ -321,7 +328,7 @@ pattern matching, and field access, the original version of this RFC used a pragma modifying the `struct` keyword: `#[repr(union)] struct`. However, while the proposed unions match struct syntax, they do not share the semantics of struct; most notably, unions represent a sum type, while structs represent a -product type. The new construct `union!` avoids the semantics attached to +product type. The new construct `union` avoids the semantics attached to existing keywords. In the absence of any native support for unions, developers of existing Rust From 748a14973a9b0b9ad108f0e74b315296e6d2839f Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 28 Mar 2016 11:05:05 -0700 Subject: [PATCH 0841/1195] Update with a number of recent discussion points * There is no longer a vector for a "zero configuration workspace", but the configuration needed is very small. * Workspace roots are now defined by `[workspace]` * Explicit edges are now `workspace.members` or `package.workspace`, no others. * Lockfile interactions are explained, especially wrt overrides --- text/0000-cargo-workspace.md | 155 ++++++++++++++++++++++------------- 1 file changed, 98 insertions(+), 57 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index b18f9a13687..f0c1f4a9429 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -57,29 +57,34 @@ conventional project layouts but will have explicit controls for configuration. First, let's look at the new manifest keys which will be added to `Cargo.toml`: ```toml +[package] +workspace = "../foo" + +# or ... + [workspace] -root = true -members = ["relative/path/to/child1", "child2"] +members = ["relative/path/to/child1", "../child2"] ``` -Here the `workspace.root` key will be used to indicate whether a `Cargo.toml` is -the root of a workspace, and the `members` key will be a list of paths to -crates which should be added to the package's workspace. The paths listed in -`members` must be valid paths to crates. +Here the `package.workspace` key is used to point at a workspace root. For +example this Cargo.toml indicates that the Cargo.toml in `../foo` is the +workspace that this package is a member of. + +The root of a workspace, indicated by the presence of `[workspace]`, may also +explicitly specify some members of the workspace as well via the +`workspace.members` key. This example here means that two extra crates will be a +member of the workspace. ### Implicit relations In addition to the keys above, Cargo will apply a few heuristics to infer the keys wherever possible: -* All path dependencies of a crate are considered members of the `workspace` key - implicitly. -* Starting from a package's `Cargo.toml`, Cargo will walk upwards on the - filesystem to find a sibling `Cargo.toml` and VCS directory (e.g. `.git` or - `.svn`). If found, this crate is also implicitly considered a member of the +* All `path` dependencies of a crate are considered members of the same workspace. -* A `Cargo.toml` which resides next to a VCS directory is implicitly a - workspace root. +* If `package.workspace` isn't specified, then Cargo will walk upwards on the + filesystem until either a `Cargo.toml` with `[workspace]` is found or a VCS + root is found. These rules are intended to reflect some conventional Cargo project layouts. "Root crates" typically appear at the root of a repository with lots path @@ -90,19 +95,18 @@ downwards to specific locations. ### "Virtual" `Cargo.toml` A good number of projects do not have a root `Cargo.toml` at the top of a -repository, however. While the explicit `[workspace]` keys should be enough to -configure the workspace in addition to the implicit relations above, this -directory structure is common enough that it shouldn't require *that* much more -configuration. +repository, however. While the explicit `package.workspace` and +`workspace.members` keys should be enough to configure the workspace in addition +to the implicit relations above, this directory structure is common enough that +it shouldn't require *that* much more configuration. -To accomodate this project layout, Cargo will now allow for "virtual manifest" +To accommodate this project layout, Cargo will now allow for "virtual manifest" files. These manifests will currently **only** contains the `[workspace]` key and will notably be lacking a `[project]` or `[package]` top level key. A virtual manifest does not itself define a crate, but can help when defining a -root. For example a `Cargo.toml` file at the root of a repository with -`workspace.members` keys would suffice for the project configurations in -question. +root. For example a `Cargo.toml` file at the root of a repository with a +`[workspace]` key would suffice for the project configurations in question. Cargo will for the time being disallow many commands against a virtual manifest, for example `cargo build` will be rejected. Arguments that take a package, @@ -112,10 +116,11 @@ get extended with `--all` flags so in a workspace root you could execute ### Constructing a workspace -With the explicit and implicit relations defined above, each crate will now have -a flag indicating whether it's the root and a number of outgoing edges to other -crates. Two crates are then in the same workspace if they both transitively have -edges to one another. A valid workspace then only has one crate that is a root. +With the explicit and implicit relations defined above, each crate will have a +number of outgoing edges to other crates via `workspace.members`, path +dependencies, and `package.workspace`. Two crates are then in the same workspace +if they both transitively have edges to one another. A valid workspace then has +exactly one root crate with a `[workspace]` key. While the restriction of one-root-per workspace may make sense, the restriction of crates transitively having edges to one another may seem a bit odd. The @@ -128,20 +133,20 @@ would not know how to get back to the "root package", so the workspace from the point of view of the path dependencies would be different than that of the root package. This could in turn lead to `Cargo.lock` getting out of sync. -To alleviate misconfiguration, however, if the `workspace` configuration key -contains a crate which is not a member of the constructed workspace, Cargo will -emit an error indicating such. +To alleviate misconfiguration, however, if the `workspace.members` +configuration key contains a crate which is not a member of the constructed +workspace, Cargo will emit an error indicating as such. ### Workspaces in practice -The conventional layout for a Rust project is to have a `Cargo.toml` at the root +A conventional layout for a Rust project is to have a `Cargo.toml` at the root with the "main project" with dependencies and/or satellite projects underneath. -Consequently the conventional layout will need no extra configuration to benefit -from the workspaces proposed in this RFC. For example, all of these project -layouts (with `/` being the root of a repository) will not require any -configuration to have all crates be members of a workspace: +Consequently the conventional layout will only need a `[workspace]` key added to +the root to benefit from the workspaces proposed in this RFC. For example, all +of these project layouts (with `/` being the root of a repository) will not +require any configuration to have all crates be members of a workspace: -* An FFI crate with a sub-scrate for FFI bindings +* An FFI crate with a sub-crate for FFI bindings ``` Cargo.toml @@ -166,8 +171,8 @@ configuration to have all crates be members of a workspace: Projects like the compiler, however, will likely need explicit configuration. The `rust` repo conceptually has two workspaces, the standard library and the -compiler, and these would need to be manually configured with `workspace` and -`workspace-root` keys amongst all crates. +compiler, and these would need to be manually configured with +`workspace.members` and `package.workspace` keys amongst all crates. Some examples of layouts that will require extra configuration, along with the configuration necessary, are: @@ -191,11 +196,6 @@ configuration necessary, are: ```toml [workspace] - members = [ - "crate1", - "crate2", - "crate3", - ] ``` * Trees with multiple workspaces @@ -222,34 +222,77 @@ configuration necessary, are: ```toml # ws1/Cargo.toml [workspace] - root = true members = ["crate1", "crate2"] ``` ```toml - # ws1/crate1/Cargo.toml + # ws2/Cargo.toml [workspace] - members = [".."] ``` +* Trees with non-hierarchical workspaces + + ``` + root/ + Cargo.toml + src/ + crates/ + crate1/ + Cargo.toml + src/ + crate2/ + Cargo.toml + src/ + ``` + + The workspace here can be configured by placing the following in the + manifests: + ```toml - # ws1/crate2/Cargo.toml + # root/Cargo.toml + # + # Note that `members` aren't necessary if these are otherwise path + # dependencies. [workspace] - members = [".."] + members = ["../crates/crate1", "../crates/crate2"] ``` ```toml - # ws2/Cargo.toml - [workspace] - root = true + # crates/crate1/Cargo.toml + [package] + workspace = "../root" ``` ```toml - # ws2/crate3/Cargo.toml - [workspace] - members = [".."] + # crates/crate2/Cargo.toml + [package] + workspace = "../root" ``` +### Lockfile and override interactions + +One of the main features of a workspace is that only one `Cargo.lock` is +generated for the entire workspace. This lock file can be affected, however, +with both [`[replace]` overrides][replace] as well as `paths` overrides. + +[replace]: https://github.com/rust-lang/cargo/pull/2385 + +Primarily, the `Cargo.lock` generate will not simply be the concatenation of the +lock files from each project. Instead the entire workspace will be resolved +together all at once, minimizing versions of crates used and sharing +dependencies as much as possible. For example one `path` dependency will always +have the same set of dependencies no matter which crate is being compiled. + +When interacting with overrides, workspaces will be modified to only allow +`[replace]` to exist in the workspace root. This Cargo.toml will affect lock +file generation, but no other workspace members will be allowed to have a +`[replace]` directive (with an informative error message being produced). + +Finally, the `paths` overrides will be applied as usual, and they'll continue to +be applied relative to whatever crate is being compiled (not the workspace +root). These are intended for much more local testing, so no restriction of +"must be in the root" should be necessary. + ### Future Extensions Once Cargo understands a workspace of crates, we could easily extend various @@ -270,14 +313,12 @@ show that workspaces can be used to solve other existing issues in Cargo. be incompatible. If all maintainers agree on versions of Cargo, however, this is not a problem. -* If no crate exists at the root of a repository, it may be the case that an - unduly large amount of configuration is required to setup the workspace - correctly. A minor deviation from the normal conventions should in theory only - require a proportionally minor amount of configuration. - * As proposed there is no method to disable implicit actions taken by Cargo. It's unclear what the use case for this is, but it could in theory arise. +* No crate will implicitly benefit from workspaces after this is implemented. + Existing crates must opt-in with a `[workspace]` key somewhere at least. + # Alternatives * Cargo could attempt to perform more inference of workspace members by simply From 8768d9fc10ec7e4cce002279e0861bea2b98bb0b Mon Sep 17 00:00:00 2001 From: Sergio Benitez Date: Mon, 28 Mar 2016 17:31:34 -0700 Subject: [PATCH 0842/1195] Initial attribute with literals RFC. --- text/0000-attributes-with-literals.md | 130 ++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 text/0000-attributes-with-literals.md diff --git a/text/0000-attributes-with-literals.md b/text/0000-attributes-with-literals.md new file mode 100644 index 00000000000..405e8d144a7 --- /dev/null +++ b/text/0000-attributes-with-literals.md @@ -0,0 +1,130 @@ +- Feature Name: attributes_with_literals +- Start Date: 2016-03-28 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC proposes accepting literals in attributes by defining the grammar of attributes as: + +```ebnf +attr : '#' '[' meta_item ']' ; + +meta_item : IDENT ( '=' LIT | '(' meta_item_inner? ')' )? ; + +meta_item_inner : (meta_item | LIT) (',' meta_item_inner)? ; +``` + +Note that `LIT` is a valid Rust literal and `IDENT` is a valid Rust identifier. The following +attributes, among others, would be accepted by this grammar: + +```rust +#[attr] +#[attr()] +#[attr(ident)] +#[attr(ident, ident = 100, ident = "hello", ident(100))] +#[attr(100)] +#[attr("hello")] +#[repr(C, align = 4)] +#[repr(C, align(4))] +``` + +# Motivation +[motivation]: #motivation + +At present, literals are only accepted as the value of a key-value pair in attributes. What's more, +only _string_ literals are accepted. This means that literals can only appear in forms of +`#[attr(name = "value")]` or `#[attr = "value"]`. + +This forces non-string literal values to be awkwardly stringified. For example, while it is clear +that something like alignment should be an integer value, the following are disallowed: +`#[align(4)]`, `#[align = 4]`. Instead, we must use something akin to `#[align = "4"]`. Even +`#[align("4")]` and `#[name("name")]` are disallowed, forcing identifiers or key-values to be used +instead: `#[align(size = "4")]` or `#[name(name)]`. + +In short, the current design forces users to use values of the wrong type in attributes. + +### Cleaner Attributes + +Implementation of this RFC can clean up the following attributes in the standard library: + +* `#![recursion_limit = "64"]` **=>** `#![recursion_limit = 64]` or `#![recursion_limit(64)]` +* `#[cfg(all(unix, target_pointer_width = "32"))]` **=>** `#[cfg(all(unix, target_pointer_width = 32))]` + +If `align` were to be added as an attribute, the following are now valid options for its syntax: + +* `#[repr(align(4))]` +* `#[repr(align = 4)]` +* `#[align = 4]` +* `#[align(4)]` + +### Syntax Extensions + +As syntax extensions mature and become more widely used, being able to use literals in a variety of +positions becomes more important. + +# Detailed design +[design]: #detailed-design + +1. The `MetaItemKind` structure would need to allow literals as top-level entities: + + ```rust + pub enum MetaItemKind { + Word(InternedString), + List(InternedString, Vec>), + NameValue(InternedString, Lit), + Lit, + } + ``` + +2. `libsyntax` (`libsyntax/parse/attr.rs`) would need to be modified to allow literals as values in + k/v pairs and as top-level entities of a list. + +3. Crate metadata encoding/decoding would need to encode and decode literals in attributes. + +# Drawbacks +[drawbacks]: #drawbacks + +This RFC requires a change to the AST and is likely to break syntax extensions using attributes in +the wild. + +# Alternatives +[alternatives]: #alternatives + +### Token Trees + +An alternative is to allow any tokens inside of an attribute. That is, the grammar could be: + +```ebnf +attr : '#' '[' TOKEN+ ']' ; +``` + +where `TOKEN` is any valid Rust token. The drawback to this approach is that attributes lose any +sense of structure. This results in more difficult and verbose attribute parsing, although this +could be ameliorated through libraries. Further, this would require almost all of the existing +attribute parsing code to change. + +The advantage, of course, is that it allows any syntax and is rather future proof. It is also more +inline with `macro!`s. + +### Only Allow Literals as Values in K/V Pairs + +Instead of allowing literals in top-level positions, i.e. `#[attr(4)]`, only allow them as values in +key value pairs: `#[attr = 4]` or `#[attr(ident = 4)]`. This has the nice advantage that it was the +initial idea for attributes, and so the AST types already reflect this. As such, no changes would +have to be made to existing code. The drawback, of course, is the lack of flexibility. `#[repr(C, +align(4))]` would no longer be valid. + +### Do Nothing + +Of course, the current design could be kept. Although it seems that the initial intention was for a +form of literals to be allowed. Unfortunately, this idea was [scrapped due to release pressure] and +never revisited. Even the manual alludes to allowing all literals. + + [scrapped due to release pressure]: https://github.com/rust-lang/rust/issues/623 + +# Unresolved questions +[unresolved]: #unresolved-questions + +None that I can think of. From aac02932e3c3b5749d1c15b83813ca1c3ef00281 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 11 Feb 2016 12:20:44 +1300 Subject: [PATCH 0843/1195] Changes to name resolution Some internal and language-level changes to name resolution. Internally, name resolution will be split into two parts - import resolution and name lookup. Import resolution is moved forward in time to happen in the same phase as parsing and macro expansion. Name lookup remains where name resolution currently takes place (that may change in the future, but is outside the scope of this RFC). However, name lookup can be done earlier if required (importantly it can be done during macro expansion to allow using the module system for macros, also outside the scope of this RFC). Import resolution will use a new algorithm. The observable effects of this RFC (i.e., language changes) are some increased flexibility in the name resolution rules, especially around globs and shadowing. --- text/0000-name-resolution.md | 420 +++++++++++++++++++++++++++++++++++ 1 file changed, 420 insertions(+) create mode 100644 text/0000-name-resolution.md diff --git a/text/0000-name-resolution.md b/text/0000-name-resolution.md new file mode 100644 index 00000000000..120ee265f12 --- /dev/null +++ b/text/0000-name-resolution.md @@ -0,0 +1,420 @@ +- Feature Name: N/A +- Start Date: 2016-02-09 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Some internal and language-level changes to name resolution. + +Internally, name resolution will be split into two parts - import resolution and +name lookup. Import resolution is moved forward in time to happen in the same +phase as parsing and macro expansion. Name lookup remains where name resolution +currently takes place (that may change in the future, but is outside the scope +of this RFC). However, name lookup can be done earlier if required (importantly +it can be done during macro expansion to allow using the module system for +macros, also outside the scope of this RFC). Import resolution will use a new +algorithm. + +The observable effects of this RFC (i.e., language changes) are some increased +flexibility in the name resolution rules, especially around globs and shadowing. + +There is an implementation of the language changes in +[PR #32213](https://github.com/rust-lang/rust/pull/32213). + +# Motivation +[motivation]: #motivation + +Naming and importing macros currently works very differently to naming and +importing any other item. It would be impossible to use the same rules, +since macro expansion happens before name resolution in the compilation process. +Implementing this RFC means that macro expansion and name resolution can happen +in the same phase, thus allowing macros to use the Rust module system properly. + +At the same time, we should be able to accept more Rust programs by tweaking the +current rules around imports and name shadowing. This should make programming +using imports easier. + +# Detailed design +[design]: #detailed-design + +## Changes to name resolution rules + +### Multiple unused imports + +A name may be imported multiple times, it is only a name resolution error if +that name is used. E.g., + +``` +mod foo { + pub struct Qux; +} + +mod bar { + pub struct Qux; +} + +mod baz { + use foo::*; + use bar::*; // Ok, no name conflict. +} +``` + +In this example, adding a use of `Qux` in `baz` would cause a name resolution +error. + +### Multiple imports of the same binding + +A name may be imported multiple times and used if both names bind to the same +item. E.g., + +``` +mod foo { + pub struct Qux; +} + +mod bar { + pub use foo::Qux; +} + +mod baz { + use foo::*; + use bar::*; + + fn f(q: Qux) {} +} +``` + +### non-public imports + +Currently `use` and `pub use` items are treated differently. Non-public imports +will be treated in the same way as public imports, so they may be referenced +from modules which have access to them. E.g., + +``` +mod foo { + pub struct Qux; +} + +mod bar { + use foo::Qux; + + mod baz { + use bar::Qux; // Ok + } +} +``` + + +### Glob imports of accessible but not public names + +Glob imports will import all accessible names, not just public ones. E.g., + +``` +struct Qux; + +mod foo { + use super::*; + + fn f(q: Qux) {} // Ok +} +``` + +This change is backwards incompatible. However, the second rule above should +address most cases, e.g., + +``` +struct Qux; + +mod foo { + use super::*; + use super::Qux; // Legal due to the second rule above. + + fn f(q: Qux) {} // Ok +} +``` + +The below rule (though more controversial) should make this change entirely +backwards compatible. + +Note that in combination with the above rule, this means non-public imports are +imported by globs where they are private but accessible. + + +### Globs and explicit names + +An explicit name may shadow a glob imported name without causing a name +resolution error. E.g., + +``` +mod foo { + pub struct Qux; +} + +mod bar { + pub struct Qux; +} + +mod baz { + use foo::*; + + struct Qux; // Shadows foo::Qux. +} + +mod boz { + use foo::*; + use bar::Qux; // Shadows foo::Qux; note, ordering is not important. +} +``` + +Note that shadowing is namespace specific. I believe this is consistent with our +general approach to name spaces. E.g., + +``` +mod foo { + pub struct Qux; +} + +mod bar { + pub trait Qux; +} + +mod boz { + use foo::*; + use bar::Qux; // Shadows only in the type name space. + + fn f(x: &Qux) { // bound to bar::Qux. + let _ = Qux; // bound to foo::Qux. + } +} +``` + +This change is discussed in [issue 31337](https://github.com/rust-lang/rust/issues/31337). + + +## Changes to the implementation + +Note: below I talk about "the binding table", this is sort of hand-waving. I'm +envisaging a sets-of-scopes system where there is effectively a single, global +binding table. However, the details of that are beyond the scope of this RFC. +One can imagine "the binding table" means one binding table per scope, as in the +current system. + +Currently, parsing and macro expansion happen in the same phase. With this +proposal, we add import resolution to that mix too. Binding tables as well as +the AST will be produced by libsyntax. Name lookup will continue to be done +where name resolution currently takes place. + +To resolve imports, the algorithm proceeds as follows: we start by parsing as +much of the program as we can; like today we don't parse macros. When we find +items which bind a name, we add the name to the binding table. When we find an +import which can't be resolved, we add it to a work list. When we find a glob +import, we have to record a 'back link', so that when a public name is added for +the supplying module, we can add it for the importing module. + +We then loop over the work list and try to lookup names. If a name has exactly +one best binding then we use it (and record the binding on a list of resolved +names). If there are zero, or more than one possible binding, then we put it +back on the work list. When we reach a fixed point, i.e., the work list no +longer changes, then we are done. If the work list is empty, then +expansion/import resolution succeeded, otherwise there are names not found, or +ambiguous names, and we failed. + +As we are looking up names, we record the resolutions in the binding table. If +the name we are looking up is for a glob import, we add bindings for every +accessible name currently known. + +To expand a macro use, we try to resolve the macro's name. If that fails, we put +it on the work list. Otherwise, we expand that macro by parsing the arguments, +pattern matching, and doing hygienic expansion. We then parse the generated code +in the same way as we parsed the original program. We add new names to the +binding table, and expand any new macro uses. + +If we add names for a module which has back links, we must follow them and add +these names to the importing module (if they are accessible). When following +these back links, we check for cycles, signaling an error if one is found. + +In pseudo-code: + +``` +// Assumes parsing is already done, but the two things could be done in the same +// pass. +fn parse_expand_and_resolve() { + loop until fixed point { + loop until fixed point { + process_names() + process_work_list() + } + expand_macros() + } + + for item in work_list { + report_error() + } else { + success!() + } +} + +fn process_names() { + // 'module' includes `mod`s, top level of the crate, function bodies + for each unseen item in any module { + if item is a definition { + // struct, trait, type, local variable def, etc. + bindings.insert(item.name, module, item) + populate_back_links(module, item) + } else { + try_to_resolve_import(module, item) + } + record_macro_uses() + } +} + +fn try_to_resolve_import(module, item) { + if item is an explicit use { + // item is use a::b::c as d; + match try_to_resolve(item) { + Ok(r) => { + add(bindings.insert(d, module, r, Priority::Explicit)) + populate_back_links(module, item) + } + Err() => work_list.push(module, item) + } + } else if item is a glob { + // use a::b::*; + match try_to_resolve(a::b) { + Ok(n) => + for binding in n { + bindings.insert_if_no_higher_priority_binding(binding.name, module, binding, Priority::Glob) + populate_back_links(module, binding) + } + add_back_link(n to module) + work_list.remove() + Err(_) => work_list.push(module, item) + } + } +} + +fn process_work_list() { + for each (module, item) in work_list { + work_list.remove() + try_to_resolve_import(module, item) + } +} +``` + +In order to keep macro expansion comprehensible to programmers, we must enforce +that all macro uses resolve to the same binding at the end of resolution as they +do when they were resolved. + +We rely on a monotonicity property in macro expansion - once an item exists in a +certain place, it will always exist in that place. It will never disappear and +never change. Note that for the purposes of this property, I do not consider +code annotated with a macro to exist until it has been fully expanded. + +A consequence of this is that if the compiler resolves a name, then does some +expansion and resolves it again, the first resolution will still be valid. +However, another resolution may appear, so the resolution of a name may change +as we expand. It can also change from a good resolution to an ambiguity. It is +also possible to change from good to ambiguous to good again. There is even an +edge case where we go from good to ambiguous to the same good resolution (but +via a different route). + +If import resolution succeeds, then we check our record of name resolutions. We +re-resolve and check we get the same result. We can also check for un-used +macros at this point. + +### Privacy + +In order to resolve imports (and in the future for macro privacy), we must be +able to decide if names are accessible. This requires doing privacy checking as +required during parsing/expansion/import resolution. We can keep the current +algorithm, but check accessibility on demand, rather than as a separate pass. + +During macro expansion, once a name is resolvable, then we can safely perform +privacy checking, because parsing and macro expansion will never remove items, +nor change the module structure of an item once it has been expanded. + +### Metadata + +When a crate is packed into metadata, we must also include the binding table. We +must include private entries due to macros that the crate might export. We don't +need data for function bodies. For functions which are serialised for +inlining/monomorphisation, we should include local data (although it's probably +better to serialise the HIR or MIR, then the local bindings are unnecessary). + + +# Drawbacks +[drawbacks]: #drawbacks + +It's a lot of work and name resolution is complex, therefore there is scope for +introducing bugs. + +The macro changes are not backwards compatible, which means having a macro +system 2.0. If users are reluctant to use that, we will have two macro systems +forever. + +# Alternatives +[alternatives]: #alternatives + +## Naming rules + +We could take a subset of the shadowing changes (or none at all), whilst still +changing the implementation of name resolution. In particular, we might want to +discard the explicit/glob shadowing rule change, or only allow items, not +imported names to shadow. + +We could also consider different shadowing rules around namespacing. In the +'globs and explicit names' rule change, we could consider an explicit name to +shadow both name spaces and emit a custom error. The example becomes: + + +``` +mod foo { + pub struct Qux; +} + +mod bar { + pub trait Qux; +} + +mod boz { + use foo::*; + use bar::Qux; // Shadows both name spaces. + + fn f(x: &Qux) { // bound to bar::Qux. + let _ = Qux; // ERROR, unresolved name Qux; the compiler would emit a + // note about shadowing and namespaces. + } +} +``` + +## Import resolution algorithm + +Rather than lookup names for imports during the fixpoint iteration, one could +save links between imports and definitions. When lookup is required (for macros, +or later in the compiler), these links are followed to find a name, rather than +having the name being immediately available. + + +# Unresolved questions +[unresolved]: #unresolved-questions + +## Name lookup + +The name resolution phase would be replaced by a cut-down name lookup phase, +where the binding tables generated during expansion are used to lookup names in +the AST. + +We could go further, two appealing possibilities are merging name lookup with +the lowering from AST to HIR, so the HIR is a name-resolved data structure. Or, +name lookup could be done lazily (probably with some caching) so no tables +binding names to definitions are kept. I prefer the first option, but this is +not really in scope for this RFC. + + +# References + +* [Niko's prototype](https://github.com/nikomatsakis/rust-name-resolution-algorithm) +* [Blog post](http://ncameron.org/blog/name-resolution/), includes details about + how the name resolution algorithm interacts with sets of scopes hygiene. From 53a973a6b422564c3904dedb203106f65d3f0d53 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 29 Mar 2016 09:31:02 -0700 Subject: [PATCH 0844/1195] Fix a bug in [workspace] definitions --- text/0000-cargo-workspace.md | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index f0c1f4a9429..f1e83281ce0 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -106,7 +106,8 @@ and will notably be lacking a `[project]` or `[package]` top level key. A virtual manifest does not itself define a crate, but can help when defining a root. For example a `Cargo.toml` file at the root of a repository with a -`[workspace]` key would suffice for the project configurations in question. +`[workspace]` key plus `workspace.members` configuration would suffice for the +project configurations in question. Cargo will for the time being disallow many commands against a virtual manifest, for example `cargo build` will be rejected. Arguments that take a package, @@ -143,8 +144,9 @@ A conventional layout for a Rust project is to have a `Cargo.toml` at the root with the "main project" with dependencies and/or satellite projects underneath. Consequently the conventional layout will only need a `[workspace]` key added to the root to benefit from the workspaces proposed in this RFC. For example, all -of these project layouts (with `/` being the root of a repository) will not -require any configuration to have all crates be members of a workspace: +of these project layouts (with `/` being the root of a repository) will only +require the addition of `[workspace]` in the root to have all crates be members +of a workspace: * An FFI crate with a sub-crate for FFI bindings @@ -196,6 +198,7 @@ configuration necessary, are: ```toml [workspace] + members = ["crate1", "crate2", "crate3"] ``` * Trees with multiple workspaces @@ -321,6 +324,17 @@ show that workspaces can be used to solve other existing issues in Cargo. # Alternatives +* The `workspace.members` key could support globs to define a number of + directories at once. For example one could imagine: + + ```toml + [workspace] + members = ["crates/*"] + ``` + + as an ergonomic method of slurping up all sub-folders in the `crates` folder + as crates. + * Cargo could attempt to perform more inference of workspace members by simply walking the entire directory tree starting at `Cargo.toml`. All children found could implicitly be members of the workspace. Walking entire trees, From 773f05e400e36ce52e9606d6a9f2382b4d7b3e62 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 29 Mar 2016 09:45:02 -0700 Subject: [PATCH 0845/1195] Clarify [workspace] without members --- text/0000-cargo-workspace.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index f1e83281ce0..662d72f5783 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -107,7 +107,9 @@ and will notably be lacking a `[project]` or `[package]` top level key. A virtual manifest does not itself define a crate, but can help when defining a root. For example a `Cargo.toml` file at the root of a repository with a `[workspace]` key plus `workspace.members` configuration would suffice for the -project configurations in question. +project configurations in question. Note that omitting `workspace.members` would +not be useful as there are no outgoing edges (no `path` dependencies), so Cargo +will emit an error in cases like this. Cargo will for the time being disallow many commands against a virtual manifest, for example `cargo build` will be rejected. Arguments that take a package, From 573bd83aaac0896842f136262458c3b3bf671cd7 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 11 Feb 2016 14:28:06 +1300 Subject: [PATCH 0846/1195] Macro naming and modularisation This RFC proposes making macros a first-class citizen in the Rust module system. Both macros by example (`macro_rules` macros) and procedural macros (aka syntax extensions) would use the same naming and modularisation scheme as other items in Rust. --- text/0000-macro-naming.md | 165 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 165 insertions(+) create mode 100644 text/0000-macro-naming.md diff --git a/text/0000-macro-naming.md b/text/0000-macro-naming.md new file mode 100644 index 00000000000..669b9176174 --- /dev/null +++ b/text/0000-macro-naming.md @@ -0,0 +1,165 @@ +- Feature Name: N/A (part of other unstable features) +- Start Date: 2016-02-11 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Naming and modularisation for macros. + +This RFC proposes making macros a first-class citizen in the Rust module system. +Both macros by example (`macro_rules` macros) and procedural macros (aka syntax +extensions) would use the same naming and modularisation scheme as other items +in Rust. + +For procedural macros, this RFC could be implemented immediately or as part of a +larger effort to reform procedural macros. For macros by example, this would be +part of a macros 2.0 feature, the rest of which will be described in a separate +RFC. This RFC depends on the changes to name resolution described in +[RFC 1560](https://github.com/rust-lang/rfcs/pull/1560). + +# Motivation +[motivation]: #motivation + +Currently, procedural macros are not modularised at all (beyond the crate +level). Macros by example have a [custom modularisation +scheme](https://github.com/rust-lang/rfcs/blob/master/text/0453-macro-reform.md) +which involves modules to some extent, but relies on source ordering and +attributes which are not used for other items. Macros cannot be imported or +named using the usual syntax. It is confusing that macros use their own system +for modularisation. It would be far nicer if they were a more regular feature of +Rust in this respect. + + +# Detailed design +[design]: #detailed-design + +## Defining macros + +This RFC does not propose changes to macro definitions. It is envisaged that +definitions of procedural macros will change, see [this blog post](http://ncameron.org/blog/macro-plans-syntax/) +for some rough ideas. I'm assuming that procedural macros will be defined in +some function-like way and that these functions will be defined in modules in +their own crate (to start with). + +Ordering of macro definitions in the source text will no longer be significant. +A macro may be used before it is defined, as long as it can be named. That is, +macros follow the same rules regarding ordering as other items. E.g., this will +work: + +``` +foo!(); + +macro_rules! foo { ... } +``` + +Macro expansion order is also not defined by source order. E.g., in `foo!(); bar!();`, +`bar` may be expanded before `foo`. Ordering is only guaranteed as far as it is +necessary. E.g., if `bar` is only defined by expanding `foo`, then `foo` must be +expanded before `bar`. + +## Function-like macro uses + +A function-like macro use (c.f., attribute-like macro use) is a macro use which +uses `foo!(...)` or `foo! ident (...)` syntax (where `()` may also be `[]` or `{}`). + +Macros may be named by using a `::`-separated path. Naming follows the same +rules as other items in Rust. + +If a macro `baz` (by example or procedural) is defined in a module `bar` which +is nested in `foo`, then it may be used anywhere in the crate using an +absolute path: `::foo::bar::baz!(...)`. It can be used via relative paths in the +usual way, e.g., inside `foo` as `bar::baz!()`. + +Macros declared inside a function body can only be used inside that function +body. + +For procedural macros, the path must point to the function defining the macro. + +The grammar for macros is changed, anywhere we currently parser `name "!"`, we +now parse `path "!"`. I don't think this introduces any issues. + +Name lookup follows the same name resolution rules as other items. See [RFC +1560](https://github.com/rust-lang/rfcs/pull/1560) for details on how name +resolution could be adapted to support this. + +## Attribute-like macro uses + +Attribute macros may also be named using a `::`-separated path. Other than +appearing in an attribute, these also follow the usual Rust naming rules. + +E.g., `#[::foo::bar::baz(...)]` and `#[bar::baz(...)]` are uses of absolute and +relative paths, respectively. + + +## Importing macros + +Importing macros is done using `use` in the same way as other items. An `!` is +not necessary in an import item. Macros are imported into their own namespace +and do not shadow or overlap items with the same name in the type or value +namespaces. + +E.g., `use foo::bar::baz;` imports the macro `baz` from the module `::foo::bar`. +Macro imports may be used in import lists (with other macro imports and with +non-macro imports). + +Where a glob import (`use ...::*;`) imports names from a module including macro +definitions, the names of those macros are also imported. E.g., `use +foo::bar::*;` would import `baz` along with any other items in `foo::bar`. + +Where macros are defined in a separate crate, these are imported in the same way +as other items by an `extern crate` item. + +No `#[macro_use]` or `#[macro_export]` annotations are required. + + +# Drawbacks +[drawbacks]: #drawbacks + +If the new macro system is not well adopted by users, we could be left with two +very different schemes for naming macros depending on whether a macro is defined +by example or procedurally. That would be inconsistent and annoying. However, I +hope we can make the new macro system appealing enough and close enough to the +existing system that migration is both desirable and easy. + + +# Alternatives +[alternatives]: #alternatives + +We could adopt the proposed scheme for procedural macros only and keep the +existing scheme for macros by example. + +We could adapt the current macros by example scheme to procedural macros. + +We could require the `!` in macro imports to distinguish them from other names. +I don't think this is necessary or helpful. + +We could continue to require `macro_export` annotations on top of this scheme. +However, I prefer moving to a scheme using the same privacy system as the rest +of Rust, see below. + + +# Unresolved questions +[unresolved]: #unresolved-questions + +## Privacy for macros + +I would like that macros follow the same rules for privacy as other Rust items, +i.e., they are private by default and may be marked as `pub` to make them +public. This is not as straightforward as it sounds as it requires parsing `pub +macro_rules! foo` as a macro definition, etc. I leave this for a separate RFC. + +## Scoped attributes + +It would be nice for tools to use scoped attributes as well as procedural +macros, e.g., `#[rustfmt::skip]` or `#[rust::new_attribute]`. I believe this +should be straightforward syntactically, but there are open questions around +when attributes are ignored or seen by tools and the compiler. Again, I leave it +for a future RFC. + +## Inline procedural macros + +Some day, I hope that procedural macros may be defined in the same crate in +which they are used. I leave the details of this for later, however, I don't +think this affects the design of naming - it should all Just Work. From 5a1be048c36bf6ef86707412adc5547e1e22bad5 Mon Sep 17 00:00:00 2001 From: Sergio Benitez Date: Tue, 29 Mar 2016 18:12:09 -0700 Subject: [PATCH 0847/1195] Fixes and clarifications: '!' in grammar, "literal" definition, more examples. --- text/0000-attributes-with-literals.md | 59 +++++++++++++++++++++------ 1 file changed, 47 insertions(+), 12 deletions(-) diff --git a/text/0000-attributes-with-literals.md b/text/0000-attributes-with-literals.md index 405e8d144a7..e1b9312e3f3 100644 --- a/text/0000-attributes-with-literals.md +++ b/text/0000-attributes-with-literals.md @@ -9,7 +9,7 @@ This RFC proposes accepting literals in attributes by defining the grammar of attributes as: ```ebnf -attr : '#' '[' meta_item ']' ; +attr : '#' '!'? '[' meta_item ']' ; meta_item : IDENT ( '=' LIT | '(' meta_item_inner? ')' )? ; @@ -21,10 +21,12 @@ attributes, among others, would be accepted by this grammar: ```rust #[attr] -#[attr()] +#[attr(true)] #[attr(ident)] -#[attr(ident, ident = 100, ident = "hello", ident(100))] +#[attr(ident, 100, true, "true", ident = 100, ident = "hello", ident(100))] #[attr(100)] +#[attr(enabled = true)] +#[enabled(true)] #[attr("hello")] #[repr(C, align = 4)] #[repr(C, align(4))] @@ -40,10 +42,11 @@ only _string_ literals are accepted. This means that literals can only appear in This forces non-string literal values to be awkwardly stringified. For example, while it is clear that something like alignment should be an integer value, the following are disallowed: `#[align(4)]`, `#[align = 4]`. Instead, we must use something akin to `#[align = "4"]`. Even -`#[align("4")]` and `#[name("name")]` are disallowed, forcing identifiers or key-values to be used -instead: `#[align(size = "4")]` or `#[name(name)]`. +`#[align("4")]` and `#[name("name")]` are disallowed, forcing key-value pairs or identifiers to be +used instead: `#[align(size = "4")]` or `#[name(name)]`. -In short, the current design forces users to use values of the wrong type in attributes. +In short, the current design forces users to use values of a single type, and thus occasionally the +_wrong_ type, in attributes. ### Cleaner Attributes @@ -67,6 +70,23 @@ positions becomes more important. # Detailed design [design]: #detailed-design +To clarify, _literals_ are: + + * **Strings:** `"foo"`, `r##"foo"##` + * **Byte Strings:** `b"foo"` + * **Byte Characters:** `b'f'` + * **Characters:** `'a'` + * **Integers:** `1`, `1{i,u}{8,16,32,64,size}` + * **Floats:** `1.0`, `1.0f{32,64}` + * **Booleans:** `true`, `false` + +They are defined in the [manual] and by implementation in the [AST]. + + [manual]: https://doc.rust-lang.org/reference.html#literals + [AST]: http://manishearth.github.io/rust-internals-docs/syntax/ast/enum.LitKind.html + +Implementation of this RFC requires the following changes: + 1. The `MetaItemKind` structure would need to allow literals as top-level entities: ```rust @@ -74,7 +94,7 @@ positions becomes more important. Word(InternedString), List(InternedString, Vec>), NameValue(InternedString, Lit), - Lit, + Literal(Lit), } ``` @@ -92,12 +112,12 @@ the wild. # Alternatives [alternatives]: #alternatives -### Token Trees +### Token trees An alternative is to allow any tokens inside of an attribute. That is, the grammar could be: ```ebnf -attr : '#' '[' TOKEN+ ']' ; +attr : '#' '!'? '[' TOKEN+ ']' ; ``` where `TOKEN` is any valid Rust token. The drawback to this approach is that attributes lose any @@ -108,7 +128,21 @@ attribute parsing code to change. The advantage, of course, is that it allows any syntax and is rather future proof. It is also more inline with `macro!`s. -### Only Allow Literals as Values in K/V Pairs +### Allow only unsuffixed literals + +This RFC proposes allowing _any_ valid Rust literals in attributes. Instead, the use of literals +could be restricted to only those that are unsuffixed. That is, only the following literals could be +allowed: + + * **Strings:** `"foo"` + * **Characters:** `'a'` + * **Integers:** `1` + * **Floats:** `1.0` + * **Booleans:** `true`, `false` + +This cleans up the appearance of attributes will still increasing flexibility. + +### Allow literals only as values in k/v pairs Instead of allowing literals in top-level positions, i.e. `#[attr(4)]`, only allow them as values in key value pairs: `#[attr = 4]` or `#[attr(ident = 4)]`. This has the nice advantage that it was the @@ -116,13 +150,14 @@ initial idea for attributes, and so the AST types already reflect this. As such, have to be made to existing code. The drawback, of course, is the lack of flexibility. `#[repr(C, align(4))]` would no longer be valid. -### Do Nothing +### Do nothing Of course, the current design could be kept. Although it seems that the initial intention was for a form of literals to be allowed. Unfortunately, this idea was [scrapped due to release pressure] and -never revisited. Even the manual alludes to allowing all literals. +never revisited. Even [the reference] alludes to allowing all literals as values in k/v pairs. [scrapped due to release pressure]: https://github.com/rust-lang/rust/issues/623 + [the manual]: https://doc.rust-lang.org/reference.html#attributes # Unresolved questions [unresolved]: #unresolved-questions From e7c214fce533eae4a97a5232517ebaf4e35232f9 Mon Sep 17 00:00:00 2001 From: Sergio Benitez Date: Tue, 29 Mar 2016 18:30:12 -0700 Subject: [PATCH 0848/1195] Fixed a mislabeled link. --- text/0000-attributes-with-literals.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-attributes-with-literals.md b/text/0000-attributes-with-literals.md index e1b9312e3f3..47958767092 100644 --- a/text/0000-attributes-with-literals.md +++ b/text/0000-attributes-with-literals.md @@ -157,7 +157,7 @@ form of literals to be allowed. Unfortunately, this idea was [scrapped due to re never revisited. Even [the reference] alludes to allowing all literals as values in k/v pairs. [scrapped due to release pressure]: https://github.com/rust-lang/rust/issues/623 - [the manual]: https://doc.rust-lang.org/reference.html#attributes + [the reference]: https://doc.rust-lang.org/reference.html#attributes # Unresolved questions [unresolved]: #unresolved-questions From 1785c85c00c8b57ed0639ff6407ae5203bcb8174 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 29 Mar 2016 21:20:30 -0700 Subject: [PATCH 0849/1195] Remove alternatives that are folded in --- text/0000-cargo-workspace.md | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index 662d72f5783..8b5aeb4331e 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -343,18 +343,6 @@ show that workspaces can be used to solve other existing issues in Cargo. unfortunately, isn't always efficient to do and it would be unfortunate to have to unconditionally do this. -* Cargo could support "virtual packages" where a `Cargo.toml` is placed at the - root of a repository but only to serve as a global project configuration. No - crate would actually be described by a virtual package, but it would play into - the workspace heuristics described here. This feature could alleviate the "too - much extra configuration" drawback described above, but it's unclear whether - it's needed at this point. - -* Implicit members are currently only path dependencies and a "Cargo.toml next - to VCS" traveling upwards. Instead all Cargo.toml members found traveling - upwards could be implicit members of a workspace. This behavior, however, may - end up picking up too many crates. - # Unresolved questions * Does this approach scale well to repositories with a large number of crates? From e019825d6b2b7bb4c8537bced61bd9d180cf6c2b Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Wed, 30 Mar 2016 22:20:01 +0100 Subject: [PATCH 0850/1195] Update RFC based on feedback --- text/0000-int128.md | 63 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/text/0000-int128.md b/text/0000-int128.md index 3ebb408ced0..7882f528b23 100644 --- a/text/0000-int128.md +++ b/text/0000-int128.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -This RFC adds the `i128` and `u128` types to Rust. The `i128` and `u128` are not added to the prelude, and must instead be explicitly imported with `use core::{i128, u128}`. +This RFC adds the `i128` and `u128` primitive types to Rust. # Motivation [motivation]: #motivation @@ -16,11 +16,29 @@ Some algorithms need to work with very large numbers that don't fit in 64 bits, # Detailed design [design]: #detailed-design -The `i128` and `u128` types are not added to the Rust prelude since that would break compatibility. Instead they must be explicitly imported with `use core::{i128, u128}` or `use std::{i128, u128}`. +## Compiler support -Implementation-wise, this should just be a matter of adding a new primitive type to the compiler and adding trait implementations for `i128`/`u128` in libcore. Literals will need to be extended to support `i128`/`u128`. +The first step for implementing this feature is to add support for the `i128`/`u128` primitive types to the compiler. This will requires changes to many parts of the compiler, from libsyntax to trans. -LLVM fully supports 128-bit integers on all architectures, however it will emit calls to functions in `compiler-rt` for many operations such as multiplication and division (addition and subtraction are implemented natively). However, `compiler-rt` only provides the functions for 128-bit integers on 64-bit platforms (`#ifdef __LP64__`). We will need to provide our own implementations of the following functions to allow `i128`/`u128` to be available on all architectures: +The compiler will need to be bootstrapped from an older compiler which does not support `i128`/`u128`, but rustc will want to use these types internally for things like literal parsing and constant propagation. This can be solved by using a "software" implementation of these types, similar to the one in the [extprim](https://github.com/kennytm/extprim) crate. Once stage1 is built, stage2 can be compiled using the native LLVM `i128`/`u128` types. + +## Runtime library support + +The LLVM code generator supports 128-bit integers on all architectures, however it will lower some operations to runtime library calls. This similar to how we currently handle `u64` and `i64` on 32-bit platforms: "complex" operations such as multiplication or division are lowered by LLVM backends into calls to functions in the `compiler-rt` runtime library. + +Here is a rough breakdown of which operations are handled natively instead of through a library call: +- Add/Sub/Neg: native, including checked overflow variants +- Compare (eq/ne/gt/ge/lt/le): native +- Bitwise and/or/xor/not: native +- Shift left/right: native on most architectures (some use libcalls instead) +- Bit counting, parity, leading/trailing ones/zeroes: native +- Byte swapping: native +- Mul/Div/Mod: libcall (including checked overflow multiplication) +- Conversion to/from f32/f64: libcall + +The `compiler-rt` library that comes with LLVM only implements runtime library functions for 128-bit integers on 64-bit platforms (`#ifdef __LP64__`). We will need to provide our own implementations of the relevant functions to allow `i128`/`u128` to be available on all architectures. Note that this can only be done with a compiler that already supports `i128`/`u128` to match the calling convention that LLVM is expecting. + +Here is the list of functions that need to be implemented: ```c // si_int = i32 @@ -46,17 +64,46 @@ tu_int __udivti3(tu_int a, tu_int b); tu_int __umodti3(tu_int a, tu_int b); ``` +Implementations of these functions will be written in Rust and will be included in libcore. + +## Modifications to libcore + +Several changes need to be done to libcore: +- `src/libcore/num/i128.rs`: Define `MIN` and `MAX`. +- `src/libcore/num/u128.rs`: Define `MIN` and `MAX`. +- `src/libcore/num/mod.rs`: Implement inherent methods, `Zero`, `One`, `From` and `FromStr` for `u128` and `i128`. +- `src/libcore/num/wrapping.rs`: Implement methods for `Wrapping` and `Wrapping`. +- `src/libcore/fmt/num.rs`: Implement `Binary`, `Octal`, `LowerHex`, `UpperHex`, `Debug` and `Display` for `u128` and `i128`. +- `src/libcore/cmp.rs`: Implement `Eq`, `PartialEq`, `Ord` and `PartialOrd` for `u128` and `i128`. +- `src/libcore/nonzero.rs`: Implement `NonZero` for `u128` and `i128`. +- `src/libcore/iter.rs`: Implement `Step` for `u128` and `i128`. +- `src/libcore/clone.rs`: Implement `Clone` for `u128` and `i128`. +- `src/libcore/default.rs`: Implement `Default` for `u128` and `i128`. +- `src/libcore/hash/mod.rs`: Implement `Hash` for `u128` and `i128` and add `write_i128` and `write_u128` to `Hasher`. +- `src/libcore/lib.rs`: Add the `u128` and `i128` modules. + +## Modifications to libstd + +A few minor changes are required in libstd: +- `src/libstd/lib.rs`: Re-export `core::{i128, u128}`. +- `src/libstd/primitive_docs.rs`: Add documentation for `i128` and `u128`. + +## Modifications to other crates + +A few external crates will need to be updated to support the new types: +- `rustc-serialize`: Add the ability to serialize `i128` and `u128`. +- `serde`: Add the ability to serialize `i128` and `u128`. +- `rand`: Add the ability to generate random `i128`s and `u128`s. + # Drawbacks [drawbacks]: #drawbacks -One possible complication is that primitive types aren't currently part of the prelude, instead they are directly added to the global namespace by the compiler. The new `i128` and `u128` types will behave differently and will need to be explicitly imported. - -Another possible issue is that a `u128` can hold a very large number that doesn't fit in a `f32`. We need to make sure this doesn't lead to any `undef`s from LLVM. See [this comment](https://github.com/rust-lang/rust/issues/10185#issuecomment-110955148), and [this example code](https://gist.github.com/Amanieu/f87da5f0599b343c5500). +One possible issue is that a `u128` can hold a very large number that doesn't fit in a `f32`. We need to make sure this doesn't lead to any `undef`s from LLVM. See [this comment](https://github.com/rust-lang/rust/issues/10185#issuecomment-110955148), and [this example code](https://gist.github.com/Amanieu/f87da5f0599b343c5500). # Alternatives [alternatives]: #alternatives -There have been several attempts to create `u128`/`i128` wrappers based on two `u64` values, but these can't match the performance of LLVM's native 128-bit integers. +There have been several attempts to create `u128`/`i128` wrappers based on two `u64` values, but these can't match the performance of LLVM's native 128-bit integers. For example LLVM is able to lower a 128-bit add into just 2 instructions on 64-bit platforms and 4 instructions on 32-bit platforms. # Unresolved questions [unresolved]: #unresolved-questions From 4c4423f98e96362652decf13040fc1401ac5cbff Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Wed, 30 Mar 2016 22:26:19 +0100 Subject: [PATCH 0851/1195] Change function prototypes to Rust --- text/0000-int128.md | 40 ++++++++++++++++++---------------------- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/text/0000-int128.md b/text/0000-int128.md index 7882f528b23..1e20368a5fc 100644 --- a/text/0000-int128.md +++ b/text/0000-int128.md @@ -40,28 +40,24 @@ The `compiler-rt` library that comes with LLVM only implements runtime library f Here is the list of functions that need to be implemented: -```c -// si_int = i32 -// su_int = u32 -// ti_int = i128 -// tu_int = u128 -ti_int __ashlti3(ti_int a, si_int b); -ti_int __ashrti3(ti_int a, si_int b); -ti_int __divti3(ti_int a, ti_int b); -ti_int __fixdfti(double a); -ti_int __fixsfti(float a); -tu_int __fixunsdfti(double a); -tu_int __fixunssfti(float a); -double __floattidf(ti_int a); -float __floattisf(ti_int a); -double __floatuntidf(tu_int a); -float __floatuntisf(tu_int a); -ti_int __lshrti3(ti_int a, si_int b); -ti_int __modti3(ti_int a, ti_int b); -ti_int __muloti4(ti_int a, ti_int b, int* overflow); -ti_int __multi3(ti_int a, ti_int b); -tu_int __udivti3(tu_int a, tu_int b); -tu_int __umodti3(tu_int a, tu_int b); +```rust +fn __ashlti3(a: i128, b: i32) -> i128; +fn __ashrti3(a: i128, b: i32) -> i128; +fn __divti3(a: i128, b: i128) -> i128; +fn __fixdfti(a: f64) -> i128; +fn __fixsfti(a: f32) -> i128; +fn __fixunsdfti(a: f64) -> u128; +fn __fixunssfti(a: f32) -> u128; +fn __floattidf(a: i128) -> f64; +fn __floattisf(a: i128) -> f32; +fn __floatuntidf(a: u128) -> f64; +fn __floatuntisf(a: u128) -> f32; +fn __lshrti3(a: i128, b: i32) -> i128; +fn __modti3(a: i128, b: i128) -> i128; +fn __muloti4(a: i128, b: i128, overflow: &mut i32) -> i128; +fn __multi3(a: i128, b: i128) -> i128; +fn __udivti3(a: u128, b: u128) -> u128; +fn __umodti3(a: u128, b: u128) -> u128; ``` Implementations of these functions will be written in Rust and will be included in libcore. From cbe92f4a3fbc956bd75dacf1ea2e14ef84d2bdb1 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 30 Mar 2016 17:50:47 -0700 Subject: [PATCH 0852/1195] RFC 1552 is {VecDeque,LinkedList}::contains --- ...s.md => 1552-contains-method-for-various-collections.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-contains-method-for-various-collections.md => 1552-contains-method-for-various-collections.md} (92%) diff --git a/text/0000-contains-method-for-various-collections.md b/text/1552-contains-method-for-various-collections.md similarity index 92% rename from text/0000-contains-method-for-various-collections.md rename to text/1552-contains-method-for-various-collections.md index 3a2d97f7395..07dab257fe7 100644 --- a/text/0000-contains-method-for-various-collections.md +++ b/text/1552-contains-method-for-various-collections.md @@ -1,7 +1,7 @@ -- Feature Name: contains_method_for_various_collections +- Feature Name: `contains_method_for_various_collections` - Start Date: 2016-03-16 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1552](https://github.com/rust-lang/rfcs/pull/1552) +- Rust Issue: [rust-lang/rust#32630](https://github.com/rust-lang/rust/issues/32630) # Summary [summary]: #summary From a4b68d4704088971afce23e83549212fde941f06 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Thu, 31 Mar 2016 12:56:07 +0100 Subject: [PATCH 0853/1195] Explain why we can't use the compiler-rt functions or write them in C --- text/0000-int128.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-int128.md b/text/0000-int128.md index 1e20368a5fc..f5e5b57942e 100644 --- a/text/0000-int128.md +++ b/text/0000-int128.md @@ -60,7 +60,7 @@ fn __udivti3(a: u128, b: u128) -> u128; fn __umodti3(a: u128, b: u128) -> u128; ``` -Implementations of these functions will be written in Rust and will be included in libcore. +Implementations of these functions will be written in Rust and will be included in libcore. Note that it is not possible to write these functions in C or use the existing implementations in `compiler-rt` since the `__int128` type is not available in C on 32-bit platforms. ## Modifications to libcore From efa2e50e3a489e9ef2350a01066980d15bcd602f Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Thu, 31 Mar 2016 18:22:15 -0400 Subject: [PATCH 0854/1195] Rough draft --- ...0000-more-api-documentation-conventions.md | 266 ++++++++++++++++++ 1 file changed, 266 insertions(+) create mode 100644 text/0000-more-api-documentation-conventions.md diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md new file mode 100644 index 00000000000..47b52584f43 --- /dev/null +++ b/text/0000-more-api-documentation-conventions.md @@ -0,0 +1,266 @@ +- Feature Name: More API Documentation Conventions +- Start Date: 2016-03-31 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +[RFC 505] introduced certain conventions around documenting Rust projects. This RFC supersedes +that one, thought it has the same aims: to describe how the Rust project should be documented, +and provide guidance for other Rust projects as well. + +This RFC will contain some similar text as RFC 505, so that we can have one RFC with the full +conventions. + +[RFC 505]: https://github.com/rust-lang/rfcs/blob/master/text/0505-api-comment-conventions.md + +# Motivation +[motivation]: #motivation + +Documentation is an extremely important part of any project. It’s important +that we have consistency in our documentation. + +For the most part, the RFC proposes guidelines that are already followed today, +but it tries to motivate and clarify them. + +# Detailed design +[design]: #detailed-design + +This RFC is large. Here’s a table of contents: + +* [Content](#content) + * [Summary sentence](#summary-sentence) + * [English](#english) +* [Form](#form) + * [Use line comments](#use-line-comments) + * [Using Markdown](#using-markdown) +* [Example](#example) + +## Content +[content]: #content + +These conventions relate to the contents of the documentation, the words themselves. + +### Summary sentence +[summary-sentence]: #summary-sentence + +In API documentation, the first line should be a single-line short sentence +providing a summary of the code. This line is used as a summary description +throughout Rustdoc’s output, so it’s a good idea to keep it short. + +The summary line should be written in third person singular present indicative +form. Basically, this means write “Returns” instead of “Return.” + +### English +[english]: #english + +This section applies to `rustc` and the standard library. + +All documentation is standardized on American English, with regards to +spelling, grammar, and punctuation conventions. Language changes over time, +so this doesn’t mean that there is always a correct answer to every grammar +question, but there is often some kind of formal consensus. + +One specific rule that comes up often: when quoting something for emphasis, +use a single quote, and put punctuation outside the quotes, ‘this’. When +quoting something at length, “use double quotes and put the punctuation +inside of the quote.” Most documentation will end up using single quotes, +so if you’re not sure, just stick with them. + +## Form +[form]: #form + +These conventions relate to the formatting of the documentation, how they +appear in source code. + +### Use line comments +[use-line-comments]: #use-line-comments + +Avoid block comments. Use line comments instead: + +```rust +// Wait for the main task to return, and set the process error code +// appropriately. +``` + +Instead of: + +```rust +/* + * Wait for the main task to return, and set the process error code + * appropriately. + */ +``` + +Only use inner doc comments `//!` to write crate and module-level documentation, +nothing else. When using `mod` blocks, prefer `///` outside of the block: + +```rust +/// This module contains tests +mod test { + // ... +} +``` + +over + +```rust +mod test { + //! This module contains tests + + // ... +} +``` + +### Using Markdown +[using-markdown]: #using-markdown + +Within doc comments, use Markdown to format your documentation. + +Use top level headings # to indicate sections within your comment. Common headings: + +* Examples +* Panics +* Errors +* Safety +* Aborts +* Undefined Behavior + +Even if you only include one example, use the plural form: ‘Examples’ rather +than ‘Example’. Future tooling is easier this way. + +Use graves (`) to denote a code fragment within a sentence. + +Use triple graves (```) to write longer examples, like this: + + This code does something cool. + + ```rust + let x = foo(); + + x.bar(); + ``` + +When appropriate, make use of Rustdoc’s modifiers. Annotate triple grave blocks with +the appropriate formatting directive. + + ```rust + println!("Hello, world!"); + ``` + + ```ruby + puts "Hello" + ``` + +In API documentation, feel free to rely on the default being ‘rust’: + + /// For example: + /// + /// ``` + /// let x = 5; + /// ``` + +In long-form documentation, always be explicit: + + For example: + + ```rust + let x = 5; + ``` + +This will highlight syntax in places that do not default to ‘rust’, like GitHub. + +Rustdoc is able to test all Rust examples embedded inside of documentation, so +it’s important to mark what is not Rust so your tests don’t fail. + +References and citation should be linked ‘reference style.’ Prefer + +``` +[Rust website] + +[Rust website]: http://www.rust-lang.org +``` + +to + +``` +[Rust website](http://www.rust-lang.org) +``` + +### Examples in API docs +[examples-in-api-docs]: #examples-in-api-docs + +Everything should have examples. Here is an example of how to do examples: + +``` +/// # Examples +/// +/// Basic usage: +/// +/// ``` +/// use op; +/// +/// let s = "foo"; +/// let answer = op::compare(s, "bar"); +/// ``` +/// +/// Passing a closure to compare with, rather than a string: +/// +/// ``` +/// use op; +/// +/// let s = "foo"; +/// let answer = op::compare(s, |a| a.chars().is_whitespace().all()); +/// ``` +``` + +For particularly simple APIs, still say “Examples” and “Basic usage:” for +consistency’s sake. + +### Referring to types +[referring-to-types]: #referring-to-types + +When talking about a type, use its full name. In other words, if the type is generic, +say `Option`, not `Option`. An exception to this is lengthy bounds. Write `Cow<'a, B>` +rather than `Cow<'a, B> where B: 'a + ToOwned + ?Sized`. + +Another possibility is to write in lower case using a more generic term. In other words, +‘string’ can refer to a `String` or an `&str`, and ‘an option’ can be ‘an `Option`’. + +### Link all the things +[link-all-the-things]: #link-all-the-things + +A major drawback of Markdown is that it cannot automatically link types in API documentation. +Do this yourself with the reference-style syntax, for ease of reading: + +``` +/// The [`String`] passed in lorum ipsum... +/// +/// [`String`]: ../string/struct.String.html +``` + +## Example +[example]: #example + +Below is a full crate, with documentation following these rules: + +```rust +``` + +## Formatting + +# Drawbacks +[drawbacks]: #drawbacks + +It’s possible that RFC 505 went far enough, and something this detailed is inappropriate. + +# Alternatives +[alternatives]: #alternatives + +We could stick with the more minimal conventions of the previous RFC. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None. From c3d44cd6bf91da54827e7c00206e9f5a51b8230f Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Thu, 31 Mar 2016 18:34:45 -0400 Subject: [PATCH 0855/1195] typo --- text/0000-more-api-documentation-conventions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index 47b52584f43..9bb20a3b523 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -7,7 +7,7 @@ [summary]: #summary [RFC 505] introduced certain conventions around documenting Rust projects. This RFC supersedes -that one, thought it has the same aims: to describe how the Rust project should be documented, +that one, though it has the same aims: to describe how the Rust project should be documented, and provide guidance for other Rust projects as well. This RFC will contain some similar text as RFC 505, so that we can have one RFC with the full From 9c42f45387d979c3af1bf1afe4b33659ca7e1c63 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 1 Apr 2016 22:12:51 +1300 Subject: [PATCH 0856/1195] Procedural macros This RFC proposes an evolution of Rust's procedural macro system (aka syntax extensions, aka compiler plugins). This RFC specifies syntax for the definition of procedural macros, a high-level view of their implementation in the compiler, and outlines how they interact with the compilation process. At the highest level, macros are defined by implementing functions marked with a `#[macro]` attribute. Macros operate on a list of tokens provided by the compiler and return a list of tokens that the macro use is replaced by. We provide low-level facilities for operating on these tokens. Higher level facilities (e.g., for parsing tokens to an AST) should exist as library crates. --- text/0000-proc-macros.md | 417 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 417 insertions(+) create mode 100644 text/0000-proc-macros.md diff --git a/text/0000-proc-macros.md b/text/0000-proc-macros.md new file mode 100644 index 00000000000..429b47881c5 --- /dev/null +++ b/text/0000-proc-macros.md @@ -0,0 +1,417 @@ +- Feature Name: procedural_macros +- Start Date: 2016-02-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC proposes an evolution of Rust's procedural macro system (aka syntax +extensions, aka compiler plugins). This RFC specifies syntax for the definition +of procedural macros, a high-level view of their implementation in the compiler, +and outlines how they interact with the compilation process. + +This RFC specifies the architecture of the procedural macro system. It relies on +[RFC 1561](https://github.com/rust-lang/rfcs/pull/1561) which specifies the +naming and modularisation of macros. It leaves many of the details for further +RFCs, in particular the details of the APIs available to macro authors +(tentatively called `libmacro`). See this [blog post](http://ncameron.org/blog/libmacro/) +for some ideas of how that might look. + +At the highest level, macros are defined by implementing functions marked with +a `#[macro]` attribute. Macros operate on a list of tokens provided by the +compiler and return a list of tokens that the macro use is replaced by. We +provide low-level facilities for operating on these tokens. Higher level +facilities (e.g., for parsing tokens to an AST) should exist as library crates. + + +# Motivation +[motivation]: #motivation + +Procedural macros have long been a part of Rust and have been used for diverse +and interesting purposes, for example [compile-time regexes](https://github.com/rust-lang-nursery/regex), +[serialisation](https://github.com/serde-rs/serde), and +[design by contract](https://github.com/nrc/libhoare). They allow the ultimate +flexibility in syntactic abstraction, and offer possibilities for efficiently +using Rust in novel ways. + +Procedural macros are currently unstable and are awkward to define. We would +like to remedy this by implementing a new, simpler system for procedural macros, +and for this new system to be on the usual path to stabilisation. + +One major problem with the current system is that since it is based on ASTs, if +we change the Rust language (even in a backwards compatible way) we can easily +break procedural macros. Therefore, offering the usual backwards compatibility +guarantees to procedural macros, would inhibit our ability to evolve the +language. By switching to a token-based (rather than AST- based) system, we hope +to avoid this problem. + +# Detailed design +[design]: #detailed-design + +There are two kinds of procedural macro: function-like and macro-like. These two +kinds exist today, and other than naming (see +[RFC 1561](https://github.com/rust-lang/rfcs/pull/1561)) the syntax for using +these macros remains unchanged. If the macro is called `foo`, then a function- +like macro is used with syntax `foo!(...)`, and an attribute-like macro with +`#[foo(...)] ...`. Macros may be used in the same places as `macro_rules` macros +and this remains unchanged. + +To define a procedural macro, the programmer must write a function with a +specific signature and attribute. Where `foo` is the name of a function-like +macro: + +``` +#[macro] +pub fn foo(TokenStream, &mut MacroContext) -> TokenStream; +``` + +The first argument is the tokens between the delimiters in the macro use. +For example in `foo!(a, b, c)`, the first argument would be `[Ident(a), Comma, +Ident(b), Comma, Ident(c)]`. + +The value returned replaces the macro use. + +Attribute-like: + +``` +#[macro_attribute] +pub fn foo(Option, TokenStream, &mut MacroContext) -> TokenStream; +``` + +The first argument is a list of the tokens between the delimiters in the macro +use. Examples: + +* `#[foo]` => `None` +* `#[foo()]` => `Some([])` +* `#[foo(a, b, c)]` => `Some([Ident(a), Comma, Ident(b), Comma, Ident(c)])` + +The second argument is the tokens for the AST node the attribute is placed on. +Note that in order to compute the tokens to pass here, the compiler must be able +to parse the code the attribute is applied to. However, the AST for the node +passed to the macro is discarded, it is not passed to the macro nor used by the +compiler (in practice, this might not be 100% true due to optimisiations). If +the macro wants an AST, it must parse the tokens itself. + +The attribute and the AST node it is applied to are both replaced by the +returned tokens. In most cases, the tokens returned by a procedural macro will +be parsed by the compiler. It is the procedural macro's responsibility to ensure +that the tokens parse without error. In some cases, the tokens will be consumed +by another macro without parsing, in which case they do not need to parse. The +distinction is not statically enforced. It could be, but I don't think the +overhead would be justified. + +We also introduce a special configuration option: `#[cfg(macro)]`. Items with +this configuration are not macros themselves but are compiled only for macro +uses. + +Initially, it will only be legal to apply `#[cfg(macro)]` to a whole crate and +the `#[macro]` and `#[macro_attribute]` attributes may only appear within a +`#[cfg(macro)]` crate. This has the effect of partitioning crates into macro- +defining and non-macro defining crates. Macros may not be used in the crate in +which they are defined, although they may be called as regular functions. In the +future, I hope we can relax these restrictions so that macro and non-macro code +can live in the same crate. + +Importing macros for use means using `extern crate` to make the crate available +and then using `use` imports or paths to name macros, just like other items. +Again, see [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561) for more +details. + +When a `#[cfg(macro)]` crate is `extern crate`ed, it's items (even public ones) +are not available to the importing crate; only macros declared in that crate. +The crate is dynamically linked with the compiler at compile-time, rather +than with the importing crate at runtime. + + +## Writing procedural macros + +Procedural macro authors should not use the compiler crates (libsyntax, etc.). +Using these will remain unstable. We will make available a new crate, libmacro, +which will follow the usual path to stabilisation, will be part of the Rust +distribution, and will be required to be used by procedural macros (because, at +the least, it defines the types used in the required signatures). + +The details of libmacro will be specified in a future RFC. In the meantime, this +[blog post](http://ncameron.org/blog/libmacro/) gives an idea of what it might +contain. + +The philosophy here is that libmacro will contain low-level tools for +constructing macros, dealing with tokens, hygiene, pattern matching, quasi- +quoting, interactions with the compiler, etc. For higher level abstractions +(such as parsing and an AST), macros should use external libraries (there are no +restrictions on `#[cfg(macro)]` crates using other crates). + +The `MacroContext` is an object passed to all procedural macro definitions. It +is the main entry point to the libmacro API and for interaction with the +compiler. Via the `MacroContext`, a procedural macro can access information +about the context in which it is used and defined, and perform operations which +rely on the state of the compiler. It will be more fully defined in the upcoming +RFC proposing libmacro. + +Rust macros are hygienic by default. Hygiene is a large and complex subject, but +to summarise: effectively, naming takes place in the context of the macro +definition, not the expanded macro. + +Procedural macros often want to bend the rules around macro hygiene, for example +to make items or variables more widely nameable than they would be by default. +Procedural macros will be able to take part in the application of the hygiene +algorithm via libmacro. Again, full details must wait for the libmacro RFC and a +sketch is available in this [blog post](http://ncameron.org/blog/libmacro/). + + +## Tokens + +Procedural macros will primarily operate on tokens. There are two main benefits +to this principal: flexibility and future proofing. By operating on tokens, code +passed to procedural macros does not need to satisfy the Rust parser, only the +lexer. Stabilising an interface based on tokens means we need only commit to +not changing the rules around those tokens, not the whole grammar. I.e., it +allows us to change the Rust grammar without breaking procedural macros. + +In order to make the token-based interface even more flexible and future-proof, +I propose a simpler token abstraction than is currently used in the compiler. +The proposed system may be used directly in the compiler or may be an interface +wrapper over a more efficient representation. + +Since macro expansion will not operate purely on tokens, we must keep hygiene +information on tokens, rather than on `Ident` AST nodes (we might be able to +optimise by not keeping such info for all tokens, but that is an implementation +detail). We will also keep span information for each token, since that is where +a record of macro expansion is maintained (and it will make life easier for +tools. Again, we might optimise internally). + +A token is a single lexical element, for example, a numeric literal, a word +(which could be an identifier or keyword), a string literal, or a comment. + +A token stream is a sequence of tokens, e.g., `a b c;` is a stream of four +tokens - `['a', 'b', 'c', ';'']`. + +A token tree is a tree structure where each leaf node is a token and each +interior node is a token stream. I.e., a token stream which can contain nested +token streams. A token tree can be delimited, e.g., `a (b c);` will give +`TT(None, ['a', TT(Some('()'), ['b', 'c'], ';'']))`. An undelimited token tree +is useful for grouping tokens due to expansion, without representation in the +source code. That could be used for unsafety hygiene, or to affect precedence +and parsing without affecting scoping. They also replace the interpolated AST +tokens currently in the compiler. + +In code: + +``` +// We might optimise this representation +pub struct TokenStream(Vec); + +// A borrowed TokenStream +pub struct TokenSlice<'a>(&'a [TokenTree]); + +// A token or token tree. +pub struct TokenTree { + pub kind: TokenKind, + pub span: Span, + pub hygiene: HygieneObject, +} + +pub enum TokenKind { + Sequence(Delimiter, Vec), + + // The content of the comment can be found from the span. + Comment(CommentKind), + // The Span is the span of the string itself, without delimiters. + String(Span, StringKind), + + // These tokens are treated specially since they are used for macro + // expansion or delimiting items. + Exclamation, // `!` + Dollar, // `$` + // Not actually sure if we need this or if semicolons can be treated like + // other punctuation. + Semicolon, // `;` + Eof, + + // Word is defined by Unicode Standard Annex 31 - + // [Unicode Identifier and Pattern Syntax](http://unicode.org/reports/tr31/) + Word(InternedString), + Punctuation(char), +} + +pub enum Delimiter { + None, + // { } + Brace, + // ( ) + Parenthesis, + // [ ] + Bracket, +} + +pub enum CommentKind { + Regular, + InnerDoc, + OuterDoc, +} + +pub enum StringKind { + Regular, + // usize is for the count of `#`s. + Raw(usize), + Byte, + RawByte(usize), +} +``` + + +## Staging + +1. Implement [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561). +2. Implement `#[macro]` and `#[cfg(macro)]` and the function approach to + defining macros. However, pass the existing data structures to the macros, + rather than tokens and `MacroContext`. +3. Implement libmacro and make this available to macros. At this stage both old + and new macros are available (functions with different signatures). This will + require an RFC and considerable refactoring of the compiler. +4. Implement some high-level macro facilities in external crates on top of + libmacro. It is hoped that much of this work will be community-led. +5. After some time to allow conversion, deprecate the old-style macros. Later, + remove old macros completely. + + +# Drawbacks +[drawbacks]: #drawbacks + +Procedural macros are a somewhat unpleasant corner of Rust at the moment. It is +hard to argue that some kind of reform is unnecessary. One could find fault with +this proposed reform in particular (see below for some alternatives). Some +drawbacks that come to mind: + +* providing such a low-level API risks never seeing good high-level libraries; +* the design is complex and thus will take some time to implement and stabilise, + meanwhile unstable procedural macros are a major pain point in current Rust; +* dealing with tokens and hygiene may discourage macro authors due to complexity, + hopefully that is addressed by library crates. + +The actual concept of procedural macros also have drawbacks: executing arbitrary +code in the compiler makes it vulnerable to crashes and possibly security issues, +macros can introduce hard to debug errors, macros can make a program hard to +comprehend, it risks creating de facto dialects of Rust and thus fragmentation +of the ecosystem, etc. + +# Alternatives +[alternatives]: #alternatives + +We could keep the existing system or remove procedural macros from Rust. + +We could have an AST-based (rather than token-based) system. This has major +backwards compatibility issues. + +We could allow pluging in at later stages of compilation, giving macros access +to type information, etc. This would allow some really interesting tools. +However, it has some large downsides - it complicates the whole compilation +process (not just the macro system), it pollutes the whole compiler with macro +knowledge, rather than containing it in the frontend, it complicates the design +of the interface between the compiler and macro, and (I believe) the use cases +are better addressed by compiler plug-ins or tools based on the compiler (the +latter can be written today, the former require more work on an interface to the +compiler to be practical). + +We could have a dedicated syntax for procedural macros, similar to the +`macro_rules` syntax for macros by example. Since a procedural macro is really +just a Rust function, I believe using a function is better. I have also not been +able to come up with (or seen suggestions for) a good alternative syntax. It +seems reasonable to expect to write Rust macros in Rust (although there is +nothing stopping a macro author from using FFI and some other language to write +part or all of a macro). + +For attribute-like macros on items, it would be nice if we could skip parsing +the annotated item until after macro expansion. That would allow for more +flexible macros, since the input would not be constrained to Rust syntax. However, +this would require identifying items from tokens, rather than from the AST, which +would require additional rules on token trees and may not be possible. + + +# Unresolved questions +[unresolved]: #unresolved-questions + +### macros with an extra identifier + +We currently allow procedural macros to take an extra ident after the macro name +and before the arguments, e.g., `foo! bar(...)` where `foo` is the macro name +and `bar` is the extra identifier. This is used for `macro_rules` and is useful +for macros which define classes of items, rather than instances of items. E.g., +a `struct!` macro might be used similarly to the `struct` keyword. + +My feeling is that this macro form is not used enough to justify its existence. +From a design perspective, it encourages uses of macros for language extension, +rather than syntactic abstraction. I feel that such macros are at higher risk of +making programs incomprehensible and of fragmenting the ecosystem). + +Therefore, I would like to remove them from the language. Alternatively, they +could be incorporated into the new design by having another kind of macro +function: + +``` +#[macro_with_ident] +pub fn foo(&Token, TokenStream, &mut MacroContext) -> TokenStream; +``` + +where the first argument is the extra identifier. + + +### Linking model + +Currently, procedural macros are dynamically linked with the compiler. This +prevents the compiler being statically linked, which is sometimes desirable. An +alternative architecture would have procedural macros compiled as independent +programs and have them communicate with the compiler via IPC. + +This would have the advantage of allowing static linking for the compiler and +would prevent procedural macros from crashing the main compiler process. +However, designing a good IPC interface is complicated because there is a lot of +data that might be exchanged between the compiler and the macro. + +I think we could first design the syntax, interfaces, etc. and later evolve into +a process-separated model (if desired). However, if this is considered an +essential feature of macro reform, then we might want to consider the interfaces +more thoroughly with this in mind. + + +### Interactions with constant evaluation + +Both procedural macros and constant evaluation are mechanisms for running Rust +code at compile time. Currently, and under the proposed design, they are +considered completely separate features. There might be some benefit in letting +them interact. + + +### Inline procedural macros + +It would nice to allow procedural macros to be defined in the crate in which +they are used, as well as in separate crates (mentioned above). This complicates +things since it breaks the invariant that a crate is designed to be used at +either compile-time or runtime. I leave it for the future. + + +### Specification of the macro definition function signatures + +As proposed, the signatures of functions used as macro definitions are hard- +wired into the compiler. It would be more flexible to allow them to be specified +by a lang-item. I'm not sure how beneficial this would be, since a change to the +signature would require changing much of the procedural macro system. I propose +leaving them hard-wired, unless there is a good use case for the more flexible +approach. + + +### Specifying delimiters + +Under this RFC, a function-like macro use may use either parentheses, braces, or +square brackets. The choice of delimiter does not affect the semantics of the +macro (the rules requiring braces or a semi-colon for macro uses in item position +still apply). + +Which delimiter was used should be available to the macro implementation via the +`MacroContext`. I believe this is maximally flexible - the macro implementation +can throw an error if it doesn't like the delimiters used. + +We might want to allow the compiler to restrict the delimiters. Alternatively, +we might want to hide the information about the delimiter from the macro author, +so as not to allow errors regarding delimiter choice to affect the user. From 5a3abd22df2a2fa940a4c5f96d328e3e99fd9c10 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 1 Apr 2016 14:21:00 +0200 Subject: [PATCH 0857/1195] Most of the changes suggested by feedback during FCP period. * Removed associated `Error` type from `Allocator` trait; all methods now use `AllocErr` for error type. Removed `AllocError` trait and `MemoryExhausted` error. * Removed `fn max_size` and `fn max_align` methods; we can put them back later if someone demonstrates a need for them. * Added `fn realloc_in_place`. --- text/0000-kinds-of-allocators.md | 355 +++++++++---------------------- 1 file changed, 101 insertions(+), 254 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 762b57d6121..61b1b784c01 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -227,7 +227,8 @@ much we have allocated from the backing storage. sharing this demo allocator across scoped threads.) ```rust -struct DumbBumpPool { +#[derive(Debug)] +pub struct DumbBumpPool { name: &'static str, ptr: *mut u8, end: *mut u8, @@ -283,48 +284,7 @@ impl Drop for DumbBumpPool { } ``` -Now, before we get into the trait implementation itself, here is an -interesting simple design choice: - - * To show-off the error abstraction in the API, we make a special - error type that covers a third case that is not part of the - standard `enum AllocErr`. - -Specifically, our bump allocator has *three* error conditions that we -will expose: - - 1. the inputs could be invalid, - - 2. the memory could be exhausted, or, - - 3. there could be *interference* between two threads. - This latter scenario means that this allocator failed - on this memory request, but the client might - quite reasonably just *retry* the request. This is - an error condition specific to this allocator, so we - will identify it via a separate `fn is_transient` inherent - method. - -```rust -#[derive(Clone, PartialEq, Eq, Debug)] -pub enum BumpAllocError { - Invalid(&'static str), - MemoryExhausted(alloc::Layout), - Interference -} - -impl BumpAllocError { - pub fn is_transient(&self) -> bool { *self == BumpAllocError::Interference } -} - -impl alloc::AllocError for BumpAllocError { - fn invalid_input(details: &'static str) -> Self { BumpAllocError::Invalid(details) } - fn is_memory_exhausted(&self) -> bool { if let BumpAllocError::MemoryExhausted(_) = *self { true } else { false } } - fn is_request_unsupported(&self) -> bool { false } -} -``` - -With that out of the way, here are some other design choices of note: +Here are some other design choices of note: * Our Bump Allocator is going to use a most simple-minded deallocation policy: calls to `fn dealloc` are no-ops. Instead, every request takes @@ -352,29 +312,31 @@ unsafe impl Sync for DumbBumpPool { } Here is the demo implementation of `Allocator` for the type. ```rust -impl<'a> Allocator for &'a DumbBumpPool { - type Error = BumpAllocError; - - unsafe fn alloc(&mut self, layout: alloc::Layout) -> Result { - let curr = self.avail.load(Ordering::Relaxed) as usize; +unsafe impl<'a> Allocator for &'a DumbBumpPool { + unsafe fn alloc(&mut self, layout: alloc::Layout) -> Result { let align = *layout.align(); - let (sum, oflo) = curr.overflowing_add(align - 1); - let curr_aligned = sum & !(align - 1); let size = *layout.size(); - let remaining = (self.end as usize) - curr_aligned; - if oflo || remaining < size { - return Err(BumpAllocError::MemoryExhausted(layout.clone())); - } - let curr = curr as *mut u8; - let curr_aligned = curr_aligned as *mut u8; - let new_curr = curr_aligned.offset(size as isize); + loop { + let curr = self.avail.load(Ordering::Relaxed) as usize; + let (sum, oflo) = curr.overflowing_add(align - 1); + let curr_aligned = sum & !(align - 1); + let remaining = (self.end as usize) - curr_aligned; + if oflo || remaining < size { + return Err(AllocErr::Exhausted { request: layout.clone() }); + } - if curr != self.avail.compare_and_swap(curr, new_curr, Ordering::Relaxed) { - return Err(BumpAllocError::Interference); - } else { - println!("alloc finis ok: 0x{:x} size: {}", curr_aligned as usize, size); - return Ok(NonZero::new(curr_aligned)); + let curr = curr as *mut u8; + let curr_aligned = curr_aligned as *mut u8; + let new_curr = curr_aligned.offset(size as isize); + + // If the allocation attempt hits interference ... + if curr != self.avail.compare_and_swap(curr, new_curr, Ordering::Relaxed) { + continue; // .. then try again + } else { + // println!("alloc finis ok: 0x{:x} size: {}", curr_aligned as usize, size); + return Ok(NonZero::new(curr_aligned)); + } } } @@ -382,8 +344,10 @@ impl<'a> Allocator for &'a DumbBumpPool { // this bump-allocator just no-op's on dealloc } - fn oom(&mut self, err: Self::Error) -> ! { - panic!("exhausted memory in {} on request {:?}", self.name, err); + fn oom(&mut self, err: AllocErr) -> ! { + let remaining = self.end as usize - self.avail.load(Ordering::Relaxed) as usize; + panic!("exhausted memory in {} on request {:?} with avail: {}; self: {:?}", + self.name, err, remaining, self); } } @@ -795,40 +759,18 @@ of the preconditions hold. Finally, we get to object-oriented programming. -Since the `Allocator` trait has an associated error type, one -cannot just encode virtually-dispatched allocator objects with -`Box` or `&Allocator`; trait objects need to have -their associated types specified as part of the object trait. - In general, we expect allocator-parametric code to opt *not* to use trait objects to generalize over allocators, but instead to use generic types and instantiate those types with specific concrete allocators. -Nonetheless, it *is* an option to write `Box>`, or -`&Allocator`, when working with allocators that -use each corresponding error type. +Nonetheless, it *is* an option to write `Box` or `&Allocator`. * (The allocator methods that are not object-safe, like `fn alloc_one(&mut self)`, have a clause `where Self: Sized` to ensure that their presence does not cause the `Allocator` trait as a whole to become non-object-safe.) -To encourage client code that chooses to use trait objects for their -allocators to try to standardize on one choice of associated `Error` -type, we provide a convenience `type` definition for -[allocator objects][], `AllocatorObj`, which makes an opinionated -decision about which one of the "standard error types" is the "right -one" for such general purpose objects: namely, `AllocErr`, since it is -both cheap to construct but also can provide some amount of -context-sensitive information about the original cause of an -allocation error. - -However, the main point remains that we expect this object-oriented -usage of allocators to be rare. If this assumption turns out to be -incorrect, we should revisit these decisions before stabilizing the -allocator API (that would be the time to e.g. remove the associated -error type). ## Why this API [Why this API]: #why-this-api @@ -883,21 +825,12 @@ My hypothesis is that the standard allocator API should embrace `Result` as the standard way for describing local error conditions in Rust. -In principle, we can use `Result` without adding *any* additional -overhead (at least in terms of the size of the values being returned -from the allocation calls), because the error type for the `Result` -can be zero-sized if so desired. That is why the error is an -associated type of the `Allocator`: allocators that want to ensure the -results have minimum size can use the zero-sized `MemoryExhausted` type -as their associated `Self::Error`. - - * `MemoryExhausted` is a specific error type meant for allocators - that could in principle handle *any* sane input request, if there - were sufficient memory available. (By "sane" we mean for example - that the input arguments do not cause an arithmetic overflow during - computation of the size of the memory block -- if they do, then it - is reasonable for an allocator with this error type to respond that - insufficent memory was available, rather than e.g. panicking.) + * A previous version of this RFC attempted to ensure that the use of + the `Result` type could avoid any additional overhead over a raw + pointer return value, by using a `NonZero` address type and a + zero-sized error type attached to the trait via an associated + `Error` type. But during the RFC process we decided that this + was not necessary. ### Why return `Result` rather than directly `oom` on failure @@ -1247,13 +1180,6 @@ few motivating examples that *are* clearly feasible and useful. `Address` an abuse of the `NonZero` type? (Or do we just need some constructor for `NonZero` that asserts that the input is non-zero)? - * Should we get rid of the `AllocError` bound entirely? Is the given set - of methods actually worth providing to all generic clients? - - (Keeping it seems very low cost to me; implementors can always opt - to use the `MemoryExhausted` error type, which is cheap. But my - intuition may be wrong.) - * Do we need `Allocator::max_size` and `Allocator::max_align` ? * Should default impl of `Allocator::max_align` return `None`, or is @@ -1287,6 +1213,14 @@ few motivating examples that *are* clearly feasible and useful. * Revised `fn oom` method to take the `Self::Error` as an input (so that the allocator can, indirectly, feed itself information about what went wrong). +* Removed associated `Error` type from `Allocator` trait; all methods now use `AllocErr` + for error type. Removed `AllocError` trait and `MemoryExhausted` error. + +* Removed `fn max_size` and `fn max_align` methods; we can put them back later if + someone demonstrates a need for them. + +* Added `fn realloc_in_place`. + # Appendices ## Bibliography @@ -1457,7 +1391,6 @@ sub-divided roughly accordingly to functionality. issue = "27700")] use core::cmp; -use core::fmt; use core::mem; use core::nonzero::NonZero; use core::ptr::{self, Unique}; @@ -1762,87 +1695,14 @@ impl Layout { ``` -### AllocError API -[error api]: #allocerror-api +### AllocErr API +[error api]: #allocerr-api ```rust -/// `AllocError` instances provide feedback about the cause of an allocation failure. -pub trait AllocError: fmt::Debug { - /// Construct an error that indicates operation failure due to - /// invalid input values for the request. - /// - /// This can be used, for example, to signal that allocation of - /// a zero-sized type was requested. - /// - /// As another example, it might be used to signal that an overflow - /// occurred during arithmetic computation with the input. (However, - /// since overflows can also occur during large allocation requests - /// that would exhaust memory if arbitrary-precision arithmetic were - /// used, clients are alternatively allowed to constuct an error - /// representing memory exhaustion in this scenario.) - fn invalid_input(details: &'static str) -> Self where Self: Sized; - - /// Returns true if the error is due to hitting some resource - /// limit, or otherwise running out of memory. This condition - /// serves as a hint that some series of deallocations *might* - /// allow a subsequent reissuing of the original allocation - /// request to succeed. - /// - /// Exhaustion is a common interpretation of an allocation failure; - /// e.g. usually when `malloc` returns `null`, it is because of - /// hitting a user resource limit or system memory exhaustion. - /// - /// Note that the resource exhaustion could be internal to the - /// original allocator (i.e. the only way to free up memory is by - /// deallocating memory attached to that allocator), or it could - /// be associated with some other state external to the original - /// allocator (e.g. freeing up memory or reducing fragmentation - /// globally might allow a call to the system `malloc` to succeed). - /// The `AllocError` trait does not distinguish between the two - /// scenarios (but instances of the associated `Allocator::Error` - /// type might provide ways to distinguish them). - /// - /// Finally, error responses to allocation input requests that are - /// *always* illegal for *any* allocator (e.g. zero-sized or - /// arithmetic-overflowing requests) are allowed to respond `true` - /// here. (This is to allow `MemoryExhausted` as a valid - /// zero-sized error type for an allocator that can handle all - /// "sane" requests.) - fn is_memory_exhausted(&self) -> bool; - - /// Returns true if the allocator is fundamentally incapable of - /// satisfying the original request. This condition implies that - /// such an allocation request would never succeed on *this* - /// allocator, regardless of environment, memory pressure, or - /// other contextual condtions. - /// - /// An example where this might arise: A block allocator that only - /// supports satisfying memory requests where each allocated block - /// is at most `K` bytes in size. - fn is_request_unsupported(&self) -> bool; -} - -/// The `MemoryExhausted` error represents a blanket condition -/// that the given request was not satisifed for some reason beyond -/// any particular limitations of a given allocator. -/// -/// It roughly corresponds to getting `null` back from a call to `malloc`: -/// you've probably exhausted memory (though there might be some other -/// explanation; see discussion with `AllocError::is_memory_exhausted`). -/// -/// Allocators that can in principle allocate any kind of legal input -/// might choose this as their associated error type. -#[derive(Copy, Clone, PartialEq, Eq, Debug)] -pub struct MemoryExhausted; - /// The `AllocErr` error specifies whether an allocation failure is /// specifically due to resource exhaustion or if it is due to /// something wrong when combining the given input arguments with this /// allocator. - -/// Allocators that only support certain classes of inputs might choose this -/// as their associated error type, so that clients can respond appropriately -/// to specific error failure scenarios. #[derive(Clone, PartialEq, Eq, Debug)] pub enum AllocErr { /// Error due to hitting some resource limit or otherwise running @@ -1859,24 +1719,23 @@ pub enum AllocErr { Unsupported { details: &'static str }, } -impl AllocError for MemoryExhausted { - fn invalid_input(_details: &'static str) -> Self { MemoryExhausted } - fn is_memory_exhausted(&self) -> bool { true } - fn is_request_unsupported(&self) -> bool { false } -} - -impl AllocError for AllocErr { - fn invalid_input(details: &'static str) -> Self { +impl AllocErr { + pub fn invalid_input(details: &'static str) -> Self { AllocErr::Unsupported { details: details } } - fn is_memory_exhausted(&self) -> bool { + pub fn is_memory_exhausted(&self) -> bool { if let AllocErr::Exhausted { .. } = *self { true } else { false } } - fn is_request_unsupported(&self) -> bool { + pub fn is_request_unsupported(&self) -> bool { if let AllocErr::Unsupported { .. } = *self { true } else { false } } } +/// The `CannotReallocInPlace` error is used when `fn realloc_in_place` +/// was unable to reuse the given memory block for a requested layout. +#[derive(Clone, PartialEq, Eq, Debug)] +pub struct CannotReallocInPlace; + ``` ### Allocator trait header @@ -1909,12 +1768,6 @@ impl AllocError for AllocErr { /// `usable_size`. /// pub unsafe trait Allocator { - /// When allocation requests cannot be satisified, an instance of - /// this error is returned. - /// - /// Many allocators will want to use the zero-sized - /// `MemoryExhausted` type for this. - type Error: AllocError; ``` @@ -1937,7 +1790,7 @@ pub unsafe trait Allocator { /// not a strict requirement. (Specifically: it is *legal* to use /// this trait to wrap an underlying native allocation library /// that aborts on memory exhaustion.) - unsafe fn alloc(&mut self, layout: Layout) -> Result; + unsafe fn alloc(&mut self, layout: Layout) -> Result; /// Deallocate the memory referenced by `ptr`. /// @@ -1959,7 +1812,7 @@ pub unsafe trait Allocator { /// instead they should return an appropriate error from the /// invoked method, and let the client decide whether to invoke /// this `oom` method. - fn oom(&mut self, _: Self::Error) -> ! { + fn oom(&mut self, _: AllocErr) -> ! { unsafe { ::core::intrinsics::abort() } } ``` @@ -1969,24 +1822,7 @@ pub unsafe trait Allocator { ```rust // == ALLOCATOR-SPECIFIC QUANTITIES AND LIMITS == - // max_size, max_align, usable_size - - /// The maximum requestable size in bytes for memory blocks - /// managed by this allocator. - /// - /// Returns `None` if this allocator has no explicit maximum size. - /// (Note that such allocators may well still have an *implicit* - /// maximum size; i.e. allocation requests can always fail.) - fn max_size(&self) -> Option { None } - - /// The maximum requestable alignment in bytes for memory blocks - /// managed by this allocator. - /// - /// Returns `None` if this allocator has no assigned maximum - /// alignment. (Note that such allocators may well still have an - /// *implicit* maximum alignment; i.e. allocation requests can - /// always fail.) - fn max_align(&self) -> Option { None } + // usable_size /// Returns bounds on the guaranteed usable size of a successful /// allocation created with the specified `layout`. @@ -2037,10 +1873,9 @@ pub unsafe trait Allocator { /// /// Behavior undefined if either of latter two constraints are unmet. /// - /// In addition, `new_layout` should not impose a stronger alignment + /// In addition, `new_layout` should not impose a different alignment /// constraint than `layout`. (In other words, `new_layout.align()` - /// must evenly divide `layout.align()`; note this implies the - /// alignment of `new_layout` must not exceed that of `layout`.) + /// should equal `layout.align()`.) /// However, behavior is well-defined (though underspecified) when /// this constraint is violated; further discussion below. /// @@ -2056,8 +1891,9 @@ pub unsafe trait Allocator { /// alignment of `layout`, or if reallocation otherwise fails. (Note /// that did not say "if and only if" -- in particular, an /// implementation of this method *can* return `Ok` if - /// `new_layout.align() > old_layout.align()`; or it can return `Err` - /// in that scenario.) + /// `new_layout.align() != old_layout.align()`; or it can return `Err` + /// in that scenario, depending on whether this allocator + /// can dynamically adjust the alignment constraint for the block.) /// /// If this method returns `Err`, then ownership of the memory /// block has not been transferred to this allocator, and the @@ -2065,7 +1901,7 @@ pub unsafe trait Allocator { unsafe fn realloc(&mut self, ptr: Address, layout: Layout, - new_layout: Layout) -> Result { + new_layout: Layout) -> Result { let (min, max) = self.usable_size(&layout); let s = new_layout.size(); // All Layout alignments are powers of two, so a comparison @@ -2087,7 +1923,7 @@ pub unsafe trait Allocator { /// Behaves like `fn alloc`, but also returns the whole size of /// the returned block. For some `layout` inputs, like arrays, this /// may include extra storage usable for additional data. - unsafe fn alloc_excess(&mut self, layout: Layout) -> Result { + unsafe fn alloc_excess(&mut self, layout: Layout) -> Result { let usable_size = self.usable_size(&layout); self.alloc(layout).map(|p| Excess(p, usable_size.1)) } @@ -2098,12 +1934,40 @@ pub unsafe trait Allocator { unsafe fn realloc_excess(&mut self, ptr: Address, layout: Layout, - new_layout: Layout) -> Result { + new_layout: Layout) -> Result { let usable_size = self.usable_size(&new_layout); self.realloc(ptr, layout, new_layout) .map(|p| Excess(p, usable_size.1)) } + /// Attempts to extend the allocation referenced by `ptr` to fit `new_layout`. + /// + /// * `ptr` must have previously been provided via this allocator. + /// + /// * `layout` must *fit* the `ptr` (see above). (The `new_layout` + /// argument need not fit it.) + /// + /// Behavior undefined if either of latter two constraints are unmet. + /// + /// If this returns `Ok`, then the allocator has asserted that the + /// memory block referenced by `ptr` now fits `new_layout`, and thus can + /// be used to carry data of that layout. (The allocator is allowed to + /// expend effort to accomplish this, such as extending the memory block to + /// include successor blocks, or virtual memory tricks.) + /// + /// If this returns `Err`, then the allocator has made no assertion + /// about whether the memory block referenced by `ptr` can or cannot + /// fit `new_layout`. + /// + /// In either case, ownership of the memory block referenced by `ptr` + /// has not been transferred, and the contents of the memory block + /// are unaltered. + unsafe fn realloc_in_place(&mut self, + ptr: Address, + layout: Layout, + new_layout: Layout) -> Result<(), CannotReallocInPlace> { + Err(CannotReallocInPlace) + } ``` ### Allocator convenience methods for common usage patterns @@ -2121,14 +1985,14 @@ pub unsafe trait Allocator { /// `alloc`/`realloc` methods of this allocator. /// /// Returns `Err` for zero-sized `T`. - unsafe fn alloc_one(&mut self) -> Result, Self::Error> + unsafe fn alloc_one(&mut self) -> Result, AllocErr> where Self: Sized { if let Some(k) = Layout::new::() { self.alloc(k).map(|p|Unique::new(*p as *mut T)) } else { // (only occurs for zero-sized T) debug_assert!(mem::size_of::() == 0); - Err(Self::Error::invalid_input("zero-sized type invalid for alloc_one")) + Err(AllocErr::invalid_input("zero-sized type invalid for alloc_one")) } } @@ -2154,11 +2018,11 @@ pub unsafe trait Allocator { /// `alloc`/`realloc` methods of this allocator. /// /// Returns `Err` for zero-sized `T` or `n == 0`. - unsafe fn alloc_array(&mut self, n: usize) -> Result, Self::Error> + unsafe fn alloc_array(&mut self, n: usize) -> Result, AllocErr> where Self: Sized { match Layout::array::(n) { Some(layout) => self.alloc(layout).map(|p|Unique::new(*p as *mut T)), - None => Err(Self::Error::invalid_input("invalid layout for alloc_array")), + None => Err(AllocErr::invalid_input("invalid layout for alloc_array")), } } @@ -2173,28 +2037,28 @@ pub unsafe trait Allocator { unsafe fn realloc_array(&mut self, ptr: Unique, n_old: usize, - n_new: usize) -> Result, Self::Error> + n_new: usize) -> Result, AllocErr> where Self: Sized { let old_new_ptr = (Layout::array::(n_old), Layout::array::(n_new), *ptr); if let (Some(k_old), Some(k_new), ptr) = old_new_ptr { self.realloc(NonZero::new(ptr as *mut u8), k_old, k_new) .map(|p|Unique::new(*p as *mut T)) } else { - Err(Self::Error::invalid_input("invalid layout for realloc_array")) + Err(AllocErr::invalid_input("invalid layout for realloc_array")) } } /// Deallocates a block suitable for holding `n` instances of `T`. /// /// Captures a common usage pattern for allocators. - unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), Self::Error> + unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), AllocErr> where Self: Sized { let raw_ptr = NonZero::new(*ptr as *mut u8); if let Some(k) = Layout::array::(n) { self.dealloc(raw_ptr, k); Ok(()) } else { - Err(Self::Error::invalid_input("invalid layout for dealloc_array")) + Err(AllocErr::invalid_input("invalid layout for dealloc_array")) } } @@ -2320,20 +2184,3 @@ pub unsafe trait Allocator { } } ``` - -### Allocator trait objects -[allocator objects]: #allocator-trait-objects - -```rust -/// `AllocatorObj` is a convenience for making allocator trait objects -/// such as `Box` or `&AllocatorObj`. (One cannot just -/// write `Box` because the one must specify the associated -/// error type as part of the trait object. -/// -/// Since one is pays the cost of virtual function dispatch when -/// calling methods on trait objects, this definition uses `AllocErr` -/// to encode more information when signalling errors in these -/// objects, rather than using the content-impoverished -/// `MemoryExhausted` error type for the associated error type. -pub type AllocatorObj = Allocator; -``` From 7c2c4446129d1874e6d86a3125b31bcc8821e271 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 1 Apr 2016 15:25:28 +0200 Subject: [PATCH 0858/1195] Allow `Layout` to represent zero-sized layouts. Removed uses of `NonZero`. Revised specifications of (hopefully all) relevant methods to indicate that they may or may not support allocation of zero-sized layouts (but they should return an appropriate `Err` when given such). (Now, the requirement to return an `Err` does imply a branch that arguably we would like to avoid. It would be good to double-check this, and potentially try to inline the initial checks into the call site.) --- text/0000-kinds-of-allocators.md | 166 ++++++++++++++++--------------- 1 file changed, 86 insertions(+), 80 deletions(-) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 61b1b784c01..4146f6ceb2c 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -69,8 +69,8 @@ following: ## Allocators should feel "rustic" In addition, for Rust we want an allocator API design that leverages -the core type machinery and language idioms (e.g. using `Result`, with -a `NonZero` okay variant and a zero-sized error variant), and provides +the core type machinery and language idioms (e.g. using `Result` to +propagate dynamic error conditions), and provides premade functions for common patterns for allocator clients (such as allocating either single instances of a type, or arrays of some types of dynamically-determined length). @@ -314,11 +314,12 @@ Here is the demo implementation of `Allocator` for the type. ```rust unsafe impl<'a> Allocator for &'a DumbBumpPool { unsafe fn alloc(&mut self, layout: alloc::Layout) -> Result { - let align = *layout.align(); - let size = *layout.size(); + let align = layout.align(); + let size = layout.size(); + let mut curr_addr = self.avail.load(Ordering::Relaxed); loop { - let curr = self.avail.load(Ordering::Relaxed) as usize; + let curr = curr_addr as usize; let (sum, oflo) = curr.overflowing_add(align - 1); let curr_aligned = sum & !(align - 1); let remaining = (self.end as usize) - curr_aligned; @@ -326,16 +327,17 @@ unsafe impl<'a> Allocator for &'a DumbBumpPool { return Err(AllocErr::Exhausted { request: layout.clone() }); } - let curr = curr as *mut u8; let curr_aligned = curr_aligned as *mut u8; let new_curr = curr_aligned.offset(size as isize); + let attempt = self.avail.compare_and_swap(curr_addr, new_curr, Ordering::Relaxed); // If the allocation attempt hits interference ... - if curr != self.avail.compare_and_swap(curr, new_curr, Ordering::Relaxed) { + if curr_addr != attempt { + curr_addr = attempt; continue; // .. then try again } else { - // println!("alloc finis ok: 0x{:x} size: {}", curr_aligned as usize, size); - return Ok(NonZero::new(curr_aligned)); + println!("alloc finis ok: 0x{:x} size: {}", curr_aligned as usize, size); + return Ok(curr_aligned); } } } @@ -674,11 +676,6 @@ worth hard-coding into the method signatures. * Therefore, I made [type aliases][] for `Size`, `Capacity`, `Alignment`, and `Address`. -Furthermore, all values of the above types must be non-zero for any -allocation action to make sense. - - * Therefore, I made them instances of the `NonZero` type. - ### Basic implementation An instance of an allocator has many methods, but an implementor of @@ -1221,6 +1218,9 @@ few motivating examples that *are* clearly feasible and useful. * Added `fn realloc_in_place`. +* Removed uses of `NonZero`. Made `Layout` able to represent zero-sized layouts. + A given `Allocator` may or may not support zero-sized layouts. + # Appendices ## Bibliography @@ -1401,11 +1401,11 @@ use core::ptr::{self, Unique}; [type aliases]: #type-aliases ```rust -pub type Size = NonZero; -pub type Capacity = NonZero; -pub type Alignment = NonZero; +pub type Size = usize; +pub type Capacity = usize; +pub type Alignment = usize; -pub type Address = NonZero<*mut u8>; +pub type Address = *mut u8; /// Represents the combination of a starting address and /// a total capacity of the returned block. @@ -1426,8 +1426,7 @@ fn size_align() -> (usize, usize) { /// An instance of `Layout` describes a particular layout of memory. /// You build a `Layout` up as an input to give to an allocator. /// -/// All layouts have an associated positive size; note that this implies -/// zero-sized types have no corresponding layout. +/// All layouts have an associated non-negative size and positive alignment. #[derive(Clone, Debug, PartialEq, Eq)] pub struct Layout { // size of the requested block of memory, measured in bytes. @@ -1450,38 +1449,29 @@ pub struct Layout { impl Layout { // (private constructor) fn from_size_align(size: usize, align: usize) -> Layout { - assert!(align.is_power_of_two()); - let size = unsafe { assert!(size > 0); NonZero::new(size) }; - let align = unsafe { assert!(align > 0); NonZero::new(align) }; + assert!(align.is_power_of_two()); + assert!(align > 0); Layout { size: size, align: align } } /// The minimum size in bytes for a memory block of this layout. - pub fn size(&self) -> NonZero { self.size } + pub fn size(&self) -> usize { self.size } /// The minimum byte alignment for a memory block of this layout. - pub fn align(&self) -> NonZero { self.align } + pub fn align(&self) -> usize { self.align } /// Constructs a `Layout` suitable for holding a value of type `T`. - /// Returns `None` if no such layout exists (e.g. for zero-sized `T`). - pub fn new() -> Option { + pub fn new() -> Self { let (size, align) = size_align::(); - if size > 0 { Some(Layout::from_size_align(size, align)) } else { None } + Layout::from_size_align(size, align) } /// Produces layout describing a record that could be used to /// allocate backing structure for `T` (which could be a trait /// or other unsized type like a slice). - /// - /// Returns `None` when no such layout exists; for example, when `x` - /// is a reference to a zero-sized type. - pub fn for_value(t: &T) -> Option { + pub fn for_value(t: &T) -> Self { let (size, align) = (mem::size_of_val(t), mem::align_of_val(t)); - if size > 0 { - Some(Layout::from_size_align(size, align)) - } else { - None - } + Layout::from_size_align(size, align) } /// Creates a layout describing the record that can hold a value @@ -1499,8 +1489,8 @@ impl Layout { if align > self.align { let pow2_align = align.checked_next_power_of_two().unwrap(); debug_assert!(pow2_align > 0); // (this follows from self.align > 0...) - Layout { align: unsafe { NonZero::new(pow2_align) }, - ..*self } + Layout { align: pow2_align, + ..*self } } else { self.clone() } @@ -1518,9 +1508,9 @@ impl Layout { /// whole record, because `self.align` would not provide /// sufficient constraint. pub fn padding_needed_for(&self, align: Alignment) -> usize { - debug_assert!(*align <= *self.align()); - let len = *self.size(); - let len_rounded_up = (len + *align - 1) & !(*align - 1); + debug_assert!(align <= self.align()); + let len = self.size(); + let len_rounded_up = (len + align - 1) & !(align - 1); return len_rounded_up - len; } @@ -1531,9 +1521,8 @@ impl Layout { /// layout of the array and `offs` is the distance between the start /// of each element in the array. /// - /// On zero `n` or arithmetic overflow, returns `None`. + /// On arithmetic overflow, returns `None`. pub fn repeat(&self, n: usize) -> Option<(Self, usize)> { - if n == 0 { return None; } let padded_size = match self.size.checked_add(self.padding_needed_for(self.align)) { None => return None, Some(padded_size) => padded_size, @@ -1542,7 +1531,7 @@ impl Layout { None => return None, Some(alloc_size) => alloc_size, }; - Some((Layout::from_size_align(alloc_size, *self.align), padded_size)) + Some((Layout::from_size_align(alloc_size, self.align), padded_size)) } /// Creates a layout describing the record for `self` followed by @@ -1557,24 +1546,24 @@ impl Layout { /// /// On arithmetic overflow, returns `None`. pub fn extend(&self, next: Self) -> Option<(Self, usize)> { - let new_align = unsafe { NonZero::new(cmp::max(*self.align, *next.align)) }; + let new_align = cmp::max(self.align, next.align); let realigned = Layout { align: new_align, ..*self }; let pad = realigned.padding_needed_for(new_align); - let offset = *self.size() + pad; - let new_size = offset + *next.size(); - Some((Layout::from_size_align(new_size, *new_align), offset)) + let offset = self.size() + pad; + let new_size = offset + next.size(); + Some((Layout::from_size_align(new_size, new_align), offset)) } /// Creates a layout describing the record for `n` instances of /// `self`, with no padding between each instance. /// - /// On zero `n` or overflow, returns `None`. + /// On arithmetic overflow, returns `None`. pub fn repeat_packed(&self, n: usize) -> Option { let scaled = match self.size().checked_mul(n) { None => return None, Some(scaled) => scaled, }; - let size = unsafe { assert!(scaled > 0); NonZero::new(scaled) }; + let size = { assert!(scaled > 0); scaled }; Some(Layout { size: size, align: self.align }) } @@ -1594,12 +1583,11 @@ impl Layout { /// /// On arithmetic overflow, returns `None`. pub fn extend_packed(&self, next: Self) -> Option<(Self, usize)> { - let new_size = match self.size().checked_add(*next.size()) { + let new_size = match self.size().checked_add(next.size()) { None => return None, Some(new_size) => new_size, }; - let new_size = unsafe { NonZero::new(new_size) }; - Some((Layout { size: new_size, ..*self }, *self.size())) + Some((Layout { size: new_size, ..*self }, self.size())) } // Below family of methods *assume* inputs are pre- or @@ -1611,7 +1599,6 @@ impl Layout { // methods are `unsafe`. /// Creates layout describing the record for a single instance of `T`. - /// Requires `T` has non-zero size. pub unsafe fn new_unchecked() -> Self { let (size, align) = size_align::(); Layout::from_size_align(size, align) @@ -1676,7 +1663,7 @@ impl Layout { /// On zero `n`, zero-sized `T`, or arithmetic overflow, returns `None`. pub fn array(n: usize) -> Option { Layout::new::() - .and_then(|k| k.repeat(n)) + .repeat(n) .map(|(k, offs)| { debug_assert!(offs == mem::size_of::()); k @@ -1716,6 +1703,9 @@ pub enum AllocErr { /// such an allocation request will never succeed on the given /// allocator, regardless of environment, memory pressure, or /// other contextual condtions. + /// + /// For example, an allocator that does not support zero-sized + /// blocks can return this error variant. Unsupported { details: &'static str }, } @@ -1913,7 +1903,7 @@ pub unsafe trait Allocator { let old_size = layout.size(); let result = self.alloc(new_layout); if let Ok(new_ptr) = result { - ptr::copy(*ptr as *const u8, *new_ptr, cmp::min(*old_size, *new_size)); + ptr::copy(ptr as *const u8, new_ptr, cmp::min(old_size, new_size)); self.dealloc(ptr, layout); } result @@ -1966,6 +1956,7 @@ pub unsafe trait Allocator { ptr: Address, layout: Layout, new_layout: Layout) -> Result<(), CannotReallocInPlace> { + let (_, _, _) = (ptr, layout, new_layout); Err(CannotReallocInPlace) } ``` @@ -1984,14 +1975,13 @@ pub unsafe trait Allocator { /// The returned block is suitable for passing to the /// `alloc`/`realloc` methods of this allocator. /// - /// Returns `Err` for zero-sized `T`. + /// May return `Err` for zero-sized `T`. unsafe fn alloc_one(&mut self) -> Result, AllocErr> where Self: Sized { - if let Some(k) = Layout::new::() { + let k = Layout::new::(); + if k.size() > 0 { self.alloc(k).map(|p|Unique::new(*p as *mut T)) } else { - // (only occurs for zero-sized T) - debug_assert!(mem::size_of::() == 0); Err(AllocErr::invalid_input("zero-sized type invalid for alloc_one")) } } @@ -2006,8 +1996,8 @@ pub unsafe trait Allocator { /// Captures a common usage pattern for allocators. unsafe fn dealloc_one(&mut self, mut ptr: Unique) where Self: Sized { - let raw_ptr = NonZero::new(ptr.get_mut() as *mut T as *mut u8); - self.dealloc(raw_ptr, Layout::new::().unwrap()); + let raw_ptr = ptr.get_mut() as *mut T as *mut u8; + self.dealloc(raw_ptr, Layout::new::()); } /// Allocates a block suitable for holding `n` instances of `T`. @@ -2017,12 +2007,20 @@ pub unsafe trait Allocator { /// The returned block is suitable for passing to the /// `alloc`/`realloc` methods of this allocator. /// - /// Returns `Err` for zero-sized `T` or `n == 0`. + /// May return `Err` for zero-sized `T` or `n == 0`. + /// + /// Always returns `Err` on arithmetic overflow. unsafe fn alloc_array(&mut self, n: usize) -> Result, AllocErr> where Self: Sized { match Layout::array::(n) { - Some(layout) => self.alloc(layout).map(|p|Unique::new(*p as *mut T)), - None => Err(AllocErr::invalid_input("invalid layout for alloc_array")), + Some(ref layout) if layout.size() > 0 => { + self.alloc(layout.clone()) + .map(|p| { + println!("alloc_array layout: {:?} yielded p: {:?}", layout, p); + Unique::new(p as *mut T) + }) + } + _ => Err(AllocErr::invalid_input("invalid layout for alloc_array")), } } @@ -2034,17 +2032,23 @@ pub unsafe trait Allocator { /// /// The returned block is suitable for passing to the /// `alloc`/`realloc` methods of this allocator. + /// + /// May return `Err` for zero-sized `T` or `n == 0`. + /// + /// Always returns `Err` on arithmetic overflow. unsafe fn realloc_array(&mut self, ptr: Unique, n_old: usize, n_new: usize) -> Result, AllocErr> where Self: Sized { - let old_new_ptr = (Layout::array::(n_old), Layout::array::(n_new), *ptr); - if let (Some(k_old), Some(k_new), ptr) = old_new_ptr { - self.realloc(NonZero::new(ptr as *mut u8), k_old, k_new) - .map(|p|Unique::new(*p as *mut T)) - } else { - Err(AllocErr::invalid_input("invalid layout for realloc_array")) + match (Layout::array::(n_old), Layout::array::(n_new), *ptr) { + (Some(ref k_old), Some(ref k_new), ptr) if k_old.size() > 0 && k_new.size() > 0 => { + self.realloc(ptr as *mut u8, k_old.clone(), k_new.clone()) + .map(|p|Unique::new(p as *mut T)) + } + _ => { + Err(AllocErr::invalid_input("invalid layout for realloc_array")) + } } } @@ -2053,12 +2057,14 @@ pub unsafe trait Allocator { /// Captures a common usage pattern for allocators. unsafe fn dealloc_array(&mut self, ptr: Unique, n: usize) -> Result<(), AllocErr> where Self: Sized { - let raw_ptr = NonZero::new(*ptr as *mut u8); - if let Some(k) = Layout::array::(n) { - self.dealloc(raw_ptr, k); - Ok(()) - } else { - Err(AllocErr::invalid_input("invalid layout for dealloc_array")) + let raw_ptr = *ptr as *mut u8; + match Layout::array::(n) { + Some(ref k) if k.size() > 0 => { + Ok(self.dealloc(raw_ptr, k.clone())) + } + _ => { + Err(AllocErr::invalid_input("invalid layout for dealloc_array")) + } } } @@ -2166,7 +2172,7 @@ pub unsafe trait Allocator { let (k_old, k_new, ptr) = (Layout::array_unchecked::(n_old), Layout::array_unchecked::(n_new), *ptr); - self.realloc_unchecked(NonZero::new(ptr as *mut u8), k_old, k_new) + self.realloc_unchecked(ptr as *mut u8, k_old, k_new) .map(|p|Unique::new(*p as *mut T)) } @@ -2180,7 +2186,7 @@ pub unsafe trait Allocator { unsafe fn dealloc_array_unchecked(&mut self, ptr: Unique, n: usize) where Self: Sized { let layout = Layout::array_unchecked::(n); - self.dealloc(NonZero::new(*ptr as *mut u8), layout); + self.dealloc(*ptr as *mut u8, layout); } } ``` From 117e5fca0988a1b0c4b6e4de83b633f92b0c9c84 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Fri, 1 Apr 2016 15:57:47 +0200 Subject: [PATCH 0859/1195] Add mention of where `fn oom` should go. --- text/0000-kinds-of-allocators.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-kinds-of-allocators.md b/text/0000-kinds-of-allocators.md index 4146f6ceb2c..441ecf761e3 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/0000-kinds-of-allocators.md @@ -1162,6 +1162,12 @@ few motivating examples that *are* clearly feasible and useful. over the underlying system allocator, while the convenience methods would truly be convenient.) + * Should `oom` be a free-function rather than a method on `Allocator`? + (The reason I want it on `Allocator` is so that it can provide feedback + about the allocator's state at the time of the OOM. Zoxc has argued + on the RFC thread that some forms of static analysis, to prove `oom` is + never invoked, would prefer it to be a free function.) + # Unresolved questions [unresolved]: #unresolved-questions From 80265328e86aaae96d2dca96c6090ce6b1a54098 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sat, 2 Apr 2016 15:46:17 -0700 Subject: [PATCH 0860/1195] Make unions that want C layout use #[repr(C)] explicitly --- text/0000-union.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/text/0000-union.md b/text/0000-union.md index 21b413b1b9a..ab7ef1b117b 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -54,7 +54,8 @@ union MyUnion { } ``` -`union` implies `#[repr(C)]` as the default representation. +By default, a union uses an unspecified binary layout. A union declared with +the `#[repr(C)]` attribute will have the same layout as an equivalent C union. ## Contextual keyword @@ -139,6 +140,7 @@ allows matching on the tag and the corresponding field simultaneously: #[repr(u32)] enum Tag { I, F } +#[repr(C)] union U { i: i32, f: f32, @@ -253,12 +255,14 @@ invalid value. ## Union size and alignment -A union must have the same size and alignment as an equivalent C union -declaration for the target platform. Typically, a union would have the maximum -size of any of its fields, and the maximum alignment of any of its fields. -Note that those maximums may come from different fields; for instance: +A union declared with `#[repr(C)]` must have the same size and alignment as an +equivalent C union declaration for the target platform. Typically, a union +would have the maximum size of any of its fields, and the maximum alignment of +any of its fields. Note that those maximums may come from different fields; +for instance: ```rust +#[repr(C)] union U { f1: u16, f2: [u8; 4], From 7f40a6bd5858c035b859f0003ae2eed37f744905 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sat, 2 Apr 2016 15:49:48 -0700 Subject: [PATCH 0861/1195] Mention impl syntax explicitly --- text/0000-union.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-union.md b/text/0000-union.md index ab7ef1b117b..fdffc2558be 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -237,7 +237,8 @@ a field, should cause the compiler to treat the entire union as initialized. ## Unions and traits -A union may have trait implementations, using the same syntax as a struct. +A union may have trait implementations, using the same `impl` syntax as a +struct. The compiler should provide a lint if a union field has a type that implements the `Drop` trait. The compiler may optionally provide a pragma to disable that From b2e030930356b31d2d9439f9e9cfde4b4b88fc84 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sat, 2 Apr 2016 16:07:41 -0700 Subject: [PATCH 0862/1195] Discuss generic union --- text/0000-union.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/0000-union.md b/text/0000-union.md index fdffc2558be..10da05a9146 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -246,6 +246,17 @@ lint, for code that intentionally stores a type with Drop in a union. The compiler must never implicitly generate a Drop implementation for the union itself, though Rust code may explicitly implement Drop for a union type. +## Generic unions + +A union may have a generic type, with one or more type parameters or lifetime +parameters. As with a generic enum, the types within the union must make use +of all the parameters; however, not all fields within the union must use all +parameters. + +Type inference works on generic union types. In some cases, the compiler may +not have enough information to infer the parameters of a generic type, and may +require explicitly specifying them. + ## Unions and undefined behavior Rust code must not use unions to invoke [undefined From 3f456838f01fce474b27309bb29ce3653f25d081 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sat, 2 Apr 2016 16:08:46 -0700 Subject: [PATCH 0863/1195] Prohibit empty union declarations --- text/0000-union.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-union.md b/text/0000-union.md index 10da05a9146..1c264b3fb8d 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -57,6 +57,9 @@ union MyUnion { By default, a union uses an unspecified binary layout. A union declared with the `#[repr(C)]` attribute will have the same layout as an equivalent C union. +A union must have at least one field; an empty union declaration produces a +syntax error. + ## Contextual keyword Rust normally prevents the use of a keyword as an identifier; for instance, a From f73d084b617e2e08703cefa2db12ab354fdedf1f Mon Sep 17 00:00:00 2001 From: "Ryan Scheel (Havvy)" Date: Sun, 3 Apr 2016 02:24:42 +0000 Subject: [PATCH 0864/1195] Explain that RFCs can be changed based on feedback in Process. --- README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.md b/README.md index a46080d6075..cc04fd9bb25 100644 --- a/README.md +++ b/README.md @@ -182,6 +182,11 @@ stakeholders to discuss the issues in greater detail. * The sub-team will discuss the RFC PR, as much as possible in the comment thread of the PR itself. Offline discussion will be summarized on the PR comment thread. +* RFCs rarely go through this process unchanged, especially as alternatives and +drawbacks are shown. You can make edits, big and small, to the RFC to +clarify or change the design, but make changes as new commits to the PR, and +leave a comment on the PR explaining your changes. Specifically, do not squash +or rebase commits after they are visible on the PR. * Once both proponents and opponents have clarified and defended positions and the conversation has settled, the RFC will enter its *final comment period* (FCP). This is a final opportunity for the community to comment on the PR and is From 2af2202a235ce8659265c020d94890d616682d95 Mon Sep 17 00:00:00 2001 From: Robin Stocker Date: Mon, 4 Apr 2016 10:46:52 +1000 Subject: [PATCH 0865/1195] Fix typo in 0230-remove-runtime.md Remove a superfluous `)`. --- text/0230-remove-runtime.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0230-remove-runtime.md b/text/0230-remove-runtime.md index 20310dc716d..d852d475e43 100644 --- a/text/0230-remove-runtime.md +++ b/text/0230-remove-runtime.md @@ -66,7 +66,7 @@ tasks vary along several important dimensions: applies to long-running loops or page faults.) M:N models can deal with blocking in a couple of ways. The approach taken in - Java's [fork/join](http://gee.cs.oswego.edu/dl/papers/fj.pdf)) framework, for + Java's [fork/join](http://gee.cs.oswego.edu/dl/papers/fj.pdf) framework, for example, is to dynamically spin up/down worker threads. Alternatively, special task-aware blocking operations (including I/O) can be provided, which are mapped under the hood to nonblocking operations, allowing the worker thread to From f5eb7dce635cc8cb5dcea5abde7d311c9f051f7b Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 4 Apr 2016 13:59:22 -0700 Subject: [PATCH 0866/1195] Wordsmith a bit --- text/0000-cargo-workspace.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index 8b5aeb4331e..5b32490b249 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -126,15 +126,16 @@ if they both transitively have edges to one another. A valid workspace then has exactly one root crate with a `[workspace]` key. While the restriction of one-root-per workspace may make sense, the restriction -of crates transitively having edges to one another may seem a bit odd. The -intention is to ensure that the set of packages in a workspace is the same -regardless of which package is selected to start discovering a workspace from. - -With the implicit relations defined it's possible for a repository to not have a -root package yet still have path dependencies. In this situation each dependency -would not know how to get back to the "root package", so the workspace from the -point of view of the path dependencies would be different than that of the root -package. This could in turn lead to `Cargo.lock` getting out of sync. +of crates transitively having edges to one another may seem a bit odd. If, +however, this restriction were not in place then the set of crates in a +workspace may differ depending on which crate it was viewed from. For example if +crate A has a path dependency on B then it will think B is in A's workspace. If, +however, A was not in B's filesystem hierarchy, then B would not think that A +was in its workspace. This would in turn cause the set of crates in each +workspace to be different, futher causing `Cargo.lock` to get out of sync if it +were allowed. By ensuring that all crates have edges to each other in a +workspace Cargo can prevent this situation and guarantee robust builds no matter +where they're executed in the workspace. To alleviate misconfiguration, however, if the `workspace.members` configuration key contains a crate which is not a member of the constructed From 98dd9e4954de96861e96b49c28fe7ab759fe5e62 Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Tue, 5 Apr 2016 02:55:48 +0300 Subject: [PATCH 0867/1195] Fix nit --- text/1228-placement-left-arrow.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1228-placement-left-arrow.md b/text/1228-placement-left-arrow.md index 71f47e520e5..c5a76276328 100644 --- a/text/1228-placement-left-arrow.md +++ b/text/1228-placement-left-arrow.md @@ -182,7 +182,7 @@ lowest precedence) to highest in the language. The most prominent choices are: 3. More than assignment and binop-assignment, but less than any other operator: - This is what currently this RFC proposes. This allows for various + This is what this RFC currently proposes. This allows for various expressions involving equality symbols and `<-` to be parsed reasonably and consistently. For example `x = y <- z += a <- b <- c` would get parsed as `x = ((y <- z) += (a <- (b <- c)))`. From ffbb2f7f3acf08ebd23011d86b6bf70f4a42663a Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 5 Apr 2016 14:08:18 +1200 Subject: [PATCH 0868/1195] Change some uses of macro_rules! to macro! --- text/0000-macro-naming.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/text/0000-macro-naming.md b/text/0000-macro-naming.md index 669b9176174..680b4b5459b 100644 --- a/text/0000-macro-naming.md +++ b/text/0000-macro-naming.md @@ -51,9 +51,13 @@ work: ``` foo!(); -macro_rules! foo { ... } +macro! foo { ... } ``` +(Note, I'm using a hypothetical `macro!` defintion which I will define in a future +RFC. The reader can assume it works much like `macro_rules!`, but with the new +naming scheme). + Macro expansion order is also not defined by source order. E.g., in `foo!(); bar!();`, `bar` may be expanded before `foo`. Ordering is only guaranteed as far as it is necessary. E.g., if `bar` is only defined by expanding `foo`, then `foo` must be @@ -148,7 +152,7 @@ of Rust, see below. I would like that macros follow the same rules for privacy as other Rust items, i.e., they are private by default and may be marked as `pub` to make them public. This is not as straightforward as it sounds as it requires parsing `pub -macro_rules! foo` as a macro definition, etc. I leave this for a separate RFC. +macro! foo` as a macro definition, etc. I leave this for a separate RFC. ## Scoped attributes From e28f7a9b6ef664e8ddf596bf9d299bca27f661d7 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 5 Apr 2016 16:45:02 +1200 Subject: [PATCH 0869/1195] Some mostly minor changes --- text/0000-proc-macros.md | 61 +++++++++++++++++++++++++++++++--------- 1 file changed, 48 insertions(+), 13 deletions(-) diff --git a/text/0000-proc-macros.md b/text/0000-proc-macros.md index 429b47881c5..1ab3299f755 100644 --- a/text/0000-proc-macros.md +++ b/text/0000-proc-macros.md @@ -49,8 +49,8 @@ to avoid this problem. # Detailed design [design]: #detailed-design -There are two kinds of procedural macro: function-like and macro-like. These two -kinds exist today, and other than naming (see +There are two kinds of procedural macro: function-like and attribute-like. These +two kinds exist today, and other than naming (see [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561)) the syntax for using these macros remains unchanged. If the macro is called `foo`, then a function- like macro is used with syntax `foo!(...)`, and an attribute-like macro with @@ -120,8 +120,9 @@ details. When a `#[cfg(macro)]` crate is `extern crate`ed, it's items (even public ones) are not available to the importing crate; only macros declared in that crate. -The crate is dynamically linked with the compiler at compile-time, rather -than with the importing crate at runtime. +There should be a lint to warn about public items which will not be visible due +to `#[cfg(macro)]`. The crate is dynamically linked with the compiler at +compile-time, rather than with the importing crate at runtime. ## Writing procedural macros @@ -163,7 +164,7 @@ sketch is available in this [blog post](http://ncameron.org/blog/libmacro/). ## Tokens Procedural macros will primarily operate on tokens. There are two main benefits -to this principal: flexibility and future proofing. By operating on tokens, code +to this principle: flexibility and future proofing. By operating on tokens, code passed to procedural macros does not need to satisfy the Rust parser, only the lexer. Stabilising an interface based on tokens means we need only commit to not changing the rules around those tokens, not the whole grammar. I.e., it @@ -213,12 +214,20 @@ pub struct TokenTree { } pub enum TokenKind { - Sequence(Delimiter, Vec), + Sequence(Delimiter, TokenStream), // The content of the comment can be found from the span. Comment(CommentKind), - // The Span is the span of the string itself, without delimiters. - String(Span, StringKind), + + // Symbol is the string contents, not including delimiters. It would be nice + // to avoid an allocation in the common case that the string is in the + // source code. We might be able to use `&'Codemap str` or something. + // `Option is for the count of `#`s if the string is a raw string. If + // the string is not raw, then it will be `None`. + String(Symbol, Option, StringKind), + + // char literal, span includes the `'` delimiters. + Char(char), // These tokens are treated specially since they are used for macro // expansion or delimiting items. @@ -227,11 +236,11 @@ pub enum TokenKind { // Not actually sure if we need this or if semicolons can be treated like // other punctuation. Semicolon, // `;` - Eof, + Eof, // Do we need this? // Word is defined by Unicode Standard Annex 31 - // [Unicode Identifier and Pattern Syntax](http://unicode.org/reports/tr31/) - Word(InternedString), + Word(Symbol), Punctuation(char), } @@ -253,13 +262,34 @@ pub enum CommentKind { pub enum StringKind { Regular, - // usize is for the count of `#`s. - Raw(usize), Byte, - RawByte(usize), } + +// A Symbol is a possibly-interned string. +pub struct Symbol { ... } ``` +### Open question: `Punctuation(char)` and multi-char operators. + +Rust has many compound operators, e.g., `<<`. It's not clear how best to deal +with them. If the source code contains "`+ =`", it would be nice to distinguish +this in the token stream from "`+=`". On the other hand, if we represent `<<` as +a single token, then the macro may need to split them into `<`, `<` in generic +position. + +I had hoped to represent each character as a separate token. However, to make +pattern matching backwards compatible, we would need to combine some tokens. In +fact, if we want to be completely backwards compatible, we probably need to keep +the same set of compound operators as are defined at the moment. + +Some solutions: + +* `Punctuation(char)` with special rules for pattern matching tokens, +* `Punctuation([char])` with a facility for macros to split tokens. Tokenising + could match the maximum number of punctuation characters, or use the rules for + the current token set. The former would have issues with pattern matching. The + latter is a bit hacky, there would be backwards compatibility issues if we + wanted to add new compound operators in the future. ## Staging @@ -314,6 +344,9 @@ are better addressed by compiler plug-ins or tools based on the compiler (the latter can be written today, the former require more work on an interface to the compiler to be practical). +We could use the `macro` keyword rather than the `fn` keyword to declare a +macro. We would then not require a `#[macro]` attribute. + We could have a dedicated syntax for procedural macros, similar to the `macro_rules` syntax for macros by example. Since a procedural macro is really just a Rust function, I believe using a function is better. I have also not been @@ -374,6 +407,8 @@ a process-separated model (if desired). However, if this is considered an essential feature of macro reform, then we might want to consider the interfaces more thoroughly with this in mind. +A step in this direction might be to run the macro in its own thread, but in the +compiler's process. ### Interactions with constant evaluation From 1330d7b6beafa21e6aab4e49d6ecff1cbbcb4e06 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Mon, 4 Apr 2016 17:31:41 -0400 Subject: [PATCH 0870/1195] example docs --- ...0000-more-api-documentation-conventions.md | 145 +++++++++++++++++- 1 file changed, 144 insertions(+), 1 deletion(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index 9bb20a3b523..f2a81aa8157 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -243,9 +243,152 @@ Do this yourself with the reference-style syntax, for ease of reading: ## Example [example]: #example -Below is a full crate, with documentation following these rules: +Below is a full crate, with documentation following these rules. I am loosely basing +this off of my [ref_slice] crate, because it’s small, but I’m not claiming the code +is good here. It’s about the docs, not the code. + +[ref_slice]: https://crates.io/crates/ref_slice + +In lib.rs: + +```rust +//! Turning references into slices +//! +//! This crate contains several utility functions for taking various kinds +//! of references and producing slices out of them. In this case, only full +//! slices, not ranges for sub-slices. +//! +//! # Layout +//! +//! At the top level, we have functions for working with references, `&T`. +//! There are two submodules for dealing with other types: `option`, for +//! &[`Option`], and `mut`, for `&mut T`. +//! +//! [`Option`]: http://doc.rust-lang.org/std/option/enum.Option.html + +pub mod option; + +/// Converts a reference to `T` into a slice of length 1. +/// +/// This will not copy the data, only create the new slice. +/// +/// # Panics +/// +/// In this case, the code won’t panic, but if it did, the circumstances +/// in which it would would be included here. +/// +/// # Examples +/// +/// Basic usage: +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::ref_slice; +/// +/// let x = &5; +/// +/// let slice = ref_slice(x); +/// +/// assert_eq!(&[5], slice); +/// ``` +/// +/// A more compelx example. In this case, it’s the same example, because this +/// is a pretty trivial function, but use your imagination. +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::ref_slice; +/// +/// let x = &5; +/// +/// let slice = ref_slice(x); +/// +/// assert_eq!(&[5], slice); +/// ``` +pub fn ref_slice(s: &T) -> &[T] { + unimplemented!() +} + +/// Functions that operate on mutable references. +/// +/// This submodule mirrors the parent module, but instead of dealing with `&T`, +/// they’re for `&mut T`. +mod mut { + /// Converts a reference to `&mut T` into a mutable slice of length 1. + /// + /// This will not copy the data, only create the new slice. + /// + /// # Safety + /// + /// In this case, the code doesn’t need to be marked as unsafe, but if it + /// did, the invariants you’re expected to uphold would be documented here. + /// + /// # Examples + /// + /// Basic usage: + /// + /// ``` + /// extern crate ref_slice; + /// use ref_slice::mut; + /// + /// let x = &mut 5; + /// + /// let slice = mut::ref_slice(x); + /// + /// assert_eq!(&mut [5], slice); + /// ``` + pub fn ref_slice(s: &mut T) -> &mut [T] { + unimplemented!() + } +} +``` + +in `option.rs`: ```rust +//! Functions that operate on references to [`Option`]s. +//! +//! This submodule mirrors the parent module, but instead of dealing with `&T`, +//! they’re for `&`[`Option`]. +//! +//! [`Option`]: http://doc.rust-lang.org/std/option/enum.Option.html + +/// Converts a reference to `Option` into a slice of length 0 or 1. +/// +/// [`Option`]: http://doc.rust-lang.org/std/option/enum.Option.html +/// +/// This will not copy the data, only create the new slice. +/// +/// # Examples +/// +/// Basic usage: +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::option; +/// +/// let x = &Some(5); +/// +/// let slice = option::ref_slice(x); +/// +/// assert_eq!(&[5], slice); +/// ``` +/// +/// `None` will result in an empty slice: +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::option; +/// +/// let x: &Option = &None; +/// +/// let slice = option::ref_slice(x); +/// +/// assert_eq!(&[], slice); +/// ``` +pub fn ref_slice(opt: &Option) -> &[T] { + unimplemented!() +} ``` ## Formatting From 7727fb38171a15e9cedcb529ec749c605906a1e9 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Tue, 5 Apr 2016 22:17:54 -0700 Subject: [PATCH 0871/1195] Document limitations on unions declared without #[repr(C)] --- text/0000-union.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-union.md b/text/0000-union.md index 1c264b3fb8d..1d501872c8e 100644 --- a/text/0000-union.md +++ b/text/0000-union.md @@ -268,6 +268,11 @@ In particular, Rust code must not use unions to break the pointer aliasing rules with raw pointers, or access a field containing a primitive type with an invalid value. +In addition, since a union declared without `#[repr(C)]` uses an unspecified +binary layout, code reading fields of such a union or pattern-matching such a +union must not read from a field other than the one written to. This includes +pattern-matching a specific value in a union field. + ## Union size and alignment A union declared with `#[repr(C)]` must have the same size and alignment as an From 311c9141e5480c5c37a38730225a1c95de037991 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 8 Apr 2016 09:48:55 -0700 Subject: [PATCH 0872/1195] Fix a typo --- text/0000-rdylib.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-rdylib.md b/text/0000-rdylib.md index 313667ab152..d3ca2153142 100644 --- a/text/0000-rdylib.md +++ b/text/0000-rdylib.md @@ -39,7 +39,7 @@ cdylibs: into somewhere else, however, you have no need for the metadata! * *Reachable* symbols are exposed from dynamic libraries, but if you're loading Rust into somewhere else then, like executables, only *public* non-Rust-ABI - function sneed to be exported. This can lead to unnecessarily large Rust + functions need to be exported. This can lead to unnecessarily large Rust dynamic libraries in terms of object size as well as missed optimization opportunities from knowing that a function is otherwise private. * We can't run LTO for dylibs because those are intended for end products, not From 0a46c3372ea23b9e2f5cab52c098de3b2803a053 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 8 Apr 2016 16:38:52 -0400 Subject: [PATCH 0873/1195] update the edit history for rfc 550 --- text/0550-macro-future-proofing.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0550-macro-future-proofing.md b/text/0550-macro-future-proofing.md index 881347e07f1..3cec600ab8c 100644 --- a/text/0550-macro-future-proofing.md +++ b/text/0550-macro-future-proofing.md @@ -485,6 +485,9 @@ reasonable freedom and can be extended in the future. - Updated by https://github.com/rust-lang/rfcs/pull/1462, which added open square bracket into the follow set for types. + +- Updated by https://github.com/rust-lang/rfcs/pull/1494, which adjusted + the follow set for types to include block nonterminals. # Appendices From 2317bd1f6e8b58b642ec088f3447e7ba217a864f Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 8 Apr 2016 16:42:46 -0400 Subject: [PATCH 0874/1195] rename #1444, create tracking issue --- text/{0000-union.md => 1444-union.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-union.md => 1444-union.md} (99%) diff --git a/text/0000-union.md b/text/1444-union.md similarity index 99% rename from text/0000-union.md rename to text/1444-union.md index 1d501872c8e..82a809cefec 100644 --- a/text/0000-union.md +++ b/text/1444-union.md @@ -1,7 +1,7 @@ - Feature Name: `union` - Start Date: 2015-12-29 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pulls/1444 +- Rust Issue: https://github.com/rust-lang/rust/issues/32836 # Summary [summary]: #summary From e72daa013a59d21ace2c4e8a44f7f996e05ae736 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 8 Apr 2016 16:55:00 -0400 Subject: [PATCH 0875/1195] merge RFC #1513 --- text/{0000-less-unwinding.md => 1513-less-unwinding.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-less-unwinding.md => 1513-less-unwinding.md} (99%) diff --git a/text/0000-less-unwinding.md b/text/1513-less-unwinding.md similarity index 99% rename from text/0000-less-unwinding.md rename to text/1513-less-unwinding.md index dc665b4a9ec..a46c736e077 100644 --- a/text/0000-less-unwinding.md +++ b/text/1513-less-unwinding.md @@ -1,7 +1,7 @@ - Feature Name: `panic_runtime` - Start Date: 2016-02-25 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1513 +- Rust Issue: https://github.com/rust-lang/rust/issues/32837 # Summary [summary]: #summary From bdbe73d1be948dd925c6b3583d40879cde130bcf Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 8 Apr 2016 16:57:58 -0400 Subject: [PATCH 0876/1195] merge rfc #1398 --- ...000-kinds-of-allocators.md => 1398-kinds-of-allocators.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-kinds-of-allocators.md => 1398-kinds-of-allocators.md} (99%) diff --git a/text/0000-kinds-of-allocators.md b/text/1398-kinds-of-allocators.md similarity index 99% rename from text/0000-kinds-of-allocators.md rename to text/1398-kinds-of-allocators.md index 441ecf761e3..720e6fdcde4 100644 --- a/text/0000-kinds-of-allocators.md +++ b/text/1398-kinds-of-allocators.md @@ -1,7 +1,7 @@ - Feature Name: allocator_api - Start Date: 2015-12-01 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1398 +- Rust Issue: https://github.com/rust-lang/rust/issues/32838 # Summary [summary]: #summary From 0c1cc7bb5ac60adcbb2be0d15b3d2f3dc1b40beb Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 14 Apr 2016 16:55:07 -0700 Subject: [PATCH 0877/1195] RFC 1543 is more integer atomics --- text/{0000-integer_atomics.md => 1543-integer_atomics.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-integer_atomics.md => 1543-integer_atomics.md} (96%) diff --git a/text/0000-integer_atomics.md b/text/1543-integer_atomics.md similarity index 96% rename from text/0000-integer_atomics.md rename to text/1543-integer_atomics.md index 9a9a330586e..d1d0dce3f64 100644 --- a/text/0000-integer_atomics.md +++ b/text/1543-integer_atomics.md @@ -1,7 +1,7 @@ -- Feature Name: integer_atomics +- Feature Name: `integer_atomics` - Start Date: 2016-03-14 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1543](https://github.com/rust-lang/rfcs/pull/1543) +- Rust Issue: [rust-lang/rust#32976](https://github.com/rust-lang/rust/issues/32976) # Summary [summary]: #summary From 7056161346ee8133d2d28a3e2802dcf99b6c8295 Mon Sep 17 00:00:00 2001 From: mirandadam Date: Fri, 15 Apr 2016 09:23:15 -0300 Subject: [PATCH 0878/1195] Add line breaks --- text/1211-mir.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/1211-mir.md b/text/1211-mir.md index 547b9b7b27b..38e2e4ade11 100644 --- a/text/1211-mir.md +++ b/text/1211-mir.md @@ -67,8 +67,10 @@ there are also a number of distinct downsides. intermediate form improves the situation because: a. In some cases, we can do the optimizations in the MIR itself before translation. + b. In other cases, we can do analyses on the MIR to easily determine when the optimization would be safe. + c. In all cases, whatever we can do on the MIR will be helpful for other targets beyond LLVM (see next bullet). From 1fd4a15e938a743c9afe59631535c7292a135e1b Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Fri, 15 Apr 2016 12:38:07 -0400 Subject: [PATCH 0879/1195] `FusedIterator` marker trait and `iter::Fuse` specialization This RFC adds a `FusedIterator` marker trait and specializes `iter::Fuse` to do nothing when the underlying iterator already provides the `Fuse` guarantee. --- text/0000-fused-iterator.md | 247 ++++++++++++++++++++++++++++++++++++ 1 file changed, 247 insertions(+) create mode 100644 text/0000-fused-iterator.md diff --git a/text/0000-fused-iterator.md b/text/0000-fused-iterator.md new file mode 100644 index 00000000000..4a57bde77c1 --- /dev/null +++ b/text/0000-fused-iterator.md @@ -0,0 +1,247 @@ +- Feature Name: fused +- Start Date: 2016-04-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add a marker trait `FusedIterator` to `std::iter` and implement it on `Fuse` and +applicable iterators and adapters. By implementing `FusedIterator`, an iterator +promises to behave as if `Iterator::fuse()` had been called on it (i.e. return +`None` forever after returning `None` once). Then, specialize `Fuse` to be a +no-op iff `I` implements `FusedIterator`. + +# Motivation +[motivation]: #motivation + +Iterators are allowed to return whatever they want after returning `None` once. +However, assuming that an iterator continues to return `None` can make +implementing some algorithms/adapters easier. Therefore, `Fused` and +`Iterator::fuse` exist. Unfortunately, the `Fused` iterator adapter introduces a +noticeable overhead. Furthermore, many iterators (most if not all iterators in +std) already act as if they were fused (this is considered to be the "polite" +behavior). Therefore, it would be nice to be able to pay the `Fused` overhead +iff necessary. + +Microbenchmarks: + +```text +test fuse ... bench: 200 ns/iter (+/- 13) +test fuse_fuse ... bench: 250 ns/iter (+/- 10) +test myfuse ... bench: 48 ns/iter (+/- 4) +test myfuse_myfuse ... bench: 48 ns/iter (+/- 3) +test range ... bench: 48 ns/iter (+/- 2) +``` + +```rust +#![feature(test, specialization)] +extern crate test; + +use std::ops::Range; + +#[derive(Clone, Debug)] +#[must_use = "iterator adaptors are lazy and do nothing unless consumed"] +pub struct MyFuse { + iter: I, + done: bool +} + +pub trait Fused: Iterator {} + +trait IterExt: Iterator + Sized { + fn myfuse(self) -> MyFuse { + MyFuse { + iter: self, + done: false, + } + } +} + +impl Fused for MyFuse where MyFuse: Iterator {} +impl Fused for Range where Range: Iterator {} + +impl IterExt for T {} + +impl Iterator for MyFuse where I: Iterator { + type Item = ::Item; + + #[inline] + default fn next(&mut self) -> Option<::Item> { + if self.done { + None + } else { + let next = self.iter.next(); + self.done = next.is_none(); + next + } + } +} + +impl Iterator for MyFuse where I: Iterator + Fused { + #[inline] + fn next(&mut self) -> Option<::Item> { + self.iter.next() + } +} + +impl ExactSizeIterator for MyFuse where I: ExactSizeIterator {} + +#[bench] +fn myfuse(b: &mut test::Bencher) { + b.iter(|| { + for i in (0..100).myfuse() { + test::black_box(i); + } + }) +} + +#[bench] +fn myfuse_myfuse(b: &mut test::Bencher) { + b.iter(|| { + for i in (0..100).myfuse().myfuse() { + test::black_box(i); + } + }); +} + + +#[bench] +fn fuse(b: &mut test::Bencher) { + b.iter(|| { + for i in (0..100).fuse() { + test::black_box(i); + } + }) +} + +#[bench] +fn fuse_fuse(b: &mut test::Bencher) { + b.iter(|| { + for i in (0..100).fuse().fuse() { + test::black_box(i); + } + }); +} + +#[bench] +fn range(b: &mut test::Bencher) { + b.iter(|| { + for i in (0..100) { + test::black_box(i); + } + }) +} +``` + +# Detailed Design +[design]: #detailed-design + +``` +trait FusedIterator: Iterator {} + +impl FusedIterator for Fuse {} + +impl FusedIterator for Range {} +// ...and for most std/core iterators... + + +// Existing implementation of Fuse repeated for convenience +pub struct Fuse { + iterator: I, + done: bool, +} + +impl Iterator for Fuse where I: Iterator { + type Item = I::Item; + + #[inline] + fn next(&mut self) -> Self::Item { + if self.done { + None + } else { + let next = self.iterator.next(); + self.done = next.is_none(); + next + } + } +} + +// Then, specialize Fuse... +impl Iterator for Fuse where I: FusedIterator { + type Item = I::Item; + + #[inline] + fn next(&mut self) -> Self::Item { + // Ignore the done flag and pass through. + // Note: this means that the done flag should *never* be exposed to the + // user. + self.iterator.next() + } +} + +``` + +# Drawbacks +[drawbacks]: #drawbacks + +1. Yet another special iterator trait. +2. There is a useless done flag on no-op `Fuse` adapters. +3. Fuse isn't used very often anyways. However, I would argue that it should be + used more often and people are just playing fast and loose. I'm hoping that + making `Fuse` free when unneeded will encourage people to use it when they should. + +# Alternatives + +## Do Nothing + +Just pay the overhead on the rare occasions when fused is actually used. + +## Associated Type + +Use an associated type (and set it to `Self` for iterators that already provide +the fused guarantee) and an `IntoFused` trait: + +```rust +#![feature(specialization)] +use std::iter::Fuse; + +trait FusedIterator: Iterator {} + +trait IntoFused: Iterator + Sized { + type Fused: Iterator; + fn into_fused(self) -> Self::Fused; +} + +impl IntoFused for T where T: Iterator { + default type Fused = Fuse; + default fn into_fused(self) -> Self::Fused { + // Currently complains about a mismatched type but I think that's a + // specialization bug. + self.fuse() + } +} + +impl IntoFused for T where T: FusedIterator { + type Fused = Self; + + fn into_fused(self) -> Self::Fused { + self + } +} +``` + +For now, this doesn't actually compile because rust believes that the associated +type `Fused` could be specialized independent of the `into_fuse` function. + +While this method gets rid of memory overhead of a no-op `Fuse` wrapper, it adds +complexity, needs to be implemented as a separate trait (because adding +associated types is a breaking change), and can't be used to optimize the +iterators returned from `Iterator::fuse` (users would *have* to call +`IntoFused::into_fused`). + +# Unresolved questions +[unresolved]: #unresolved-questions + +Should this trait be unsafe? I can't think of any way generic unsafe code could +end up relying on the guarantees of `Fused`. From 4b24c5d541b843110e339223c582b0cb41653012 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Fri, 15 Apr 2016 15:09:56 -0400 Subject: [PATCH 0880/1195] note associated type alternative --- text/0000-fused-iterator.md | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/text/0000-fused-iterator.md b/text/0000-fused-iterator.md index 4a57bde77c1..c7a8570c3a8 100644 --- a/text/0000-fused-iterator.md +++ b/text/0000-fused-iterator.md @@ -197,7 +197,7 @@ impl Iterator for Fuse where I: FusedIterator { Just pay the overhead on the rare occasions when fused is actually used. -## Associated Type +## IntoFused Use an associated type (and set it to `Self` for iterators that already provide the fused guarantee) and an `IntoFused` trait: @@ -240,6 +240,30 @@ associated types is a breaking change), and can't be used to optimize the iterators returned from `Iterator::fuse` (users would *have* to call `IntoFused::into_fused`). +## Associated Type + +If we add the ability to condition associated types on `Self: Sized`, I believe +we can add them without it being a breaking change (associated types only need +to be fully specified on DSTs). If so (after fixing the bug in specialization +noted above), we could do the following: + +```rust +trait Iterator { + type Item; + type Fuse: Iterator where Self: Sized = Fuse; + fn fuse(self) -> Self::Fuse where Self: Sized { + Fuse { + done: false, + iter: self, + } + } + // ... +} +``` + +However, changing an iterator to take advantage of this would be a breaking +change. + # Unresolved questions [unresolved]: #unresolved-questions From 2f7a03010fd31d5c18bbd3835cb79cd9a8b72819 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Fri, 15 Apr 2016 15:20:56 -0400 Subject: [PATCH 0881/1195] remove potentially confusing iffs --- text/0000-fused-iterator.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-fused-iterator.md b/text/0000-fused-iterator.md index c7a8570c3a8..88093c97f3c 100644 --- a/text/0000-fused-iterator.md +++ b/text/0000-fused-iterator.md @@ -10,7 +10,7 @@ Add a marker trait `FusedIterator` to `std::iter` and implement it on `Fuse` applicable iterators and adapters. By implementing `FusedIterator`, an iterator promises to behave as if `Iterator::fuse()` had been called on it (i.e. return `None` forever after returning `None` once). Then, specialize `Fuse` to be a -no-op iff `I` implements `FusedIterator`. +no-op if `I` implements `FusedIterator`. # Motivation [motivation]: #motivation @@ -22,7 +22,7 @@ implementing some algorithms/adapters easier. Therefore, `Fused` and noticeable overhead. Furthermore, many iterators (most if not all iterators in std) already act as if they were fused (this is considered to be the "polite" behavior). Therefore, it would be nice to be able to pay the `Fused` overhead -iff necessary. +only when necessary. Microbenchmarks: From ec97a6e0edf75022baecdfbed70223491e7b57f4 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 21 Apr 2016 09:18:02 -0700 Subject: [PATCH 0882/1195] Change to introduce cdylib instead of rdylib --- text/0000-rdylib.md | 87 ++++++++++----------------------------------- 1 file changed, 18 insertions(+), 69 deletions(-) diff --git a/text/0000-rdylib.md b/text/0000-rdylib.md index d3ca2153142..4d39155da04 100644 --- a/text/0000-rdylib.md +++ b/text/0000-rdylib.md @@ -6,8 +6,8 @@ # Summary [summary]: #summary -Add a new crate type accepted by the compiler, called `rdylib`, which -corresponds to the behavior of `-C prefer-dynamic` plus `--crate-type dylib`. +Add a new crate type accepted by the compiler, called `cdylib`, which +corresponds to exporting a C interface from a Rust dynamic library. # Motivation [motivation]: #motivation @@ -51,12 +51,13 @@ represent the more rarely used form of dynamic library (rdylibs). # Detailed design [design]: #detailed-design -A new crate type will be accepted by the compiler, `rdylib`, which can be passed -as either `--crate-type rdylib` on the command line or via `#![crate_type = -"rdylib"]` in crate attributes. This crate type will conceptually correspond to -the rdylib use case described above, and today's `dylib` crate-type will -correspond to the cdylib use case above. Note that the literal output artifacts -of these two crate types (files, file names, etc) will be the same. +A new crate type will be accepted by the compiler, `cdylib`, which can be passed +as either `--crate-type cdylib` on the command line or via `#![crate_type = +"cdylib"]` in crate attributes. This crate type will conceptually correspond to +the cdylib use case described above, and today's `dylib` crate-type will +continue to correspond to the rdylib use case above. Note that the literal +output artifacts of these two crate types (files, file names, etc) will be the +same. The two formats will differ in the parts listed in the motivation above, specifically: @@ -74,42 +75,6 @@ specifically: example the standard library will be linked dynamically by default. On the other hand, cdylibs will link all Rust dependencies statically by default. -As is evidenced from many of these changes, however, the reinterpretation of the -`dylib` output format from what it is today is a breaking change. For example -metadata will not be present and symbols will be hidden. As a result, this RFC -has a... - -### Transition Plan - -This RFC is technically a breaking change, but it is expected to not actually -break many work flows in practice because there is only one known user of -rdylibs, the compiler itself. This notably means that plugins will also need to -be compiled differently, but because they are nightly-only we've got some more -leeway around them. - -All other known users of the `dylib` output crate type fall into the cdylib use -case. The "breakage" here would mean: - -* The metadata section no longer exists. In almost all cases this just means - that the output artifacts will get smaller if it isn't present, it's expected - that no one other than the compiler itself is actually consuming this - information. -* Rust symbols will be hidden by default. The symbols, however, have - unpredictable hashes so there's not really any way they can be meaningfully - leveraged today. - -Given that background, it's expected that if there's a smooth migration path for -plugins and the compiler then the "breakage" here won't actually appear in -practice. The proposed implementation strategy and migration path is: - -1. Implement the `rdylib` output type as proposed in this RFC. -2. Change Cargo to use `--crate-type rdylib` when compiling plugins instead of - `--crate-type dylib` + `-C prefer-dynamic`. -3. Implement the changes to the `dylib` output format as proposed in this RFC. - -So long as the steps are spaced apart by a few days it should be the case that -no nightly builds break if they're always using an up-to-date nightly compiler. - # Drawbacks [drawbacks]: #drawbacks @@ -118,35 +83,19 @@ ephemeral. This RFC is an extension of this model, but it's difficult to reason about extending that which is not well defined. As a result there could be unforseen interactions between this output format and where it's used. -As usual, of course, proposing a breaking change is indeed a drawback. It is -expected that RFC doesn't break anything in practice, but that'd be difficult to -gauge until it's implemented. - # Alternatives [alternatives]: #alternatives -* Instead of reinterpreting the `dylib` output format as a cdylib, we could - continue interpreting it as an rdylib and add a new dedicated `cdylib` output - format. This would not be a breaking change, but it doesn't come without its - drawbacks. As the most common output type, many projects would have to switch - to `cdylib` from `dylib`, meaning that they no longer support older Rust - compilers. This may also take time to propagate throughout the community. It's - also arguably a "better name", so this RFC proposes an - in-practice-not-a-breaking-change by adding a worse name of `rdylib` for the - less used output format. - -* The compiler could have a longer transition period where `-C prefer-dynamic` - plus `--crate-type dylib` is interpreted as an rdylib. Either that or the - implementation strategy here could be extended by a release or two to let - changes time to propagate throughout the ecosystem. +* Originally this RFC proposed adding a new crate type, `rdylib`, instead of + adding a new crate type, `cdylib`. The existing `dylib` output type would be + reinterpreted as a cdylib use-case. This is unfortunately, however, a breaking + change and requires a somewhat complicated transition plan in Cargo for + plugins. In the end it didn't seem worth it for the benefit of "cdylib is + probably what you want". # Unresolved questions [unresolved]: #unresolved-questions -* This RFC is currently founded upon the assumption that rdylibs are very rarely - used in the ecosystem. An audit has not been performed to determine whether - this is true or not, but is this actually the case? - -* Should the new `rdylib` format be considered unstable? (should it require a - nightly compiler?). The use case for a Rust dynamic library is so limited, and - so volatile, we may want to just gate access to it by default. +* Should the existing `dylib` format be considered unstable? (should it require + a nightly compiler?). The use case for a Rust dynamic library is so limited, + and so volatile, we may want to just gate access to it by default. From 6e7712dc977225fef94b164ad2c580cf4f60dfc5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 21 Apr 2016 09:19:26 -0700 Subject: [PATCH 0883/1195] RFC 1510 is cdylibs --- text/{0000-rdylib.md => 1510-rdylib.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-rdylib.md => 1510-rdylib.md} (96%) diff --git a/text/0000-rdylib.md b/text/1510-rdylib.md similarity index 96% rename from text/0000-rdylib.md rename to text/1510-rdylib.md index 4d39155da04..7961262594c 100644 --- a/text/0000-rdylib.md +++ b/text/1510-rdylib.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2016-02-23 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1510](https://github.com/rust-lang/rfcs/pull/1510) +- Rust Issue: [rust-lang/rust#33132](https://github.com/rust-lang/rust/issues/33132) # Summary [summary]: #summary From b835447258900dd961870468a94a253422028656 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 21 Apr 2016 09:22:01 -0700 Subject: [PATCH 0884/1195] RFC 1535 is stable overflow checks --- ...-overflow-checks.md => 1535-stable-overflow-checks.md} | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) rename text/{0000-stable-overflow-checks.md => 1535-stable-overflow-checks.md} (88%) diff --git a/text/0000-stable-overflow-checks.md b/text/1535-stable-overflow-checks.md similarity index 88% rename from text/0000-stable-overflow-checks.md rename to text/1535-stable-overflow-checks.md index 1762bc29325..eb66764c103 100644 --- a/text/0000-stable-overflow-checks.md +++ b/text/1535-stable-overflow-checks.md @@ -1,7 +1,7 @@ -- Feature Name: (fill me in with a unique ident, my_awesome_feature) -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Feature Name: N/A +- Start Date: 2016-03-09 +- RFC PR: [rust-lang/rfcs#1535](https://github.com/rust-lang/rfcs/pull/1535) +- Rust Issue: [rust-lang/rust#33134](https://github.com/rust-lang/rust/issues/33134) # Summary [summary]: #summary From 43efcc6ee641ffa1faf5a413543e15a1e63fb639 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 21 Apr 2016 09:22:22 -0700 Subject: [PATCH 0885/1195] Add back the template --- 0000-template.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 0000-template.md diff --git a/0000-template.md b/0000-template.md new file mode 100644 index 00000000000..a45c6110e58 --- /dev/null +++ b/0000-template.md @@ -0,0 +1,36 @@ +- Feature Name: (fill me in with a unique ident, my_awesome_feature) +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +One para explanation of the feature. + +# Motivation +[motivation]: #motivation + +Why are we doing this? What use cases does it support? What is the expected outcome? + +# Detailed design +[design]: #detailed-design + +This is the bulk of the RFC. Explain the design in enough detail for somebody familiar +with the language to understand, and for somebody familiar with the compiler to implement. +This should get into specifics and corner-cases, and include examples of how the feature is used. + +# Drawbacks +[drawbacks]: #drawbacks + +Why should we *not* do this? + +# Alternatives +[alternatives]: #alternatives + +What other designs have been considered? What is the impact of not doing this? + +# Unresolved questions +[unresolved]: #unresolved-questions + +What parts of the design are still TBD? From ec1a4b7d5bd4047380e976ab06b19e645bb97573 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 22 Apr 2016 06:59:38 -0400 Subject: [PATCH 0886/1195] draft --- text/0000-rustc-bug-fix-procedure.md | 273 +++++++++++++++++++++++++++ 1 file changed, 273 insertions(+) create mode 100644 text/0000-rustc-bug-fix-procedure.md diff --git a/text/0000-rustc-bug-fix-procedure.md b/text/0000-rustc-bug-fix-procedure.md new file mode 100644 index 00000000000..5a9f532c1ff --- /dev/null +++ b/text/0000-rustc-bug-fix-procedure.md @@ -0,0 +1,273 @@ +- Feature Name: N/A +- Start Date: 2016-04-22 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Defines a best practices procedure for making bug fixes or soundness +corrections in the compiler that can cause existing code to stop +compiling. + +# Motivation +[motivation]: #motivation + +From time to time, we encounter the need to make a bug fix, soundness +correction, or other change in the compiler which will cause existing +code to stop compiling. When this happens, it is important that we +handle the change in a way that gives users of Rust a smooth +transition. What we want to avoid is that existing programs suddenly +stop compiling with opaque error messages: we would prefer to have a +gradual period of warnings, with clear guidance as to what the problem +is, how to fix it, and why the change was made. This RFC describes the +procedure that we have been developing for handling breaking changes +that aims to achieve that kind of smooth transition. + +One of the key points of this policy is that (a) warnings should be +issued initially rather than hard errors if at all possible and (b) +every change that causes existing code to stop compiling will have an +associated tracking issue. This issue provides a point to collect +feedback on the results of that change. Sometimes changes have +unexpectedly large consequences or there may be a way to avoid the +change that was not considered. In those cases, we may decide to +change course and roll back the change, or find another solution (if +warnings are being used, this is particularly easy to do). + +### What qualifies as a bug fix? + +Note that this RFC does not try to define when a breaking change is +permitted. That is already covered under [RFC 1122][]. This document +assumes that the change being made is in accordance with those +policies. Here is a summary of the conditions from RFC 1122: + +- **Soundness changes:** Fixes to holes uncovered in the type system. +- **Compiler bugs:** Places where the compiler is not implementing the + specified semantics found in an RFC or lang-team decision. +- **Underspecified language semantics:** Clarifications to grey areas + where the compiler behaves inconsistently and no formal behavior had + been previously decided. + +Please see [the RFC][RFC 1122] for full details! + +# Detailed design +[design]: #detailed-design + +The procedure for making a breaking change is as follows (each of +these steps is described in more detail below): + +0. Do a **crater run** to assess the impact of the change. +1. Make a **special tracking issue** dedicated to the change. +2. Do not report an error right away. Instead, **issue + forwards-compatibility lint warnings**. + - Sometimes this is not straightforward. See the text below for + suggestions on different techniques we have employed in the past. + - For cases where warnings are infeasible: + - Report errors, but make every effort to give a targeted error + message that directs users to the tracking issue + - Submit PRs to all known affected crates that fix the issue + - or, at minimum, alert the owners of those crates to the problem + and direct them to the tracking issue +3. Once the change has been in the wild for at least one cycle, we can + **stabilize the change**, converting those warnings into errors. + +Finally, for changes to libsyntax that will affect plugins, the +general policy is to batch these changes. That is discussed below in +more detail. + +### Tracking issue + +Every breaking change should be accompanied by a **dedicated tracking +issue** for that change. The main text of this issue should describe +the change being made, with a focus on what users must do to fix their +code. The issue should be approachable and practical; it may make +sense to direct users to an RFC or some other issue for the full +details. The issue also serves as a place where users can comment with +questions or other concerns. + +A template for these breaking-change tracking issues can be found +below. An example of how such an issue should look can be +[found here][breaking-change-issue]. + +The issue should be tagged with (at least) `B-unstable` and +`T-compiler`. + +### Tracking issue template + +What follows is a template for tracking issues. + +--------------------------------------------------------------------------- + +This is the **summary issue** for the `YOUR_LINT_NAME_HERE` +future-compatibility warning and other related errors. The goal of +this page is describe why this change was made and how you can fix +code that is affected by it. It also provides a place to ask questions +or register a complaint if you feel the change should not be made. For +more information on the policy around future-compatibility warnings, +see our [breaking change policy guidelines][guidelines]. + +[guidelines]: LINK_TO_THIS_RFC + +#### What is the warning for? + +*Describe the conditions that trigger the warning and how they can be +fixed. Also explain why the change was made.** + +#### When will this warning become a hard error? + +At the beginning of each 6-week release cycle, the Rust compiler team +will review the set of outstanding future compatibility warnings and +nominate some of them for **Final Comment Period**. Toward the end of +the cycle, we will review any comments and make a final determination +whether to convert the warning into a hard error or remove it +entirely. + +--------------------------------------------------------------------------- + +### Issuing future compatibility warnings + +The best way to handle a breaking change is to begin by issuing +future-compatibility warnings. These are a special category of lint +warning. Adding a new future-compatibility warning can be done as +follows. + +```rust +// 1. Define the lint in `src/librustc/lint/builtin.rs`: +declare_lint! { + pub YOUR_ERROR_HERE, + Warn, + "illegal use of foo bar baz" +} + +// 2. Add to the list of HardwiredLints in the same file: +impl LintPass for HardwiredLints { + fn get_lints(&self) -> LintArray { + lint_array!( + .., + YOUR_ERROR_HERE + ) + } +} + +// 3. Register the lint in `src/librustc_lint/lib.rs`: +store.register_future_incompatible(sess, vec![ + ..., + FutureIncompatibleInfo { + id: LintId::of(YOUR_ERROR_HERE), + reference: "issue #1234", // your tracking issue here! + }, +]); + +// 4. Report the lint: +tcx.sess.add_lint( + lint::builtin::YOUR_ERROR_HERE, + path_id, + binding.span, + format!("some helper message here")); +``` + +#### Helpful techniques + +It can often be challenging to filter out new warnings from older, +pre-existing errors. One technique that has been used in the past is +to run the older code unchanged and collect the errors it would have +reported. You can then issue warnings for any errors you would give +which do not appear in that original set. Another option is to abort +compilation after the original code completes if errors are reported: +then you know that your new code will only execute when there were no +errors before. + +#### Crater and crates.io + +We should always do a crater run to assess impact. It is polite and +considerate to at least notify the authors of affected crates the +breaking change. If we can submit PRs to fix the problem, so much the +better. + +#### What if issuing a warning is too hard? + +It does happen from time to time that it is nigh impossible to isolate +the breaking change so that you can issue warnings. In such cases, the best +strategy is to mitigate: + +1. Issue warnings for subparts of the problem, and reserve the new errors for + the smallest set of cases you can. +2. Try to give a very precise error message that suggests how to fix + the problem and directs users to the tracking issue. +3. It may also make sense to layer the fix: + - First, add warnings where possible and let those land before proceeding + to issue errors. + - Work with authors of affected crates to ensure that corrected + versions are available *before* the fix lands, so that downstream + users can use them. + +If you will be issuing a new hard warning, then it is mandatory to at +least notify authors of affected crates which we know +about. Submitting PRs to fix the problem is strongly recommended. If +the impact is too large to make that practical, then we should try +harder to issue warnings or find a way to avoid making the change at +all. + +### Stabilization + +After a change is made, we will **stabilize** the change using the same +process that we use for unstable features: + +- After a new release is made, we will go through the outstanding tracking + issues corresponding to breaking changes and nominate some of them for + **final comment period** (FCP). +- The FCP for such issues lasts for one cycle. In the final week or two of the cycle, + we will review comments and make a final determination: + - Convert to error: the change should be made into a hard error. + - Revert: we should remove the warning and continue to allow the older code to compile. + - Defer: can't decide yet, wait longer, or try other strategies. + +### Batching breaking changes to libsyntax + +Due to the lack of stable plugins, making changes to libsyntax can +currently be quite disruptive to the ecosystem that relies on plugins. +In an effort to ease this pain, we generally try to batch up such +changes so that they occur all at once, rather than occuring in a +piecemeal fashion. In practice, this means that you should add: + + cc #31645 @Manishearth + +to the PR and avoid directly merging it. In the future we may develop +a more polished procedure here, but the hope is that this is a +relatively temporary state of affairs. + +# Drawbacks +[drawbacks]: #drawbacks + +Following this policy can require substantial effort and slows the +time it takes for a change to become final. However, this is far +outweighed by the benefits of avoiding sharp disruptions in the +ecosystem. + +# Alternatives +[alternatives]: #alternatives + +There are obviously many points that we could tweak in this policy: + +- Eliminate the tracking issue. +- Change the stabilization schedule. +- + +Two other obvious (and rather extreme) alternatives are not having a +policy and not making any sort of breaking change at all: + +- Not having a policy at all (as is the case today) encourages + inconsistent treatment of issues. +- Not making any sorts of breaking changes would mean that Rust simply + has to stop evolving, or else would issue new major versions quite + frequently, causing undue disruption. + +# Unresolved questions +[unresolved]: #unresolved-questions + +N/A + + + +[RFC 1122]: https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md +[breaking-change-issue]: https://gist.github.com/nikomatsakis/631ec8b4af9a18b5d062d9d9b7d3d967 From 08a98b4a65f8d577ab7815e4ca6733781f3a49d9 Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Fri, 22 Apr 2016 14:58:17 -0600 Subject: [PATCH 0887/1195] Add a `lifetime` specifier to `macro_rules!` --- text/0000-macro-lifetimes.md | 49 ++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 text/0000-macro-lifetimes.md diff --git a/text/0000-macro-lifetimes.md b/text/0000-macro-lifetimes.md new file mode 100644 index 00000000000..91facfb8435 --- /dev/null +++ b/text/0000-macro-lifetimes.md @@ -0,0 +1,49 @@ +- Feature Name: Allow `lifetime` specifiers to be passed to macros +- Start Date: 2016-04-22 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add a `lifetime` specifier for `macro_rules!` patterns, that matches any valid +lifetime. + +# Motivation +[motivation]: #motivation + +Certain classes of macros are completely impossible without the ability to pass +lifetimes. Specifically, anything that wants to implement a trait from inside of +a macro is going to need to deal with lifetimes eventually. They're also +commonly needed for any macros that need to deal with types in a more granular +way than just `ty`. + +Since a lifetime is a single token, there is currently no way to accept one +without an explicit matcher. Something like `'$lifetime:ident` will fail to +compile. + +# Detailed design +[design]: #detailed-design + +This RFC proposes adding `lifetime` as an additional specifier to +`macro_rules!` (alternatively: `life` or `lt`). Since a lifetime acts very much +like an identifier, and can appear in almost as many places, it can be handled +almost identically. A preliminary implementation can be found at +https://github.com/rust-lang/rust/pull/33135 + +# Drawbacks +[drawbacks]: #drawbacks + +None + +# Alternatives +[alternatives]: #alternatives + +A more general specifier, such as a "type parameter list", which would roughly +map to `ast::Generics` would cover most of the cases that matching lifetimes +individually would cover. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From 82a969b5e0f8d6e75a4d581b06ddb6ddd9c5dee7 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 22 Apr 2016 17:02:14 -0400 Subject: [PATCH 0888/1195] merge RFC #1440: drop in static, const-fn --- ...000-drop-types-in-const.md => 1440-drop-types-in-const.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-drop-types-in-const.md => 1440-drop-types-in-const.md} (94%) diff --git a/text/0000-drop-types-in-const.md b/text/1440-drop-types-in-const.md similarity index 94% rename from text/0000-drop-types-in-const.md rename to text/1440-drop-types-in-const.md index 92f79a19e7b..4455d580b38 100644 --- a/text/0000-drop-types-in-const.md +++ b/text/1440-drop-types-in-const.md @@ -1,7 +1,7 @@ - Feature Name: `drop_types_in_const` - Start Date: 2016-01-01 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1440](https://github.com/rust-lang/rfcs/pull/1440) +- Rust Issue: [rust-lang/rust#33156](https://github.com/rust-lang/rust/issues/33156) # Summary [summary]: #summary From 968a8c00042b4038e9f72190e8c3d254acfe04e3 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 22 Apr 2016 19:27:43 -0400 Subject: [PATCH 0889/1195] merge RFC #1399 also adjust from `#[repr(pack = "N")]` to `#[repr(packed = "N")]` --- text/{0000-repr-pack.md => 1399-repr-pack.md} | 25 ++++++++----------- 1 file changed, 10 insertions(+), 15 deletions(-) rename text/{0000-repr-pack.md => 1399-repr-pack.md} (82%) diff --git a/text/0000-repr-pack.md b/text/1399-repr-pack.md similarity index 82% rename from text/0000-repr-pack.md rename to text/1399-repr-pack.md index 447137bdf8f..c165ace4bce 100644 --- a/text/0000-repr-pack.md +++ b/text/1399-repr-pack.md @@ -1,12 +1,12 @@ -- Feature Name: `repr_pack` +- Feature Name: `repr_packed` - Start Date: 2015-12-06 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1399](https://github.com/rust-lang/rfcs/pull/1399) +- Rust Issue: [rust-lang/rust#33158](https://github.com/rust-lang/rust/issues/33158) # Summary [summary]: #summary -Extend the existing `#[repr]` attribute on structs with a `pack = "N"` option to +Extend the existing `#[repr]` attribute on structs with a `packed = "N"` option to specify a custom packing for `struct` types. # Motivation @@ -19,7 +19,7 @@ C/C++ libraries (such as Windows API which uses it pervasively making writing Rust libraries such as `winapi` challenging). At the moment the only way to work around the lack of a proper -`#[repr(pack = "N")]` attribute is to use `#[repr(packed)]` and then manually +`#[repr(packed = "N")]` attribute is to use `#[repr(packed)]` and then manually fill in padding which is a burdensome task. Even then that isn't quite right because the overall alignment of the struct would end up as 1 even though it needs to be N (or the default if that is smaller than N), so this fills in a gap @@ -31,7 +31,7 @@ which is impossible to do in Rust at the moment. The `#[repr]` attribute on `struct`s will be extended to include a form such as: ```rust -#[repr(pack = "2")] +#[repr(packed = "2")] struct LessAligned(i16, i32); ``` @@ -42,7 +42,7 @@ alignment of 4 and a size of 8, and the second field would have an offset of 4 from the base of the struct. Syntactically, the `repr` meta list will be extended to accept a meta item -name/value pair with the name "pack" and the value as a string which can be +name/value pair with the name "packed" and the value as a string which can be parsed as a `u64`. The restrictions on where this attribute can be placed along with the accepted values are: @@ -61,25 +61,22 @@ struct, then the alignment and layout of the struct should be unaffected. When combined with `#[repr(C)]` the size alignment and layout of the struct should match the equivalent struct in C. -`#[repr(packed)]` and `#[repr(pack = "1")]` should have identical behavior. +`#[repr(packed)]` and `#[repr(packed = "1")]` should have identical behavior. Because this lowers the effective alignment of fields in the same way that `#[repr(packed)]` does (which caused [issue #27060][gh27060]), while accessing a field should be safe, borrowing a field should be unsafe. -Specifying `#[repr(packed)]` and `#[repr(pack = "N")]` where N is not 1 should +Specifying `#[repr(packed)]` and `#[repr(packed = "N")]` where N is not 1 should result in an error. -Specifying `#[repr(pack = "A")]` and `#[repr(align = "B")]` should still pack +Specifying `#[repr(packed = "A")]` and `#[repr(align = "B")]` should still pack together fields with the packing specified, but then increase the overall alignment to the alignment specified. Depends on [RFC #1358][rfc1358] landing. # Drawbacks [drawbacks]: #drawbacks -Duplication in the language where `#[repr(packed)]` and `#[repr(pack = "1")]` -have identical behavior. - # Alternatives [alternatives]: #alternatives @@ -87,8 +84,6 @@ have identical behavior. `#[repr(packed)]` with manual padding, although such structs would always have an alignment of 1 which is often wrong. * Alternatively a new attribute could be used such as `#[pack]`. -* `#[repr(packed)]` could be extended as either `#[repr(packed(N))]` or - `#[repr(packed = "N")]`. # Unresolved questions [unresolved]: #unresolved-questions From bf9fca6c712eb5685fe3babcf9333cd2d1631ed3 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 22 Apr 2016 19:38:52 -0400 Subject: [PATCH 0890/1195] add unresolved question and change log to reflect rust-lang/rfcs#1319 --- text/1228-placement-left-arrow.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/text/1228-placement-left-arrow.md b/text/1228-placement-left-arrow.md index c5a76276328..ffeba08d696 100644 --- a/text/1228-placement-left-arrow.md +++ b/text/1228-placement-left-arrow.md @@ -196,4 +196,17 @@ lowest precedence) to highest in the language. The most prominent choices are: # Unresolved questions -None +**What should the precedence of the `<-` operator be?** In particular, +it may make sense for it to have the same precedence of `=`, as argued +in [these][huon1] [comments][huon2]. The ultimate answer here will +probably depend on whether the result of `a <- b` is commonly composed +and how, so it was decided to hold off on a final decision until there +was more usage in the wild. + +[huon1]: https://github.com/rust-lang/rfcs/pull/1319#issuecomment-206627750 +[huon2]: https://github.com/rust-lang/rfcs/pull/1319#issuecomment-207090495 + +# Change log + +**2016.04.22.** Amended by [rust-lang/rfcs#1319](https://github.com/rust-lang/rfcs/pull/1319) +to adjust the precedence. From e247a63eab1b8180c0e3fe028e3a489500782af7 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 22 Apr 2016 14:45:20 -0700 Subject: [PATCH 0891/1195] Reorganize RFC, hopefully clarify many points --- text/0000-cargo-workspace.md | 162 +++++++++++++++++------------------ 1 file changed, 78 insertions(+), 84 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index 5b32490b249..817ff58937d 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -9,9 +9,6 @@ Improve Cargo's story around multi-crate single-repo project management by introducing the concept of workspaces. All packages in a workspace will share `Cargo.lock` and an output directory for artifacts. -Cargo will infer workspaces where possible, but it will also have knobs for -explicitly controlling what crates belong to which workspace. - # Motivation A common method to organize a multi-crate project is to have one @@ -57,59 +54,37 @@ conventional project layouts but will have explicit controls for configuration. First, let's look at the new manifest keys which will be added to `Cargo.toml`: ```toml -[package] -workspace = "../foo" - -# or ... - [workspace] members = ["relative/path/to/child1", "../child2"] -``` - -Here the `package.workspace` key is used to point at a workspace root. For -example this Cargo.toml indicates that the Cargo.toml in `../foo` is the -workspace that this package is a member of. -The root of a workspace, indicated by the presence of `[workspace]`, may also -explicitly specify some members of the workspace as well via the -`workspace.members` key. This example here means that two extra crates will be a -member of the workspace. +# or ... -### Implicit relations +[package] +workspace = "../foo" +``` -In addition to the keys above, Cargo will apply a few heuristics to infer the -keys wherever possible: +The root of a workspace, indicated by the presence of `[workspace]`, is +responsible for defining the entire workspace (listing all members). +This example here means that two extra crates will members of the workspace +(which also includes the root). -* All `path` dependencies of a crate are considered members of the same - workspace. -* If `package.workspace` isn't specified, then Cargo will walk upwards on the - filesystem until either a `Cargo.toml` with `[workspace]` is found or a VCS - root is found. +The `package.workspace` key is used to point at a workspace root. For +example this Cargo.toml indicates that the Cargo.toml in `../foo` is the +workspace root that this package is a member of. -These rules are intended to reflect some conventional Cargo project layouts. -"Root crates" typically appear at the root of a repository with lots path -dependencies to all other crates in a repo. Additionally, we don't want to -traverse wildly across the filesystem so we only go upwards to a fixed point or -downwards to specific locations. +These keys are mutually exclusive when applied in `Cargo.toml`. A crate may +*either* specify `package.workspace` or specify `[workspace]`. That is, a +crate cannot both be a root in a workspace (contain `[workspace]`) and also be +member of another workspace (contain `package.workspace`). ### "Virtual" `Cargo.toml` -A good number of projects do not have a root `Cargo.toml` at the top of a -repository, however. While the explicit `package.workspace` and -`workspace.members` keys should be enough to configure the workspace in addition -to the implicit relations above, this directory structure is common enough that -it shouldn't require *that* much more configuration. - -To accommodate this project layout, Cargo will now allow for "virtual manifest" -files. These manifests will currently **only** contains the `[workspace]` key -and will notably be lacking a `[project]` or `[package]` top level key. - -A virtual manifest does not itself define a crate, but can help when defining a -root. For example a `Cargo.toml` file at the root of a repository with a -`[workspace]` key plus `workspace.members` configuration would suffice for the -project configurations in question. Note that omitting `workspace.members` would -not be useful as there are no outgoing edges (no `path` dependencies), so Cargo -will emit an error in cases like this. +A good number of projects do not necessarily have a "root `Cargo.toml`" which is +an appropriate root for a workspace. To accommodate these projects and allow for +the output of a workspace to be configured regardless of where crates are +located, Cargo will now allow for "virtual manifest" files. These manifests will +currently **only** contains the `[workspace]` table and will notably be lacking +a `[project]` or `[package]` top level key. Cargo will for the time being disallow many commands against a virtual manifest, for example `cargo build` will be rejected. Arguments that take a package, @@ -117,39 +92,57 @@ however, such as `cargo test -p foo` will be allowed. Workspaces can eventually get extended with `--all` flags so in a workspace root you could execute `cargo build --all` to compile all crates. -### Constructing a workspace +### Validating a workspace -With the explicit and implicit relations defined above, each crate will have a -number of outgoing edges to other crates via `workspace.members`, path -dependencies, and `package.workspace`. Two crates are then in the same workspace -if they both transitively have edges to one another. A valid workspace then has -exactly one root crate with a `[workspace]` key. +A workspace is valid if these two properties hold: + +1. A workspace has only one root crate (that with `[workspace]` in + `Cargo.toml`). +2. All workspace crates defined in `workspace.members` point back to the + workspace root with `package.workspace`. While the restriction of one-root-per workspace may make sense, the restriction -of crates transitively having edges to one another may seem a bit odd. If, -however, this restriction were not in place then the set of crates in a -workspace may differ depending on which crate it was viewed from. For example if -crate A has a path dependency on B then it will think B is in A's workspace. If, -however, A was not in B's filesystem hierarchy, then B would not think that A -was in its workspace. This would in turn cause the set of crates in each -workspace to be different, futher causing `Cargo.lock` to get out of sync if it -were allowed. By ensuring that all crates have edges to each other in a -workspace Cargo can prevent this situation and guarantee robust builds no matter -where they're executed in the workspace. - -To alleviate misconfiguration, however, if the `workspace.members` -configuration key contains a crate which is not a member of the constructed -workspace, Cargo will emit an error indicating as such. +of crates pointing back to the root may not. If, however, this restriction were +not in place then the set of crates in a workspace may differ depending on +which crate it was viewed from. For example if workspace root A includes B then +it will think B is in A's workspace. If, however, B does ont point back to A, +then B would not think that A was in its workspace. This would in turn cause the +set of crates in each workspace to be different, futher causing `Cargo.lock` to +get out of sync if it were allowed. By ensuring that all crates have edges to +each other in a workspace Cargo can prevent this situation and guarantee robust +builds no matter where they're executed in the workspace. + +To alleviate misconfiguration Cargo will emit an error if the two properties +above hold for any crate attempting to be part of a workspace. For example, if +the `package.workspace` key is specified, but the crate is not a workspace root +or doesn't point back to the original crate an error is emitted. + +### Implicit relations + +The combination of the `package.workspace` key and `[workspace]` table is enough +to specify any workspace in Cargo. Having to annotate all crates with a +`package.workspace` parent or a `workspace.members` list can get quite tedious, +however! To alleviate this configuration burden Cargo will allow these keys to +be implicitly defined in some situations. + +The `package.workspace` can be omitted if it would only contain `../` (or some +repetition of it). That is, if the root of a workspace is hierarchically the +first `Cargo.toml` with `[workspace]` above a crate in the filesystem, then that +crate can omit the `package.workspace` key. + +Next, a crate which specifies `[workspace]` **without a `members` key** will +transitively crawl `path` dependencies to fill in this key. This way all `path` +dependencies (and recursively their own `path` dependencies) will inherently +become the default value for `workspace.members`. + +Note that these implicit relations will be subject to the same validations +mentioned above for all of the explicit configuration as well. ### Workspaces in practice -A conventional layout for a Rust project is to have a `Cargo.toml` at the root -with the "main project" with dependencies and/or satellite projects underneath. -Consequently the conventional layout will only need a `[workspace]` key added to -the root to benefit from the workspaces proposed in this RFC. For example, all -of these project layouts (with `/` being the root of a repository) will only -require the addition of `[workspace]` in the root to have all crates be members -of a workspace: +Many Rust projects today already have `Cargo.toml` at the root of a repository, +and with the small addition of `[workspace]` in the root a workspace will be +ready for all crates in that repository. For example: * An FFI crate with a sub-crate for FFI bindings @@ -174,11 +167,6 @@ of a workspace: src/ ``` -Projects like the compiler, however, will likely need explicit configuration. -The `rust` repo conceptually has two workspaces, the standard library and the -compiler, and these would need to be manually configured with -`workspace.members` and `package.workspace` keys amongst all crates. - Some examples of layouts that will require extra configuration, along with the configuration necessary, are: @@ -275,6 +263,11 @@ configuration necessary, are: workspace = "../root" ``` +Projects like the compiler will likely need exhaustively explicit configuration. +The `rust` repo conceptually has two workspaces, the standard library and the +compiler, and these would need to be manually configured with +`workspace.members` and `package.workspace` keys amongst all crates. + ### Lockfile and override interactions One of the main features of a workspace is that only one `Cargo.lock` is @@ -299,6 +292,13 @@ be applied relative to whatever crate is being compiled (not the workspace root). These are intended for much more local testing, so no restriction of "must be in the root" should be necessary. +Note that this change to the lockfile format is technically incompatible with +older versions of Cargo.lock, but the entire workspaces feature is also +incompatible with older versions of Cargo. This will require projects that wish +to work with workspaces and multiple versions of Cargo to check in multiple +`Cargo.lock` files, but if projects avoid workspaces then Cargo will remain +forwards and backwards compatible. + ### Future Extensions Once Cargo understands a workspace of crates, we could easily extend various @@ -313,12 +313,6 @@ show that workspaces can be used to solve other existing issues in Cargo. # Drawbacks -* This change is not backwards compatible with older versions of Cargo.lock. For - example if a newer cargo were used to develop a repository which otherwise is - developed with older versions of Cargo, the `Cargo.lock` files generated would - be incompatible. If all maintainers agree on versions of Cargo, however, this - is not a problem. - * As proposed there is no method to disable implicit actions taken by Cargo. It's unclear what the use case for this is, but it could in theory arise. From 942c40ed55af07930ded53a523c5571a74bcdf16 Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Sat, 23 Apr 2016 08:49:13 -0600 Subject: [PATCH 0892/1195] Note that `tt` can be used to match a lifetime --- text/0000-macro-lifetimes.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/0000-macro-lifetimes.md b/text/0000-macro-lifetimes.md index 91facfb8435..6b29deb19fa 100644 --- a/text/0000-macro-lifetimes.md +++ b/text/0000-macro-lifetimes.md @@ -18,9 +18,11 @@ a macro is going to need to deal with lifetimes eventually. They're also commonly needed for any macros that need to deal with types in a more granular way than just `ty`. -Since a lifetime is a single token, there is currently no way to accept one -without an explicit matcher. Something like `'$lifetime:ident` will fail to -compile. +Since a lifetime is a single token, the only way to match against a lifetime is +by capturing it as `tt`. Something like `'$lifetime:ident` would fail to +compile. This is extremely limiting, as it becomes difficult to sanitize input, +and `tt` is extremely difficult to use in a sequence without using awkward +separators. # Detailed design [design]: #detailed-design From 5ce14f3d5e7391d7413c47def089f4016ccc979b Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Sun, 24 Apr 2016 10:52:37 -0700 Subject: [PATCH 0893/1195] Typos and such --- text/0000-cargo-workspace.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/text/0000-cargo-workspace.md b/text/0000-cargo-workspace.md index 817ff58937d..27516a4185a 100644 --- a/text/0000-cargo-workspace.md +++ b/text/0000-cargo-workspace.md @@ -65,7 +65,7 @@ workspace = "../foo" The root of a workspace, indicated by the presence of `[workspace]`, is responsible for defining the entire workspace (listing all members). -This example here means that two extra crates will members of the workspace +This example here means that two extra crates will be members of the workspace (which also includes the root). The `package.workspace` key is used to point at a workspace root. For @@ -75,7 +75,7 @@ workspace root that this package is a member of. These keys are mutually exclusive when applied in `Cargo.toml`. A crate may *either* specify `package.workspace` or specify `[workspace]`. That is, a crate cannot both be a root in a workspace (contain `[workspace]`) and also be -member of another workspace (contain `package.workspace`). +a member of another workspace (contain `package.workspace`). ### "Virtual" `Cargo.toml` @@ -105,17 +105,17 @@ While the restriction of one-root-per workspace may make sense, the restriction of crates pointing back to the root may not. If, however, this restriction were not in place then the set of crates in a workspace may differ depending on which crate it was viewed from. For example if workspace root A includes B then -it will think B is in A's workspace. If, however, B does ont point back to A, +it will think B is in A's workspace. If, however, B does not point back to A, then B would not think that A was in its workspace. This would in turn cause the -set of crates in each workspace to be different, futher causing `Cargo.lock` to +set of crates in each workspace to be different, further causing `Cargo.lock` to get out of sync if it were allowed. By ensuring that all crates have edges to each other in a workspace Cargo can prevent this situation and guarantee robust builds no matter where they're executed in the workspace. To alleviate misconfiguration Cargo will emit an error if the two properties -above hold for any crate attempting to be part of a workspace. For example, if -the `package.workspace` key is specified, but the crate is not a workspace root -or doesn't point back to the original crate an error is emitted. +above do not hold for any crate attempting to be part of a workspace. For +example, if the `package.workspace` key is specified, but the crate is not a +workspace root or doesn't point back to the original crate an error is emitted. ### Implicit relations @@ -308,6 +308,9 @@ subcommands with a `--all` flag to perform tasks such as: * Build all binaries for a set of crates within a workspace * Publish all crates in a workspace if necessary to crates.io +Furthermore, workspaces could start to deduplicate metadata among crates like +version numbers, URL information, authorship, etc. + This support isn't proposed to be added in this RFC specifically, but simply to show that workspaces can be used to solve other existing issues in Cargo. From b2c857738c04bf687a51692052dd9174b07b74f2 Mon Sep 17 00:00:00 2001 From: Ariel Ben-Yehuda Date: Wed, 27 Apr 2016 16:56:03 +0300 Subject: [PATCH 0894/1195] all but the last field of a tuple must be Sized --- text/1214-projections-lifetimes-and-wf.md | 30 ++++++++++++++--------- 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/text/1214-projections-lifetimes-and-wf.md b/text/1214-projections-lifetimes-and-wf.md index 9364d3dfe04..f1da694a143 100644 --- a/text/1214-projections-lifetimes-and-wf.md +++ b/text/1214-projections-lifetimes-and-wf.md @@ -366,7 +366,7 @@ For example, in practice, many iterator implementation break due to region relationships: ```rust -impl<'a, T> IntoIterator for &'a LinkedList { +impl<'a, T> IntoIterator for &'a LinkedList { type Item = &'a T; ... } @@ -402,14 +402,14 @@ types: | T // Type O = for TraitId // Object type fragment r = 'x // Region name - + We'll use this to describe the rules in detail. A quick note on terminology: an "object type fragment" is part of an object type: so if you have `Box`, `FnMut()` and `Send` are object type fragments. Object type fragments are identical to full trait references, except that they do not have a self type (no `P0`). - + ### Syntactic definition of the outlives relation The outlives relation is defined in purely syntactic terms as follows. @@ -454,8 +454,8 @@ or projections are involved: OutlivesFragment: ∀i. R,r.. ⊢ Pi: 'a -------------------------------------------------- - R ⊢ for TraitId: 'a - + R ⊢ for TraitId: 'a + #### Outlives for lifetimes The outlives relation for lifetimes depends on whether the lifetime in @@ -487,7 +487,7 @@ lifetime is not yet known. This means for example that `for<'a> fn(&'a i32): 'x` holds, even though we do not yet know what region `'a` is (and in fact it may be instantiated many times with different values on each call to the fn). - + OutlivesRegionBound: 'x ∈ R // bound region -------------------------------------------------- @@ -525,7 +525,7 @@ but reflects the behavior of my prototype implementation.) <> ⊢ >::Id: 'a OutlivesProjectionTraitDef: - WC = [Xi => Pi] WhereClauses(Trait) + WC = [Xi => Pi] WhereClauses(Trait) >::Id: 'b in WC <> ⊢ 'b: 'a -------------------------------------------------- @@ -643,7 +643,7 @@ form: ``` C = r0: r1 | C AND C -``` +``` This is convenient because a simple fixed-point iteration suffices to find the minimal regions which satisfy the constraints. @@ -719,6 +719,7 @@ declare one), but we'll take those basic conditions for granted. WfTuple: ∀i. R ⊢ Ti WF + ∀i TraitId - + Note that we don't check the where clauses declared on the trait itself. These are checked when the object is created. The reason not to check them here is because the `Self` type is not known (this is an @@ -1024,15 +1025,15 @@ that a projection outlives `'a` if its inputs outlive `'a`. To start, let's specify the projection `` as: >::Id - + where `P` can be a lifetime or type parameter as appropriate. - + Then we know that there exists some impl of the form: ```rust impl Trait for Q0 { type Id = T; -} +} ``` Here again, `X` can be a lifetime or type parameter name, and `Q` can @@ -1105,6 +1106,11 @@ then `R ⊢ P': 'a`. Proceed by induction and by cases over the form of `P`: in a type outlive `'a`, then the type outlives `'a`. Follows by inspection of the outlives rules. +# Edit History + +[RFC1592] - amend to require that tuple fields be sized + [crater-errors]: https://gist.github.com/nikomatsakis/2f851e2accfa7ba2830d#root-regressions-sorted-by-rank [crater-all]: https://gist.github.com/nikomatsakis/364fae49de18268680f2#root-regressions-sorted-by-rank [#21953]: https://github.com/rust-lang/rust/issues/21953 +[RFC1592]: https://github.com/rust-lang/rfcs/pull/1592 \ No newline at end of file From 17d68e5f0054d1e809c3f9c1ec64fadf64bc9be8 Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Wed, 27 Apr 2016 18:13:24 -0600 Subject: [PATCH 0895/1195] Add an explicit mention of the follow rules for the specifier --- text/0000-macro-lifetimes.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/text/0000-macro-lifetimes.md b/text/0000-macro-lifetimes.md index 6b29deb19fa..0580cd0ab31 100644 --- a/text/0000-macro-lifetimes.md +++ b/text/0000-macro-lifetimes.md @@ -28,9 +28,12 @@ separators. [design]: #detailed-design This RFC proposes adding `lifetime` as an additional specifier to -`macro_rules!` (alternatively: `life` or `lt`). Since a lifetime acts very much +`macro_rules!` (alternatively: `life` or `lt`). As it is a single token, it is +able to be followed by any other specifier. Since a lifetime acts very much like an identifier, and can appear in almost as many places, it can be handled -almost identically. A preliminary implementation can be found at +almost identically. + +A preliminary implementation can be found at https://github.com/rust-lang/rust/pull/33135 # Drawbacks From 1b9be7363d8f91db6e60a9189a78cefdd5c41986 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 28 Apr 2016 10:20:51 -0700 Subject: [PATCH 0896/1195] Update with a simple interaction with #[repr(packed)] --- text/0000-repr-align.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/text/0000-repr-align.md b/text/0000-repr-align.md index 16d6e734957..3dfc81f9f74 100644 --- a/text/0000-repr-align.md +++ b/text/0000-repr-align.md @@ -61,16 +61,20 @@ with the accepted values are: should be a backwards-compatible extension. * Alignment values must be a power of two. -A custom alignment cannot *decrease* the alignment of a structure unless it is -also declared with `#[repr(packed)]` (to mirror what C does in this regard), but -it can increase the alignment (and hence size) of a structure (as shown -above). +Multiple `#[repr(align = "..")]` directives are accepted on a struct +declaration, and the actual alignment of the structure will be the maximum of +all `align` directives and the natural alignment of the struct itself. Semantically, it will be guaranteed (modulo `unsafe` code) that custom alignment will always be respected. If a pointer to a non-aligned structure exists and is used then it is considered unsafe behavior. Local variables, objects in arrays, statics, etc, will all respect the custom alignment specified for a type. +The `#[repr(align)]` attribute will not interact with `#[repr(packed)]`. That +is, the `#[repr(packed)]` controls the orthogonal attribute of a structure of +how the fields are packed, and the `#[repr(align)]` attribute only controls the +alignment of the overall structure. + # Drawbacks [drawbacks]: #drawbacks From 996891437567f3b90867cdd03d2ea7e53f8d31f6 Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Sat, 30 Apr 2016 14:04:12 -0600 Subject: [PATCH 0897/1195] Update the alternative section for #1268 This RFC was written at a time where the specialization RFC appeared to include the lattice rule. Since the RFC was accepted without it, this RFC implies that it is superseded by specialization, but it is not. --- text/1268-allow-overlapping-impls-on-marker-traits.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/1268-allow-overlapping-impls-on-marker-traits.md b/text/1268-allow-overlapping-impls-on-marker-traits.md index 0df66d3faa7..9ae0b4cb450 100644 --- a/text/1268-allow-overlapping-impls-on-marker-traits.md +++ b/text/1268-allow-overlapping-impls-on-marker-traits.md @@ -112,9 +112,10 @@ probably be considered an acceptable breakage. # Alternatives -Once specialization lands, there does not appear to be a case that is impossible -to write, albeit with some additional boilerplate, as you'll have to manually -specify the empty impl for any overlap that might occur. +If the lattice rule for specialization is eventually accepted, there does not +appear to be a case that is impossible to write, albeit with some additional +boilerplate, as you'll have to manually specify the empty impl for any overlap +that might occur. # Unresolved questions From 75c895fd71875d29e30af664fd75a8e05b1e9348 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 2 May 2016 11:06:50 -0700 Subject: [PATCH 0898/1195] Clarify some interactions with #[repr(packed)] --- text/0000-repr-align.md | 71 +++++++++++++++++++++++++++++++++++------ 1 file changed, 62 insertions(+), 9 deletions(-) diff --git a/text/0000-repr-align.md b/text/0000-repr-align.md index 3dfc81f9f74..2a048398f6f 100644 --- a/text/0000-repr-align.md +++ b/text/0000-repr-align.md @@ -65,15 +65,68 @@ Multiple `#[repr(align = "..")]` directives are accepted on a struct declaration, and the actual alignment of the structure will be the maximum of all `align` directives and the natural alignment of the struct itself. -Semantically, it will be guaranteed (modulo `unsafe` code) that custom alignment -will always be respected. If a pointer to a non-aligned structure exists and is -used then it is considered unsafe behavior. Local variables, objects in arrays, -statics, etc, will all respect the custom alignment specified for a type. - -The `#[repr(align)]` attribute will not interact with `#[repr(packed)]`. That -is, the `#[repr(packed)]` controls the orthogonal attribute of a structure of -how the fields are packed, and the `#[repr(align)]` attribute only controls the -alignment of the overall structure. +Semantically, it will be guaranteed (modulo `unsafe` code and `#[repr(packed)`) +that custom alignment will always be respected. If a pointer to a non-aligned +structure exists and is used then it is considered unsafe behavior. Local +variables, objects in arrays, statics, etc, will all respect the custom +alignment specified for a type. + +The `#[repr(align)]` attribute will not interact with `#[repr(packed)]` in the +sense that the `packed` attribute only affects *field alignment* whereas `align` +affects the *struct alignment*. The `packed` may indirectly lower struct +alignment by lowering the alignment of fields, and then `align` may raise the +overal struct alignment. + +Some examples of `#[repr(align)]` are: + +```rust +// Raising alignment +#[repr(align = "16")] +struct Align16(i32); + +assert_eq!(mem::align_of::(), 16); +assert_eq!(mem::size_of::(), 16); + +// Lowering has no effect +#[repr(align = "1")] +struct Align1(i32); + +assert_eq!(mem::align_of::(), 4); +assert_eq!(mem::size_of::(), 4); + +// Multiple attributes take the max +#[repr(align = "8", align = "4")] +#[repr(align = "16")] +struct AlignMany(i32); + +assert_eq!(mem::align_of::(), 16); +assert_eq!(mem::size_of::(), 16); + +// Raising alignment may not alter size. +#[repr(align = "8")] +struct Align8Many { + a: i32, + b: i32, + c: i32, + d: u8, +} + +assert_eq!(mem::align_of::(), 8); +assert_eq!(mem::size_of::(), 16); + +// Raising alignment beyond the packed value +#[repr(align = "4", packed = "2")] +struct AlignAndPacked { + a: u16, + b: i32, +} + +assert_eq!(mem::align_of::(), 4); +assert_eq!(mem::size_of::(), 8); +assert_eq!(offset_of!(AlignAndPacked, a), 0); +assert_eq!(offset_of!(AlignAndPacked, b), 2); +``` + # Drawbacks [drawbacks]: #drawbacks From 73b4897dbe8c573600f3e16f37739060a99bf198 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 28 Apr 2016 15:31:55 +1200 Subject: [PATCH 0899/1195] This RFC proposes a process for deciding detailed guidelines for code formatting, and default settings for Rustfmt. The outcome of the process should be an approved formatting style defined by a style guide and enforced by Rustfmt. This RFC proposes creating a new repository under the [rust-lang](https://github.com/rust-lang) organisation called fmt-rfcs. It will be operated in a similar manner to the [RFCs repository](https://github.com/rust-lang/rfcs), but restricted to formatting issues. A new [sub-team](https://github.com/rust-lang/rfcs/blob/master/text/1068-rust-governance.md#subteams) will be created to deal with those RFCs. Both the team and repository are expected to be temporary. Once the style guide is complete, the team can be disbanded and the repository frozen. --- text/0000-style-rfcs.md | 323 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 323 insertions(+) create mode 100644 text/0000-style-rfcs.md diff --git a/text/0000-style-rfcs.md b/text/0000-style-rfcs.md new file mode 100644 index 00000000000..d82896ba861 --- /dev/null +++ b/text/0000-style-rfcs.md @@ -0,0 +1,323 @@ +- Feature Name: N/A +- Start Date: 2016-04-21 +- RFC PR: (leave this empty) +- Rust Issue: N/A + + +# Summary +[summary]: #summary + +This RFC proposes a process for deciding detailed guidelines for code +formatting, and default settings for Rustfmt. The outcome of the process should +be an approved formatting style defined by a style guide and enforced by +Rustfmt. + +This RFC proposes creating a new repository under the [rust-lang](https://github.com/rust-lang) +organisation called fmt-rfcs. It will be operated in a similar manner to the +[RFCs repository](https://github.com/rust-lang/rfcs), but restricted to +formatting issues. A new [sub-team](https://github.com/rust-lang/rfcs/blob/master/text/1068-rust-governance.md#subteams) +will be created to deal with those RFCs. Both the team and repository are +expected to be temporary. Once the style guide is complete, the team can be +disbanded and the repository frozen. + + +# Motivation +[motivation]: #motivation + +There is a need to decide on detailed guidelines for the format of Rust code. A +uniform, language-wide formatting style makes comprehending new code-bases +easier and forestalls bikeshedding arguments in teams of Rust users. The utility +of such guidelines has been proven by Go, amongst other languages. + +The [Rustfmt](https://github.com/rust-lang-nursery/rustfmt) tool is +[reaching maturity](https://users.rust-lang.org/t/please-help-test-rustfmt/5386) +and currently enforces a somewhat arbitrary, lightly discussed style, with many +configurable options. + +If Rustfmt is to become a widely accepted tool, there needs to be a process for +the Rust community to decide on the default style, and how configurable that +style should be. + +These discussions should happen in the open and be highly visible. It is +important that the Rust community has significant input to the process. The RFC +repository would be an ideal place to have this discussion because it exists to +satisfy these goals, and is tried and tested. However, the discussion is likely +to be a high-bandwidth one (code style is a contentious and often subjective +topic, and syntactic RFCs tend to be the highest traffic ones). Therefore, +having the discussion on the RFCs repository could easily overwhelm it and make +it less useful for other important discussions. + +There currently exists a [style guide](https://github.com/rust-lang/rust/tree/master/src/doc/style) +as part of the Rust documentation. This is far more wide-reaching than just +formatting style, but also not detailed enough to specify Rustfmt. This was +originally developed in its [own repository](https://github.com/rust-lang/rust-guidelines), +but is now part of the main Rust repository. That seems like a poor venue for +discussion of these guidelines due to visibility. + + +# Detailed design +[design]: #detailed-design + +## Process + +The process for style RFCs will mostly follow the [process for other RFCs](https://github.com/rust-lang/rfcs). +Anyone may submit an RFC. An overview of the process is: + +* If there is no single, obvious style, then open a GitHub issue on the + fmt-rfcs repo for initial discussion. This initial discussion should identify + which Rustfmt options are required to enforce the guideline. +* Implement the style in rustfmt (behind an option if it is not the current + default). In exceptional circumstances (such as where the implementation would + require very deep changes to rustfmt), this step may be skipped. +* Write an RFC formalising the formatting convention and referencing the + implementation, submit as a PR to fmt-rfcs. The RFC should include the default + values for options to enforce the guideline and which non-default options + should be kept. +* The RFC PR will be triaged by the style team and either assigned to a team + member for [shepherding](https://github.com/rust-lang/rfcs#the-role-of-the-shepherd), + or closed. +* When discussion has reached a fixed point, the RFC PR will be put into a final + comment period (FCP). +* After FCP, the RFC will either be accepted and merged or closed. +* Implementation in Rustfmt can then be finished (including any changes due to + discussion of the RFC), and defaults are set. + + +### Scope of the process + +This process is specifically limited to formatting style guidelines which can be +enforced by Rustfmt with its current architecture. Guidelines that cannot be +enforced by Rustfmt without a large amount of work are out of scope, even if +they only pertain to formatting. + +Note whether Rustfmt should be configurable at all, and if so how configurable +is a decision that should be dealt with using the formatting RFC process. That +will be a rather exceptional RFC. + +### Size of RFCs + +RFCs should be self-contained and coherent, whilst being as small as possible to +keep discussion focused. For example, an RFC on 'arithmetic and logic +expressions' is about the right size; 'expressions' would be too big, and +'addition' would be too small. + + +### When is a guideline ready for RFC? + +The purpose of the style RFC process is to foster an open discussion about style +guidelines. Therefore, RFC PRs should be made early rather than late. It is +expected that there may be more discussion and changes to style RFCs than is +typical for Rust RFCs. However, at submission, RFC PRs should be completely +developed and explained to the level where they can be used as a specification. + +A guideline should usually be implemented in Rustfmt **before** an RFC PR is +submitted. The RFC should be used to select an option to be the default +behaviour, rather than to identify a range of options. An RFC can propose a +combination of options (rather than a single one) as default behaviour. An RFC +may propose some reorganisation of options. + +Usually a style should be widely used in the community before it is submitted as +an RFC. Where multiple styles are used, they should be covered as alternatives +in the RFC, rather than being submitted as multiple RFCs. In some cases, a style +may be proposed without wide use (we don't want to discourage innovation), +however, it should have been used in *some* real code, rather than just being +sketched out. + + +### Triage + +RFC PRs are triaged by the style team. An RFC may be closed during triage (with +feedback for the author) if the style team think it is not specified in enough +detail, has too narrow or broad scope, or is not appropriate in some way (e.g., +applies to more than just formatting). Otherwise, the PR will be assigned a +shepherd as for other RFCs. + + +### FCP + +FCP will last for two weeks (assuming the team decide to meet every two weeks) +and will be announced in the style team sub-team report. + + +### Decision and post-decision process + +The style team will make the ultimate decision on accepting or closing a style +RFC PR. Decisions should be by consensus. Most discussion should take place on +the PR comment thread, a decision should ideally be made when consensus is +reached on the thread. Any additional discussion amongst the style team will be +summarised on the thread. + +If an RFC PR is accepted, it will be merged. An issue for implementation will be +filed in the appropriate place (usually the Rustfmt repository) referencing the +RFC. If the style guide needs to be updated, then an issue for that should be +filed on the Rust repository. + +The author of an RFC is not required to implement the guideline. If you are +interested in working on the implementation for an 'active' RFC, but cannot +determine if someone else is already working on it, feel free to ask (e.g. by +leaving a comment on the associated issue). + + +## The fmt-rfcs repository + +The form of the fmt-rfcs repository will follow the rfcs repository. Accepted +RFCs will live in a `text` directory, the `README.md` will include information +taken from this RFC, there will be an RFC template in the root of the +repository. Issues on the repository can be used for placeholders for future +RFCs and for preliminary discussion. + +The RFC format will be illustrated by the RFC template. It will have the +following sections: + +* summary +* details +* implementation +* rationale +* alternatives +* unresolved questions + +The 'details' section should contain examples of both what should and shouldn't +be done, cover simple and complex cases, and the interaction with other style +guidelines. + +The 'implementation' section should specify how options must be set to enforce +the guideline, and what further changes (including additional options) are +required. It should specify any renaming, reorganisation, or removal of options. + +The 'rationale' section should motivate the choices behind the RFC. It should +reference existing code bases which use the proposed style. 'Alternatives' +should cover alternative possible guidelines, if appropriate. + +Guidelines may include more than one acceptable rule, but should offer +guidance for when to use each rule (which should be formal enough to be used by +a tool). + +For example: "a struct literal must be formatted either on a single line (with +spaces after the opening brace and before the closing brace, and with fields +separated by commas and spaces), or on multiple lines (with one field per line +and newlines after the opening brace and before the closing brace). The former +approach should be used for short struct literals, the latter for longer struct +literals. For tools, the first approach should be used when the width of the +fields (excluding commas and braces) is 16 characters. E.g., + +``` +let x = Foo { a: 42, b: 34 }; +let y = Foo { + a: 42, + b: 34, + c: 1000 +}; +``` +" + +(Note this is just an example, not a proposed guideline). + +The repository in embryonic form lives at [nrc/fmt-rfcs](https://github.com/nrc/fmt-rfcs). +It illustrates what [issues](https://github.com/nrc/fmt-rfcs/issues/1) and +[PRs](https://github.com/nrc/fmt-rfcs/pull/2) might look like, as well as +including the RFC template. Note that typically there should be more discussion +on an issue before submitting an RFC PR. + +The repository should be updated as this RFC develops, and moved to the rust-lang +GitHub organisation if this RFC is accepted. + + +## The style team + +The style [sub-team](https://github.com/rust-lang/rfcs/blob/master/text/1068-rust-governance.md#subteams) +will be responsible for handling style RFCs and making decisions related to +code style and formatting. + +Per the [governance RFC](https://github.com/rust-lang/rfcs/blob/master/text/1068-rust-governance.md), +the core team would pick a leader who would then pick the rest of the team. I +propose that the team should include members representative of the following +areas: + +* Rustfmt, +* the language, tools, and libraries sub-teams (since each has a stake in code style), +* large Rust projects. + +Because activity such as this hasn't been done before in the RUst community, it +is hard to identify suitable candidates for the team ahead of time. The team +will probably start small and consist of core members of the Rust community. I +expect that once the process gets underway the team can be rapidly expanded with +community members who are active in the fmt-rfcs repository (i.e., submitting +and constructively commenting on RFCs). + +There will be a dedicated irc channel for discussion on formatting issues: +`#rust-style`. + + +## Style guide + +The [existing style guide](https://github.com/rust-lang/rust/tree/master/src/doc/style) +will be split into two guides: one dealing with API design and similar issues +which will be managed by the libs team, and one dealing with formatting issues +which will be managed by the style team. Note that the formatting part of the +guide may include guidelines which are not enforced by Rustfmt. Those are outside +the scope of the process defined in this RFC, but still belong in that part of +the style guide. + +When RFCs are accepted the style guide may need to be updated. Towards the end +of the process, the style team should audit and edit the guide to ensure it is a +coherent document. + + +## Material goals + +Hopefully, the style guideline process will have limited duration, one year +seems reasonable. After that time, style guidelines for new syntax could be +included with regular RFCs, or the fmt-rfcs repository could be maintained in a +less active fashion. + +At the end of the process, the fmt-rfcs repository should be a fairly complete +guide for formatting Rust code, and useful as a specification for Rustfmt and +tools with similar goals, such as IDEs. In particular, there should be a +decision made on how configurable Rustfmt should be, and an agreed set of +default options. The formatting style guide in the Rust repository should be a +more human-friendly source of formatting guidelines, and should be in sync with +the fmt-rfcs repo. + + +# Drawbacks +[drawbacks]: #drawbacks + +This RFC introduces more process and bureaucracy, and requires more meetings for +some core Rust contributors. Precious time and energy will need to be devoted to +discussions. + + +# Alternatives +[alternatives]: #alternatives + +Benevolent dictator - a single person dictates style rules which will be +followed without question by the community. This seems to work for Go, I suspect +it will not work for Rust. + +Parliamentary 'democracy' - the community 'elects' a style team (via the usual +RFC consensus process, rather than actual voting). The style team decides on +style issues without an open process. This would be more efficient, but doesn't +fit very well with the open ethos of the Rust community. + +Use the RFCs repo, rather than a new repo. This would have the benefit that +style RFCs would get more visibility, and it is one less place to keep track of +for Rust community members. However, it risks overwhelming the RFC repo with +style debate. + +Use issues on Rustfmt. I feel that the discussions would not have enough +visibility in this fashion, but perhaps that can be addressed by wide and +regular announcement. + +Use a book format for the style repo, rather than a collection of RFCs. This +would make it easier to see how the 'final product' style guide would look. +However, I expect there will be many issues that are important to be aware of +while discussing an RFC, that are not important to include in a final guide. + +Have an existing team handle the process, rather than create a new style team. +Saves on a little bureaucracy. Candidate teams would be language and tools. +However, the language team has very little free bandwidth, and the tools team is +probably not broad enough to effectively handle the style decisions. + + +# Unresolved questions +[unresolved]: #unresolved-questions From 062342588a0cf1b527a0b6240cd0b96ba5b52d11 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 4 May 2016 10:09:14 -0700 Subject: [PATCH 0900/1195] RFC 1525 is Cargo workspaces --- text/{0000-cargo-workspace.md => 1525-cargo-workspace.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-cargo-workspace.md => 1525-cargo-workspace.md} (98%) diff --git a/text/0000-cargo-workspace.md b/text/1525-cargo-workspace.md similarity index 98% rename from text/0000-cargo-workspace.md rename to text/1525-cargo-workspace.md index 27516a4185a..e07c3907ea7 100644 --- a/text/0000-cargo-workspace.md +++ b/text/1525-cargo-workspace.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2015-09-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1525](https://github.com/rust-lang/rfcs/pull/1525) +- Rust Issue: [rust-lang/cargo#2122](https://github.com/rust-lang/cargo/issues/2122) # Summary From c3fa4ba8b610be55d5646614489bfe3a2503891c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 4 May 2016 16:27:06 -0700 Subject: [PATCH 0901/1195] RFC 1521 is Copy/Clone semantics --- ...copy-clone-semantics.md => 1521-copy-clone-semantics.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-copy-clone-semantics.md => 1521-copy-clone-semantics.md} (93%) diff --git a/text/0000-copy-clone-semantics.md b/text/1521-copy-clone-semantics.md similarity index 93% rename from text/0000-copy-clone-semantics.md rename to text/1521-copy-clone-semantics.md index 0df568c4963..6a79314d156 100644 --- a/text/0000-copy-clone-semantics.md +++ b/text/1521-copy-clone-semantics.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 01 March, 2016 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1521](https://github.com/rust-lang/rfcs/pull/1521) +- Rust Issue: [rust-lang/rust#33416](https://github.com/rust-lang/rust/issues/33416) # Summary [summary]: #summary @@ -24,7 +24,7 @@ would allow us to simply `memcpy` the values from the old `Vec` to the new `Vec` in the case of `T: Copy`. However, if we don't specify this, we will not be able to, and we will be stuck looping over every value. -It's always been the intention that `Clone::clone == ptr::read for T: Copy`; see +It's always been the intention that `Clone::clone == ptr::read for T: Copy`; see [issue #23790][issue-copy]: "It really makes sense for `Clone` to be a supertrait of `Copy` -- `Copy` is a refinement of `Clone` where `memcpy` suffices, basically." This idea was also implicit in accepting From a153c5362a36b487fa1b6d059afbe4a21f7f1d7e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 4 May 2016 16:29:45 -0700 Subject: [PATCH 0902/1195] RFC 1542 is TryFrom/TryInto traits --- text/{0000-try-from.md => 1542-try-from.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-try-from.md => 1542-try-from.md} (96%) diff --git a/text/0000-try-from.md b/text/1542-try-from.md similarity index 96% rename from text/0000-try-from.md rename to text/1542-try-from.md index a08c9e09d46..affee80057b 100644 --- a/text/0000-try-from.md +++ b/text/1542-try-from.md @@ -1,7 +1,7 @@ -- Feature Name: try_from +- Feature Name: `try_from` - Start Date: 2016-03-10 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1542](https://github.com/rust-lang/rfcs/pull/1542) +- Rust Issue: [rust-lang/rfcs#33147](https://github.com/rust-lang/rust/issues/33417) # Summary [summary]: #summary From efa734d7bf4489ba9be3ac6f1a026a6b6bc2cab8 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 6 May 2016 11:18:14 +1200 Subject: [PATCH 0903/1195] Add a caveat to the 'explicit names shadow glob names' rule --- text/0000-name-resolution.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/text/0000-name-resolution.md b/text/0000-name-resolution.md index 120ee265f12..84bb8149bf9 100644 --- a/text/0000-name-resolution.md +++ b/text/0000-name-resolution.md @@ -190,6 +190,34 @@ mod boz { } ``` +Caveat: an explicit name which is defined by the expansion of a macro does **not** +shadow glob imports. Example: + +``` +macro_rules! foo { + () => { + fn foo() {} + } +} + +mod a { + fn foo() {} +} + +mod b { + use a::*; + + foo!(); // Expands to `fn foo() {}`, this `foo` does not shadow the `foo` + // imported from `a` and therefore there is a duplicate name error. +} +``` + +The rationale for this caveat is so that during import resolution, if we have a +glob import we can be sure that any imported names will not be shadowed, either +the name will continue to be valid, or there will be an error. Without this +caveat, a name could be valid, and then after further expansion, become shadowed +by a higher priority name. + This change is discussed in [issue 31337](https://github.com/rust-lang/rust/issues/31337). @@ -303,6 +331,12 @@ fn process_work_list() { } ``` +Note that this pseudo-code elides some details: that names are imported into +distinct namespaces (the type and value namespaces, and with changes to macro +naming, also the macro namespace), and that we must record whether a name is due +to macro expansion or not to abide by the caveat to the 'explicit names shadow +glob names' rule. + In order to keep macro expansion comprehensible to programmers, we must enforce that all macro uses resolve to the same binding at the end of resolution as they do when they were resolved. From a5a36ec3543e33941db07bd3e9b4e8de418263b9 Mon Sep 17 00:00:00 2001 From: Liigo Zhuang Date: Sat, 7 May 2016 12:38:57 +0800 Subject: [PATCH 0904/1195] clarify 'root' with 'root crate' or 'root `Cargo.toml`' Currently, it is not clear if the 'root' is a file or directory, and so the meaning of "the `target` directory next to the root" is ambiguous. Note that 'root crate' is not new words, it appeared in the origianl RFC text. --- text/1525-cargo-workspace.md | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/text/1525-cargo-workspace.md b/text/1525-cargo-workspace.md index e07c3907ea7..0cd37ca8517 100644 --- a/text/1525-cargo-workspace.md +++ b/text/1525-cargo-workspace.md @@ -36,13 +36,15 @@ use already-built artifacts if available. Cargo will grow the concept of a **workspace** for managing repositories of multiple crates. Workspaces will then have the properties: -* A workspace can contain multiple local crates. -* Each workspace will have a root. +* A workspace can contain multiple local crates: one 'root crate', and any + number of 'member crate'. +* The root crate of a workspace has a `Cargo.toml` file containing `[workspace]` + key, which we call it as 'root `Cargo.toml`'. * Whenever any crate in the workspace is compiled, output will be placed in the - `target` directory next to the root. -* One `Cargo.lock` for the entire workspace will reside next to the workspace - root and encompass the dependencies (and dev-dependencies) for all packages - in the workspace. + `target` directory next to the root `Cargo.toml`. +* One `Cargo.lock` file for the entire workspace will reside next to the root + `Cargo.toml` and encompass the dependencies (and dev-dependencies) for all + crates in the workspace. With workspaces, Cargo can now solve the problems set forth in the motivation section. Next, however, workspaces need to be defined. In the spirit of much of @@ -63,19 +65,19 @@ members = ["relative/path/to/child1", "../child2"] workspace = "../foo" ``` -The root of a workspace, indicated by the presence of `[workspace]`, is -responsible for defining the entire workspace (listing all members). +The root `Cargo.toml` of a workspace, indicated by the presence of `[workspace]`, +is responsible for defining the entire workspace (listing all members). This example here means that two extra crates will be members of the workspace (which also includes the root). -The `package.workspace` key is used to point at a workspace root. For -example this Cargo.toml indicates that the Cargo.toml in `../foo` is the -workspace root that this package is a member of. +The `package.workspace` key is used to point at a workspace's root crate. For +example this Cargo.toml indicates that the Cargo.toml in `../foo` is the root +Cargo.toml of root crate, that this package is a member of. These keys are mutually exclusive when applied in `Cargo.toml`. A crate may *either* specify `package.workspace` or specify `[workspace]`. That is, a -crate cannot both be a root in a workspace (contain `[workspace]`) and also be -a member of another workspace (contain `package.workspace`). +crate cannot both be a root crate in a workspace (contain `[workspace]`) and +also be a member crate of another workspace (contain `package.workspace`). ### "Virtual" `Cargo.toml` @@ -141,8 +143,8 @@ mentioned above for all of the explicit configuration as well. ### Workspaces in practice Many Rust projects today already have `Cargo.toml` at the root of a repository, -and with the small addition of `[workspace]` in the root a workspace will be -ready for all crates in that repository. For example: +and with the small addition of `[workspace]` in the root `Cargo.toml`, a +workspace will be ready for all crates in that repository. For example: * An FFI crate with a sub-crate for FFI bindings From 49453a74cd84a04ff0e2c8231b4954dc0b7faedd Mon Sep 17 00:00:00 2001 From: Andrew Cann Date: Sat, 7 May 2016 13:03:23 +0800 Subject: [PATCH 0905/1195] Small improvements based on thread comments --- text/0000-bang-type.md | 53 +++++++++++++++++++++++++++--------------- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/text/0000-bang-type.md b/text/0000-bang-type.md index cea8750f8b6..247ed8fb5d8 100644 --- a/text/0000-bang-type.md +++ b/text/0000-bang-type.md @@ -12,10 +12,11 @@ Promote `!` to be a full-fledged type equivalent to an `enum` with no variants. To understand the motivation for this it's necessary to understand the concept of empty types. An empty type is a type with no inhabitants, ie. a type for which there is nothing of that type. For example consider the type `enum Never -{}`. This type has no constructors and therefore can never be instantiated. It is empty, in the sense that there are no values of type `Never`. Note -that `Never` is not equivalent to `()` or `struct Foo {}` each of which have -exactly one inhabitant. Empty types have some interesting properties that may -be unfamiliar to programmers who have not encountered them before. +{}`. This type has no constructors and therefore can never be instantiated. It +is empty, in the sense that there are no values of type `Never`. Note that +`Never` is not equivalent to `()` or `struct Foo {}` each of which have exactly +one inhabitant. Empty types have some interesting properties that may be +unfamiliar to programmers who have not encountered them before. * They never exist at runtime. Because there is no way to create one. @@ -42,8 +43,9 @@ be unfamiliar to programmers who have not encountered them before. * They represent the return type of functions that don't return. For a function that never returns, such as `exit`, the set of all values it may return is the empty set. That is to say, the type of all values it may - return is the type of no inhabitants, ie. `Never` or anything equivalent to - it. + return is the type of no inhabitants, ie. `Never` or anything isomorphic to + it. Similarly, they are the logical type for expressions that never return + to their caller such as `break`, `continue` and `return`. * They can be converted to any other type. To specify a function `A -> B` we need to specify a return value in `B` for @@ -98,8 +100,8 @@ fn wrap_exit() -> Never { // we can use a `Never` value to diverge without using unsafe code or calling // any diverging intrinsics fn diverge_from_never(n: Never) -> ! { - match n { - } + match n { + } } fn main() { @@ -263,6 +265,20 @@ So why do this? AFAICS there are 3 main reasons use cases. Doing so would standardise the concept and prevent different people reimplementing it under different names. + * **Better dead code detection** + + Consider the following code: + + ``` + let t = std::thread::spawn(|| panic!("nope")); + t.join().unwrap(); + println!("hello"); + + ``` + Under this RFC: the closure body gets typed `!` instead of `()`, the `unwrap()` + gets typed `!`, and the `println!` will raise a dead code warning. There's no + way current rust can detect cases like that. + * **Because it's the correct thing to do.** The empty type is such a fundamental concept that - given that it already @@ -326,6 +342,12 @@ written with this RFC's `!` can already be written by swapping out `!` with issues for the language (such as making it unsound or complicating the compiler) then these issues would already exist for `Never`. +It's also worth noting that the `!` proposed here is *not* the bottom type that +used to exist in Rust in the very early days. Making `!` a subtype of all types +would greatly complicate things as it would require, for example, `Vec` be a +subtype of `Vec`. This `!` is simply an empty type (albeit one that can be +cast to any other type) + # Detailed design Add a type `!` to Rust. `!` behaves like an empty enum except that it can be @@ -352,10 +374,11 @@ match break { () => 23, // matching with a `()` forces the match argument to be cast to type `()` } ``` +These casts can be implemented by having the compiler assign a fresh, diverging +type variable to any expression of type `!`. -In the compiler, remove the distinctions that treat diverging and converging -expressions as two different kinds of things (eg. stuff like `FnConverging` vs -`FnDiverging`). Use the type system to do things like reachability analysis. +In the compiler, remove the distinction between diverging and converging +functions. Use the type system to do things like reachability analysis. Add an implementation for `!` of any trait that it can trivially implement. Add methods to `Result` and `Result` for safely extracting the inner @@ -379,14 +402,6 @@ Someone would have to implement this. # Unresolved questions -Apparently, rust used to have something similar to this but it was removed. -There are still a few references to `ty_bot` in the compiler. Why was this -taken out? Note that if there any arguments for not having type `!` in the -language they should apply equally well to `Never`/`Void` so I assume the old -`ty_bot` was trying to be something crazier than this RFC's `!` (such as a -subtype of all types, given the name). Could someone who was around back then -clarify this? - `!` has a unique impl of any trait whose only items are non-static methods. It would be nice if there was a way a to automate the creation of these impls. Should `!` automatically satisfy any such trait? Alternatively we could do this From ae4f3454f9c4f2366785276a4d752d62f837c381 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 9 May 2016 09:29:18 -0700 Subject: [PATCH 0906/1195] Disallow mixing align/packed --- text/0000-repr-align.md | 35 +++++++++++------------------------ 1 file changed, 11 insertions(+), 24 deletions(-) diff --git a/text/0000-repr-align.md b/text/0000-repr-align.md index 2a048398f6f..efaba0b9185 100644 --- a/text/0000-repr-align.md +++ b/text/0000-repr-align.md @@ -65,17 +65,17 @@ Multiple `#[repr(align = "..")]` directives are accepted on a struct declaration, and the actual alignment of the structure will be the maximum of all `align` directives and the natural alignment of the struct itself. -Semantically, it will be guaranteed (modulo `unsafe` code and `#[repr(packed)`) -that custom alignment will always be respected. If a pointer to a non-aligned -structure exists and is used then it is considered unsafe behavior. Local -variables, objects in arrays, statics, etc, will all respect the custom -alignment specified for a type. - -The `#[repr(align)]` attribute will not interact with `#[repr(packed)]` in the -sense that the `packed` attribute only affects *field alignment* whereas `align` -affects the *struct alignment*. The `packed` may indirectly lower struct -alignment by lowering the alignment of fields, and then `align` may raise the -overal struct alignment. +Semantically, it will be guaranteed (modulo `unsafe` code) that custom alignment +will always be respected. If a pointer to a non-aligned structure exists and is +used then it is considered unsafe behavior. Local variables, objects in arrays, +statics, etc, will all respect the custom alignment specified for a type. + +For now, it will be illegal to mix `#[repr(align)]` and `#[repr(packed)]` in +structs. Specifically, both attributes cannot be applied on the same struct, and +a `#[repr(packed)]` struct cannot transitively contain another struct with +`#[repr(align)]` or vice versa. The behavior of MSVC and gcc differ in how these +properties interact, and for now we'll just yield an error while we get +experience with the two attributes. Some examples of `#[repr(align)]` are: @@ -113,21 +113,8 @@ struct Align8Many { assert_eq!(mem::align_of::(), 8); assert_eq!(mem::size_of::(), 16); - -// Raising alignment beyond the packed value -#[repr(align = "4", packed = "2")] -struct AlignAndPacked { - a: u16, - b: i32, -} - -assert_eq!(mem::align_of::(), 4); -assert_eq!(mem::size_of::(), 8); -assert_eq!(offset_of!(AlignAndPacked, a), 0); -assert_eq!(offset_of!(AlignAndPacked, b), 2); ``` - # Drawbacks [drawbacks]: #drawbacks From a43706c7610a1498db5385a87df5361a14d9b273 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 11 May 2016 10:43:04 -0700 Subject: [PATCH 0907/1195] Allow #[repr(packed)] inside #[repr(align)] --- text/0000-repr-align.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/text/0000-repr-align.md b/text/0000-repr-align.md index efaba0b9185..843d43719c7 100644 --- a/text/0000-repr-align.md +++ b/text/0000-repr-align.md @@ -70,12 +70,13 @@ will always be respected. If a pointer to a non-aligned structure exists and is used then it is considered unsafe behavior. Local variables, objects in arrays, statics, etc, will all respect the custom alignment specified for a type. -For now, it will be illegal to mix `#[repr(align)]` and `#[repr(packed)]` in -structs. Specifically, both attributes cannot be applied on the same struct, and -a `#[repr(packed)]` struct cannot transitively contain another struct with -`#[repr(align)]` or vice versa. The behavior of MSVC and gcc differ in how these -properties interact, and for now we'll just yield an error while we get -experience with the two attributes. +For now, it will be illegal for any `#[repr(packed)]` struct to transitively +contain a struct with `#[repr(align)]`. Specifically, both attributes cannot be +applied on the same struct, and a `#[repr(packed)]` struct cannot transitively +contain another struct with `#[repr(align)]`. The flip side, including a +`#[repr(packed)]` structure inside of a `#[repr(align)]` one will be allowed. +The behavior of MSVC and gcc differ in how these properties interact, and for +now we'll just yield an error while we get experience with the two attributes. Some examples of `#[repr(align)]` are: From ca63eb92da882eae556ba94a5a7cd9b4c706b40d Mon Sep 17 00:00:00 2001 From: Andrew Cann Date: Fri, 13 May 2016 00:36:15 +0800 Subject: [PATCH 0908/1195] Add remark about allow explicit casts from --- text/0000-bang-type.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-bang-type.md b/text/0000-bang-type.md index 247ed8fb5d8..d70e3556c65 100644 --- a/text/0000-bang-type.md +++ b/text/0000-bang-type.md @@ -380,6 +380,9 @@ type variable to any expression of type `!`. In the compiler, remove the distinction between diverging and converging functions. Use the type system to do things like reachability analysis. +Allow expressions of type `!` to be explicitly cast to any other type (eg. +`let x: u32 = break as u32;`) + Add an implementation for `!` of any trait that it can trivially implement. Add methods to `Result` and `Result` for safely extracting the inner value. Name these methods along the lines of `unwrap_nopanic`, `safe_unwrap` or From a84bfbf711eb75907d70e15b9e49c2f3203467e1 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 13 May 2016 16:13:51 -0700 Subject: [PATCH 0909/1195] RFC 1358 is repr-align --- text/{0000-repr-align.md => 1358-repr-align.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-repr-align.md => 1358-repr-align.md} (100%) diff --git a/text/0000-repr-align.md b/text/1358-repr-align.md similarity index 100% rename from text/0000-repr-align.md rename to text/1358-repr-align.md From 78077879eb5b5e222ed0bcfd5ee74052da6d974b Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 13 May 2016 16:20:37 -0700 Subject: [PATCH 0910/1195] RFC 1492: `..` in patterns --- ...n-patterns.md => 1492-dotdot-in-patterns.md} | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) rename text/{0000-dotdot-in-patterns.md => 1492-dotdot-in-patterns.md} (87%) diff --git a/text/0000-dotdot-in-patterns.md b/text/1492-dotdot-in-patterns.md similarity index 87% rename from text/0000-dotdot-in-patterns.md rename to text/1492-dotdot-in-patterns.md index 954698d81a3..c7861672222 100644 --- a/text/0000-dotdot-in-patterns.md +++ b/text/1492-dotdot-in-patterns.md @@ -1,6 +1,6 @@ - Feature Name: dotdot_in_patterns - Start Date: 2016-02-06 -- RFC PR: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1492 - Rust Issue: (leave this empty) # Summary @@ -69,6 +69,7 @@ S { subpat1, subpat2, .. } // anywhere except for one conventionally chosen position (the last one) or in sublist bindings, // so we don't propose extensions to struct patterns. S { subpat1, .., subpatN } +// **NOT PROPOSED**: Struct patterns with bindings S { subpat1, binding.., subpatN } // Tuple struct patterns, the last and the only position, no extra subpatterns allowed. @@ -77,22 +78,20 @@ S(..) S(subpat1, subpat2, ..) S(.., subpatN-1, subpatN) S(subpat1, .., subpatN) -// **NEW**: Tuple struct patterns, any position with a sublist binding. -// The binding has a tuple type. -// By ref bindings are not allowed, because layouts of S(A, B, C, D) and (B, C) are not necessarily -// compatible (e.g. field reordering is possible). +// **NOT PROPOSED**: Struct patterns with bindings S(subpat1, binding.., subpatN) // **NEW**: Tuple patterns, any position. (subpat1, subpat2, ..) (.., subpatN-1, subpatN) (subpat1, .., subpatN) -// **NEW**: Tuple patterns, any position with a sublist binding. -// The binding has a tuple type. -// By ref bindings are not allowed, because layouts of (A, B, C, D) and (B, C) are not necessarily -// compatible (e.g. field reordering is possible). +// **NOT PROPOSED**: Tuple patterns with bindings (subpat1, binding.., subpatN) +``` + +Slice patterns are not covered in this RFC, but here is the syntax for reference: +``` // Slice patterns, the last position. [subpat1, subpat2, ..] // Slice patterns, the first position. From a4a22d7c5dd71724bb2cd0fe2db5026338d0b270 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 13 May 2016 16:21:59 -0700 Subject: [PATCH 0911/1195] add links for rfc 1358 --- text/1358-repr-align.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1358-repr-align.md b/text/1358-repr-align.md index 843d43719c7..d3c2d8f0004 100644 --- a/text/1358-repr-align.md +++ b/text/1358-repr-align.md @@ -1,7 +1,7 @@ - Feature Name: `repr_align` - Start Date: 2015-11-09 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1358 +- Rust Issue: https://github.com/rust-lang/rust/issues/33626 # Summary [summary]: #summary From a0d5e4afdaa40a4abb3610173fde93344766db3b Mon Sep 17 00:00:00 2001 From: Wang Xuerui Date: Tue, 17 May 2016 15:49:55 +0800 Subject: [PATCH 0912/1195] Ergonomic format_args! --- text/0000-ergonomic-format-args.md | 186 +++++++++++++++++++++++++++++ 1 file changed, 186 insertions(+) create mode 100644 text/0000-ergonomic-format-args.md diff --git a/text/0000-ergonomic-format-args.md b/text/0000-ergonomic-format-args.md new file mode 100644 index 00000000000..32c741cc7da --- /dev/null +++ b/text/0000-ergonomic-format-args.md @@ -0,0 +1,186 @@ +- Feature Name: `ergonomic_format_args` +- Start Date: 2016-05-17 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Removes the one-type-only restriction on `format_args!` arguments. + +# Motivation +[motivation]: #motivation + +The `format_args!` macro and its friends historically only allowed a single +type per argument, such that trivial format strings like `"{0:?} == {0:x}"` or +`"rgb({r}, {g}, {b}) is #{r:02x}{g:02x}{b:02x}"` are illegal. This is +massively inconvenient and counter-intuitive, especially considering the +formatting syntax is borrowed from Python where such things are perfectly +valid. + +Upon closer investigation, the restriction is in fact an artificial +implementation detail. For mapping format placeholders to macro arguments the +`format_args!` implementation did not bother to record type information for +all the placeholders sequentially, but rather chose to remember only one type +per argument. Also the formatting logic has not received significant attention +since after its conception, but the uses have greatly expanded over the years, +so the mechanism as a whole certainly needs more love. + +# Detailed design +[design]: #detailed-design + +## Overview + +Formatting is done during both compile-time (expansion-time to be pedantic) +and runtime in Rust. As we are concerned with format string parsing, not +outputting, this RFC only touches the compile-time side of the existing +formatting mechanism which is `libsyntax_ext` and `libfmt_macros`. + +Before continuing with the details, it is worth noting that the core flow of +current Rust formatting is *mapping arguments to placeholders to format specs*. +For clarity, we distinguish among *placeholders*, *macro arguments* and +*generated `ArgumentV1` objects*. They are all *italicized* to provide some +visual hint for distinction. + +To implement the proposed design, first we resolve all implicit references to +the next argument (*next-references* for short) during parse; then we modify +the macro expansion to make use of the now explicit argument references, +preserving the mapping. + +## Parse-time next-reference resolution + +Currently two forms of next-references exist: `ArgumentNext` and +`CountIsNextParam`. Both take a positional *macro argument* and advance the +same internal pointer, but format is parsed before position, as shown in +format strings like `"{foo:.*} {} {:.*}"` which is in every way equivalent to +`"{foo:.0$} {1} {3:.2$}"`. + +As the rule is already known even at compile-time, and does not require the +whole format string to be known beforehand, the resolution can happen just +inside the parser after a *placeholder* is successfully parsed. As a natural +consequence, both forms of next-reference can be removed from the rest of the +compiler, simplifying work later. + +## Expansion-time argument mapping + +There are two kinds of *macro arguments*, positional and named. Because of the +apparent type difference, two maps are needed to track *placeholder* types +(known as `ArgumentType`s in the code). In the current implementation, +`Vec>` is for positional *macro arguments* and +`HashMap` is for named *macro arguments*, apparently +neither of which supports multiple types for one *macro argument*. Also, for +constructing the `__STATIC_FMTARGS` we need to first figure out the order for +every *placeholder* in the list of *generated `ArgumentV1` objects*. So we +first classify *placeholders* according to their associated *macro arguments*, +which are all explicit now, then assign each of them a correct index. + +### Placeholder type collection + +In the proposed design, lists of `ArgumentType`s are used to store +*placeholder* types for each *macro argument* in order. During verification +the *placeholder* type seen for a *macro argument* is simply pushed into the +respective list. This does not remove the ability to sense unused +*macro arguments*, as the list would simply be empty when checked later, just +as it would be `None` in the old `Option` version. + +### Mapping construction + +For consistency with the current implementation, named *macro arguments* are +still put at the end of *generated `ArgumentV1` objects*. Which means we have +to consume all of format string in order to know how many *placeholders* there +are referencing to positional *macro arguments*. As such, the verification +and translation of pieces are now separated with mapping construction in +between. + +Obviously, the orders used during mapping and actual expansion must agree, but +fortunately the rules are very simple now only explicit references remain. +We iterate over the list of known positional *macro arguments*, recording the +index every bunch of *generated `ArgumentV1` objects* would begin for each +positional *macro argument*. After that, we also record the total number for +mapping the named *macro arguments*, as the relative offsets of named +*placeholders* are already recorded during verification. + +### Expansion + +With mapping between *placeholders* and *generated `ArgumentV1` objects* +ready at hand, it is easy to emit correct `Argument`s. Scratch space is +provided to `trans_piece` for remembering how many *placeholders* for a given +*macro argument* have been processed. This information is then used to rewrite +all references from using *macro argument* indices to +*generated `ArgumentV1` object* indices, namely: + +* `ArgumentIs(i)` +* `ArgumentNamed(n)` +* `CountIsParam(i)` +* `CountIsName(n)` + +For the count references, some may suggest that they are now potentially +ambiguous. However considering the implementation of `verify_count`, the +parameter used by each `Count` is individually injected into the list of +*generated `ArgumentV1` objects* as if it were explicitly specified. Also it +is *macro arguments* to be referenced, not the potentially multiple +*placeholders*, so there are in fact no ambiguities. + +# Drawbacks +[drawbacks]: #drawbacks + +Due to the added data structures and processing, time and memory costs of +compilations may slightly increase. However this is mere speculation without +actual profiling and benchmarks. Also the ergonomical benefits alone justifies +the additional costs. + +# Alternatives +[alternatives]: #alternatives + +## Do nothing + +One can always write a little more code to simulate the proposed behavior, +and this is what people have most likely been doing under today's constraints. +As in: + +```rust +fn main() { + let r = 0x66; + let g = 0xcc; + let b = 0xff; + + // rgb(102, 204, 255) == #66ccff + // println!("rgb({r}, {g}, {b}) == #{r:02x}{g:02x}{b:02x}", r=r, g=g, b=b); + println!("rgb({}, {}, {}) == #{:02x}{:02x}{:02x}", r, g, b, r, g, b); +} +``` + +Or slightly more verbose when side effects are in play: + +```rust +fn do_something(i: &mut usize) -> usize { + let result = *i; + *i += 1; + result +} + +fn main() { + let mut i = 0x1234usize; + + // 0b1001000110100 0o11064 0x1234 + // 0x1235 + // println!("{0:#b} {0:#o} {0:#x}", do_something(&mut i)); + // println!("{:#x}", i); + + // need to consider side effects, hence a temp var + { + let r = do_something(&mut i); + println!("{:#b} {:#o} {:#x}", r, r, r); + println!("{:#x}", i); + } +} +``` + +While the effects are the same and nothing requires modification, the +ergonomics is simply bad and the code becomes unnecessarily convoluted. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Does the *generated `ArgumentV1` objects* need deduplication? +* Will it break the ABI if handling of next-references in `libcore/fmt` is removed as well? From 4b02fa03e508052c249bf3deb19e05bfd456666e Mon Sep 17 00:00:00 2001 From: Wang Xuerui Date: Tue, 17 May 2016 17:11:20 +0800 Subject: [PATCH 0913/1195] small English nitpick --- text/0000-ergonomic-format-args.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-ergonomic-format-args.md b/text/0000-ergonomic-format-args.md index 32c741cc7da..03438277ab5 100644 --- a/text/0000-ergonomic-format-args.md +++ b/text/0000-ergonomic-format-args.md @@ -95,9 +95,9 @@ between. Obviously, the orders used during mapping and actual expansion must agree, but fortunately the rules are very simple now only explicit references remain. We iterate over the list of known positional *macro arguments*, recording the -index every bunch of *generated `ArgumentV1` objects* would begin for each -positional *macro argument*. After that, we also record the total number for -mapping the named *macro arguments*, as the relative offsets of named +index at which every bunch of *generated `ArgumentV1` objects* would begin for +each positional *macro argument*. After that, we also record the total number +for mapping the named *macro arguments*, as the relative offsets of named *placeholders* are already recorded during verification. ### Expansion From 9bf85eb2bf0656ad0c801c7e975896c7df1dcf21 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Tue, 17 May 2016 14:52:50 -0400 Subject: [PATCH 0914/1195] update a few things Move this to a clear "diff vs unified" style. Fix up graves vs backticks Make note of [long link style][website] for links --- ...0000-more-api-documentation-conventions.md | 70 +++++++++---------- 1 file changed, 35 insertions(+), 35 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index f2a81aa8157..c474883f886 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -6,12 +6,9 @@ # Summary [summary]: #summary -[RFC 505] introduced certain conventions around documenting Rust projects. This RFC supersedes -that one, though it has the same aims: to describe how the Rust project should be documented, -and provide guidance for other Rust projects as well. - -This RFC will contain some similar text as RFC 505, so that we can have one RFC with the full -conventions. +[RFC 505] introduced certain conventions around documenting Rust projects. This +RFC aguments that one, and a full text of the older one combined with these +modfications is provided below. [RFC 505]: https://github.com/rust-lang/rfcs/blob/master/text/0505-api-comment-conventions.md @@ -27,15 +24,24 @@ but it tries to motivate and clarify them. # Detailed design [design]: #detailed-design -This RFC is large. Here’s a table of contents: +# Drawbacks +[drawbacks]: #drawbacks + +It’s possible that RFC 505 went far enough, and something this detailed is inappropriate. + +# Alternatives +[alternatives]: #alternatives + +We could stick with the more minimal conventions of the previous RFC. -* [Content](#content) - * [Summary sentence](#summary-sentence) - * [English](#english) -* [Form](#form) - * [Use line comments](#use-line-comments) - * [Using Markdown](#using-markdown) -* [Example](#example) +# Unresolved questions +[unresolved]: #unresolved-questions + +None. + +# Appendix A: Full conventions text + +Below is a combination of RFC 505 + this RFC’s modifications, for convenience. ## Content [content]: #content @@ -50,17 +56,17 @@ providing a summary of the code. This line is used as a summary description throughout Rustdoc’s output, so it’s a good idea to keep it short. The summary line should be written in third person singular present indicative -form. Basically, this means write “Returns” instead of “Return.” +form. Basically, this means write ‘Returns’ instead of ‘Return’. ### English [english]: #english This section applies to `rustc` and the standard library. -All documentation is standardized on American English, with regards to -spelling, grammar, and punctuation conventions. Language changes over time, -so this doesn’t mean that there is always a correct answer to every grammar -question, but there is often some kind of formal consensus. +All documentation for the standard library is standardized on American English, +with regards to spelling, grammar, and punctuation conventions. Language +changes over time, so this doesn’t mean that there is always a correct answer +to every grammar question, but there is often some kind of formal consensus. One specific rule that comes up often: when quoting something for emphasis, use a single quote, and put punctuation outside the quotes, ‘this’. When @@ -130,9 +136,9 @@ Use top level headings # to indicate sections within your comment. Common headin Even if you only include one example, use the plural form: ‘Examples’ rather than ‘Example’. Future tooling is easier this way. -Use graves (`) to denote a code fragment within a sentence. +Use backticks (`) to denote a code fragment within a sentence. -Use triple graves (```) to write longer examples, like this: +Use backticks (```) to write longer examples, like this: This code does something cool. @@ -188,6 +194,14 @@ to [Rust website](http://www.rust-lang.org) ``` +If the text is very long, feel free to use the shortened form: + +``` +This link [is very long and links to the Rust website][website]. + +[website]: http://www.rust-lang.org +``` + ### Examples in API docs [examples-in-api-docs]: #examples-in-api-docs @@ -393,17 +407,3 @@ pub fn ref_slice(opt: &Option) -> &[T] { ## Formatting -# Drawbacks -[drawbacks]: #drawbacks - -It’s possible that RFC 505 went far enough, and something this detailed is inappropriate. - -# Alternatives -[alternatives]: #alternatives - -We could stick with the more minimal conventions of the previous RFC. - -# Unresolved questions -[unresolved]: #unresolved-questions - -None. From 5e233fa177f129dbdcb61814f927519c1e02c10a Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Wed, 11 May 2016 20:45:44 -0400 Subject: [PATCH 0915/1195] An RFC for regex 1.0. --- text/0000-regex-1.0.md | 983 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 983 insertions(+) create mode 100644 text/0000-regex-1.0.md diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md new file mode 100644 index 00000000000..3fcac13c152 --- /dev/null +++ b/text/0000-regex-1.0.md @@ -0,0 +1,983 @@ +- Feature Name: regex-1.0 +- Start Date: 2016-05-11 +- RFC PR: +- Rust Issue: + +# Table of contents + +* [Summary][summary] +* [Motivation][motivation] +* [Detailed design][design] + * [Syntax][syntax] + * [Evolution][evolution] + * [Concrete syntax][concrete-syntax] + * [Expansion concerns][expansion-concerns] + * [Core API][core-api] + * [RegexBuilder][regexbuilder] + * [Replacer][replacer] + * [quote][quote] + * [RegexSet][regexset] + * [The `bytes` submodule][the-bytes-submodule] +* [Drawbacks][drawbacks] + * [Guaranteed linear time matching][guaranteed-linear-time-matching] + * [Allocation][allocation] + * [Synchronization is implicit][synchronization-is-implicit] + * [The implementation is complex][the-implementation-is-complex] +* [Alternatives][alternatives] + * [Big picture][big-picture] + * [`bytes::Regex`][bytesregex] + * [A regex trait][a-regex-trait] + * [Reuse some types][reuse-some-types] +* [Unresolved questions][unresolved] + * [`regex-syntax`][regex-syntax] + * [`regex-capi`][regex-capi] + * [`regex_macros`][regex_macros] + * [Dependencies][dependencies] + * [Exposing more internals][exposing-more-internals] +* [Breaking changes][breaking-changes] + +# Summary +[summary]: #summary + +This RFC proposes a 1.0 API for the `regex` crate and therefore a move out of +the `rust-lang-nursery` organization and into the `rust-lang` organization. +Since the API of `regex` has largely remained unchanged since its inception +[2 years ago](https://github.com/rust-lang/rfcs/blob/master/text/0042-regexps.md), +significant emphasis is placed on retaining the existing API. Some minor +breaking changes are proposed. + +# Motivation +[motivation]: #motivation + +Regular expressions are a widely used tool and most popular programming +languages either have an implementation of regexes in their standard library, +or there exists at least one widely used third party implementation. It +therefore seems reasonable for Rust to do something similar. + +The `regex` crate specifically serves many use cases, most of which are somehow +related to searching strings for patterns. Describing regular expressions in +detail is beyond the scope of this RFC, but briefly, these core use cases are +supported in the main API: + +1. Testing whether a pattern matches some text. +2. Finding the location of a match of a pattern in some text. +3. Finding the location of a match of a pattern---and locations of all its + capturing groups---in some text. +4. Iterating over successive non-overlapping matches of (2) and (3). + +The expected outcome is that the `regex` crate should be the preferred default +choice for matching regular expressions when writing Rust code. This is already +true today; this RFC formalizes it. + +# Detailed design +[design]: #detailed-design + +## Syntax +[syntax]: #syntax + +### Evolution +[evolution]: #evolution + +The public API of a `regex` library *includes* the syntax of a regular +expression. A change in the semantics of the syntax can cause otherwise working +programs to break, yet, we'd still like the option to expand the syntax if +necessary. Thus, this RFC proposes: + +1. Any change that causes a previously invalid regex pattern to become valid is + *not* a breaking change. For example, the escape sequence `\y` is not a + valid pattern, but could become one in a future release without a major + version bump. +2. Any change that causes a previously valid regex pattern to become invalid + *is* a breaking change. +3. Any change that causes a valid regex pattern to change its matching + semantics *is* a breaking change. (For example, changing `\b` from "word + boundary assertion" to "backspace character.") + +Bug fixes are exceptions to both (2) and (3). + +Another interesting exception to (2) is that compiling a regex can fail if the +entire compiled object would exceed some pre-defined user configurable size. +In particular, future changes to the compiler could cause certain instructions +to use more memory, or indeed, the representation of the compiled regex could +change completely. This could cause a regex that fit under the size limit to +no longer fit, and therefore fail to compile. These cases are expected to be +extremely rare in practice. Notably, the default size limit is `10MB`. + +### Concrete syntax +[concrete-syntax]: #concrete-syntax + +The syntax is exhaustively documented in the current public API documentation: +http://doc.rust-lang.org/regex/regex/index.html#syntax + +To my knowledge, the evolution as proposed in this RFC has been followed since +`regex` was created. The syntax has largely remain unchanged with few +additions. + +### Expansion concerns +[expansion-concerns]: #expansion-concerns + +There are a few possible avenues for expansion, and we take measures to make +sure they are possible with respect to API evolution. + +* Escape sequences are often blessed with special semantics. For example, `\d` + is a Unicode character class that matches any digit and `\b` is a word + boundary assertion. We may one day like to add more escape sequences with + special semantics. For this reason, any unrecognized escape sequence makes a + pattern invalid. +* If we wanted to expand the syntax with various look-around operators, then it + would be possible since most common syntax is considered an invalid pattern + today. In particular, all of the [syntactic forms listed + here](http://www.regular-expressions.info/refadv.html) are invalid patterns + in `regex`. +* Character class sets are another potentially useful feature that may be worth + adding. Currently, [various forms of set + notation](http://www.regular-expressions.info/refcharclass.html) are treated + as valid patterns, but this RFC proposes making them invalid patterns before + `1.0`. +* Additional named Unicode classes or codepoints may be desirable to add. + Today, any pattern of the form `\p{NAME}` where `NAME` is unrecognized is + considered invalid, which leaves room for expansion. +* If all else fails, we can introduce new flags that enable new features that + conflict with stable syntax. This is possible because using an unrecognized + flag results in an invalid pattern. + +## Core API +[core-api]: #core-api + +The core API of the `regex` crate is the `Regex` type: + +```rust +pub struct Regex(_); +``` + +It has one primary constructor: + +```rust +impl Regex { + /// Creates a new regular expression. If the pattern is invalid or otherwise + /// fails to compile, this returns an error. + pub fn new(pattern: &str) -> Result; +} +``` + +And five core search methods. All searching completes in worst case linear time +with respect to the search text (the size of the regex is taken as a constant). + +```rust +impl Regex { + /// Returns true if and only if the text matches this regex. + pub fn is_match(&self, text: &str) -> bool; + + /// Returns the leftmost-first match of this regex in the text given. If no + /// match exists, then None is returned. + /// + /// The leftmost-first match is defined as the first match that is found + /// by a backtracking search. + pub fn find(&self, text: &str) -> Option<(usize, usize)>; + + /// Returns an iterator of successive non-overlapping matches of this regex + /// in the text given. + pub fn find_iter<'r, 't>(&'r self, text: &'t str) -> FindIter<'r, 't>; + + /// Returns the leftmost-first match of this regex in the text given with + /// locations for all capturing groups that participated in the match. + pub fn captures(&self, text: &str) -> Option; + + /// Returns an iterator of successive non-overlapping matches with capturing + /// group information in the text given. + pub fn captures_iter<'r, 't>(&'r self, text: &'t str) -> CapturesIter<'r, 't>; +} +``` + +(N.B. The `captures` method can technically replace all uses of `find` and +`is_match`, but is potentially slower. Namely, the API reflects a performance +trade off: the more you ask for, the harder the regex engine has to work.) + +There is one additional, but idiosyncratic, search method: + +```rust +impl Regex { + /// Returns the end location of a match if one exists in text. + /// + /// This may return a location preceding the end of a proper leftmost-first + /// match. In particular, it may return the location at which a match is + /// determined to exist. For example, matching `a+` against `aaaaa` will + /// return `1` while the end of the leftmost-first match is actually `5`. + /// + /// This has the same performance characteristics as `is_match`. + pub fn shortest_match(&self, text: &str) -> Option; +} +``` + +And two methods for splitting: + +```rust +impl Regex { + /// Returns an iterator of substrings of `text` delimited by a match of + /// this regular expression. Each element yielded by the iterator corresponds + /// to text that *isn't* matched by this regex. + pub fn split<'r, 't>(&'r self, text: &'t str) -> SplitsIter<'r, 't>; + + /// Returns an iterator of at most `limit` substrings of `text` delimited by + /// a match of this regular expression. Each element yielded by the iterator + /// corresponds to text that *isn't* matched by this regex. The remainder of + /// `text` that is not split will be the last element yielded by the + /// iterator. + pub fn splitn<'r, 't>(&'r self, text: &'t str, limit: usize) -> SplitsNIter<'r, 't>; +} +``` + +And three methods for replacement. Replacement is discussed in more detail in a +subsequent section. + +```rust +impl Regex { + /// Replaces matches of this regex in `text` with `rep`. If no matches were + /// found, then the given string is returned unchanged, otherwise a new + /// string is allocated. + /// + /// `replace` replaces the first match only. `replace_all` replaces all + /// matches. `replacen` replaces at most `limit` matches. + fn replace<'t, R: Replacer>(&self, text: &'t str, rep: R) -> Cow<'t, str>; + fn replace_all<'t, R: Replacer>(&self, text: &'t str, rep: R) -> Cow<'t, str>; + fn replacen<'t, R: Replacer>(&self, text: &'t str, limit: usize, rep: R) -> Cow<'t, str>; +} +``` + +And lastly, three simple accessors: + +```rust +impl Regex { + /// Returns the original pattern string. + pub fn as_str(&self) -> &str; + + /// Returns an iterator over all capturing group in the pattern in the order + /// they were defined (by position of the leftmost parenthesis). The name of + /// the group is yielded if it has a name, otherwise None is yielded. + pub fn capture_names(&self) -> CaptureNamesIter; + + /// Returns the total number of capturing groups in the pattern. This + /// includes the implicit capturing group corresponding to the entire + /// pattern. + pub fn captures_len(&self) -> usize; +} +``` + +Finally, `Regex` impls the `Send`, `Sync`, `Display`, `Debug`, `Clone` and +`FromStr` traits from the standard library. + +## Error + +The `Error` enum is an *extensible* enum, similar to `std::io::Error`, +corresponding to the different ways that regex compilation can fail. In +particular, this means that adding a new variant to this enum is not a breaking +change. (Removing or changing an existing variant is still a breaking change.) + +```rust +pub enum Error { + /// A syntax error. + Syntax(String), + /// The compiled program exceeded the set size limit. + /// The argument is the size limit imposed. + CompiledTooBig(usize), + /// Hints that destructuring should not be exhaustive. + /// + /// This enum may grow additional variants, so this makes sure clients + /// don't count on exhaustive matching. (Otherwise, adding a new variant + /// could break existing code.) + #[doc(hidden)] + __Nonexhaustive, +} +``` + +Note that the `Syntax` variant could contain the `Error` type from the +`regex-syntax` crate, but this couples `regex-syntax` to the public API of +`regex`. We sidestep this hazard by converting `regex-syntax`'s error to a +`String`. + +## RegexBuilder +[regexbuilder]: #regexbuilder + +In most cases, the construction of a regex is done with `Regex::new`. There are +however some options one might want to tweak. This can be done with a +`RegexBuilder`: + +```rust +impl RegexBuilder { + /// Creates a new builder from the given pattern. + pub fn new(pattern: &str) -> RegexBuilder; + + /// Compiles the pattern and all set options. If successful, a Regex is + /// returned. Otherwise, if compilation failed, an Error is returned. + /// + /// N.B. `RegexBuilder::new("...").compile()` is equivalent to + /// `Regex::new("...")`. + pub fn compile(self) -> Result; + + /// Set the case insensitive flag (i). + pub fn case_insensitive(self, yes: bool) -> RegexBuilder; + + /// Set the multi line flag (m). + pub fn multi_line(self, yes: bool) -> RegexBuilder; + + /// Set the dot-matches-any-character flag (s). + pub fn dot_matches_new_line(self, yes: bool) -> RegexBuilder; + + /// Set the swap-greedy flag (U). + pub fn swap_greed(self, yes: bool) -> RegexBuilder; + + /// Set the ignore whitespace flag (x). + pub fn ignore_whitespace(self, yes: bool) -> RegexBuilder; + + /// Set the Unicode flag (u). + pub fn unicode(self, yes: bool) -> RegexBuilder; + + /// Set the approximate size limit (in bytes) of the compiled regular + /// expression. + /// + /// If compiling a pattern would approximately exceed this size, then + /// compilation will fail. + pub fn size_limit(self, limit: usize) -> RegexBuilder; + + /// Set the approximate size limit (in bytes) of the cache used by the DFA. + /// + /// This is a per thread limit. Once the DFA fills the cache, it will be + /// wiped and refilled again. If the cache is wiped too frequently, the + /// DFA will quit and fall back to another matching engine. + pub fn dfa_size_limit(self, limit: usize) -> RegexBuilder; +} +``` + +## Captures + +A `Captures` value stores the locations of all matching capturing groups for +a single match. It provides convenient access to those locations indexed by +either number, or, if available, name. + +The first capturing group (index `0`) is always unnamed and always corresponds +to the entire match. Other capturing groups correspond to groups in the +pattern. Capturing groups are indexed by the position of their leftmost +parenthesis in the pattern. + +Note that `Captures` is a type constructor with a single parameter: the +lifetime of the text searched by the corresponding regex. In particular, the +lifetime of `Captures` is not tied to the lifetime of a `Regex`. + +```rust +impl<'t> Captures<'t> { + /// Returns the start and end location of the ith capturing group. + /// + /// If group i did not participate in the match, then None is returned. + pub fn pos(&self, i: usize) -> Option<(usize, usize)>; + + /// Returns the text matched by the ith capturing group. + /// + /// If group i did not participate in the match, then None is returned. + pub fn at(&self, i: usize) -> Option<&'t str>; + + /// Returns the text matched by the named capturing group. + /// + /// If the named group did not participate in the match, then None is + /// returned. + pub fn name(&self, name: &str) -> Option<&'t str>; + + /// Returns an iterator for all text matched by each of the capturing groups + /// in order of appearance in the pattern. If a capturing group did not + /// participate in a match, then None is yielded in its place. + pub fn iter<'c>(&'c self) -> SubCapturesIter<'c, 't>; + + /// Returns an iterator for all match locations by each of the capturing + /// groups in order of appearance in the pattern. If a capturing group did + /// not participate in a match, then None is yielded in its place. + pub fn iter_pos(&self) -> SubCapturesPosIter; + + /// Returns an iterator of tuples, where each tuple is the name of the + /// capturing group (if it exists) and the matched text. If a capturing group + /// did not participate in a match, then the second element of the tuple is + /// None. + pub fn iter_named<'c>(&'c self) -> SubCapturesNamedIter<'c, 't>; + + /// Returns the number of captured groups. This is always at least 1, since + /// the first unnamed capturing group corresponding to the entire match + /// always exists. + pub fn len(&self) -> usize; + + /// Expands all instances of $name in the text given to the value of the + /// corresponding named capture group. The expanded string is written to + /// dst. + /// + /// The name in $name may be integer corresponding to the index of a capture + /// group or it can be the name of a capture group. If the name isn't a valid + /// capture group, then it is replaced with an empty string. + /// + /// The longest possible name is used. e.g., $1a looks up the capture group + /// named 1a and not the capture group at index 1. To exert more precise + /// control over the name, use braces, e.g., ${1}a. + /// + /// To write a literal $, use $$. + pub fn expand(&self, replacement: &str, dst: &mut String); +} +``` + +The `Captures` type impls `Debug`, `Index` (for numbered capture groups) +and `Index` (for named capture groups). A downside of the `Index` impls is +that the return value is bounded to the lifetime of `Captures` instead of the +lifetime of the actual text searched because of how the `Index` trait is +defined. Callers can work around that limitation if necessary by using an +explicit method such as `at` or `name`. + +## Replacer +[replacer]: #replacer + +The `Replacer` trait is a helper trait to make the various `replace` methods on +`Regex` more ergonomic. In particular, it makes it possible to use either a +standard string as a replacement, or a closure with more explicit access to a +`Captures` value. + +```rust +pub trait Replacer { + /// Appends text to dst to replace the current match. + /// + /// The current match is represents by caps, which is guaranteed to have a + /// match at capture group 0. + /// + /// For example, a no-op replacement would be + /// dst.extend(caps.at(0).unwrap()). + fn replace_append(&mut self, caps: &Captures, dst: &mut String); + + /// Return a fixed unchanging replacement string. + /// + /// When doing replacements, if access to Captures is not needed, then + /// it can be beneficial from a performance perspective to avoid finding + /// sub-captures. In general, this is called once for every call to replacen. + fn no_expansion<'r>(&'r mut self) -> Option> { + None + } +} +``` + +Along with this trait, there is also a helper type, `NoExpand` that implements +`Replacer` like so: + +```rust +pub struct NoExpand<'t>(pub &'t str); + +impl<'t> Replacer for NoExpand<'t> { + fn reg_replace(&mut self, _: &Captures) -> Cow { + self.0.into() + } + + fn no_expand(&mut self) -> Option> { + Some(self.0.into()) + } +} +``` + +This permits callers to use `NoExpand` with the `replace` methods to guarantee +that the replacement string is never searched for `$group` replacement syntax. + +We also provide two more implementations of the `Replacer` trait: `&str` and +`FnMut(&Captures) -> String`. + +## quote +[quote]: #quote + +There is one free function in `regex`: + +```rust +/// Escapes all regular expression meta characters in `text`. +/// +/// The string returned may be safely used as a literal in a regex. +pub fn quote(text: &str) -> String; +``` + +## RegexSet +[regexset]: #regexset + +A `RegexSet` represents the union of zero or more regular expressions. It is a +specialized machine that can match multiple regular expressions simultaneously. +Conceptually, it is similar to joining multiple regexes as alternates, e.g., +`re1|re2|...|reN`, with one crucial difference: in a `RegexSet`, multiple +expressions can match. This means that each pattern can be reasoned about +independently. A `RegexSet` is ideal for building simpler lexers or an HTTP +router. + +Because of their specialized nature, they can only report which regexes match. +They do not report match locations. In theory, this could be added in the +future, but is difficult. + +```rust +pub struct RegexSet(_); + +impl RegexSet { + /// Constructs a new RegexSet from the given sequence of patterns. + /// + /// The order of the patterns given is used to assign increasing integer + /// ids starting from 0. Namely, matches are reported in terms of these ids. + pub fn new(patterns: I) -> Result + where S: AsRef, I: IntoIterator; + + /// Returns the total number of regexes in this set. + pub fn len(&self) -> usize; + + /// Returns true if and only if one or more regexes in this set match + /// somewhere in the given text. + pub fn is_match(&self, text: &str) -> bool; + + /// Returns the set of regular expressions that match somewhere in the given + /// text. + pub fn matches(&self, text: &str) -> SetMatches; +} +``` + +`RegexSet` impls the `Debug` and `Clone` traits. + +The `SetMatches` type is queryable and implements `IntoIterator`. + +```rust +pub struct SetMatches(_); + +impl SetMatches { + /// Returns true if this set contains 1 or more matches. + pub fn matched_any(&self) -> bool; + + /// Returns true if and only if the regex identified by the given id is in + /// this set of matches. + /// + /// This panics if the id given is >= the number of regexes in the set that + /// these matches came from. + pub fn matched(&self, id: usize) -> bool; + + /// Returns the total number of regexes in the set that created these + /// matches. + pub fn len(&self) -> usize; + + /// Returns an iterator over the ids in the set that correspond to a match. + pub fn iter(&self) -> SetMatchesIter; +} +``` + +`SetMatches` impls the `Debug` and `Clone` traits. + +Note that a builder is not proposed for `RegexSet` in this RFC; however, it is +likely one will be added at some point in a backwards compatible way. + +## The `bytes` submodule +[the-bytes-submodule]: #the-bytes-submodule + +All of the above APIs have thus far been explicitly for searching `text` where +`text` has type `&str`. While this author believes that suits most use cases, +it should also be possible to search a regex on *arbitrary* bytes, i.e., +`&[u8]`. One particular use case is quickly searching a file via a memory map. +If regexes could only search `&str`, then one would have to verify it was UTF-8 +first, which could be costly. Moreover, if the file isn't valid UTF-8, then you +either can't search it, or you have to allocate a new string and lossily copy +the contents. Neither case is particularly ideal. It would instead be nice to +just search the `&[u8]` directly. + +This RFC including a `bytes` submodule in the crate. The API of this submodule +is a clone of the API described so far, except with `&str` replaced by `&[u8]` +for the search text (patterns are still `&str`). The clone includes `Regex` +itself, along with all supporting types and traits such as `Captures`, +`Replacer`, `FindIter`, `RegexSet`, `RegexBuilder` and so on. (This RFC +describes some alternative designs in a subsequent section.) + +Since the API is a clone of what has been seen so far, it is not written out +again. Instead, we'll discuss the key differences. + +Again, the first difference is that a `bytes::Regex` can search `&[u8]` +while a `Regex` can search `&str`. + +The second difference is that a `bytes::Regex` can completely disable Unicode +support and explicitly match arbitrary bytes. The details: + +1. The `u` flag can be disabled even when disabling it might cause the regex to +match invalid UTF-8. When the `u` flag is disabled, the regex is said to be in +"ASCII compatible" mode. +2. In ASCII compatible mode, neither Unicode codepoints nor Unicode character +classes are allowed. +3. In ASCII compatible mode, Perl character classes (`\w`, `\d` and `\s`) +revert to their typical ASCII definition. `\w` maps to `[[:word:]]`, `\d` maps +to `[[:digit:]]` and `\s` maps to `[[:space:]]`. +4. In ASCII compatible mode, word boundaries use the ASCII compatible `\w` to +determine whether a byte is a word byte or not. +5. Hexadecimal notation can be used to specify arbitrary bytes instead of +Unicode codepoints. For example, in ASCII compatible mode, `\xFF` matches the +literal byte `\xFF`, while in Unicode mode, `\xFF` is a Unicode codepoint that +matches its UTF-8 encoding of `\xC3\xBF`. Similarly for octal notation. +6. `.` matches any byte except for `\n` instead of any Unicode codepoint. When +the `s` flag is enabled, `.` matches any byte. + +An interesting property of the above is that while the Unicode flag is enabled, +a `bytes::Regex` is *guaranteed* to match only valid UTF-8 in a `&[u8]`. Like +`Regex`, the Unicode flag is enabled by default. + +N.B. The Unicode flag can also be selectively disabled in a `Regex`, but not in +a way that permits matching invalid UTF-8. + +# Drawbacks +[drawbacks]: #drawbacks + +## Guaranteed linear time matching +[guaranteed-linear-time-matching]: #guaranteed-linear-time-matching + +A significant contract in the API of the `regex` crate is that all searching +has worst case `O(n)` complexity, where `n ~ length(text)`. (The size of the +regular expression is taken as a constant.) This contract imposes significant +restrictions on both the implementation and the set of features exposed in the +pattern language. A full analysis is beyond the scope of this RFC, but here are +the highlights: + +1. Unbounded backtracking can't be used to implement matching. Backtracking can + be quite fast in practice (indeed, the current implementation uses bounded + backtracking in some cases), but has worst case exponential time. +2. Permitting backreferences in the pattern language can cause matching to + become NP-complete, which (probably) can't be solved in linear time. +3. Arbitrary look around is probably difficult to fit into a linear time + guarantee *in practice*. + +The benefit to the linear time guarantee is just that: no matter what, all +searching completes in linear time with respect to the search text. This is a +valuable guarantee to make, because it means that one can execute arbitrary +regular expressions over arbitrary input and be absolutely sure that it will +finish in some "reasonable" time. + +Of course, in practice, constants that are omitted from complexity analysis +*actually matter*. For this reason, the `regex` crate takes a number of steps +to keep constants low. For example, by placing a limit on the size of the +regular expression or choosing an appropriate matching engine when another +might result in higher constant factors. + +This particular drawback segregates Rust's regular expression library from most +other regular expression libraries that programmers may be familiar with. +Languages such as Java, Python, Perl, Ruby, PHP and C++ support more flavorful +regexes by default. Go is the only language this author knows of whose standard +regex implementation guarantees linear time matching. Of course, RE2 +is also worth mentioning, which is a C++ regex library that guarantees linear +time matching. There are other implementations of regexes that guarantee linear +time matching (TRE, for example), but none of them are particularly popular. + +It is also worth noting that since Rust's FFI is zero cost, one can bind to +existing regex implementations that provide more features (bindings for both +PCRE1 and Oniguruma exist today). + +## Allocation +[allocation]: #allocation + +The `regex` API assumes that the implementation can dynamically allocate +memory. Indeed, the current implementation takes advantage of this. A `regex` +library that has no requirement on dynamic memory allocation would look +significantly different than the one that exists today. Dynamic memory +allocation is utilized pervasively in the parser, compiler and even during +search. + +The benefit of permitting dynamic memory allocation is that it makes the +implementation *and* API simpler. This does make use of the `regex` crate in +environments that don't have dynamic memory allocation impossible. + +This author isn't aware of any `regex` library that can work without dynamic +memory allocation. + +With that said, `regex` may want to grow custom allocator support when the +corresponding traits stabilize. + +## Synchronization is implicit +[synchronization-is-implicit]: #synchronization-is-implicit + +Every `Regex` value can be safely used from multiple threads simultaneously. +Since a `Regex` has interior mutable state, this implies that it must do some +kind of synchronization in order to be safe. + +There are some reasons why we might want to do synchronization +automatically: + +1. `Regex` exposes an *immutable API*. That is, from looking at its set of + methods, none of them borrow the `Regex` mutably (or otherwise claim to + mutate the `Regex`). This author claims that since there is no *observable + mutation* of a `Regex`, it *not* being thread safe would violate the + principle of least surprise. +2. Often, a `Regex` should be compiled once and reused repeatedly in multiple + searches. To facilitate this, `lazy_static!` can be used to guarantee that + compilation happens exactly once. `lazy_static!` requires its types to be + `Sync`. A user of `Regex` could work around this by wrapping a `Regex` in a + `Mutex`, but this would make misuse too easy. For example, locking a `Regex` + in one thread would prevent simultaneous searching in another thread. + +Synchronization has overhead, although it is extremely small (and dwarfed +by general matching overhead). The author has *ad hoc* benchmarked the +`regex` implementation with GNU Grep, and per match overhead is comparable in +single threaded use. It is this author's opinion, that it is good enough. If +synchronization overhead across multiple threads is too much, callers may elect +to clone the `Regex` so that each thread gets its own copy. Cloning a `Regex` +is no more expensive than what would be done internally automatically, but it +does eliminate contention. + +An alternative is to increase the API surface and have types that are +synchronized by default and types that aren't synchronized. This was discussed +at length in +[this +thread](https://users.rust-lang.org/t/help-me-reduce-overhead-of-regex-matching/5220/1). +My conclusion from this thread is that we either expand the surface of the API, +or we break the current API or we keep implicit synchronization as-is. In this +author's opinion, neither expanding the API or breaking the API is worth +avoiding negligible synchronization overhead. + +## The implementation is complex +[the-implementation-is-complex]: #the-implementation-is-complex + +Regular expression engines have a lot of moving parts and it often requires +quite a bit of context on how the whole library is organized in order to make +significant contributions. Therefore, moving `regex` into `rust-lang` is a +*maintenance hazard*. This author has tried to mitigate this hazard somewhat by +doing the following: + +1. Offering to mentor contributions. Significant contributions have thus far + fizzled, but minor contributions---even to complex code like the DFA---have + been successful. +2. Documenting not just the API, but the *internals*. The DFA is, for example, + heavily documented. +3. Wrote a `HACKING.md` guide that gives a sweeping overview of the design. +4. Significant test and benchmark suites. + +With that said, there is still a lot more that could be done to mitigate the +maintenance hazard. In this author's opinion, the interaction between the three +parts of the implementation (parsing, compilation, searching) is not documented +clearly enough. + +# Alternatives +[alternatives]: #alternatives + +## Big picture +[big-picture]: #big-picture + +The most important alternative is to decide *not* to bless a particular +implementation of regular expressions. We might want to go this route for any +number of reasons (see: Drawbacks). However, the `regex` crate is already +widely used, which provides at least some evidence that some set of programmers +find it good enough for general purpose regex searching. + +The impact of not moving `regex` into `rust-lang` is, plainly, that Rust won't +have an "officially blessed" regex implementation. Many programmers may +appreciate the complexity of a regex implementation, and therefore might insist +that one be officially maintained. However, to be honest, it isn't quite clear +what would happen in practice. This author is speculating. + +## `bytes::Regex` +[bytesregex]: #bytesregex + +This RFC proposes stabilizing the `bytes` sub-module of the `regex` crate in +its entirety. The `bytes` sub-module is a near clone of the API at the crate +level with one important difference: it searches `&[u8]` instead of `&str`. +This design was motivated by a similar split in `std`, but there are +alternatives. + +### A regex trait +[a-regex-trait]: #a-regex-trait + +One alternative is designing a trait that looks something like this: + +```rust +trait Regex { + type Text: ?Sized; + + fn is_match(&self, text: &Self::Text) -> bool; + fn find(&self, text: &Self::Text) -> Option<(usize, usize)>; + fn find_iter<'r, 't>(&'r self, text: &'t Self::Text) -> FindIter<'r, 't, Self::Text>; + // and so on +} +``` + +However, there are a couple problems with this approach. First and foremost, +the use cases of such a trait aren't exactly clear. It does make writing +generic code that searches either a `&str` or a `&[u8]` possible, but the +semantics of searching `&str` (always valid UTF-8) or `&[u8]` are quite a bit +different with respect to the original `Regex`. Secondly, the trait isn't +obviously implementable by others. For example, some of the methods return +iterator types such as `FindIter` that are typically implemented with a +lower level API that isn't exposed. This suggests that a straight-forward +traitification of the current API probably isn't appropriate, and perhaps, +a better trait needs to be more fundamental to regex searching. + +Perhaps the strongest reason to not adopt this design for regex `1.0` is that +we don't have any experience with it and there hasn't been any demand for it. +In particular, it could be prototyped in another crate. + +### Reuse some types +[reuse-some-types]: #reuse-some-types + +In the current proposal, the `bytes` submodule completely duplicates the +top-level API, including all iterator types, `Captures` and even the `Replacer` +trait. We could parameterize many of those types over the type of the text +searched. For example, the proposed `Replacer` trait looks like this: + +```rust +trait Replacer { + fn replace_append(&mut self, caps: &Captures, dst: &mut String); + + fn no_expansion<'r>(&'r mut self) -> Option> { + None + } +} +``` + +We might add an associated type like so: + +```rust +trait Replacer { + type Text: ToOwned + ?Sized; + + fn replace_append( + &mut self, + caps: &Captures, + dst: &mut ::Owned, + ); + + fn no_expansion<'r>(&'r mut self) -> Option> { + None + } +} +``` + +But parameterizing the `Captures` type is a little bit tricky. Namely, methods +like `at` want to slice the text at match offsets, but this can't be done +safely in generic code without introducing another public trait. + +The final death knell in this idea is that these two implementations cannot +co-exist: + +```rust +impl Replacer for F where F: FnMut(&Captures) -> String { + type Text = str; + + fn replace_append(&mut self, caps: &Captures, dst: &mut String) { + dst.push_str(&(*self)(caps)); + } +} + +impl Replacer for F where F: FnMut(&Captures) -> Vec { + type Text = [u8]; + + fn replace_append(&mut self, caps: &Captures, dst: &mut Vec) { + dst.extend(&(*self)(caps)); + } +} +``` + +Perhaps there is a path through this using yet more types or more traits, but +without a really strong motivating reason to find it, I'm not convinced it's +worth it. Duplicating all of the types is unfortunate, but it's *simple*. + + +# Unresolved questions +[unresolved]: #unresolved-questions + +The `regex` repository has more than just the `regex` crate. + +## `regex-syntax` +[regex-syntax]: #regex-syntax + +This crate exposes a regular expression parser and abstract syntax that is +completely divorced from compilation or searching. It is not part of `regex` +proper since it may experience more frequent breaking changes and is far less +frequently used. It is not clear whether this crate will ever see `1.0`, and if +it does, what criteria would be used to judge it suitable for `1.0`. +Nevertheless, it is a useful public API, but it is not part of this RFC. + +## `regex-capi` +[regex-capi]: #regex-capi + +Recently, `regex-capi` was built to provide a C API to this regex library. It +has been used to build [cgo bindings to this library for +Go](https://github.com/BurntSushi/rure-go). Given its young age, it is not part +of this proposal but will be maintained as a pre-1.0 crate in the same +repository. + +## `regex_macros` +[regex_macros]: #regex_macros + +The `regex!` compiler plugin is a macro that can compile regular expressions +when your Rust program compiles. Stated differently, `regex!("...")` is +transformed into Rust code that executes a search of the given pattern +directly. It was written two years ago and largely hasn't changed since. When +it was first written, it had two major benefits: + +1. If there was a syntax error in your regex, your Rust program would not + compile. +2. It was faster. + +Today, (1) can be simulated in practice with the use of a Clippy lint and (2) +is no longer true. In fact, `regex!` is at least one order of magnitude faster +than the standard `Regex` implementation. + +The future of `regex_macros` is not clear. In one sense, since it is a +compiler plugin, there hasn't been much interest in developing it further since +its audience is necessarily limited. In another sense, it's not entirely clear +what its implementation path is. It would take considerable work for it to beat +the current `Regex` implementation (if it's even possible). More discussion on +this is out of scope. + +## Dependencies +[dependencies]: #dependencies + +As of now, `regex` has several dependencies: + +* `aho-corasick` +* `memchr` +* `thread_local` +* `regex-syntax` +* `utf8-ranges` + +All of them except for `thread_local` were written by this author, and were +primarily motivated for use in the `regex` crate. They were split out because +they seem generally useful. + +There may be other things in `regex` (today or in the future) that may also be +helpful to others outside the strict context of `regex`. Is it beneficial to +split such things out and create a longer list of dependencies? Or should we +keep `regex` as tight as possible? + +## Exposing more internals +[exposing-more-internals]: #exposing-more-internals + +It is conceivable that others might find interest in the regex compiler or more +lower level access to the matching engines. We could do something similar to +`regex-syntax` and expose some internals in a separate crate. However, there +isn't a pressing desire to do this at the moment, and would probably require a +good deal of work. + +# Breaking changes +[breaking-changes]: #breaking-changes + +This section of the RFC lists all breaking changes between `regex 0.1` and the +API proposed in this RFC. + +* `bytes::Regex` enables the Unicode flag by default. Previously, it disabled + it by default. The flag can be disabled in the pattern with `(?-u)`. +* The definition of the `Replacer` trait was completely re-worked. Namely, its + API inverts control of allocation so that the caller must provide a `String` + to write to. Previous implementors will need to examine the new API. Moving + to the new API should be straight-forward. +* The `is_empty` method on `Captures` was removed since it always returns + `false` (because every `Captures` has at least one capture group + corresponding to the entire match). +* The `PartialEq` and `Eq` impls on `Regex` were removed. If you need this + functionality, add a newtype around `Regex` and write the corresponding + `PartialEq` and `Eq` impls. +* The lifetime parameters for the `iter` and `iter_named` methods on + `Captures` were fixed. The corresponding iterator types, `SubCaptures` and + `SubCapturesNamed`, grew an additional lifetime parameter. +* The constructor, `Regex::with_size_limit`, was removed. It can be replaced + with use of `RegexBuilder`. +* The `is_match` free function was removed. Instead, compile a `Regex` + explicitly and call the `is_match` method. +* Many iterator types were renamed. (e.g., `RegexSplits` to `SplitsIter`.) +* Replacements now return a `Cow` instead of a `String`. Namely, the + subject text doesn't need to be copied if there are no replacements. Callers + may need to add `into_owned()` calls to convert the `Cow` to a proper + `String`. +* The `Error` type no longer has the `InvalidSet` variant, since the error is + no longer possible. Its `Syntax` variant was also modified to wrap a `String` + instead of a `regex_syntax::Error`. If you need access to specific parse + error information, use the `regex-syntax` crate directly. +* To allow future growth, some character classes may no longer compile to make + room for possibly adding class set notation in the future. From 5329e0248a3e2281b9f9892ed161a74466aab284 Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Wed, 18 May 2016 19:59:23 -0400 Subject: [PATCH 0916/1195] update Error type --- text/0000-regex-1.0.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md index 3fcac13c152..9f969e32985 100644 --- a/text/0000-regex-1.0.md +++ b/text/0000-regex-1.0.md @@ -276,7 +276,7 @@ change. (Removing or changing an existing variant is still a breaking change.) ```rust pub enum Error { /// A syntax error. - Syntax(String), + Syntax(SyntaxError), /// The compiled program exceeded the set size limit. /// The argument is the size limit imposed. CompiledTooBig(usize), @@ -291,9 +291,10 @@ pub enum Error { ``` Note that the `Syntax` variant could contain the `Error` type from the -`regex-syntax` crate, but this couples `regex-syntax` to the public API of -`regex`. We sidestep this hazard by converting `regex-syntax`'s error to a -`String`. +`regex-syntax` crate, but this couples `regex-syntax` to the public API +of `regex`. We sidestep this hazard by defining a newtype in `regex` that +internally wraps `regex_syntax::Error`. This also enables us to selectively +expose more information in the future. ## RegexBuilder [regexbuilder]: #regexbuilder From 03a99550c9234d98422fadf07f683fdde9052810 Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Thu, 19 May 2016 06:34:33 -0400 Subject: [PATCH 0917/1195] regex! is slower, not faster --- text/0000-regex-1.0.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md index 9f969e32985..8acc5d39f4f 100644 --- a/text/0000-regex-1.0.md +++ b/text/0000-regex-1.0.md @@ -907,7 +907,7 @@ it was first written, it had two major benefits: 2. It was faster. Today, (1) can be simulated in practice with the use of a Clippy lint and (2) -is no longer true. In fact, `regex!` is at least one order of magnitude faster +is no longer true. In fact, `regex!` is at least one order of magnitude slower than the standard `Regex` implementation. The future of `regex_macros` is not clear. In one sense, since it is a From 25f59dc863db3da16a08127cf1e8708c5713529a Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Thu, 19 May 2016 06:35:27 -0400 Subject: [PATCH 0918/1195] s/remain/remained --- text/0000-regex-1.0.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md index 8acc5d39f4f..e2e102d5390 100644 --- a/text/0000-regex-1.0.md +++ b/text/0000-regex-1.0.md @@ -110,7 +110,7 @@ The syntax is exhaustively documented in the current public API documentation: http://doc.rust-lang.org/regex/regex/index.html#syntax To my knowledge, the evolution as proposed in this RFC has been followed since -`regex` was created. The syntax has largely remain unchanged with few +`regex` was created. The syntax has largely remained unchanged with few additions. ### Expansion concerns From a3316b957d96001507e57fcab2d8edd90e54db8d Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Fri, 20 May 2016 16:08:35 +0100 Subject: [PATCH 0919/1195] loop-break-value RFC (see #961) --- text/0000-loop-break-value.md | 161 ++++++++++++++++++++++++++++++++++ 1 file changed, 161 insertions(+) create mode 100644 text/0000-loop-break-value.md diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md new file mode 100644 index 00000000000..f659e517d19 --- /dev/null +++ b/text/0000-loop-break-value.md @@ -0,0 +1,161 @@ +- Feature Name: loop_break_value +- Start Date: 2016-05-20 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Let a `loop { ... }` expression return a value via `break my_value;`. + +# Motivation +[motivation]: #motivation + +This pattern is currently hard to implement without resorting to a function or +closure wrapping the loop: + +```rust +fn f() { + let outcome = loop { + // get and process some input, e.g. from the user or from a list of + // files + let result = get_result(); + + if successful() { + break result; + } + // otherwise keep trying + }; + + use_the_result(outcome); +} +``` + +In some cases, one can simply move `use_the_result(outcome)` into the loop, but +sometimes this is undesirable and sometimes impossible due to lifetimes. + +# Detailed design +[design]: #detailed-design + +This proposal does two things: let `break` take a value, and let `loop` have a +result type other than `()`. + +### break syntax + +Four forms of `break` will be supported: + +1. `break;` +2. `break 'label;` +3. `break EXPR;` +4. `break 'label EXPR;` + +where `'label` is the name of a looping construct and `EXPR` is an evaluable +expression. + +### result type + +Currently the result-type of a 'loop' without 'break' is `!` (never returns), +which may be coerced to `()`. This is important since a loop may appear as +the last expression of a function: + +```rust +fn f() { + loop { + do_something(); + // never breaks + } +} +fn g() -> () { + loop { + do_something(); + if Q() { break; } + } +} +fn h() -> ! { + loop { + do_something(); + // this loop is not allowed to break due to `!` result type + } +} +``` + +This proposal changes the result type to `T`, where: + +* a loop which is never "broken" via `break` has result-type `!` (coercible to `()`) +* a loop's return type may be deduced from its context, e.g. `let x: T = loop { ... };` +* where a loop is "broken" via `break;` or `break 'label;`, its result type is `()` +* where a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` + +It is an error if these types do not agree. Examples: + +```rust +// error: loop type must be ! or () not i32 +let a: i32 = loop {}; +// error: loop type must be i32 and must be &str +let b: i32 = loop { break "I am not an integer."; }; +// error: loop type must be Option<_> and must be &str +let c = loop { + if Q() { + "answer" + } else { + None + } +}; +fn z() -> ! { + // function does not return + // error: loop may break (same behaviour as before) + loop { + if Q() { break; } + } +} +``` + +### result value + +A loop only yields a value if broken via some form of `break ...;` statement, +in which case it yields the value resulting from the evaulation of the +statement's expression (`EXPR` above), or `()` if there is no `EXPR` +expression. + +Examples: + +```rust +assert_eq!(loop { break; }, ()); +assert_eq!(loop { break 5; }, 5); +let x = 'a loop { + 'b loop { + break 'a 1; + } + break 'a 2; +}; +assert_eq!(x, 1); +``` + +# Drawbacks +[drawbacks]: #drawbacks + +The proposal changes the syntax of `break` statements, requiring updates to +parsers and possibly syntax highlighters. + +The type of `loop` expressions is no longer fixed and cannot be explicitly +typed. + +# Alternatives +[alternatives]: #alternatives + +No alternatives to the design have been suggested. It has been suggested that +the feature itself is unnecessary, and indeed much Rust code already exists +without it, however the pattern solves some cases which are difficult to handle +otherwise and allows more flexibility in code layout. + +# Unresolved questions +[unresolved]: #unresolved-questions + +It would be possible to allow `for`, `while` and `while let` expressions return +values in a similar way; however, these expressions may also terminate +"naturally" (not via break), and no consensus has been reached on how the +result value should be determined in this case, or even the result type. +It is thus proposed not to change these expressions at this time. + +It should be noted that `for`, `while` and `while let` can all be emulated via +`loop`, so perhaps allowing the former to return values is less important. From 7fda498e3db5c6a1d343647c5f19d4ceb55f6870 Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Fri, 20 May 2016 16:23:22 +0100 Subject: [PATCH 0920/1195] Reference existing work --- text/0000-loop-break-value.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index f659e517d19..1f38940800a 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -6,6 +6,10 @@ # Summary [summary]: #summary +(This is a result of discussion of issue #961 and related to RFCs +[352](https://github.com/rust-lang/rfcs/pull/352) and +[955](https://github.com/rust-lang/rfcs/pull/955).) + Let a `loop { ... }` expression return a value via `break my_value;`. # Motivation @@ -159,3 +163,5 @@ It is thus proposed not to change these expressions at this time. It should be noted that `for`, `while` and `while let` can all be emulated via `loop`, so perhaps allowing the former to return values is less important. + +See discussion of #961 for more on this topic. From 75b82f4c00f76b3bae2d67c07176cb1ba271f8a7 Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Fri, 20 May 2016 16:24:37 +0100 Subject: [PATCH 0921/1195] Link issue since document is in a different repo --- text/0000-loop-break-value.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index 1f38940800a..b2ea7569bc7 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -6,7 +6,8 @@ # Summary [summary]: #summary -(This is a result of discussion of issue #961 and related to RFCs +(This is a result of discussion of +[issue #961](https://github.com/rust-lang/rfcs/issues/961) and related to RFCs [352](https://github.com/rust-lang/rfcs/pull/352) and [955](https://github.com/rust-lang/rfcs/pull/955).) @@ -164,4 +165,5 @@ It is thus proposed not to change these expressions at this time. It should be noted that `for`, `while` and `while let` can all be emulated via `loop`, so perhaps allowing the former to return values is less important. -See discussion of #961 for more on this topic. +See [discussion of #961](https://github.com/rust-lang/rfcs/issues/961) +for more on this topic. From b7ca757a78651a310ca532672f54537eeb69feab Mon Sep 17 00:00:00 2001 From: John Ericson Date: Fri, 20 May 2016 08:39:01 -0700 Subject: [PATCH 0922/1195] A few suggestions (feel free to reject! :)) I can't make line edits without an open PR, so I did this instead. --- text/0000-loop-break-value.md | 47 ++++++++++++++++++++++++----------- 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index b2ea7569bc7..e5e53d0b83d 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -43,9 +43,9 @@ sometimes this is undesirable and sometimes impossible due to lifetimes. [design]: #detailed-design This proposal does two things: let `break` take a value, and let `loop` have a -result type other than `()`. +type other than `()`. -### break syntax +## Break Syntax Four forms of `break` will be supported: @@ -54,12 +54,11 @@ Four forms of `break` will be supported: 3. `break EXPR;` 4. `break 'label EXPR;` -where `'label` is the name of a looping construct and `EXPR` is an evaluable -expression. +where `'label` is the name of a loop and `EXPR` is an expression. -### result type +### Type of Loop -Currently the result-type of a 'loop' without 'break' is `!` (never returns), +Currently the type of a 'loop' without 'break' is `!` (never returns), which may be coerced to `()`. This is important since a loop may appear as the last expression of a function: @@ -79,17 +78,17 @@ fn g() -> () { fn h() -> ! { loop { do_something(); - // this loop is not allowed to break due to `!` result type + // this loop is not allowed to break due to inferred `!` type } } ``` -This proposal changes the result type to `T`, where: +This proposal changes the type to `T`, where: -* a loop which is never "broken" via `break` has result-type `!` (coercible to `()`) -* a loop's return type may be deduced from its context, e.g. `let x: T = loop { ... };` -* where a loop is "broken" via `break;` or `break 'label;`, its result type is `()` -* where a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` +* a loop which is never "broken" via `break` has type `!` (which is coercible to anything, as today) +* where a loop is "broken" via `break;` or `break 'label;`, its type is `()` +* where a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, and `EXPR: T`, the loop has type `T` +* all "breaks" out of of a loop must have the same type It is an error if these types do not agree. Examples: @@ -100,7 +99,7 @@ let a: i32 = loop {}; let b: i32 = loop { break "I am not an integer."; }; // error: loop type must be Option<_> and must be &str let c = loop { - if Q() { + break if Q() { "answer" } else { None @@ -115,14 +114,14 @@ fn z() -> ! { } ``` -### result value +## Result Value A loop only yields a value if broken via some form of `break ...;` statement, in which case it yields the value resulting from the evaulation of the statement's expression (`EXPR` above), or `()` if there is no `EXPR` expression. -Examples: +## Examples ```rust assert_eq!(loop { break; }, ()); @@ -135,6 +134,24 @@ let x = 'a loop { }; assert_eq!(x, 1); ``` +```rust +fn y() -> () { + loop { + if coin_flip() { + break; + } else { + break (); + } + } +} +``` +```rust +fn z() -> ! { + loop { + break panic!(); + } +} +``` # Drawbacks [drawbacks]: #drawbacks From 57a00c09829ec31bd7b7e918465c192398e2f19f Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Fri, 20 May 2016 16:52:35 +0100 Subject: [PATCH 0923/1195] Address comments, specifically from glaebhoerl --- text/0000-loop-break-value.md | 50 ++++++++++++++++++++++++++--------- 1 file changed, 38 insertions(+), 12 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index b2ea7569bc7..1ea3033af00 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -59,8 +59,9 @@ expression. ### result type -Currently the result-type of a 'loop' without 'break' is `!` (never returns), -which may be coerced to `()`. This is important since a loop may appear as +Currently the result-type of a 'loop' without 'break' is `!` (never returns, +and may be coerced to any type), and the result type of a 'loop' with 'break' +is `()`. This is important since a loop may appear as the last expression of a function: ```rust @@ -86,24 +87,24 @@ fn h() -> ! { This proposal changes the result type to `T`, where: -* a loop which is never "broken" via `break` has result-type `!` (coercible to `()`) -* a loop's return type may be deduced from its context, e.g. `let x: T = loop { ... };` +* a loop which is never "broken" via `break` has result-type `!` (coercible) * where a loop is "broken" via `break;` or `break 'label;`, its result type is `()` * where a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` +* a loop's return type may be deduced from its context, e.g. `let x: T = loop { ... };` It is an error if these types do not agree. Examples: ```rust -// error: loop type must be ! or () not i32 -let a: i32 = loop {}; +// error: loop type must be () and must be i32 +let a: i32 = loop { break; }; // error: loop type must be i32 and must be &str let b: i32 = loop { break "I am not an integer."; }; // error: loop type must be Option<_> and must be &str let c = loop { if Q() { - "answer" + break "answer"; } else { - None + break None; } }; fn z() -> ! { @@ -115,6 +116,19 @@ fn z() -> ! { } ``` +Where a loop does not break, the return type is coercible: + +```rust +fn f() -> () { + // ! coerces to () + loop {} +} +fn g() -> u32 { + // ! coerces to u32 + loop {} +} +``` + ### result value A loop only yields a value if broken via some form of `break ...;` statement, @@ -142,9 +156,6 @@ assert_eq!(x, 1); The proposal changes the syntax of `break` statements, requiring updates to parsers and possibly syntax highlighters. -The type of `loop` expressions is no longer fixed and cannot be explicitly -typed. - # Alternatives [alternatives]: #alternatives @@ -164,6 +175,21 @@ It is thus proposed not to change these expressions at this time. It should be noted that `for`, `while` and `while let` can all be emulated via `loop`, so perhaps allowing the former to return values is less important. +Alternatively, a new keyword such as `default` or `else` could be used to +specify the other exit value as in: + +```rust +fn first(list: Iterator) -> Option { + for x in list { + break Some(x); + } default { + None + } +} +``` -See [discussion of #961](https://github.com/rust-lang/rfcs/issues/961) +The exact syntax is disputed. It is suggested that this RFC should not be +blocked on this issue since break-with-value can still be implemented in the +manner above after this RFC. See the +[discussion of #961](https://github.com/rust-lang/rfcs/issues/961) for more on this topic. From ba6b0cc181023eb0833d08194b8b512c330b37b9 Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Fri, 20 May 2016 17:19:48 +0100 Subject: [PATCH 0924/1195] Changes resulting from Ericson2314's observations, in particular `break panic!();` --- text/0000-loop-break-value.md | 56 +++++++++++++++++++---------------- 1 file changed, 30 insertions(+), 26 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index cc80ad64886..8237c391203 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -84,14 +84,17 @@ fn h() -> ! { } ``` -This proposal changes the type to `T`, where: +This proposal changes the result type of 'loop' to `T`, where: -* a loop which is never "broken" via `break` has result-type `!` (which is coercible to anything, as of today) -* where a loop is "broken" via `break;` or `break 'label;`, its result type is `()` -* where a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` -* a loop's return type may be deduced from its context, e.g. `let x: T = loop { ... };` +* if a loop is "broken" via `break;` or `break 'label;`, the loop's result type must be `()` +* if a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` +* as a special case, if a loop is "broken" via `break EXPR;` or `break 'label EXPR;` where `EXPR` evaluates to type `!` (does not return), this does not place a constraint on the type of the loop +* if external constaint on the loop's result type exist (e.g. `let x: S = loop { ... };`), then `T` must be coercible to this type -It is an error if these types do not agree. Examples: +It is an error if these types do not agree or if the compiler's type deduction +rules do not yield a concrete type. + +Examples of errors: ```rust // error: loop type must be () and must be i32 @@ -115,7 +118,7 @@ fn z() -> ! { } ``` -Where a loop does not break, the return type is coercible: +Examples involving `!`: ```rust fn f() -> () { @@ -126,6 +129,25 @@ fn g() -> u32 { // ! coerces to u32 loop {} } +fn z() -> ! { + loop { + break panic!(); + } +} +``` + +Example showing the equivalence of `break;` and `break ();`: + +```rust +fn y() -> () { + loop { + if coin_flip() { + break; + } else { + break (); + } + } +} ``` ### Result value @@ -135,7 +157,7 @@ in which case it yields the value resulting from the evaulation of the statement's expression (`EXPR` above), or `()` if there is no `EXPR` expression. -## Examples +Examples: ```rust assert_eq!(loop { break; }, ()); @@ -148,24 +170,6 @@ let x = 'a loop { }; assert_eq!(x, 1); ``` -```rust -fn y() -> () { - loop { - if coin_flip() { - break; - } else { - break (); - } - } -} -``` -```rust -fn z() -> ! { - loop { - break panic!(); - } -} -``` # Drawbacks [drawbacks]: #drawbacks From 48ac001e8d8cf473489ba66e4353d8ae545da30d Mon Sep 17 00:00:00 2001 From: Andre Bogus Date: Fri, 20 May 2016 18:30:41 +0200 Subject: [PATCH 0925/1195] new RFC: static_lifetime_in_statics --- text/0000-static.md | 70 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 text/0000-static.md diff --git a/text/0000-static.md b/text/0000-static.md new file mode 100644 index 00000000000..76e51b75132 --- /dev/null +++ b/text/0000-static.md @@ -0,0 +1,70 @@ +- Feature Name: static_lifetime_in_statics +- Start Date: 2016-05-20 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Let's default lifetimes in static and const declarations to `'static`. + +# Motivation +[motivation]: #motivation + +Currently, having references in `static` and `const` declarations is cumbersome +due to having to explicitly write `&'static ..`. On the other hand anything but +static is likely either useless, unsound or both. Also the long lifetime name +causes substantial rightwards drift, which makes it hard to format the code +to be visually appealing. + +For example, having a `'static` default for lifetimes would turn this: +``` +static my_awesome_tables: &'static [&'static HashMap, u32>] = .. +``` +into this: +``` +static my_awesome_table: &[&HashMap, u32>] = .. +``` + +The type declaration still causes some rightwards drift, but at least all the +contained information is useful. + +# Detailed design +[design]: #detailed-design + +The same default that RFC #599 sets up for trait object is to be used for +statics and const declarations. In those declarations, the compiler will assume +`'static` when a lifetime is not explicitly given in both refs and generics. + +Note that this RFC does not forbid writing the lifetimes, it only sets a +default when no is given. Thus the change is unlikely to cause any breakage and +should be deemed backwards-compatible. It's also very unlikely that +implementing this RFC will restrict our design space for `static` and `const` +definitions down the road. + +# Drawbacks +[drawbacks]: #drawbacks + +There are no known drawbacks to this change. + +# Alternatives +[alternatives]: #alternatives + +* Leave everything as it is. Everyone using static references is annoyed by +having to add `'static` without any value to readability. People will resort to +writing macros if they have many resources. +* Write the aforementioned macro. This is inferior in terms of UX. Depending on +the implementation it may or may not be possible to default lifetimes in +generics. +* Infer types for statics. The absence of types makes it harder to reason about +the code, so even if type inference for statics was to be implemented, +defaulting lifetimes would have the benefit of pulling the cost-benefit +relation in the direction of more explicit code. Thus it is advisable to +implement this change even with the possibility of implementing type inference +later. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Does this change requires changing the grammar? +* Are there other Rust-code handling programs that need to be updated? From 2c875d74ef8d993ff3a26c4c9215a8db49f3d4a0 Mon Sep 17 00:00:00 2001 From: Andre Bogus Date: Fri, 20 May 2016 19:11:06 +0200 Subject: [PATCH 0926/1195] include @nikomatsakis' examples, clarify lifetime elision precedence --- text/0000-static.md | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/text/0000-static.md b/text/0000-static.md index 76e51b75132..1bfd6f507fc 100644 --- a/text/0000-static.md +++ b/text/0000-static.md @@ -27,7 +27,8 @@ static my_awesome_table: &[&HashMap, u32>] = .. ``` The type declaration still causes some rightwards drift, but at least all the -contained information is useful. +contained information is useful. There is one exception to the rule: lifetime +elision for function signatures will work as it does now (see example below). # Detailed design [design]: #detailed-design @@ -42,6 +43,24 @@ should be deemed backwards-compatible. It's also very unlikely that implementing this RFC will restrict our design space for `static` and `const` definitions down the road. +The `'static` default does *not* override lifetime elision in function +signatures, but work alongside it: + +```rust +static foo: fn(&u32) -> &u32 = ...; // for<'a> fn(&'a u32) -> &'a u32 +static bar: &Fn(&u32) -> &u32 = ...; // &'static for<'a> Fn(&'a u32) -> &'a u32 +``` + +With generics, it will work as anywhere else. Notably, writing out the lifetime +is still possible. + +``` +trait SomeObject<'a> { ... } +static foo: &SomeObject = ...; // &'static SomeObject<'static> +static bar: &for<'a> SomeObject<'a> = ...; // &'static for<'a> SomeObject<'a> +static baz: &'static [u8] = ...; +``` + # Drawbacks [drawbacks]: #drawbacks From 315101a0c21b715c9ad019faf483fb5d2399173c Mon Sep 17 00:00:00 2001 From: Andre Bogus Date: Fri, 20 May 2016 19:12:33 +0200 Subject: [PATCH 0927/1195] drop confusing sentence in motivation --- text/0000-static.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/0000-static.md b/text/0000-static.md index 1bfd6f507fc..902dc4fd331 100644 --- a/text/0000-static.md +++ b/text/0000-static.md @@ -12,8 +12,7 @@ Let's default lifetimes in static and const declarations to `'static`. [motivation]: #motivation Currently, having references in `static` and `const` declarations is cumbersome -due to having to explicitly write `&'static ..`. On the other hand anything but -static is likely either useless, unsound or both. Also the long lifetime name +due to having to explicitly write `&'static ..`. Also the long lifetime name causes substantial rightwards drift, which makes it hard to format the code to be visually appealing. From 93b685e9e4b029ac7a24d52d6d1be24581a99020 Mon Sep 17 00:00:00 2001 From: Andre Bogus Date: Fri, 20 May 2016 19:14:58 +0200 Subject: [PATCH 0928/1195] clarify zero breakage, remove answered question --- text/0000-static.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/text/0000-static.md b/text/0000-static.md index 902dc4fd331..7b5c82df034 100644 --- a/text/0000-static.md +++ b/text/0000-static.md @@ -37,10 +37,10 @@ statics and const declarations. In those declarations, the compiler will assume `'static` when a lifetime is not explicitly given in both refs and generics. Note that this RFC does not forbid writing the lifetimes, it only sets a -default when no is given. Thus the change is unlikely to cause any breakage and -should be deemed backwards-compatible. It's also very unlikely that -implementing this RFC will restrict our design space for `static` and `const` -definitions down the road. +default when no is given. Thus the change will not cause any breakage and is +thus backwards-compatible. It's also very unlikely that implementing this RFC +will restrict our design space for `static` and `const` definitions down the +road. The `'static` default does *not* override lifetime elision in function signatures, but work alongside it: @@ -84,5 +84,5 @@ later. # Unresolved questions [unresolved]: #unresolved-questions -* Does this change requires changing the grammar? -* Are there other Rust-code handling programs that need to be updated? +* Are there third party Rust-code handling programs that need to be updated to +deal with this change? From 36517812becfc395f346020f91efeab4598afe44 Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Fri, 20 May 2016 20:06:51 +0100 Subject: [PATCH 0929/1195] Require break expr to converge --- text/0000-loop-break-value.md | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index 8237c391203..33df7c2f157 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -54,7 +54,7 @@ Four forms of `break` will be supported: 3. `break EXPR;` 4. `break 'label EXPR;` -where `'label` is the name of a loop and `EXPR` is an expression. +where `'label` is the name of a loop and `EXPR` is an converging expression. ### Result type of loop @@ -88,7 +88,6 @@ This proposal changes the result type of 'loop' to `T`, where: * if a loop is "broken" via `break;` or `break 'label;`, the loop's result type must be `()` * if a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` -* as a special case, if a loop is "broken" via `break EXPR;` or `break 'label EXPR;` where `EXPR` evaluates to type `!` (does not return), this does not place a constraint on the type of the loop * if external constaint on the loop's result type exist (e.g. `let x: S = loop { ... };`), then `T` must be coercible to this type It is an error if these types do not agree or if the compiler's type deduction @@ -129,11 +128,6 @@ fn g() -> u32 { // ! coerces to u32 loop {} } -fn z() -> ! { - loop { - break panic!(); - } -} ``` Example showing the equivalence of `break;` and `break ();`: From 5254944e7db9b968193f806e8fc919b4e6d1142c Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Mon, 23 May 2016 11:37:13 +0100 Subject: [PATCH 0930/1195] loop-break-value: update motivation --- text/0000-loop-break-value.md | 66 +++++++++++++++++++++++++---------- 1 file changed, 48 insertions(+), 18 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index 33df7c2f157..7b3462c5227 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -16,28 +16,58 @@ Let a `loop { ... }` expression return a value via `break my_value;`. # Motivation [motivation]: #motivation -This pattern is currently hard to implement without resorting to a function or -closure wrapping the loop: +> Rust is an expression-oriented language. Currently loop constructs don't +> provide any useful value as expressions, they are run only for their +> side-effects. But there clearly is a "natural-looking", practical case, +> described in [this thread](https://github.com/rust-lang/rfcs/issues/961) +> and [this] RFC, where the loop expressions could have +> meaningful values. I feel that not allowing that case runs against the +> expression-oriented conciseness of Rust. +> [comment by golddranks](https://github.com/rust-lang/rfcs/issues/961#issuecomment-220820787) + +Some examples which can be much more concisely written with this RFC: ```rust -fn f() { - let outcome = loop { - // get and process some input, e.g. from the user or from a list of - // files - let result = get_result(); - - if successful() { - break result; +// without break-with-value: +let x = { + let temp_bar; + loop { + ... + if ... { + temp_bar = bar; + break; } - // otherwise keep trying - }; - - use_the_result(outcome); -} -``` + } + foo(temp_bar) +}; + +// with break-with-value: +let x = foo(loop { + ... + if ... { break bar; } + }); -In some cases, one can simply move `use_the_result(outcome)` into the loop, but -sometimes this is undesirable and sometimes impossible due to lifetimes. +// without break-with-value: +let computation = { + let result; + loop { + if let Some(r) = self.do_something() { + result = r; + break; + } + } + result.do_computation() +}; +self.use(computation); + +// with break-with-value: +let computation = loop { + if let Some(r) = self.do_something() { + break r; + } + }.do_computation(); +self.use(computation); +``` # Detailed design [design]: #detailed-design From 894ae7c80781ab1049ec3e330a74690056f9e1ed Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Mon, 23 May 2016 11:48:14 +0100 Subject: [PATCH 0931/1195] loop-break-value: improve consitency and update syntax for for loops --- text/0000-loop-break-value.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index 7b3462c5227..b420beb6f65 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -28,7 +28,7 @@ Let a `loop { ... }` expression return a value via `break my_value;`. Some examples which can be much more concisely written with this RFC: ```rust -// without break-with-value: +// without loop-break-value: let x = { let temp_bar; loop { @@ -41,13 +41,13 @@ let x = { foo(temp_bar) }; -// with break-with-value: +// with loop-break-value: let x = foo(loop { ... if ... { break bar; } }); -// without break-with-value: +// without loop-break-value: let computation = { let result; loop { @@ -60,7 +60,7 @@ let computation = { }; self.use(computation); -// with break-with-value: +// with loop-break-value: let computation = loop { if let Some(r) = self.do_something() { break r; @@ -88,7 +88,7 @@ where `'label` is the name of a loop and `EXPR` is an converging expression. ### Result type of loop -Currently the result-type of a 'loop' without 'break' is `!` (never returns), +Currently the result type of a 'loop' without 'break' is `!` (never returns), which may be coerced to any type), and the result type of a 'loop' with 'break' is `()`. This is important since a loop may appear as the last expression of a function: @@ -220,21 +220,21 @@ It is thus proposed not to change these expressions at this time. It should be noted that `for`, `while` and `while let` can all be emulated via `loop`, so perhaps allowing the former to return values is less important. -Alternatively, a new keyword such as `default` or `else` could be used to +Alternatively, a keyword such as `else default` could be used to specify the other exit value as in: ```rust fn first(list: Iterator) -> Option { for x in list { break Some(x); - } default { + } else default { None } } ``` -The exact syntax is disputed. It is suggested that this RFC should not be -blocked on this issue since break-with-value can still be implemented in the -manner above after this RFC. See the -[discussion of #961](https://github.com/rust-lang/rfcs/issues/961) -for more on this topic. +The exact syntax is disputed; (JelteF has some suggestions which should work +without infinite parser lookahead) +[https://github.com/rust-lang/rfcs/issues/961#issuecomment-220728894]. +It is suggested that this RFC should not be blocked on this issue since +loop-break-value can still be implemented in the manner above after this RFC. From 0496c698b055e899b764f7604b74ca235bf35e8b Mon Sep 17 00:00:00 2001 From: Peter Marheine Date: Mon, 23 May 2016 10:22:21 -0600 Subject: [PATCH 0932/1195] Correct compile-time errors in RFC 1201 example --- text/1201-naked-fns.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1201-naked-fns.md b/text/1201-naked-fns.md index 2e340df4463..2249726fec1 100644 --- a/text/1201-naked-fns.md +++ b/text/1201-naked-fns.md @@ -144,7 +144,7 @@ use std::sync::atomic::{self, AtomicUsize, Ordering}; #[naked] #[cfg(target_arch="x86")] -unsafe fn isr_3() { +unsafe extern "C" fn isr_3() { asm!("pushad call increment_breakpoint_count popad @@ -159,7 +159,7 @@ pub fn increment_breakpoint_count() { bp_count.fetch_add(1, Ordering::Relaxed); } -fn register_isr(vector: u8, handler: fn() -> ()) { /* ... */ } +fn register_isr(vector: u8, handler: unsafe extern "C" fn() -> ()) { /* ... */ } fn main() { register_isr(3, isr_3); From 2526483e8fff9d3cca3d092f7d5b49a46c67c9e3 Mon Sep 17 00:00:00 2001 From: Andre Bogus Date: Tue, 24 May 2016 18:45:41 +0200 Subject: [PATCH 0933/1195] clarify interaction with elisions --- text/0000-static.md | 48 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 39 insertions(+), 9 deletions(-) diff --git a/text/0000-static.md b/text/0000-static.md index 7b5c82df034..1ac65277aac 100644 --- a/text/0000-static.md +++ b/text/0000-static.md @@ -17,11 +17,11 @@ causes substantial rightwards drift, which makes it hard to format the code to be visually appealing. For example, having a `'static` default for lifetimes would turn this: -``` +```rust static my_awesome_tables: &'static [&'static HashMap, u32>] = .. ``` into this: -``` +```rust static my_awesome_table: &[&HashMap, u32>] = .. ``` @@ -34,13 +34,14 @@ elision for function signatures will work as it does now (see example below). The same default that RFC #599 sets up for trait object is to be used for statics and const declarations. In those declarations, the compiler will assume -`'static` when a lifetime is not explicitly given in both refs and generics. +`'static` when a lifetime is not explicitly given in all reference lifetimes, +including reference lifetimes obtained via generic substitution. Note that this RFC does not forbid writing the lifetimes, it only sets a default when no is given. Thus the change will not cause any breakage and is -thus backwards-compatible. It's also very unlikely that implementing this RFC -will restrict our design space for `static` and `const` definitions down the -road. +therefore backwards-compatible. It's also very unlikely that implementing this +RFC will restrict our design space for `static` and `const` definitions down +the road. The `'static` default does *not* override lifetime elision in function signatures, but work alongside it: @@ -50,16 +51,40 @@ static foo: fn(&u32) -> &u32 = ...; // for<'a> fn(&'a u32) -> &'a u32 static bar: &Fn(&u32) -> &u32 = ...; // &'static for<'a> Fn(&'a u32) -> &'a u32 ``` -With generics, it will work as anywhere else. Notably, writing out the lifetime +With generics, it will work as anywhere else, also differentiating between +function lifetimes and reference lifetimes. Notably, writing out the lifetime is still possible. -``` -trait SomeObject<'a> { ... } +```rust +trait SomeObject<'a> { .. } static foo: &SomeObject = ...; // &'static SomeObject<'static> static bar: &for<'a> SomeObject<'a> = ...; // &'static for<'a> SomeObject<'a> static baz: &'static [u8] = ...; + +struct SomeStruct<'a, 'b> { + foo: &'a Foo, + bar: &'a Bar, + f: for<'b> Fn(&'b Foo) -> &'b Bar +} + +static blub: &SomeStruct = ...; // &'static SomeStruct<'static, 'b> for any 'b ``` +It will still be an error to omit lifetimes in function types *not* eligible +for elision, e.g. + +```rust +static blobb: FnMut(&Foo, &Bar) -> &Baz = ...; //~ ERROR: missing lifetimes for + //^ &Foo, &Bar, &Baz +``` + +This ensures that the really hairy cases that need the full type documented +aren't unduly abbreviated. + +It should also be noted that since statics and constants have no `self` type, +elision will only work with distinct input lifetimes or one input+output +lifetime. + # Drawbacks [drawbacks]: #drawbacks @@ -74,6 +99,11 @@ writing macros if they have many resources. * Write the aforementioned macro. This is inferior in terms of UX. Depending on the implementation it may or may not be possible to default lifetimes in generics. +* Make all non-elided lifetimes `'static`. This has the drawback of creating +hard-to-spot errors (that would also probably occur in the wrong place) and +confusing users. +* Make all non-declared lifetimes `'static`. This would not be backwards +compatible due to interference with lifetime elision. * Infer types for statics. The absence of types makes it harder to reason about the code, so even if type inference for statics was to be implemented, defaulting lifetimes would have the benefit of pulling the cost-benefit From d6e15942febeb45a8fd2d6aac5c95054105e18a1 Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Thu, 26 May 2016 12:50:56 +0100 Subject: [PATCH 0934/1195] Allow break values to diverge again This reverts commit 36517812becfc395f346020f91efeab4598afe44. --- text/0000-loop-break-value.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index b420beb6f65..761b6b5128e 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -84,7 +84,7 @@ Four forms of `break` will be supported: 3. `break EXPR;` 4. `break 'label EXPR;` -where `'label` is the name of a loop and `EXPR` is an converging expression. +where `'label` is the name of a loop and `EXPR` is an expression. ### Result type of loop @@ -118,6 +118,7 @@ This proposal changes the result type of 'loop' to `T`, where: * if a loop is "broken" via `break;` or `break 'label;`, the loop's result type must be `()` * if a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` +* as a special case, if a loop is "broken" via `break EXPR;` or `break 'label EXPR;` where `EXPR` evaluates to type `!` (does not return), this does not place a constraint on the type of the loop * if external constaint on the loop's result type exist (e.g. `let x: S = loop { ... };`), then `T` must be coercible to this type It is an error if these types do not agree or if the compiler's type deduction @@ -158,6 +159,11 @@ fn g() -> u32 { // ! coerces to u32 loop {} } +fn z() -> ! { + loop { + break panic!(); + } +} ``` Example showing the equivalence of `break;` and `break ();`: From 32bdf90b8266c2c50ec3e79c6c8e87d95fb22028 Mon Sep 17 00:00:00 2001 From: Diggory Hardy Date: Thu, 26 May 2016 13:13:32 +0100 Subject: [PATCH 0935/1195] Add discussion of for/while/while let --- text/0000-loop-break-value.md | 72 ++++++++++++++++++++++++++++------- 1 file changed, 58 insertions(+), 14 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index 761b6b5128e..2e28a0c20dd 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -218,16 +218,50 @@ otherwise and allows more flexibility in code layout. # Unresolved questions [unresolved]: #unresolved-questions -It would be possible to allow `for`, `while` and `while let` expressions return -values in a similar way; however, these expressions may also terminate -"naturally" (not via break), and no consensus has been reached on how the -result value should be determined in this case, or even the result type. -It is thus proposed not to change these expressions at this time. +### Extension to for, while, while let -It should be noted that `for`, `while` and `while let` can all be emulated via -`loop`, so perhaps allowing the former to return values is less important. -Alternatively, a keyword such as `else default` could be used to -specify the other exit value as in: +A frequently discussed issue is extension of this concept to allow `for`, +`while` and `while let` expressions to return values in a similar way. There is +however a complication: these expressions may also terminate "naturally" (not +via break), and no consensus has been reached on how the result value should +be determined in this case, or even the result type. + +There are three options: + +1. Do not adjust `for`, `while` or `while let` at this time +2. Adjust these control structures to return an `Option`, returning `None` + in the default case +3. Specify the default return value via some extra syntax + +#### Via `Option` + +Unfortunately, option (2) is not possible to implement cleanly without breaking +a lot of existing code: many functions use one of these control structures in +tail position, where the current "value" of the expression, `()`, is implicitly +used: + +```rust +// function returns `()` +fn print_my_values(v: &Vec) { + for x in v { + println!("Value: {}", x); + } + // loop exits with `()` which is implicitly "returned" from the function +} +``` + +Two variations of option (2) are possible: + +* Only adjust the control structures where they contain a `break EXPR;` or + `break 'label EXPR;` statement. This may work but would necessitate that + `break;` and `break ();` mean different things. +* As a special case, make `break ();` return `()` instead of `Some(())`, + while for other values `break x;` returns `Some(x)`. + +#### Via extra syntax for the default value + +Several syntaxes have been proposed for how a control structure's default value +is set. For example: ```rust fn first(list: Iterator) -> Option { @@ -239,8 +273,18 @@ fn first(list: Iterator) -> Option { } ``` -The exact syntax is disputed; (JelteF has some suggestions which should work -without infinite parser lookahead) -[https://github.com/rust-lang/rfcs/issues/961#issuecomment-220728894]. -It is suggested that this RFC should not be blocked on this issue since -loop-break-value can still be implemented in the manner above after this RFC. +or: + +```rust +let x = for thing in things default "nope" { + if thing.valid() { break "found it!"; } +} +``` + +There are two things to bear in mind when considering new syntax: + +* It is undesirable to add a new keyword to the list of Rust's keywords +* It is strongly desirable that unbounded lookahead is required while syntax + parsing Rust code + +For more discussion on this topic, see [issue #961](https://github.com/rust-lang/rfcs/issues/961). From e8290fab1366d3e8df6dbc68efdc57c130002211 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Fri, 27 May 2016 16:24:57 -0400 Subject: [PATCH 0936/1195] Address various nits --- text/0000-more-api-documentation-conventions.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index c474883f886..707e8c90925 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -7,7 +7,7 @@ [summary]: #summary [RFC 505] introduced certain conventions around documenting Rust projects. This -RFC aguments that one, and a full text of the older one combined with these +RFC augments that one, and a full text of the older one combined with these modfications is provided below. [RFC 505]: https://github.com/rust-lang/rfcs/blob/master/text/0505-api-comment-conventions.md @@ -148,7 +148,7 @@ Use backticks (```) to write longer examples, like this: x.bar(); ``` -When appropriate, make use of Rustdoc’s modifiers. Annotate triple grave blocks with +When appropriate, make use of Rustdoc’s modifiers. Annotate triple backtick blocks with the appropriate formatting directive. ```rust @@ -236,7 +236,7 @@ consistency’s sake. [referring-to-types]: #referring-to-types When talking about a type, use its full name. In other words, if the type is generic, -say `Option`, not `Option`. An exception to this is lengthy bounds. Write `Cow<'a, B>` +say `Option`, not `Option`. An exception to this is bounds. Write `Cow<'a, B>` rather than `Cow<'a, B> where B: 'a + ToOwned + ?Sized`. Another possibility is to write in lower case using a more generic term. In other words, From 2a378af684c34d577bc2adf688e02a0ac35ef894 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Fri, 27 May 2016 16:30:54 -0400 Subject: [PATCH 0937/1195] Transition to the diff style Now, the RFC proper only includes the new conventions, rather than all of the RFC 505 stuff, in an effort to make what is under discussion more clear. --- ...0000-more-api-documentation-conventions.md | 277 ++++++++++++++++++ 1 file changed, 277 insertions(+) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index 707e8c90925..ac755884a78 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -24,6 +24,283 @@ but it tries to motivate and clarify them. # Detailed design [design]: #detailed-design +## Content +[content]: #content + +These conventions relate to the contents of the documentation, the words themselves. + +### English +[english]: #english + +This section applies to `rustc` and the standard library. + +An additional suggestion over RFC 505: One specific rule that comes up often: +when quoting something for emphasis, use a single quote, and put punctuation +outside the quotes, ‘this’. When quoting something at length, “use double +quotes and put the punctuation inside of the quote.” Most documentation will +end up using single quotes, so if you’re not sure, just stick with them. + +## Form +[form]: #form + +These conventions relate to the formatting of the documentation, how they +appear in source code. + +### Using Markdown +[using-markdown]: #using-markdown + +The updated list of common headings is: + +* Examples +* Panics +* Errors +* Safety +* Aborts +* Undefined Behavior + +RFC 505 suggests that one should always use the `rust` formatting directive: + + ```rust + println!("Hello, world!"); + ``` + + ```ruby + puts "Hello" + ``` + +But, in API documentation, feel free to rely on the default being ‘rust’: + + /// For example: + /// + /// ``` + /// let x = 5; + /// ``` + +Other places do not know how to highlight this anyway, so it's not important to +be explicit. + +RFC 505 suggests that references and citation should be linked ‘reference +style.’ This is still recommended, but prefer to leave off the second `[]`: + +``` +[Rust website] + +[Rust website]: http://www.rust-lang.org +``` + +to + +``` +[Rust website][website] + +[website]: http://www.rust-lang.org +``` + +But, if the text is very long, it is okay to use this form. + +### Examples in API docs +[examples-in-api-docs]: #examples-in-api-docs + +Everything should have examples. Here is an example of how to do examples: + +``` +/// # Examples +/// +/// Basic usage: +/// +/// ``` +/// use op; +/// +/// let s = "foo"; +/// let answer = op::compare(s, "bar"); +/// ``` +/// +/// Passing a closure to compare with, rather than a string: +/// +/// ``` +/// use op; +/// +/// let s = "foo"; +/// let answer = op::compare(s, |a| a.chars().is_whitespace().all()); +/// ``` +``` + +For particularly simple APIs, still say “Examples” and “Basic usage:” for +consistency’s sake. + +### Referring to types +[referring-to-types]: #referring-to-types + +When talking about a type, use its full name. In other words, if the type is generic, +say `Option`, not `Option`. An exception to this is bounds. Write `Cow<'a, B>` +rather than `Cow<'a, B> where B: 'a + ToOwned + ?Sized`. + +Another possibility is to write in lower case using a more generic term. In other words, +‘string’ can refer to a `String` or an `&str`, and ‘an option’ can be ‘an `Option`’. + +### Link all the things +[link-all-the-things]: #link-all-the-things + +A major drawback of Markdown is that it cannot automatically link types in API documentation. +Do this yourself with the reference-style syntax, for ease of reading: + +``` +/// The [`String`] passed in lorum ipsum... +/// +/// [`String`]: ../string/struct.String.html +``` + +## Example +[example]: #example + +Below is a full crate, with documentation following these rules. I am loosely basing +this off of my [ref_slice] crate, because it’s small, but I’m not claiming the code +is good here. It’s about the docs, not the code. + +[ref_slice]: https://crates.io/crates/ref_slice + +In lib.rs: + +```rust +//! Turning references into slices +//! +//! This crate contains several utility functions for taking various kinds +//! of references and producing slices out of them. In this case, only full +//! slices, not ranges for sub-slices. +//! +//! # Layout +//! +//! At the top level, we have functions for working with references, `&T`. +//! There are two submodules for dealing with other types: `option`, for +//! &[`Option`], and `mut`, for `&mut T`. +//! +//! [`Option`]: http://doc.rust-lang.org/std/option/enum.Option.html + +pub mod option; + +/// Converts a reference to `T` into a slice of length 1. +/// +/// This will not copy the data, only create the new slice. +/// +/// # Panics +/// +/// In this case, the code won’t panic, but if it did, the circumstances +/// in which it would would be included here. +/// +/// # Examples +/// +/// Basic usage: +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::ref_slice; +/// +/// let x = &5; +/// +/// let slice = ref_slice(x); +/// +/// assert_eq!(&[5], slice); +/// ``` +/// +/// A more compelx example. In this case, it’s the same example, because this +/// is a pretty trivial function, but use your imagination. +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::ref_slice; +/// +/// let x = &5; +/// +/// let slice = ref_slice(x); +/// +/// assert_eq!(&[5], slice); +/// ``` +pub fn ref_slice(s: &T) -> &[T] { + unimplemented!() +} + +/// Functions that operate on mutable references. +/// +/// This submodule mirrors the parent module, but instead of dealing with `&T`, +/// they’re for `&mut T`. +mod mut { + /// Converts a reference to `&mut T` into a mutable slice of length 1. + /// + /// This will not copy the data, only create the new slice. + /// + /// # Safety + /// + /// In this case, the code doesn’t need to be marked as unsafe, but if it + /// did, the invariants you’re expected to uphold would be documented here. + /// + /// # Examples + /// + /// Basic usage: + /// + /// ``` + /// extern crate ref_slice; + /// use ref_slice::mut; + /// + /// let x = &mut 5; + /// + /// let slice = mut::ref_slice(x); + /// + /// assert_eq!(&mut [5], slice); + /// ``` + pub fn ref_slice(s: &mut T) -> &mut [T] { + unimplemented!() + } +} +``` + +in `option.rs`: + +```rust +//! Functions that operate on references to [`Option`]s. +//! +//! This submodule mirrors the parent module, but instead of dealing with `&T`, +//! they’re for `&`[`Option`]. +//! +//! [`Option`]: http://doc.rust-lang.org/std/option/enum.Option.html + +/// Converts a reference to `Option` into a slice of length 0 or 1. +/// +/// [`Option`]: http://doc.rust-lang.org/std/option/enum.Option.html +/// +/// This will not copy the data, only create the new slice. +/// +/// # Examples +/// +/// Basic usage: +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::option; +/// +/// let x = &Some(5); +/// +/// let slice = option::ref_slice(x); +/// +/// assert_eq!(&[5], slice); +/// ``` +/// +/// `None` will result in an empty slice: +/// +/// ``` +/// extern crate ref_slice; +/// use ref_slice::option; +/// +/// let x: &Option = &None; +/// +/// let slice = option::ref_slice(x); +/// +/// assert_eq!(&[], slice); +/// ``` +pub fn ref_slice(opt: &Option) -> &[T] { + unimplemented!() +} +``` + # Drawbacks [drawbacks]: #drawbacks From b48a3c4ed1bce8ab82db29b11338e2b19491d934 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Fri, 27 May 2016 16:41:33 -0400 Subject: [PATCH 0938/1195] Add some things @carols10cents brought up --- ...0000-more-api-documentation-conventions.md | 62 +++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index ac755884a78..baee94fa430 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -138,6 +138,23 @@ rather than `Cow<'a, B> where B: 'a + ToOwned + ?Sized`. Another possibility is to write in lower case using a more generic term. In other words, ‘string’ can refer to a `String` or an `&str`, and ‘an option’ can be ‘an `Option`’. +### Use parentheses for functions +[use-parentheses-for-functions]: #use-parentheses-for-functions + +When referring to function names, include the `()`s after the name. For example, do this: + +```rust +/// This behavior is similar to the way that `mem::replace()` works. +``` + +Not this: + +```rust +/// This behavior is similar to the way that `mem::replace` works. +``` + +This helps visually differentiate it in the text. + ### Link all the things [link-all-the-things]: #link-all-the-things @@ -150,6 +167,20 @@ Do this yourself with the reference-style syntax, for ease of reading: /// [`String`]: ../string/struct.String.html ``` +### Module-level vs type-level docs +[module-level-vs-type-level-docs]: #module-level-vs-type-level-docs + +There has often been a tension between module-level and type-level +documentation. For example, in today's standard library, the various +`*Cell` docs say, in the pages for each type, to "refer to the module-level +documentation for more details." + +Instead, module-level documentation should show a high-level summary of +everything in the module, and each type should document itself fully. It is +okay if there is some small amount of duplication here. Module-level +documentation should be broad, and not go into a lot of detail, which is left +to the type's documentation. + ## Example [example]: #example @@ -519,6 +550,23 @@ rather than `Cow<'a, B> where B: 'a + ToOwned + ?Sized`. Another possibility is to write in lower case using a more generic term. In other words, ‘string’ can refer to a `String` or an `&str`, and ‘an option’ can be ‘an `Option`’. +### Use parentheses for functions +[use-parentheses-for-functions]: #use-parentheses-for-functions + +When referring to function names, include the `()`s after the name. For example, do this: + +```rust +/// This behavior is similar to the way that `mem::replace()` works. +``` + +Not this: + +```rust +/// This behavior is similar to the way that `mem::replace` works. +``` + +This helps visually differentiate it in the text. + ### Link all the things [link-all-the-things]: #link-all-the-things @@ -531,6 +579,20 @@ Do this yourself with the reference-style syntax, for ease of reading: /// [`String`]: ../string/struct.String.html ``` +### Module-level vs type-level docs +[module-level-vs-type-level-docs]: #module-level-vs-type-level-docs + +There has often been a tension between module-level and type-level +documentation. For example, in today's standard library, the various +`*Cell` docs say, in the pages for each type, to "refer to the module-level +documentation for more details." + +Instead, module-level documentation should show a high-level summary of +everything in the module, and each type should document itself fully. It is +okay if there is some small amount of duplication here. Module-level +documentation should be broad, and not go into a lot of detail, which is left +to the type's documentation. + ## Example [example]: #example From f30df3042f8b7109b42a58eb61fccd43cf699d84 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Fri, 27 May 2016 16:44:22 -0400 Subject: [PATCH 0939/1195] remove excess headers --- ...0000-more-api-documentation-conventions.md | 22 ------------------- 1 file changed, 22 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index baee94fa430..25a8aaba3e1 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -24,11 +24,6 @@ but it tries to motivate and clarify them. # Detailed design [design]: #detailed-design -## Content -[content]: #content - -These conventions relate to the contents of the documentation, the words themselves. - ### English [english]: #english @@ -40,12 +35,6 @@ outside the quotes, ‘this’. When quoting something at length, “use double quotes and put the punctuation inside of the quote.” Most documentation will end up using single quotes, so if you’re not sure, just stick with them. -## Form -[form]: #form - -These conventions relate to the formatting of the documentation, how they -appear in source code. - ### Using Markdown [using-markdown]: #using-markdown @@ -351,11 +340,6 @@ None. Below is a combination of RFC 505 + this RFC’s modifications, for convenience. -## Content -[content]: #content - -These conventions relate to the contents of the documentation, the words themselves. - ### Summary sentence [summary-sentence]: #summary-sentence @@ -382,12 +366,6 @@ quoting something at length, “use double quotes and put the punctuation inside of the quote.” Most documentation will end up using single quotes, so if you’re not sure, just stick with them. -## Form -[form]: #form - -These conventions relate to the formatting of the documentation, how they -appear in source code. - ### Use line comments [use-line-comments]: #use-line-comments From 4164dfafbcd399b6bc1d96915a15d466fe978875 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 1 Jun 2016 16:07:51 -0400 Subject: [PATCH 0940/1195] remove random extra formatting header --- text/0000-more-api-documentation-conventions.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index 25a8aaba3e1..d4f525337ca 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -722,5 +722,3 @@ pub fn ref_slice(opt: &Option) -> &[T] { } ``` -## Formatting - From 8f8d56075068eab8167f4b166f43b65c321fe7dc Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Wed, 1 Jun 2016 23:17:49 +0200 Subject: [PATCH 0941/1195] Update dropck-eyepatch RFC to use `#[may_dangle] ARG` attribute (where `ARG` is either a formal lifetime parameter `'a` or a formal type parmeter `T`). --- text/0000-dropck-param-eyepatch.md | 247 +++++++++++++++-------------- 1 file changed, 130 insertions(+), 117 deletions(-) diff --git a/text/0000-dropck-param-eyepatch.md b/text/0000-dropck-param-eyepatch.md index 4cadb2b88fa..c5e5c6eedb0 100644 --- a/text/0000-dropck-param-eyepatch.md +++ b/text/0000-dropck-param-eyepatch.md @@ -9,9 +9,16 @@ Refine the unguarded-escape-hatch from [RFC 1238][] (nonparametric dropck) so that instead of a single attribute side-stepping *all* dropck constraints for a type's destructor, we instead have a more -focused attribute that specifies exactly which type and/or lifetime +focused system that specifies exactly which type and/or lifetime parameters the destructor is guaranteed not to access. +Specifically, this RFC proposes adding the capability to attach +attributes to the binding sites for generic parameters (i.e. lifetime +and type paramters). Atop that capability, this RFC proposes adding a +`#[may_dangle]` attribute that indicates that a given lifetime or type +holds data that must not be accessed during the dynamic extent of that +`drop` invocation. + [RFC 1238]: https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md [RFC 769]: https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md @@ -130,33 +137,127 @@ storage for [cyclic graph structures][dropck_legal_cycles.rs]). # Detailed design [detailed design]: #detailed-design - 1. Add a new fine-grained attribute, `unsafe_destructor_blind_to` - (which this RFC will sometimes call the "eyepatch", since it does - not make dropck totally blind; just blind on one "side"). + 1. Add the ability to attach attributes to syntax that binds formal + lifetime or type parmeters. For the purposes of this RFC, the only + place in the syntax that requires such attributes are `impl` + blocks, as in `impl Drop for Type { ... }` + + 2. Add a new fine-grained attribute, `may_dangle`, which is attached + to the binding sites for lifetime or type parameters on an `Drop` + implementation. + This RFC will sometimes call this attribute the "eyepatch", + since it does + not make dropck totally blind; just blind on one "side". + + 3. Add a new requirement that any `Drop` implementation that uses the + `#[may_dangle]` attribute must be declared as an `unsafe impl`. + This reflects the fact that such `Drop` implementations have + an additional constraint on their behavior (namely that they cannot + access certain kinds of data) that will not be verified by the + compiler and thus must be verified by the programmer. + + 4. Remove `unsafe_destructor_blind_to_params`, since all uses of it + should be expressible via `#[may_dangle]`. + +## Attributes on lifetime or type parameters + +This is a simple extension to the syntax. + +Constructions like the following will now become legal. + +Example of eyepatch attribute on a single type parameter: +```rust +unsafe impl<'a, #[may_dangle] X, Y> Drop for Foo<'a, X, Y> { + ... +} +``` + +Example of eyepatch attribute on a lifetime parameter: +```rust +unsafe impl<#[may_dangle] 'a, X, Y> Drop for Bar<'a, X, Y> { + ... +} +``` + +Example of eyepatch attribute on multiple parameters: +```rust +unsafe impl<#[may_dangle] 'a, X, #[may_dangle] Y> Drop for Baz<'a, X, Y> { + ... +} +``` + +These attributes are only written next to the formal binding +sites for the generic parameters. The *usage* sites, points +which refer back to the parameters, continue to disallow the use +of attributes. + +So while this is legal syntax: + +```rust +unsafe impl<'a, #[may_dangle] X, Y> Drop for Foo<'a, X, Y> { + ... +} +``` + +the follow would be illegal syntax (at least for now): + +```rust +unsafe impl<'a, X, Y> Drop for Foo<'a, #[may_dangle] X, Y> { + ... +} +``` - 2. Remove `unsafe_destructor_blind_to_params`, since all uses of it - should be expressible via `unsafe_destructor_blind_to` (once that - has been completely implemented). ## The "eyepatch" attribute -Add a new attribute, `unsafe_destructor_blind_to(ARG)` (the "eyepatch"). +Add a new attribute, `#[may_dangle]` (the "eyepatch"). The eyepatch is similar to `unsafe_destructor_blind_to_params`: it is -attached to the destructor[1](#footnote1), and it is meant +part of the `Drop` implementation, and it is meant to assert that a destructor is guaranteed not to access certain kinds of data accessible via `self`. -The main difference is that the eyepatch has a single required -parameter, `ARG`. This is the place where you specify exactly *what* +The main difference is that the eyepatch is applied to a single +generic parameter: `#[may_dangle] ARG`. +This specifies exactly *what* the destructor is blind to (i.e., what will dropck treat as inaccessible from the destructor for this type). -There are two things one can put the `ARG` for a given eyepatch: one -of the type parameters for the type, or one of the lifetime parameters -for the type.[2](#footnote2) - -### Examples stolen from the Rustonomicon +There are two things one can supply as the `ARG` for a given eyepatch: +one of the type parameters for the type, +or one of the lifetime parameters +for the type. + +When used on a type, e.g. `#[may_dangle] T`, the programmer is +asserting the only uses of values of that type will be to move or drop +them. Thus, no fields will be accessed nor methods called on values of +such a type (apart from any access performed by the destructor for the +type when the values are dropped). This ensures that no dangling +references (such as when `T` is instantiated with `&'a u32`) are ever +accessed in the scenario where `'a` has the same lifetime as the value +being currently destroyed (and thus the precise order of destruction +between the two is unknown to the compiler). + +When used on a lifetime, e.g. `#[may_dangle] 'a`, the programmer is +asserting that no data behind a reference of lifetime `'a` will be +accessed by the destructor. Thus, no fields will be accessed nor +methods called on values of type `&'a Struct`, ensuring that again no +dangling references are ever accessed by the destructor. + +## Require `unsafe` on Drop implementations using the eyepatch + +The final detail is to add an additional check to the compiler +to ensure that any use of `#[may_dangle]` on a `Drop` implementation +imposes a requirement that that implementation block use +`unsafe impl`.[2](#footnote1) + +This reflects the fact that use of `#[may_dangle]` is a +programmer-provided assertion about the behavior of the `Drop` +implementation that must be valided manually by the programmer. +It is analogous to other uses of `unsafe impl` (apart from the +fact that the `Drop` trait itself is not an `unsafe trait`). + +### Examples adapted from the Rustonomicon [nomicon dropck]: https://doc.rust-lang.org/nightly/nomicon/dropck.html @@ -169,8 +270,7 @@ Example of eyepatch on a lifetime parameter:: ```rust struct InspectorA<'a>(&'a u8, &'static str); -impl<'a> Drop for InspectorA<'a> { - #[unsafe_destructor_blind_to('a)] +unsafe impl<#[may_dangle] 'a> Drop for InspectorA<'a> { fn drop(&mut self) { println!("InspectorA(_, {}) knows when *not* to inspect.", self.1); } @@ -184,8 +284,7 @@ use std::fmt; struct InspectorB(T, &'static str); -impl Drop for InspectorB { - #[unsafe_destructor_blind_to(T)] +unsafe impl<#[may_dangle] T: fmt::Display> Drop for InspectorB { fn drop(&mut self) { println!("InspectorB(_, {}) knows when *not* to inspect.", self.1); } @@ -202,8 +301,7 @@ To generalize `RawVec` from the [motivation](#motivation) with an code), we would now write: ```rust -impl Drop for RawVec { - #[unsafe_destructor_blind_to(T)] +unsafe impl<#[may_dangle]T, A:Allocator> Drop for RawVec { /// Frees the memory owned by the RawVec *without* trying to Drop its contents. fn drop(&mut self) { [... free memory using self.alloc ...] @@ -211,7 +309,7 @@ impl Drop for RawVec { } ``` -The use of `unsafe_destructor_blind_to(T)` here asserts that even +The use of `#[may_dangle] T` here asserts that even though the destructor may access borrowed data through `A` (and thus dropck must impose drop-ordering constraints for lifetimes occurring in the type of `A`), the developer is guaranteeing that no access to @@ -232,9 +330,7 @@ If we wanted to generalize this type a bit, we might write: ```rust struct InspectorC<'a,'b,'c>(&'a str, &'b str, &'c str); -impl<'a,'b,'c> Drop for InspectorC<'a,'b,'c> { - #[unsafe_destructor_blind_to('a)] - #[unsafe_destructor_blind_to('c)] +unsafe impl<#[may_dangle] 'a, 'b, #[may_dangle] 'c> Drop for InspectorC<'a,'b,'c> { fn drop(&mut self) { println!("InspectorA(_, {}, _) knows when *not* to inspect.", self.1); } @@ -319,7 +415,7 @@ including when `D` == `InspectorC<'a,'name,'c>`). [prototype]: #prototype pnkfelix has implemented a proof-of-concept -[implementation][pnkfelix prototype] of this feature. +[implementation][pnkfelix prototype] of the `#[may_dangle]` attribute. It uses the substitution machinery we already have in the compiler to express the semantics above. @@ -329,20 +425,15 @@ Here we note a few limitations of the current prototype. These limitations are *not* being proposed as part of the specification of the feature. -1. The eyepatch is not attached to the -destructor in the current [prototype][pnkfelix prototype]; it is -instead attached to the `struct`/`enum` definition itself. - -2. The eyepatch is only able to accept a type -parameter, not a lifetime, in the current -[prototype][pnkfelix prototype]; it is instead attached to the -`struct`/`enum` definition itself. +2. The compiler does not yet enforce (or even +allow) the use of `unsafe impl` for `Drop` implementations that use +the `#[may_dangle]` attribute. Fixing the above limitations should just be a matter of engineering, not a fundamental hurdle to overcome in the feature's design in the context of the language. -[pnkfelix prototype]: https://github.com/pnkfelix/rust/commits/fsk-nonparam-blind-to-indiv +[pnkfelix prototype]: https://github.com/pnkfelix/rust/commits/dropck-eyepatch # Drawbacks [drawbacks]: #drawbacks @@ -359,82 +450,11 @@ could check the assertions being made by the programmer, rather than trusting them. (pnkfelix has some thoughts on this, which are mostly reflected in what he wrote in the [RFC 1238 alternatives][].) -## Attributes lack hygiene -[attributes-lack-hygiene]: #attributes-lack-hygiene - -As noted by arielb1, putting type parameter identifiers into attributes -is not likely to play well with macro hygiene. - -Here is a concrete example: - -```rust -struct Yell2(A, B); - -macro_rules! make_yell2a { - ($A:ident, $B:ident) => { - impl<$A:Debug,$B:Debug> Drop for Yell2<$A,$B> { - #[unsafe_destructor_blind_to(???)] // <---- - fn drop(&mut self) { - println!("Yell1(_, {:?})", self.1); - } - } - } -} - -make_yell2a!(X, Y); -``` - -Here is the issue: In the above, what does one put in for the `???` to -say that we are blind to the first type parameter to `Yell2`? -`#[unsafe_destructor_blind_to(A)` would be nonsense, becauase in the instantiation of the macro, `$A` will be mapped to the identifier `X`. so perhaps we should write it is blind to `X` -- but to me one big point of macro hygiene is that a macro definition should not have to build in knowledge of the identifiers chosen at the usage site, and this is the opposite of that. - -(I don't think `#[unsafe_destructor_blind_to($A)` works, because our attribute system operates at the same meta-level that macros operate at , but I would be happy to be proven wrong.) - ----- - -Despite my somewhat dire attitude above, I don't think this is a significant problem in the short term. This sort of macro is probably rare, and the combination of this macro with UGEH is doubly so. You cannot define a destructor multiple times for the same type, so it seems weird to me to abstract this code construction at this particular level. - - [RFC 1238 alternatives]: https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md#continue-supporting-parametricity # Alternatives [alternatives]: #alternatives -## unsafe_destructor_blind_to(T1, T2, ...) - -The eyepatch could take multiple arguments, rather than requiring a -distinct instance of the attribute for each parameter that we are -blind to. - -However, I think that each usage of the attribute needs to be -considered, since it represents a separate "attack vector" where -unsoundness can be introduced, and therefore it deserves more than -just a comma and a space added to the program text when it is added. - -(I only weakly support the latter position; it is obviously easy -to support this form if that is deemed desirable.) - -## Use a blacklist not a whitelist -[blacklist-not-whitelist]: #use-a-blacklist-not-a-whitelist - -The `unsafe_destructor_blind_to` attribute acts as a whitelist of -parameters that we are telling dropck to ignore in its analysis -of this destructor. - -We could instead add a way to list the lifetimes and/or -type-expressions (e.g. parameters, projections from parameters) that -the destructor may access (and thus treat that list as a blacklist of -parameters that dropck needs to *include* in its analysis). - -arielb1 first suggested this as an attribute form -[here][blacklist attribute], but then provided a different formulation -of the idea by expressing it as a [`where`-clause][blacklist where] on -the `fn drop` method (which is what I will show in the next section). - -[blacklist attribute]: https://github.com/rust-lang/rfcs/pull/1327#issuecomment-149302743 - -[blacklist where]: https://github.com/rust-lang/rfcs/pull/1327#issuecomment-149329351 - ## Make dropck "see again" via (focused) where-clauses (This alternative carries over some ideas from @@ -480,10 +500,8 @@ avoids a number of problems that the eyepatch attribute has. Advantages of fn-drop-with-where-clauses: - * It completely sidesteps the [hygiene issue][attributes-lack-hygiene]. - - * If the eyepatch attribute is to be limited to identifiers (type - parameters) and lifetimes, then this approach is more expressive, + * Since the eyepatch attribute is to be limited to type and lifetime + parameters, this approach is more expressive, since it would allow one to put type-projections into the constraints. @@ -542,7 +560,7 @@ this RFC rather than waiting for a sound compiler analysis): to write that `T` is parametric (e.g. `T: ?Special` in the [RFC 1238 alternatives]). Even then, we would still need the compiler changes suggested by this RFC, and at that point hopefully the task would be for the programmer to mechanically - replace occurrences of `#[unsafe_destructor_blind_to(T)` with `T: ?Special` + replace occurrences of `#[may_dangle] T` with `T: ?Special` (and then see if the library builds). In other words, I see the form suggested by this RFC as being a step *towards* @@ -558,11 +576,6 @@ If we do nothing, then we cannot add `Vec` soundly. # Unresolved questions [unresolved]: #unresolved-questions -Is there any issue with writing `'a` in an attribute like -`#[unsafe_destructor_blind_to('a)]`? (The prototype, as mentioned -[above](#footnote2), does not currently accept lifetime parameter -inputs, so I do not know the answer off hand. - Is the definition of the drop-check rule sound with this `patched(D)` variant? (We have not proven any previous variation of the rule sound; I think it would be an interesting student project though.) From 09aa1d67e2326bc0974b4c2239c4ac01f89d9d5a Mon Sep 17 00:00:00 2001 From: Leo Testard Date: Fri, 8 Apr 2016 14:00:08 +0200 Subject: [PATCH 0942/1195] Add a `literal` fragment specifier to `macro_rules!`. --- text/0000-macros-literal-matcher.md | 40 +++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 text/0000-macros-literal-matcher.md diff --git a/text/0000-macros-literal-matcher.md b/text/0000-macros-literal-matcher.md new file mode 100644 index 00000000000..968ae7ebdc9 --- /dev/null +++ b/text/0000-macros-literal-matcher.md @@ -0,0 +1,40 @@ +- Feature Name: macros-literal-match +- Start Date: 2016-04-08 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Add a `literal` fragment specifier for `macro_rules!` patterns that matches literal constants: + +```rust +macro_rules! foo { + ($l:literal) => ( /* ... */ ); +}; +``` + +# Motivation + +There are a lot of macros out there that take literal constants as arguments (often string constants). For now, most use the `expr` fragment specifier, which is fine since literal constants are a subset of expressions. But it has the following issues: +* It restricts the syntax of those macros. A limited set of FOLLOW tokens is allowed after an `expr` specifier. For example `$e:expr : $t:ty` is not allowed whereas `$l:literal : $t:ty` should be. There is no reason to arbitrarily restrict the syntax of those macros where they will only be actually used with literal constants. A workaround for that is to use the `tt` matcher. +* It does not allow for proper error reporting where the macro actually *needs* the parameter to be a literal constant. With this RFC, bad usage of such macros will give a proper syntax error message whereas with `epxr` it would probably give a syntax or typing error inside the generated code, which is hard to understand. +* It's not consistent. There is no reason to allow expressions, types, etc. but not literals. + +# Design + +Add a `literal` (or `lit`, or `constant`) matcher in macro patterns that matches all single-tokens literal constants (those that are currently represented by `token::Literal`). +Matching input against this matcher would call the `parse_lit` method from `libsyntax::parse::Parser`. The FOLLOW set of this matcher should be the same as `ident` since it matches a single token. + +# Drawbacks + +This includes only single-token literal constants and not compound literals, for example struct literals `Foo { x: some_literal, y: some_literal }` or arrays `[some_literal ; N]`, where `some_literal` can itself be a compound literal. See in alternatives why this is disallowed. + +# Alternatives + +* Allow compound literals too. In theory there is no reason to exclude them since they do not require any computation. In practice though, allowing them requires using the expression parser but limiting it to allow only other compound literals and not arbitrary expressions to occur inside a compound literal (for example inside struct fields). This would probably require much more work to implement and also mitigates the first motivation since it will probably restrict a lot the FOLLOW set of such fragments. +* Adding fragment specifiers for each constant type: `$s:str` which expects a literal string, `$i:integer` which expects a literal integer, etc. With this design, we could allow something like `$s:struct` for compound literals which still requires a lot of work to implement but has the advantage of not ‶polluting″ the FOLLOW sets of other specifiers such as `str`. It provides also better ‶static″ (pre-expansion) checking of the arguments of a macro and thus better error reporting. Types are also good for documentation. The main drawback here if of course that we could not allow any possible type since we cannot interleave parsing and type checking, so we would have to define a list of accepted types, for example `str`, `integer`, `bool`, `struct` and `array` (without specifying the complete type of the structs and arrays). This would be a bit inconsistent since those types indeed refer more to syntactic categories in this context than to true Rust types. It would be frustrating and confusing since it can give the impression that macros do type-checking of their arguments, when of course they don't. +* Don't do this. Continue to use `expr` or `tt` to refer to literal constants. + +# Unresolved + +The keyword of the matcher can be `literal`, `lit`, `constant`, or something else. From cf44e64ea33a22489ab83b9767634d7f2ae59c45 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Fri, 3 Jun 2016 22:34:27 -0400 Subject: [PATCH 0943/1195] Document all features. --- text/0000-document_all_features.md | 145 +++++++++++++++++++++++++++++ 1 file changed, 145 insertions(+) create mode 100644 text/0000-document_all_features.md diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md new file mode 100644 index 00000000000..05a243c516b --- /dev/null +++ b/text/0000-document_all_features.md @@ -0,0 +1,145 @@ +- Feature Name: document_all_features +- Start Date: 2016-06-03 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +One of the major goals of Rust's development process is *stability without stagnation*. That means we add features regularly. However, it can be difficult to *use* those features if they are not publicly documented anywhere. Therefore, this RFC proposes requiring that all new language features and public standard library items must be documented before landing on the stable release branch (item documentation for the standard library; in the language reference for language features). + +# Motivation +[motivation]: #motivation + +At present, new language features are often documented *only* in the RFCs which propose them and the associated announcement blog posts. Moreover, as features change, the existing official language documentation (the Rust Book, Rust by Example, and the language reference) can increasingly grow outdated. + +Although the Rust Book and Rust by Example are kept relatively up to date, [the reference is not][home-to-reference]: + +> While Rust does not have a specification, the reference tries to describe its working in detail. *It tends to be out of date.* (emphasis mine) + +Importantly, though, this warning only appears on the [main site][home-to-reference], not in the reference itself. If someone searches for e.g. that `deprecated` attribute and *does* find the discussion of the deprecated attribute, they will have no reason to believe that the reference is wrong. + +[home-to-reference]: https://www.rust-lang.org/documentation.html + +For example, the change in Rust 1.9 to allow users to use the `#[deprecated]` attribute for their own libraries is, at the time of writing this RFC, *nowhere* reflected in official documentation. (Many other examples could be supplied; this one is chosen for its relative simplicity and recency.) The Book's [discussion of attributes][book-attributes] links to the [reference list of attributes][ref-attributes], but as of the time of writing the reference [still specifies][ref-compiler-attributes] that `deprecated` is a compiler-only feature. The two places where users might become aware of the change are [the Rust 1.9 release blog post][1.9-blog] and the [RFC itself][RFC-1270]. Neither (yet) ranks highly in search; users are likely to be misled. + +[book-attributes]: https://doc.rust-lang.org/book/attributes.html +[ref-attributes]: https://doc.rust-lang.org/reference.html#attributes +[ref-compiler-attributes]: https://doc.rust-lang.org/reference.html#compiler-features +[1.9-blog]: http://blog.rust-lang.org/2016/05/26/Rust-1.9.html#deprecation-warnings +[RFC-1270]: https://github.com/rust-lang/rfcs/blob/master/text/1270-deprecation.md + +Changing this to require all language features to be documented before stabilization would mean Rust users can use the language documentation with high confidence that it will provide exhaustive coverage of all stable Rust features. + +Although the standard library is in excellent shape regarding documentation, including it in this policy will help guarantee that it remains so going forward. + +## The Current Situation +[current-situation]: #the-current-situation + +Today, the canonical source of information about new language features is the RFCs which define them. + +There are several serious problems with the _status quo_: + +1. Many users of Rust may simply not know that these RFCs exist. The number of users who do not know (or especially care) about the RFC process or its history will only increase as Rust becomes more popular. + +2. In many cases, especially in more complicated language features, some important elements of the decision, details of implementation, and expected behavior are fleshed out either in the associated RFC (pull-request) discussion or in the implementation issues which follow them. + +3. The RFCs themselves, and even more so the associated pull request discussions, are often dense with programming langauge theory. This is as it should be in context, but it means that the relevant information may be inaccessible to Rust users without prior PLT background, or without the patience to wade through it. + +4. Similarly, information about the final decisions on language features is often buried deep at the end of long and winding threads (especially for a complicated feature like `impl` specialization). + +5. Information on how the features will be used is often closely coupled to information on how the features will be implemented, both in the RFCs and in the discussion threads. Again, this is as it should be, but it makes it difficult (at best!) for ordinary Rust users to read. + +In short, RFCs are a poor source of information about language features for the ordinary Rust user. Rust users should not need to be troubled with details of how the language is implemented works simply to learn how pieces of it work. Nor should they need to dig through tens (much less hundreds) of comments to determine what the final form of the feature is. + +## Precedent +[precedent]: #precedent + +This exact idea has been adopted by the Ember community after their somewhat bumpy transitions at the end of their 1.x cycle and leading into their 2.x transition. As one commenter there [put it][@davidgoli]: + +> The fact that 1.13 was released without updated guides is really discouraging to me as an Ember adopter. It may be much faster, the features may be much cooler, but to me, they don't exist unless I can learn how to use them from documentation. Documentation IS feature work. ([@davidgoli]) + +[@davidgoli]: https://github.com/emberjs/rfcs/pull/56#issuecomment-114635962 + +The Ember core team agreed, and embraced the principle outlined in [this comment][guarav0]: + +> No version shall be released until guides and versioned API documentation is ready. This will allow newcomers the ability to understand the latest release. ([@guarav0]) + +[guarav0]: https://github.com/emberjs/rfcs/pull/56#issuecomment-114339423 + +One of the main reasons not to adopt this approach, that it might block features from landing as soon as they otherwise might, was [addressed][@eccegordo] in that discussion as well: + +> Now if this documentation effort holds up the releases people are going to grumble. But so be it. The challenge will be to effectively parcel out the effort and relieve the core team to do what they do best. No single person should be a gate. But lack of good documentation should gate releases. That way a lot of eyes are forced to focus on the problem. We can't get the great new toys unless everybody can enjoy the toys. ([@eccegordo]) + +[@eccegordo]: https://github.com/emberjs/rfcs/pull/56#issuecomment-114389963 + +The basic decision has led to a substantial improvement in the currency of the documentation (which is now updated the same day as a new version is released). Moreover, it has spurred ongoing development of better tooling around documentation to manage these releases. Finally, at least in the RFC author's estimation, it has also led to a substantial increase in the overall quality of that documentation, possibly as a consequence of increasing the community involvement in the documentation process (including the formation of a documentation subteam). + +# Detailed design +[design]: #detailed-design + +The basic process of developing new language features will remain unchanged from today, with the addition of a straightforward requirement that they be properly documented before being merged to stable. + +## Language features +[language-features]: #language-features + +In the case of language features, this will be a manual process, involving updates to the `reference.md` file. (It may at some point be sensible to break up the Reference file for easier maintenance; that is left aside as orthogonal to this discussion.) + +Note that the feature documentation does not need to be written by the feature author. In fact, this is one of the areas where the community may be most able to support core developers even if not themselves programming language theorists or compiler hackers. This may free up the compiler developers' time. It will also help communicate the features in a way that is accessible to ordinary Rust users. + +New features do not need to be documented to be merged into `master`/nightly, and in many cases *should* not, since the features may change substantially before landing on stable, at which point the reference material would need to be rewritten. + +Instead, the documentation process should immediately precede the move to stabilize. Once the *feature* has been deemed ready for stabilization, either the author or a community volunteer should write the *reference material* for the feature. + +This need not be especially long, but it should be long enough for ordinary users to learn how to use the language feature *without reading the RFCs*. + +When the core team discusses whether to stabilize a feature in a given release, the reference material will now be a part of that decision. Once the feature *and* reference material are complete, it will be merged normally, and the pull request will simply include the reference material as well as the new feature. + +## Standard library +[std]: #standard-library + +In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. + +# Drawbacks +[drawbacks]: #drawbacks + +The largest drawback at present is that the language reference is *already* quite out of date. It may take substantial work to get it up to date so that new changes can be landed appropriately. (Arguably, however, this should be done regardless, since the language reference is an important part of the language ecosystem.) + +Another potential issue is that some sections of the reference are particularly thorny and must be handled with considerable care (e.g. lifetimes). Although in general it would not be necessary for the author of the new language feature to write all the documentation, considerable extra care and oversight would need to be in place for these sections. + +Finally, this may delay landing features on stable. However, all the points raised in [**Precedent**][precedent] on this apply, especially: + +> We can't get the great new toys unless everybody can enjoy the toys. ([@eccegordo]) + +For Rust to attain its goal of *stability without stagnation*, its documentation must also be stable and not stagnant. + +# Alternatives +[alternatives]: #alternatives + +- **No change; leave RFCs as canonical documentation.** + + This approach can take (at least) two forms: + + 1. We can leave things as they are, where the RFC and surrounding discussion form the primary point of documentation for newer-than-1.0 language features. As part of that, we could just link more prominently to the RFC repository and describe the process from the documentation pages. + 2. We could automatically render the text of the RFCs into part of the documentation used on the site (via submodules and the existing tooling around Markdown documents used for Rust documentation). + + However, for all the reasons highlighted above in [**Motivation: The Current Situation**][current-situation], RFCs and their associated threads are *not* a good canonical source of information on language features. + +- **Add a rule for the standard library but not for language features.** + + This would basically just turn the _status quo_ into an official policy. It has all the same drawbacks as no change at all, but with the possible benefit of enabling automated checks on standard library documentation. + +- **Add a rule for language features but not for the standard library.** + + The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. + + On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). + + +# Unresolved questions +[unresolved]: #unresolved-questions + +- How will the requirement for documentation in the reference be enforced? +- Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? +- For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? +- Is a documentation subteam, _a la_ the one used by Ember, worth creating? \ No newline at end of file From 1ad0a00ce80039072de936b3331384bd21cc2cc6 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Fri, 3 Jun 2016 23:06:39 -0400 Subject: [PATCH 0944/1195] Add another alternative. --- text/0000-document_all_features.md | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 05a243c516b..e9c939d317b 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -118,22 +118,30 @@ For Rust to attain its goal of *stability without stagnation*, its documentation - **No change; leave RFCs as canonical documentation.** - This approach can take (at least) two forms: + This approach can take (at least) two forms: - 1. We can leave things as they are, where the RFC and surrounding discussion form the primary point of documentation for newer-than-1.0 language features. As part of that, we could just link more prominently to the RFC repository and describe the process from the documentation pages. - 2. We could automatically render the text of the RFCs into part of the documentation used on the site (via submodules and the existing tooling around Markdown documents used for Rust documentation). + 1. We can leave things as they are, where the RFC and surrounding discussion form the primary point of documentation for newer-than-1.0 language features. As part of that, we could just link more prominently to the RFC repository and describe the process from the documentation pages. + 2. We could automatically render the text of the RFCs into part of the documentation used on the site (via submodules and the existing tooling around Markdown documents used for Rust documentation). - However, for all the reasons highlighted above in [**Motivation: The Current Situation**][current-situation], RFCs and their associated threads are *not* a good canonical source of information on language features. + However, for all the reasons highlighted above in [**Motivation: The Current Situation**][current-situation], RFCs and their associated threads are *not* a good canonical source of information on language features. - **Add a rule for the standard library but not for language features.** - - This would basically just turn the _status quo_ into an official policy. It has all the same drawbacks as no change at all, but with the possible benefit of enabling automated checks on standard library documentation. + + This would basically just turn the _status quo_ into an official policy. It has all the same drawbacks as no change at all, but with the possible benefit of enabling automated checks on standard library documentation. - **Add a rule for language features but not for the standard library.** - - The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. - On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). + The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. + + On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). + +- **Eliminate the reference entirely.** + + Since the reference is already substantially out of date, it might make sense to stop presenting it publicly at all, at least until such a time as it has been completely reworked and updated. + + The main upside to this is the reality that an outdated and inaccurate reference may be worse than no reference at all, as it may substantially mislead Rust users. + + The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. # Unresolved questions @@ -142,4 +150,4 @@ For Rust to attain its goal of *stability without stagnation*, its documentation - How will the requirement for documentation in the reference be enforced? - Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? -- Is a documentation subteam, _a la_ the one used by Ember, worth creating? \ No newline at end of file +- Is a documentation subteam, _a la_ the one used by Ember, worth creating? From 00ae686a3da5534ba98a7f2e9e025cf25b5b8a3b Mon Sep 17 00:00:00 2001 From: Andrew Cann Date: Sat, 4 Jun 2016 16:02:39 +0800 Subject: [PATCH 0945/1195] Add reference to RFC 1637 --- text/0000-bang-type.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/text/0000-bang-type.md b/text/0000-bang-type.md index d70e3556c65..61330efa40c 100644 --- a/text/0000-bang-type.md +++ b/text/0000-bang-type.md @@ -407,13 +407,9 @@ Someone would have to implement this. `!` has a unique impl of any trait whose only items are non-static methods. It would be nice if there was a way a to automate the creation of these impls. -Should `!` automatically satisfy any such trait? Alternatively we could do this -through a new trait attribute: - -```rust -#[derive_bang] -trait FromStr { - ... -} -``` +Should `!` automatically satisfy any such trait? This RFC is not blocked on +resolving this question if we are willing to accept backward-incompatibilities +in questionably-valid code which tries to call trait methods on diverging +expressions and relies on the trait being implemented for `()`. As such, the +issue has been given [it's own RFC](https://github.com/rust-lang/rfcs/pull/1637). From 72aa1117c4d5d18116312854b4dc78c947795ba9 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Sat, 4 Jun 2016 10:59:36 -0400 Subject: [PATCH 0946/1195] Add call for 'How do we document this'; add that section to *this* RFC. --- text/0000-document_all_features.md | 82 ++++++++++++++++++++++++++++-- 1 file changed, 78 insertions(+), 4 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index e9c939d317b..3ff8518979c 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -3,6 +3,7 @@ - RFC PR: (leave this empty) - Rust Issue: (leave this empty) + # Summary [summary]: #summary @@ -13,7 +14,7 @@ One of the major goals of Rust's development process is *stability without stagn At present, new language features are often documented *only* in the RFCs which propose them and the associated announcement blog posts. Moreover, as features change, the existing official language documentation (the Rust Book, Rust by Example, and the language reference) can increasingly grow outdated. -Although the Rust Book and Rust by Example are kept relatively up to date, [the reference is not][home-to-reference]: +Although the Rust Book and Rust by Example are kept relatively up to date, [the reference is not][home-to-reference] (on which see also the [addendum] to this RFC): > While Rust does not have a specification, the reference tries to describe its working in detail. *It tends to be out of date.* (emphasis mine) @@ -75,12 +76,40 @@ One of the main reasons not to adopt this approach, that it might block features The basic decision has led to a substantial improvement in the currency of the documentation (which is now updated the same day as a new version is released). Moreover, it has spurred ongoing development of better tooling around documentation to manage these releases. Finally, at least in the RFC author's estimation, it has also led to a substantial increase in the overall quality of that documentation, possibly as a consequence of increasing the community involvement in the documentation process (including the formation of a documentation subteam). + # Detailed design [design]: #detailed-design -The basic process of developing new language features will remain unchanged from today, with the addition of a straightforward requirement that they be properly documented before being merged to stable. +The basic process of developing new language features will remain largely the same as today. The changes are two additions: + +- a new section in the RFC, "How do we teach this?" modeled on Ember's updated RFC process +- a new requirement that the changes themselves be properly documented before being merged to stable + +## New RFC section: "How do we teach this?" +[new-rfc-section]: #new-rfc-section-how-do-we-teach-this + +Following the example of Ember.js, we will add a new section to the RFC, just after **Detailed design**, titled **How do we teach this?** The section should to explain what changes need to be made to documentation, and if the feature substantially changes what would be considered the "best" way to solve a problem or is a fairly mainstream issue, discuss how it might be incorporated into _The Rust Programming Language_ and/or _Rust by Example_. + +Here is the Ember RFC section, with suggested substitutions: + +> # How We Teach This +> What names and terminology work best for these concepts and why? How is this idea best presented? As a continuation of existing ~~Ember~~ **Rust** patterns, or as a wholly new one? +> +> Would the acceptance of this proposal mean ~~Ember guides~~ **_The Rust Programing Language_, _Rust by Example_, or the Rust Reference** must be re-organized or altered? Does it change how ~~Ember~~ **Rust** is taught to new users at any level? +> +> How should this feature be introduced and taught to existing ~~Ember~~ **Rust** users? + +We may also find it valuable to add other, more Rust-specific (or programming language- rather than framework-specific) verbiage there. + +For a great example of this in practice, see the (currently open) [Ember RFC: Module Unification], which includes several sections discussing conventions, tooling, concepts, and impacts on testing. -## Language features +[Ember RFC: Module Unification]: https://github.com/dgeb/rfcs/blob/module-unification/text/0000-module-unification.md#how-we-teach-this + +## Review before stabilization + +Changes will now be reviewed for changes to the documentation prior to being merged. + +### Language features [language-features]: #language-features In the case of language features, this will be a manual process, involving updates to the `reference.md` file. (It may at some point be sensible to break up the Reference file for easier maintenance; that is left aside as orthogonal to this discussion.) @@ -95,11 +124,41 @@ This need not be especially long, but it should be long enough for ordinary user When the core team discusses whether to stabilize a feature in a given release, the reference material will now be a part of that decision. Once the feature *and* reference material are complete, it will be merged normally, and the pull request will simply include the reference material as well as the new feature. -## Standard library +### Standard library [std]: #standard-library In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. + +# How do we teach this? + +Since this RFC promotes including this section, it includes it itself. (RFCs, unlike Rust `struct` or `enum` types, may be freely self-referential. No boxing required.) + +To be most effective, this will involve some changes both at a process and core-team level, and at a community level. + +From the process and core team side of things: + +1. The RFC template should be updated to include the new section for teaching. +2. The RFC process description in the [RFCs README], specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. +3. A blog post discussing the new process should be written discussing why we are making this change to the process, and especially explaining both the current problems and the benefits of making the change. +4. The core team should make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Core team members are already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) + +[RFCs README]: https://github.com/rust-lang/rfcs/blob/master/README.md +[What the process is]: https://github.com/rust-lang/rfcs/blob/master/README.md#what-the-process-is + +This is also an opportunity to allow/enable non-core-team members with less experience to contribute more actively to _The Rust Programming Language_, _Rust by Example_, and the Rust Reference. + +1. We should write issues for feature documentation, and flag them as approachable entry points for new users. +2. We can use the more complicated language reference issues as points for mentoring developers interested in contributing to the compiler. Helping document a complex language feature may be a useful on-ramp for working on the compiler itself. +3. We may find it useful to form a documentation subteam (under the leadership of the relevant core team representative), similar to what Ember has done, which is responsible for shepherding these changes along. + + Whether such a team is formalized or not, the goal would be for the community to take up a greater degree of responsibility for the state of the documentation, rather than it falling entirely on the shoulders of a single core team member. (Having a dedicated core team member focused solely on docs is *wonderful*, but it means we can sometimes leave it all to just one person, and Rust has far too much going on for any individual to manage on their own.) + + (See the [addendum] below, as well.) + +At a "messaging" level, we should continue to emphasize that *documentation is just as valuable as code*. For example (and there are many other similar opportunities): in addition to highlighting new language features in the release notes for each version, we might highlight any part of the documentation which saw substantial improvement in the release. + + # Drawbacks [drawbacks]: #drawbacks @@ -113,9 +172,16 @@ Finally, this may delay landing features on stable. However, all the points rais For Rust to attain its goal of *stability without stagnation*, its documentation must also be stable and not stagnant. + # Alternatives [alternatives]: #alternatives +- **Embrace the documentation, but do not include "How do we teach this?" section in new RFCs.** + + This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. + + On the other hand, thinking about the impact on teaching should further improve consideration of the general ergonomics of a proposed feature. If something cannot be *taught* well, it's likely the design needs further refinement. + - **No change; leave RFCs as canonical documentation.** This approach can take (at least) two forms: @@ -151,3 +217,11 @@ For Rust to attain its goal of *stability without stagnation*, its documentation - Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? - Is a documentation subteam, _a la_ the one used by Ember, worth creating? + + +# Addendum: The state of the reference +[addendum]: #addendum-the-state-of-the-reference + +Related to some of the above discussion about the current state of the reference: it may be worth creating a "strike team" to invest a couple months working on the reference: updating it, organizing it, and improving its presentation. (A single web page with *all* of this content is difficult to navigate at best.) This can proceed in parallel with the documentation of new features. It is probably a necessity for this proposal to be particularly effective in the long term. + +Once the reference is up to date, the nucleus responsible for that work may either disband or possibly (depending on the core team's evaluation of the necessity of it and the interest of the "strike team" members) become the basis of a new documentation subteam. From 92d4befb1086d72569e80ee5af50e3df00fa8443 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Sat, 4 Jun 2016 11:08:30 -0400 Subject: [PATCH 0947/1195] Add process for updating the reference. --- text/0000-document_all_features.md | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 3ff8518979c..161409e1672 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -14,7 +14,7 @@ One of the major goals of Rust's development process is *stability without stagn At present, new language features are often documented *only* in the RFCs which propose them and the associated announcement blog posts. Moreover, as features change, the existing official language documentation (the Rust Book, Rust by Example, and the language reference) can increasingly grow outdated. -Although the Rust Book and Rust by Example are kept relatively up to date, [the reference is not][home-to-reference] (on which see also the [addendum] to this RFC): +Although the Rust Book and Rust by Example are kept relatively up to date, [the reference is not][home-to-reference]: > While Rust does not have a specification, the reference tries to describe its working in detail. *It tends to be out of date.* (emphasis mine) @@ -124,6 +124,20 @@ This need not be especially long, but it should be long enough for ordinary user When the core team discusses whether to stabilize a feature in a given release, the reference material will now be a part of that decision. Once the feature *and* reference material are complete, it will be merged normally, and the pull request will simply include the reference material as well as the new feature. +Given the current state of the reference, this may need to proceed in two steps: + +#### The current state of the reference. + +Since the reference is currently fairly out of date in a number of areas, it may be worth creating a "strike team" to invest a couple months working on the reference: updating it, organizing it, and improving its presentation. (A single web page with *all* of this content is difficult to navigate at best.) This can proceed in parallel with the documentation of new features. It is probably a necessity for this proposal to be particularly effective in the long term. + +Once the reference is up to date, the nucleus responsible for that work may either disband or possibly (depending on the core team's evaluation of the necessity of it and the interest of the "strike team" members) become the basis of a new documentation subteam. + +Updating the reference could proceed stepwise: + +1. Begin by adding an appendix in the reference with links to all accepted RFCs which have been implemented but are not yet referenced in the documentation. +2. As the reference material is written for each of those RFC features, it can be removed from that appendix. + + ### Standard library [std]: #standard-library @@ -154,8 +168,6 @@ This is also an opportunity to allow/enable non-core-team members with less expe Whether such a team is formalized or not, the goal would be for the community to take up a greater degree of responsibility for the state of the documentation, rather than it falling entirely on the shoulders of a single core team member. (Having a dedicated core team member focused solely on docs is *wonderful*, but it means we can sometimes leave it all to just one person, and Rust has far too much going on for any individual to manage on their own.) - (See the [addendum] below, as well.) - At a "messaging" level, we should continue to emphasize that *documentation is just as valuable as code*. For example (and there are many other similar opportunities): in addition to highlighting new language features in the release notes for each version, we might highlight any part of the documentation which saw substantial improvement in the release. @@ -217,11 +229,3 @@ For Rust to attain its goal of *stability without stagnation*, its documentation - Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? - Is a documentation subteam, _a la_ the one used by Ember, worth creating? - - -# Addendum: The state of the reference -[addendum]: #addendum-the-state-of-the-reference - -Related to some of the above discussion about the current state of the reference: it may be worth creating a "strike team" to invest a couple months working on the reference: updating it, organizing it, and improving its presentation. (A single web page with *all* of this content is difficult to navigate at best.) This can proceed in parallel with the documentation of new features. It is probably a necessity for this proposal to be particularly effective in the long term. - -Once the reference is up to date, the nucleus responsible for that work may either disband or possibly (depending on the core team's evaluation of the necessity of it and the interest of the "strike team" members) become the basis of a new documentation subteam. From b2d4228621dae5a0d5eb4457d63ffcc3e3b3c704 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Sat, 4 Jun 2016 14:01:25 -0400 Subject: [PATCH 0948/1195] Fix some grammar issues. --- text/0000-document_all_features.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 161409e1672..cd87c0675b3 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -152,10 +152,10 @@ To be most effective, this will involve some changes both at a process and core- From the process and core team side of things: -1. The RFC template should be updated to include the new section for teaching. -2. The RFC process description in the [RFCs README], specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. -3. A blog post discussing the new process should be written discussing why we are making this change to the process, and especially explaining both the current problems and the benefits of making the change. -4. The core team should make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Core team members are already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) +1. Update the RFC template to include the new section for teaching. +2. Update the RFC process description in the [RFCs README], specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. +3. Write a blog post discussing the new process should be written discussing why we are making this change to the process, and especially explaining both the current problems and the benefits of making the change. +4. Make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Core team members are already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) [RFCs README]: https://github.com/rust-lang/rfcs/blob/master/README.md [What the process is]: https://github.com/rust-lang/rfcs/blob/master/README.md#what-the-process-is From 6193dd8c4b1357d106073868d9a0aebc2d6c13dc Mon Sep 17 00:00:00 2001 From: benaryorg Date: Sat, 4 Jun 2016 22:58:21 +0200 Subject: [PATCH 0949/1195] add: duration_checked_sub RFC Signed-off-by: benaryorg --- text/0000-duration-checked-sub.md | 93 +++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) create mode 100644 text/0000-duration-checked-sub.md diff --git a/text/0000-duration-checked-sub.md b/text/0000-duration-checked-sub.md new file mode 100644 index 00000000000..21e55430f81 --- /dev/null +++ b/text/0000-duration-checked-sub.md @@ -0,0 +1,93 @@ +- Feature Name: duration_checked_sub +- Start Date: 2016-06-04 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +This RFC adds `checked_sub()` already known from various primitive types to the +`Duration` *struct*. + +# Motivation +[motivation]: #motivation + +Generally this helps when subtracting `Duration`s which can be the case quite +often. + +One abstract example would be executing a specific piece of code repeatedly +after a constant amount of time. + +Specific examples would be a network service or a rendering process emitting a +constant amount of frames per second. + +Example code would be as follows: + +```rust + +// This function is called repeatedly +fn render() { + // 10ms delay results in 100 frames per second + let wait_time = Duration::from_millis(10); + + // `Instant` for elapsed time + let start = Instant::now(); + + // execute code here + render_and_output_frame(); + + // there are no negative `Duration`s so this does nothing if the elapsed + // time is longer than the defined `wait_time` + start.elapsed().checked_sub(wait_time).and_then(std::thread::sleep); +} +``` + +# Detailed design +[design]: #detailed-design + +The detailed design would be exactly as the current `sub()` method, just +returning an `Option` and passing possible `None` values from the +underlying primitive types: + +```rust +impl Duration { + fn checked_sub(self, rhs: Duration) -> Duration { + if let Some(mut secs) = self.secs.checked_sub(rhs.secs) { + let nanos = if self.nanos >= rhs.nanos { + self.nanos - rhs.nanos + } else { + if let Some(secs) = secs.checked_sub(1) { + self.nanos + NANOS_PER_SEC - rhs.nanos + } + else { + return None; + } + }; + debug_assert!(nanos < NANOS_PER_SEC); + Duration { secs: secs, nanos: nanos } + } + else { + None + } + } +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +This proposal adds another `checked_*` method to *libstd*. +One could ask why no `CheckedSub` trait if there is a `Sub` trait. + +# Alternatives +[alternatives]: #alternatives + +The alternatives are simply not doing this and forcing the programmer to code +the check on their behalf. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Should all functions of the form +`(checked|saturating|overflowing|wrapping)_(add|sub|mul|div)` be added? + From fc2718cc35be7f7e90cd619850872f2160d25c19 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Mon, 6 Jun 2016 14:22:52 -0400 Subject: [PATCH 0950/1195] Fix bad reference-style link. --- text/0000-document_all_features.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index cd87c0675b3..4609193b18c 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -62,11 +62,11 @@ This exact idea has been adopted by the Ember community after their somewhat bum [@davidgoli]: https://github.com/emberjs/rfcs/pull/56#issuecomment-114635962 -The Ember core team agreed, and embraced the principle outlined in [this comment][guarav0]: +The Ember core team agreed, and embraced the principle outlined in [this comment][@guarav0]: > No version shall be released until guides and versioned API documentation is ready. This will allow newcomers the ability to understand the latest release. ([@guarav0]) -[guarav0]: https://github.com/emberjs/rfcs/pull/56#issuecomment-114339423 +[@guarav0]: https://github.com/emberjs/rfcs/pull/56#issuecomment-114339423 One of the main reasons not to adopt this approach, that it might block features from landing as soon as they otherwise might, was [addressed][@eccegordo] in that discussion as well: From 81b4c7e2fbc9ff0dac6ddc2e588debddf9dfbae3 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Tue, 7 Jun 2016 16:52:20 -0400 Subject: [PATCH 0951/1195] Create 0000-rust-new-error-format.md --- text/0000-rust-new-error-format.md | 191 +++++++++++++++++++++++++++++ 1 file changed, 191 insertions(+) create mode 100644 text/0000-rust-new-error-format.md diff --git a/text/0000-rust-new-error-format.md b/text/0000-rust-new-error-format.md new file mode 100644 index 00000000000..2908fa14318 --- /dev/null +++ b/text/0000-rust-new-error-format.md @@ -0,0 +1,191 @@ +- Feature Name: rust_new_error_format +- Start Date: 2016-06-07 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust error messages and --explain text to focus on the user's code. The end goal is for errors and explain text to be more readable, more friendly to new users, while still helping Rust coders fix bugs as quickly as possible. We expect to follow this RFC with a supplemental RFC that provides a writing style guide for error messages and explain text with a focus on readability and education. + +This RFC details work in close collaboration with Niko Matsakis and Yehuda Katz, with input from Aaron Turon and Alex Crichton. Special thanks to those who gave us feedback on previous iterations of the proposal. + +# Motivation + +## Default error format + +Rust offers a unique value proposition in the landscape of languages in part by codifying concepts like ownership and borrowing. Because these concepts are unique to Rust, it's critical that the learning curve be as smooth as possible. And one of the most important tools for lowering the learning curve is providing excellent errors that serve to make the concepts less intimidating, and to help 'tell the story' about what those concepts mean in the context of the programmer's code. + +![Image of current error format](http://www.jonathanturner.org/images/old_errors_new2.png) + +*Example of a borrow check error in the current compiler* + +Though a lot of time has been spent on the current error messages, they have a couple flaws which make them difficult to use. Specifically, the current error format: + +* Repeats the file position on the left-hand side. This offers no additional information, but instead makes the error harder to read. +* Prints messages about lines often out of order. This makes it difficult for the developer to glance at the error and recognize why the error is occuring +* Lacks a clear visual break between errors. As more errors occur it becomes more difficult to tell them apart. +* Uses technical terminology that is difficult for new users who may be unfamiliar with compiler terminology or terminology specific to Rust. + +This RFC details a redesign of errors to focus more on the source the programmer wrote. This format addresses the above concerns by eliminating clutter, following a more natural order for help messages, and pointing the user to both "what" the error is and "why" the error is occurring by using color-coded labels. Below you can see the same error again, this time using the proposed format: + +![Image of new error flow](http://www.jonathanturner.org/images/new_errors_new2.png) + +*Example of the same borrow check error in the proposed format* + +## Expanded error format (revised --explain) + +Languages like Elm have shown how effective an educational tool error messages can be if the explanations like our --explain text are mixed with the user's code. As mentioned earlier, it's crucial for Rust to be easy-to-use, especially since it introduces a fair number of concepts that may be unfamiliar to the user. Even experienced users may need to use --explain text from time to time when they encounter unfamiliar messages. + +While we have --explain text today, it uses generic examples that require the user to mentally translate the given example into what works for their specific situation. + +``` +You tried to move out of a value which was borrowed. Erroneous code example: + +use std::cell::RefCell; + +struct TheDarkKnight; + +impl TheDarkKnight { + fn nothing_is_true(self) {} +} +... +``` + +*Example of the current --explain (showing E0507)* + +To help users, this RFC proposes that --explain no longer uses an error code. Instead, --explain becomes a flag in a cargo or rustc invocation that enables an expanded error-reporting mode which incorporates the user's code. This more textual mode gives additional explanation to help understand compiler messages better. The end result is a richer, on-demand error reporting style. + +![Image of Rust error in elm-style](http://www.jonathanturner.org/images/elm_like_rust.png) + +# Detailed design + +The RFC is separated into two parts: the format of error messages and the format of expanded error messages (using --explain). + +## Format of error messages + +The proposal is a lighter error format focused on the code the user wrote. Messages that help understand why an error occurred appear as labels on the source. You can see an example below: + +![Image of new error flow](http://www.jonathanturner.org/images/new_errors_new2.png) + +The goals of this new format are to: + +* Create something that's visually easy to parse +* Remove noise/unnecessary information +* Present information in a way that works well for new developers, post-onboarding, and experienced developers without special configuration +* Draw inspiration from Elm as well as Dybuk and other systems that have already improved on the kind of errors that Rust has. + +In order to accomplish this, the proposed design needs to satisfy a number of constraints to make the result maximally flexible across various terminals: + +* Multiple errors beside each other should be clearly separate and not muddled together. +* Each error message should draw the eye to where the error occurs with sufficient context to understand why the error occurs. +* Each error should have a "header" section that is visually distinct from the code section. +* Code should visually stand out from text and other error messages. This allows the developer to immediately recognize their code. +* Error messages should be just as readable when not using colors (eg for users of black-and-white terminals, color-impaired readers, weird color schemes that we can't predict, or just people that turn colors off) +* Be careful using “ascii art” and avoid unicode. Instead look for ways to show the information concisely that will work across the broadest number of terminals. We expect IDEs to possibly allow for a more graphical error in the future. +* Where possible, use labels on the source itself rather than sentence "notes" at the end. +* Keep filename:line easy to spot for people who use editors that let them click on errors + +### Header + +![Image of new error format heading](http://www.jonathanturner.org/images/rust_error_1_new.png) + +The header now spans two lines. It gives you access to knowing a) if it's a warning or error, b) the text of the warning/error, and c) the location of this warning/error. You can see we also use the [--explain E0499] as a way to let the developer know they can get more information about this kind of issue, though this may be replaced in favor of the more general --explain also described in this RFC. While we use some bright colors here, we expect the use of colors and bold text in the 'Source area' (shown below) to draw the eye first. + +### Line number column + +![Image of new error format line number column](http://www.jonathanturner.org/images/rust_error_2_new.png) + +The line number column lets you know where the error is occurring in the file. Because we only show lines that are of interest for the given error/warning, we elide lines if they are not annotated as part of the message (we currently use the heuristic to elide after one un-annotated line). + +Inspired by Dybuk and Elm, the line numbers are separated with a 'wall', a separator formed from |>, to clearly distinguish what is a line number from what is source at a glance. + +As the wall also forms a way to visually separate distinct errors, we propose extending this concept to also support span-less notes and hints. For example: + +``` +92 |> config.target_dir(&pkg) + |> ^^^^ expected `core::workspace::Workspace`, found `core::package::Package` + => note: expected type `&core::workspace::Workspace<'_>` + => note: found type `&core::package::Package` +``` +### Source area + +![Image of new error format source area](http://www.jonathanturner.org/images/rust_error_3_new.png) + +The source area shows the related source code for the error/warning. The source is laid out in the order it appears in the source file, giving the user a way to map the message against the source they wrote. + +Key parts of the code are labeled with messages to help the user understand the message. + +The primary label is the label associated with the main warning/error. It explains the **what** of the compiler message. By reading it, the user can begin to understand what the root cause of the error or warning is. This label is colored to match the level of the message (yellow for warning, red for error) and uses the ^^^ underline. + +Secondary labels help to understand the error and use blue text and --- underline. These labels explain the **why** of the compiler message. You can see one such example in the above message where the secondary labels explain that there is already another borrow going on. In another example, we see another way that primary and secondary work together to tell the whole story for why the error occurred: + +![Image of new error format source area](http://www.jonathanturner.org/images/primary_secondary.png) + +Taken together, primary and secondary labels create a 'flow' to the message. Flow in the message lets the user glance at the colored labels and quickly form an educated guess as to how to correctly update their code. + +Note: We'll talk more about additional style guidance for wording to help create flow in the subsequent style RFC. + +## Expanded error messages + +Currently, --explain text focuses on the error code. You invoke the compiler with --explain and receive a verbose description of what causes errors of that number. The resulting message can be helpful, but it uses generic sample code which makes it feel less connected to the user's code. + +We propose changing --explain to no longer take an error code. Instead, passing --explain to the compiler (or to cargo) will turn the compiler output into an expanded form which incorporates the same source and label information the user saw in the default message with more explanation text. + +![Image of Rust error in elm-style](http://www.jonathanturner.org/images/elm_like_rust.png) + +*Example of an expanded error message* + +The expanded error message effectively becomes a template. The text of the template is the educational text that is explaining the message more more detail. The template is then populated using the source lines, labels, and spans from the same compiler message that's printed in the default mode. This lets the message writer call out each label or span as appropriate in the expanded text. + +It's possible to also add additional labels that aren't necessarily shown in the default error mode but would be available in the expanded error format. For example, the above error might look like this is as a default error: + +![Image of same error without all of the same labels](http://www.jonathanturner.org/images/default_borrowed_content.png) + +This gives the explain text writer maximal flexibility without impacting the readability of the default message. I'm currently prototyping an implementation of how this templating could work in practice. + +## Tying it together + +Lastly, we propose that the final error message: + +``` +error: aborting due to 2 previous errors +``` + +Be changed to notify users of this ability: + +``` +note: You can compile again with --explain for more information about these errors +``` + +As this helps inform the user of the --explain capability. + +# Drawbacks + +Changes in the error format can impact integration with other tools. For example, IDEs that use a simple regex to detect the error would need to be updated to support the new format. This takes time and community coordination. + +While the new error format has a lot of benefits, it's possible that some errors will feel "shoehorned" into it and, even after careful selection of secondary labels, may still not read as well as the original format. + +There is a fair amount of work involved to update the errors and explain text to the proposed format. + +# Alternatives + +Rather than using the proposed error format format, we could only provide the verbose --explain style that is proposed in this RFC. Famous programmers like [John Carmack](https://twitter.com/ID_AA_Carmack/status/735197548034412546) have praised the Elm error format. + +![Image of Elm error](http://www.jonathanturner.org/images/elm_error.jpg) + +*Example of an Elm error* + +In developing this RFC, we experimented with both styles. The Elm error format is great as an educational tool, and we wanted to leverage its style in Rust. For day-to-day work, though, we favor an error format that puts heavy emphasis on quickly guiding the user to what the error is and why it occurred, with an easy way to get the richer explanations (using --explain) when user wants them. + +# Stabilization + +Currently, these new rust error format is available on nightly using the ```export RUST_NEW_ERROR_FORMAT=true``` environment variable. Ultimately, this should become the default. In order to get there, we need to ensure that the new error format is indeed an improvement over the existing format in practice. + +How do we measure the readability of error messages? This RFC details an educated guess as to what would improve the current state but shows no ways to measure success. + +Likewise, While some of us have been dogfooding these errors, we don't know what long-term use feels like. For example, after a time does the use of color feel excessive? We can always update the errors as we go, but it'd be helpful to catch it early if possible. + +# Unresolved questions + +There are a few unresolved questions: +* Editors that rely on pattern-matching the compiler output will need to be updated. It's an open question how best to transition to using the new errors. There is on-going discussion of standardizing the JSON output, which could also be used. +* Can additional error notes be shown without the "rainbow problem" where too many colors and too much boldness cause errors to beocome less readable? From af8379223142f96bf03bdb65cec398e2b5e0b37e Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 7 Jun 2016 16:40:05 -0400 Subject: [PATCH 0952/1195] initial commit --- text/0000-memory-model-strike-team.md | 306 ++++++++++++++++++++++++++ 1 file changed, 306 insertions(+) create mode 100644 text/0000-memory-model-strike-team.md diff --git a/text/0000-memory-model-strike-team.md b/text/0000-memory-model-strike-team.md new file mode 100644 index 00000000000..b3da16f348f --- /dev/null +++ b/text/0000-memory-model-strike-team.md @@ -0,0 +1,306 @@ +- Feature Name: N/A +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Incorporate a strike team dedicated to preparing rules and guidelines +for writing unsafe code in Rust (commonly referred to as Rust's +"memory model"), in cooperation with the lang team. The discussion +will generally proceed in phases, starting with establishing +high-level principles and gradually getting down to the nitty gritty +details (though some back and forth is expected). The strike team will +produce various intermediate documents that will be submitted as +normal RFCs. + +# Motivation +[motivation]: #motivation + +Rust's safe type system offers very strong aliasing information that +promises to be a rich source of compiler optimization. For example, +in safe code, the compiler can infer that if a function takes two +`&mut T` parameters, those two parameters must reference disjoint +areas of memory (this allows optimizations similar to C99's `restrict` +keyword, except that it is both automatic and fully enforced). The +compiler also knows that given a shared reference type `&T`, the +referent is immutable, except for data contained in an `UnsafeCell`. + +Unfortunately, there is a fly in the ointment. Unsafe code can easily +be made to violate these sorts of rules. For example, using unsafe +code, it is trivial to create two `&mut` references that both refer to +the same memory (and which are simultaneously usable). In that case, +if the unsafe code were to (say) return those two points to safe code, +that would undermine Rust's safety guarantees -- hence it's clear that +this code would be "incorrect". + +But things become more subtle when we just consider what happens +*within* the abstraction. For example, is unsafe code allowed to use +two overlapping `&mut` references internally, without returning it to +the wild? Is it all right to overlap with `*mut`? And so forth. + +It is the contention of this RFC that a complete guidelines for unsafe +code are far too big a topic to be fruitfully addressed in a single +RFC. Therefore, this RFC proposes the formation of a dedicated +**strike team** (that is, a temporary, single-purpose team) that will +work on hammering out the details over time. Precise membership of +this team is not part of this RFC, but will be determined by the lang +team as well as the strike team itself. + +The unsafe guidelines work will proceed in rough stages, described +below. An initial goal is to produce a **high-level summary detailing +the general approach of the guidelines.** Ideally, this summary should +be sufficient to help guide unsafe authors in best practices that are +most likely to be forwards compatible. Further work will then expand +on the model to produce a more **detailed set of rules**, which may in +turn require revisiting the high-level summary if contradictions are +uncovered. + +This new "unsafe code" strike team is intended to work in +collaboration with the existing lang team. Ultimately, whatever rules +are crafted must be adopted with the **general consensus of both the +strike team and the lang team**. It is expected that lang team members +will be more involved in the early discussions that govern the overall +direction and less involved in the fine details. + +#### History and recent discussions + +The history of optimizing C can be instructive. All code in C is +effectively unsafe, and so in order to perform optimizations, +compilers have come to lean heavily on the notion of "undefined +behavior" as well as various ad-hoc rules about what programs ought +not to do (see e.g. [these][cl1] [three][cl2] [posts][cl3] entitled +"What Every C Programmer Should Know About Undefined Behavior", by +Chris Lattner). This can cause some very surprising behavior (see e.g. +["What Every Compiler Author Should Know About Programmers"][cap] or +[this blog post by John Regehr][jr], which is quite humorous). Note that +Rust has a big advantage over C here, in that only the authors of +unsafe code should need to worry about these rules. + +[cl1]: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html +[cl2]: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html +[cl3]: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html +[cap]: http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf +[jr]: http://blog.regehr.org/archives/761 + +In terms of Rust itself, there has been a large amount of discussion +over the years. Here is a (non-comprehensive) set of relevant links, +with a strong bias towards recent discussion: + +- [RFC Issue #1447](https://github.com/rust-lang/rfcs/issues/1447) provides + a general set of links as well as some discussion. +- [RFC #1578](https://github.com/rust-lang/rfcs/pull/1578) is an initial + proposal for a Rust memory model by ubsan. +- The + [Tootsie Pop](http://smallcultfollowing.com/babysteps/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/) + blog post by nmatsakis proposed an alternative approach, building on + [background about unsafe abstractions](http://smallcultfollowing.com/babysteps/blog/2016/05/23/unsafe-abstractions/) + described in an earlir post. There is also a lot of valuable + discussion in + [the corresponding internals thread](http://smallcultfollowing.com/babysteps/blog/2016/05/23/unsafe-abstractions/). + +#### Other factors + +Another factor that must be considered is the interaction with weak +memory models. Most of the links above focus purely on sequential +code: Rust has more-or-less adopted the C++ memory model for governing +interactions across threads. But there may well be subtle cases that +arise we delve deeper. For more on the C++ memory model, see +[Hans Boehm's excellent webpage](http://www.hboehm.info/c++mm/). + +# Detailed design +[design]: #detailed-design + +## Scope + +Here are some of the issues that should be resolved as part of these +unsafe code guidelines. The following list is not intended as +comprehensive (suggestions for additions welcome): + +- Legal aliasing rules and patterns of memory accesses + - e.g., which of the patterns listed in [rust-lang/rust#19733](https://github.com/rust-lang/rust/issues/19733) + are legal? + - can unsafe code create (but not use) overlapping `&mut`? + - under what conditions is it legal to dereference a `*mut T`? + - when can an `&mut T` legally alias an `*mut T`? +- Struct layout guarantees +- Interactions around zero-sized types + - e.g., what pointer values can legally be considered a `Box`? +- Allocator dependencies + +One specific area that we can hopefully "outsource" is detailed rules +regarding the interaction of different threads. Rust exposes atomics +that roughly correspond to C++11 atomics, and the intention is that we +can layer our rules for sequential execution atop those rules for +parallel execution. + +## Time frame + +Working out a a set of rules for unsafe code is a detailed process and +is expected to take months (or longer, depending on the level of +detail we ultimately aim for). However, the intention is to publish +preliminary documents as RFCs as we go, so hopefully we can be +providing ever more specific guidance for unsafe code authors. + +Note that even once an initial set of guidelines is adopted, problems +or inconsistencies may be found. If that happens, the guidelines will +be adjusted as needed to correct the problem, naturally with an eye +towards backwards compatibility. In other words, the unsafe +guidelines, like the rules for Rust language itself, should be +considered a "living document". + +As a note of caution, experience from other languages such as Java or +C++ suggests that the work on memory models can take years. Moreover, +even once a memory model is adopted, it can be unclear whether +[common compiler optimizations are actually permitted](http://www.di.ens.fr/~zappa/readings/c11comp.pdf) +under the model. The hope is that by focusing on sequential and +Rust-specific issues we can sidestep some of these quandries. + +## Intermediate documents + +Because hammering out the finer points of the memory model is expected +to possibly take some time, it is important to produce intermediate +agreements. This section describes some of the documents that may be +useful. These also serve as a rough guideline to the overall "phases" +of discussion that are expected, though in practice discussion will +likely go back and forth: + +- **Key examples and optimizations**: highlighting code examples that + ought to work, or optimizations we should be able to do, as well as + some that will not work, or those whose outcome is in doubt. +- **High-level design**: describe the rules at a high-level. This + would likely be the document that unsafe code authors would read to + know if their code is correct in the majority of scenarios. Think of + this as the "user's guide". +- **Detailed rules**: More comprehensive rules. Think of this as the + "reference manual". + +Note that both the "high-level design" and "detailed rules", once +considered complete, will be submitted as RFCs and undergo the usual +final comment period. + +### Key examples and optimizations + +Probably a good first step is to agree on some key examples and +overall principles. Examples would fall into several categories: + +- Unsafe code that we feel **must** be considered **legal** by any model +- Unsafe code that we feel **must** be considered **illegal** by any model +- Unsafe code that we feel **may or may not** be considered legal +- Optimizations that we **must** be able to perform +- Optimizations that we **should not** expect to be able to perform +- Optimizations that it would be nice to have, but which may be sacrificed + if needed + +Having such guiding examples naturally helps to steer the effort, but +it also helps to provide guidance for unsafe code authors in the +meantime. These examples illustrate patterns that one can adopt with +reasonable confidence. + +Deciding about these examples should also help in enumerating the +guiding principles we would like to adhere to. The design of a memory +model ultimately requires balancing several competing factors and it +may be useful to state our expectations up front on how these will be +weighed: + +- **Optimization.** The stricter the rules, the more we can optimize. + - on the other hand, rules that are overly strict may prevent people + from writing unsafe code that they would like to write, ultimately + leading to slower exeution. +- **Comprehensibility.** It is important to strive for rules that end + users can readily understand. If learning the rules requires diving + into academic papers or using Coq, it's a non-starter. +- **Effect on existing code.** No matter what model we adopt, existing + unsafe code may or may not comply. If we then proceed to optimize, + this could cause running code to stop working. While + [RFC 1122](https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md) + explicitly specified that the rules for unsafe code may change, we + will have to decide where to draw the line in terms of how much to + weight backwards compatibility. + +It is expected that the lang team will be **highly involved** in this discussion. + +It is also expected that we will gather examples in the following ways: + +- survey existing unsafe code; +- solicit suggestions of patterns from the Rust-using public: + - scenarios where they would like an official judgement; + - interesting questions involving the standard library. + +### High-level design + +The next document to produce is to settle on a high-level +design. There have already been several approaches floated. This phase +should build on the examples from before, in that proposals can be +weighed against their effect on the examples and optimizations. + +There will likely also be some feedback between this phase and the +previosu: as new proposals are considered, that may generate new +examples that were not relevant previously. + +Note that even once a high-level design is adopted, it will be +considered "tentative" and "unstable" until the detailed rules have +been worked out to a reasonable level of confidence. + +Once a high-level design is adopted, it may also be used by the +compiler team to inform which optimizations are legal or illegal. +However, if changes are later made, the compiler will naturally have +to be adjusted to match. + +It is expected that the lang team will be **highly involved** in this discussion. + +### Detailed rules + +Once we've settled on a high-level path -- and, no doubt, while in the +process of doing so as well -- we can begin to enumerate more detailed +rules. It is also expected that working out the rules may uncover +contradictions or other problems that require revisiting the +high-level design. + +### Lints and other checkers + +Ideally, the team will also consider whether automated checking for +conformance is possible. It is not a responsibility of this strike +team to produce such automated checking, but automated checking is +naturally a big plus! + +## Repository + +In general, the memory model discussion will be centered on a specific +repository (perhaps +, but perhaps moved +to the rust-lang organization). This allows for multi-faced +discussion: for example, we can open issues on particular questions, +as well as storing the various proposals and litmus tests in their own +directories. We'll work out and document the procedures and +conventions here as we go. + +# Drawbacks +[drawbacks]: #drawbacks + +The main drawback is that this discussion will require time and energy +which could be spent elsewhere. The justification for spending time on +developing the memory model instead is that it is crucial to enable +the compiler to perform aggressive optimizations. Until now, we've +limited ourselves by and large to conservative optimizations (though +we do supply some LLVM aliasing hints that can be affected by unsafe +code). As the transition to MIR comes to fruition, it is clear that we +will be in a place to perform more aggressive optimization, and hence +the need for rules and guidelines is becoming more acute. We can +continue to adopt a conservative course, but this risks growing an +ever larger body of code dependent on the compiler not performing +aggressive optimization, which may close those doors forever. + +# Alternatives +[alternatives]: #alternatives + +- Adopt a memory model in one fell swoop: + - considered too complicated +- Defer adopting a memory model for longer: + - considered too risky + +# Unresolved questions +[unresolved]: #unresolved-questions + +None. From 93f326aeb867bc0a35ccf5aa43d8cf880573f0f0 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Tue, 7 Jun 2016 17:10:01 -0400 Subject: [PATCH 0953/1195] Update and rename 0000-rust-new-error-format.md to 0000-default-and-expanded-rustc-errors.md --- ...rror-format.md => 0000-default-and-expanded-rustc-errors.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-rust-new-error-format.md => 0000-default-and-expanded-rustc-errors.md} (99%) diff --git a/text/0000-rust-new-error-format.md b/text/0000-default-and-expanded-rustc-errors.md similarity index 99% rename from text/0000-rust-new-error-format.md rename to text/0000-default-and-expanded-rustc-errors.md index 2908fa14318..d9bd506f545 100644 --- a/text/0000-rust-new-error-format.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -1,4 +1,4 @@ -- Feature Name: rust_new_error_format +- Feature Name: default_and_expanded_errors_for_rustc - Start Date: 2016-06-07 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) From 8e1b6da3fded59015a38f3cb8c4ec52c4c98b4ad Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 7 Jun 2016 19:04:16 -0400 Subject: [PATCH 0954/1195] Add a number of further considerations. --- text/0000-document_all_features.md | 74 ++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 8 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 4609193b18c..8e0358d2864 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -83,6 +83,7 @@ The basic decision has led to a substantial improvement in the currency of the d The basic process of developing new language features will remain largely the same as today. The changes are two additions: - a new section in the RFC, "How do we teach this?" modeled on Ember's updated RFC process + - a new requirement that the changes themselves be properly documented before being merged to stable ## New RFC section: "How do we teach this?" @@ -135,8 +136,8 @@ Once the reference is up to date, the nucleus responsible for that work may eith Updating the reference could proceed stepwise: 1. Begin by adding an appendix in the reference with links to all accepted RFCs which have been implemented but are not yet referenced in the documentation. -2. As the reference material is written for each of those RFC features, it can be removed from that appendix. +2. As the reference material is written for each of those RFC features, it can be removed from that appendix. ### Standard library [std]: #standard-library @@ -144,6 +145,28 @@ Updating the reference could proceed stepwise: In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. +## Add an "Edit" link +[edit-link]: #add-an-edit-link + +To support its own change, the Ember team added an "edit this" icon to the top of every page in the guides (and plans to do so for the API documentation, pending infrastructure changes to support that). Each of _The Rust Programming Language_, _Rust by Example_, and the Rust Reference should do the same. + +Making a similar change has some downsides (see below under [**Drawbacks**][drawbacks]), but it has two major upsides: + +1. It gives users an obvious action to fix typos. Speaking from personal experience, it can be difficult to find where a given documentation or book page exists in the Rust repository. Even with the drawbacks noted below, this would substantially smooth the process of making e.g. a small typo fix for first-time readers of _The Rust Programming Language_. Making the first contribution easy makes further contributions much more likely. + +2. It sends a quiet but real signal that the docs are up for editing. This makes it likelier that people will edit them! + +### Optional: Support with infrastructure change +[edit-link-infrastructure]: #optional-support-with-infrastructure-change + +The links to edit the documentation could track against the release branch instead of against `master`. (Fixes to documentation would be analogous to bugfix releases in this sense.) Targeting the pull-request automatically would be straightforward. However, see below under [**Drawbacks**][drawbacks]. + +## Optional: Visually Distinguish Nightly +[distinguish-nightly]: #optional-visually-distinguish-nightly + +It might be useful to visually distinguish the documentation for nightly Rust as being unstable and subject to change, even simply by setting a different default theme on _The Rust Programming Language_ book for nightly Rust. + + # How do we teach this? Since this RFC promotes including this section, it includes it itself. (RFCs, unlike Rust `struct` or `enum` types, may be freely self-referential. No boxing required.) @@ -153,8 +176,11 @@ To be most effective, this will involve some changes both at a process and core- From the process and core team side of things: 1. Update the RFC template to include the new section for teaching. + 2. Update the RFC process description in the [RFCs README], specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. + 3. Write a blog post discussing the new process should be written discussing why we are making this change to the process, and especially explaining both the current problems and the benefits of making the change. + 4. Make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Core team members are already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) [RFCs README]: https://github.com/rust-lang/rfcs/blob/master/README.md @@ -163,10 +189,12 @@ From the process and core team side of things: This is also an opportunity to allow/enable non-core-team members with less experience to contribute more actively to _The Rust Programming Language_, _Rust by Example_, and the Rust Reference. 1. We should write issues for feature documentation, and flag them as approachable entry points for new users. + 2. We can use the more complicated language reference issues as points for mentoring developers interested in contributing to the compiler. Helping document a complex language feature may be a useful on-ramp for working on the compiler itself. -3. We may find it useful to form a documentation subteam (under the leadership of the relevant core team representative), similar to what Ember has done, which is responsible for shepherding these changes along. - Whether such a team is formalized or not, the goal would be for the community to take up a greater degree of responsibility for the state of the documentation, rather than it falling entirely on the shoulders of a single core team member. (Having a dedicated core team member focused solely on docs is *wonderful*, but it means we can sometimes leave it all to just one person, and Rust has far too much going on for any individual to manage on their own.) +3. ~~We may find it useful to form~~ We are already forming a documentation subteam (under the leadership of the relevant core team representative), similar to what Ember has done, which will be responsible for shepherding these changes along. + + ~~Whether such a team is formalized or not,~~ Even with such a team in place, a major goal remains encouraging the community to take up a greater degree of responsibility for the state of the documentation, rather than it falling entirely on the shoulders of a single core team member or even the docs team. (Having a dedicated core team member focused solely on docs is *wonderful*, but it means we can sometimes leave it all to just one person, and Rust has far too much going on for any individual to manage on their own.) At a "messaging" level, we should continue to emphasize that *documentation is just as valuable as code*. For example (and there are many other similar opportunities): in addition to highlighting new language features in the release notes for each version, we might highlight any part of the documentation which saw substantial improvement in the release. @@ -174,20 +202,47 @@ At a "messaging" level, we should continue to emphasize that *documentation is j # Drawbacks [drawbacks]: #drawbacks -The largest drawback at present is that the language reference is *already* quite out of date. It may take substantial work to get it up to date so that new changes can be landed appropriately. (Arguably, however, this should be done regardless, since the language reference is an important part of the language ecosystem.) +1. The largest drawback at present is that the language reference is *already* quite out of date. It may take substantial work to get it up to date so that new changes can be landed appropriately. (Arguably, however, this should be done regardless, since the language reference is an important part of the language ecosystem.) + +2. Another potential issue is that some sections of the reference are particularly thorny and must be handled with considerable care (e.g. lifetimes). Although in general it would not be necessary for the author of the new language feature to write all the documentation, considerable extra care and oversight would need to be in place for these sections. + +3. This may delay landing features on stable. However, all the points raised in [**Precedent**][precedent] on this apply, especially: + + > We can't get the great new toys unless everybody can enjoy the toys. ([@eccegordo]) + + For Rust to attain its goal of *stability without stagnation*, its documentation must also be stable and not stagnant. + +4. If the forthcoming docs team is unable to provide significant support for the core team member responsible for documentation, and perhaps equally if the rest of the community does not also increase involvement, this will simply not work. No individual can manage all of these docs alone. + +5. Specific to the suggestion to [**Add an "edit" link**][edit-link]: -Another potential issue is that some sections of the reference are particularly thorny and must be handled with considerable care (e.g. lifetimes). Although in general it would not be necessary for the author of the new language feature to write all the documentation, considerable extra care and oversight would need to be in place for these sections. + - If the specific page is in flux (e.g. being rewritten, broken into pieces, etc.), then a link to edit `master` will be confusing. + - In addition, when users *have* made edits, it may take some time before it appears, and thus users may be confused when attempting to make edits and finding that the relevant editss have already been made. + - Some pages users attempt to edit are *likely* to have different documentation in them than the existing pages, to account for inbound changes for feature additions to the language! -Finally, this may delay landing features on stable. However, all the points raised in [**Precedent**][precedent] on this apply, especially: + Two notes, however: -> We can't get the great new toys unless everybody can enjoy the toys. ([@eccegordo]) + 1. Even facing the same issues, the Ember team has found it useful to have the link, as it enables basically any user of a sufficient comfort level with GitHub to fix basic typos or logic errors. + 2. This concern primarily impacts _The Rust Programming Language_. Both in its current state and in the event of an eventual revamp (at least: after such a revamp finished), the Rust Reference is far less likely to see pages removed or moved. -For Rust to attain its goal of *stability without stagnation*, its documentation must also be stable and not stagnant. + Finally, while infrastructure changes could be made in support of a more "targeted" editing experience, doing so would substantially increase the triage work required for the docs. It would also entail extra work "porting" the changes back to `master`. Additionally, because the language itself does not currently "bugfix" releases, this would substantially alter the workflow for dealing with releases in general. + +6. Specific to the suggestion to [**Visually Distinguish Nightly**][distinguish-nightly]: + + This requires at least some infrastructure investment. Making the change apply to the Reference as well as to the two books would entail the maintenance of further CSS. This might be acceptable if documentation teams are sufficiently motivated and engaged, but it means that if not very carefully designed up front, any changes to the documentation theme will basically require double CSS changes; they will also require double the *design* efforts. # Alternatives [alternatives]: #alternatives +- **Just add the "How do we teach this?" section.** + + Of all the alternatives, this is the easiest (and probably the best). It does not substantially change the state with regard to the documentation, and even having the section in the RFC does not mean that it will end up added to the docs, as evidence by the [`#[deprecated]` RFC][RFC 1270], which included as part of its text: + + > The language reference will be extended to describe this feature as outlined in this RFC. Authors shall be advised to leave their users enough time to react before removing a deprecated item. + + This is not a small downside by any stretch—but adding the section to the RFC will still have all the secondary benefits noted above, and it probably at least somewhat increases the likelihood that new features do get documented. + - **Embrace the documentation, but do not include "How do we teach this?" section in new RFCs.** This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. @@ -222,9 +277,12 @@ For Rust to attain its goal of *stability without stagnation*, its documentation The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. +[RFC 1270]: https://github.com/rust-lang/rfcs/pull/1270 + # Unresolved questions [unresolved]: #unresolved-questions +- How do we clearly distinguish between features on nightly, beta, and stable Rust—in the reference especially, but also in the book? - How will the requirement for documentation in the reference be enforced? - Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? From 57a04d25aa821bbe8fb203a40e5feb0eceaf424d Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Mon, 13 Jun 2016 12:25:34 -0400 Subject: [PATCH 0955/1195] Allow `Self` to appear in the where clause of trait impls --- text/0000-allow-self-in-where-clauses.md | 75 ++++++++++++++++++++++++ 1 file changed, 75 insertions(+) create mode 100644 text/0000-allow-self-in-where-clauses.md diff --git a/text/0000-allow-self-in-where-clauses.md b/text/0000-allow-self-in-where-clauses.md new file mode 100644 index 00000000000..64e0c2aee59 --- /dev/null +++ b/text/0000-allow-self-in-where-clauses.md @@ -0,0 +1,75 @@ +- Feature Name: `allow_self_in_where_clauses` +- Start Date: 2016-06-13 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC proposes allowing the `Self` type to be used in where clauses for trait +implementations, as well as referencing associated types for the trait being +implemented. + +# Motivation +[motivation]: #motivation + +`Self` is a useful tool to have to reduce churn when the type changes for +various reasons. One would expect to be able to write + +```rust +impl SomeTrait for MySuperLongType where + Self: SomeOtherTrait, +``` + +but this will fail to compile today, forcing you to repeat the type, and adding +one more place that has to change if the type ever changes. + +By this same logic, we would also like to be able to reference associated types +from the traits being implemented. When dealing with generic code, patterns like +this often emerge: + +```rust +trait MyTrait { + type MyType: SomeBound; +} + +impl MyTrait for SomeStruct where + SomeOtherStruct: SomeBound, +{ + type MyType = SomeOtherStruct; +} +``` + +the only reason the associated type is repeated at all is to restate the bound +on the associated type. It would be nice to reduce some of that duplication. + +# Detailed design +[design]: #detailed-design + +The first half of this RFC is simple. Inside of a where clause for trait +implementations, `Self` will refer to the type the trait is being implemented +for. It will have the same value as `Self` being used in the body of the trait +implementation. + +Accessing associated types will have the same result as copying the body of the +associated type into the place where it's being used. That is to say that it +will assume that all constraints hold, and evaluate to what the type would have +been in that case. Ideally one should never have to write `::SomeType`, but in practice it will likely be required to remove +issues with recursive evaluation. + +# Drawbacks +[drawbacks]: #drawbacks + +`Self` is always less explicit than the alternative + +# Alternatives +[alternatives]: #alternatives + +Not implementing this, or only allowing bare `Self` but not associated types in +where clauses + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From c40bd5fcc786459fb94ea2b416efc7ebaa10fc50 Mon Sep 17 00:00:00 2001 From: Guillaume Gomez Date: Fri, 1 Apr 2016 18:52:00 +0200 Subject: [PATCH 0956/1195] Normalization for long error codes explanations RFC --- ...g-error-codes-explanation-normalization.md | 116 ++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 text/0000-long-error-codes-explanation-normalization.md diff --git a/text/0000-long-error-codes-explanation-normalization.md b/text/0000-long-error-codes-explanation-normalization.md new file mode 100644 index 00000000000..f57ee61fce5 --- /dev/null +++ b/text/0000-long-error-codes-explanation-normalization.md @@ -0,0 +1,116 @@ + +Start Date: 2016-01-04 + +RFC PR: + +Rust Issue: N/A + +# Summary + +Long error codes explanations haven't been normalized yet. This RFC intends to do it in order to uniformize them. + +# Motivation + +Long error codes explanations are a very important part of Rust. Having an explanation of what failed helps to understand the error and is appreciated by Rust developers of all skill levels. Providing an unified template is needed in order to help people who would want to write ones as well as people who read them. + +# Detailed design + +Here is the template I propose: + +## First point + +Giving a little more detailed error message. For example, the `E0109` says "type parameters are not allowed on this type" and the error explanation says: "You tried to give a type parameter to a type which doesn't need it.". + +## Second point + +Giving an erroneous code example which directly follows `First point`. It'll be helpful for the `Forth point`. Making it as simple as possible is really important in order to help readers to understand what the error is about. A comment should be added with the error on the same line that the errors happen. Example: + + ```Rust + type X = u32; // error: type parameters are not allowed on this type + ``` + + If the error comments is too long to fit 80 columns, split it up like this, so the next line start at the same column of the previous line: + + ```Rust + type X = u32<'static>; // error: lifetime parameters are not allowed on + // this type + ``` + + And if the code line is just too long to make a correct comment, put your comment before it: + +```Rust +// error: lifetime parameters are not allowed on this type +fn super_long_function_name_and_thats_problematic() {} +``` + +Of course, it the comment is too long, the split rules still applies. + +## Third point + +Providing a full explanation about "__why__ you get the error" and some leads on __how__ to fix it. If needed, add little code examples to improve your explanations. + +## Fourth point + +This part will show how to fix the error that we saw previously in the `Second point`, with comments explaining how it was fixed. + +## Fifth point + +Some details which might be useful for the users, let's take back `E0109` example. At the end, the supplementary explanation is the following: "Note that type parameters for enum-variant constructors go after the variant, not after the enum (`Option::None::`, not `Option::::None`).". It provides more information, not directly linked to the error, but it might help user to avoid doing another error. + +## Template + +So in final, it should like this: + +```Rust +E000: r##" +[First point] Example of erroneous code: + +\```compile_fail +[Second point] +\``` + +[Third point] + +\``` +[Fourth point] +\``` + +[Optional Fifth point] +``` + +Now let's take a full example: + +> E0264: r##" +> An unknown external lang item was used. Example of erroneous code: +> +> ```compile_fail +> #![feature(lang_items)] +> extern "C" { +> #[lang = "cake"] // error: unknown external lang item: `cake` +> fn cake(); +> } +> ``` +> +> A list of available external lang items is available in +> `src/librustc/middle/weak_lang_items.rs`. Example: +> +> ``` +> #![feature(lang_items)] +> extern "C" { +> #[lang = "panic_fmt"] // ok! +> fn cake(); +> } +> ``` +> "##, + +# Drawbacks + +None. + +# Alternatives + +Not having error codes explanations uniformized. + +# Unresolved questions + +None. From 03e5b13763e1145d7a3a230f4d4c04506b960cc7 Mon Sep 17 00:00:00 2001 From: Guillaume Gomez Date: Thu, 5 May 2016 01:26:46 +0200 Subject: [PATCH 0957/1195] Update titles --- ...g-error-codes-explanation-normalization.md | 121 +++++++++++------- 1 file changed, 74 insertions(+), 47 deletions(-) diff --git a/text/0000-long-error-codes-explanation-normalization.md b/text/0000-long-error-codes-explanation-normalization.md index f57ee61fce5..c1e5cdaa911 100644 --- a/text/0000-long-error-codes-explanation-normalization.md +++ b/text/0000-long-error-codes-explanation-normalization.md @@ -7,7 +7,7 @@ Rust Issue: N/A # Summary -Long error codes explanations haven't been normalized yet. This RFC intends to do it in order to uniformize them. +Rust has extend error messages that explain each error in more detail. We've been writing lots of them, which is good, but they're written in different styles, which is bad. This RFC intends to fix this inconsistency by providing a template for these long-form explanations to follow. # Motivation @@ -15,101 +15,128 @@ Long error codes explanations are a very important part of Rust. Having an expla # Detailed design -Here is the template I propose: +Here is what I propose: -## First point +## Error description -Giving a little more detailed error message. For example, the `E0109` says "type parameters are not allowed on this type" and the error explanation says: "You tried to give a type parameter to a type which doesn't need it.". +Provide a more detailed error message. For example: -## Second point +```rust +extern crate a; +extern crate b as a; +``` + +We get the `E0259` error code which says "an extern crate named `a` has already been imported in this module" and the error explanation says: "The name chosen for an external crate conflicts with another external crate that has been imported into the current module.". + +## Minimal example + +Provide an erroneous code example which directly follows `Error description`. The erroneous example will be helpful for the `How to fix the problem`. Making it as simple as possible is really important in order to help readers to understand what the error is about. A comment should be added with the error on the same line where the errors occur. Example: + +```rust +type X = u32; // error: type parameters are not allowed on this type +``` + +If the error comments is too long to fit 80 columns, split it up like this, so the next line start at the same column of the previous line: + +```rust +type X = u32<'static>; // error: lifetime parameters are not allowed on + // this type +``` -Giving an erroneous code example which directly follows `First point`. It'll be helpful for the `Forth point`. Making it as simple as possible is really important in order to help readers to understand what the error is about. A comment should be added with the error on the same line that the errors happen. Example: +And if the sample code is too long to write an effective comment, place your comment on the line before the sample code: - ```Rust - type X = u32; // error: type parameters are not allowed on this type - ``` - - If the error comments is too long to fit 80 columns, split it up like this, so the next line start at the same column of the previous line: - - ```Rust - type X = u32<'static>; // error: lifetime parameters are not allowed on - // this type - ``` - - And if the code line is just too long to make a correct comment, put your comment before it: - -```Rust +```rust // error: lifetime parameters are not allowed on this type fn super_long_function_name_and_thats_problematic() {} ``` - + Of course, it the comment is too long, the split rules still applies. -## Third point +## Error explanation -Providing a full explanation about "__why__ you get the error" and some leads on __how__ to fix it. If needed, add little code examples to improve your explanations. +Provide a full explanation about "__why__ you get the error" and some leads on __how__ to fix it. If needed, use additional code snippets to improve your explanations. -## Fourth point +## How to fix the problem -This part will show how to fix the error that we saw previously in the `Second point`, with comments explaining how it was fixed. +This part will show how to fix the error that we saw previously in the `Minimal example`, with comments explaining how it was fixed. -## Fifth point +## Additional information Some details which might be useful for the users, let's take back `E0109` example. At the end, the supplementary explanation is the following: "Note that type parameters for enum-variant constructors go after the variant, not after the enum (`Option::None::`, not `Option::::None`).". It provides more information, not directly linked to the error, but it might help user to avoid doing another error. ## Template -So in final, it should like this: +In summary, the template looks like this: -```Rust +```rust E000: r##" -[First point] Example of erroneous code: +[Error description] + +Example of erroneous code: \```compile_fail -[Second point] +[Minimal example] \``` -[Third point] +[Error explanation] \``` -[Fourth point] +[How to fix the problem] \``` -[Optional Fifth point] +[Optional Additional information] ``` Now let's take a full example: -> E0264: r##" -> An unknown external lang item was used. Example of erroneous code: +> E0409: r##" +> An "or" pattern was used where the variable bindings are not consistently bound +> across patterns. +> +> Example of erroneous code: > > ```compile_fail -> #![feature(lang_items)] -> extern "C" { -> #[lang = "cake"] // error: unknown external lang item: `cake` -> fn cake(); +> let x = (0, 2); +> match x { +> (0, ref y) | (y, 0) => { /* use y */} // error: variable `y` is bound with +> // different mode in pattern #2 +> // than in pattern #1 +> _ => () > } > ``` > -> A list of available external lang items is available in -> `src/librustc/middle/weak_lang_items.rs`. Example: +> Here, `y` is bound by-value in one case and by-reference in the other. > +> To fix this error, just use the same mode in both cases. +> Generally using `ref` or `ref mut` where not already used will fix this: +> +> ```ignore +> let x = (0, 2); +> match x { +> (0, ref y) | (ref y, 0) => { /* use y */} +> _ => () +> } > ``` -> #![feature(lang_items)] -> extern "C" { -> #[lang = "panic_fmt"] // ok! -> fn cake(); +> +> Alternatively, split the pattern: +> +> ``` +> let x = (0, 2); +> match x { +> (y, 0) => { /* use y */ } +> (0, ref y) => { /* use y */} +> _ => () > } > ``` > "##, # Drawbacks -None. +This will make contributing slighty more complex, as there are rules to follow, whereas right now there are none. # Alternatives -Not having error codes explanations uniformized. +Not having error codes explanations following a common template. # Unresolved questions From 750edfb71a48bb00b6546c584a499afc00ffaf52 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 15 Jun 2016 04:43:35 -0400 Subject: [PATCH 0958/1195] "Long error code explanation" is RFC 1567 --- ...md => 1567-long-error-codes-explanation-normalization.md} | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) rename text/{0000-long-error-codes-explanation-normalization.md => 1567-long-error-codes-explanation-normalization.md} (97%) diff --git a/text/0000-long-error-codes-explanation-normalization.md b/text/1567-long-error-codes-explanation-normalization.md similarity index 97% rename from text/0000-long-error-codes-explanation-normalization.md rename to text/1567-long-error-codes-explanation-normalization.md index c1e5cdaa911..9e02eed52b4 100644 --- a/text/0000-long-error-codes-explanation-normalization.md +++ b/text/1567-long-error-codes-explanation-normalization.md @@ -1,9 +1,8 @@ Start Date: 2016-01-04 -RFC PR: - -Rust Issue: N/A +- RFC PR: [rust-lang/rfcs#1567](https://github.com/rust-lang/rfcs/pull/1567) +- Rust Issue: N/A # Summary From 6df4777d948bcae8554891823027d7ed85c0feb8 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Wed, 15 Jun 2016 12:43:20 +0100 Subject: [PATCH 0959/1195] Add extra access methods for atomic types --- text/0000-atomic-access.md | 61 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) create mode 100644 text/0000-atomic-access.md diff --git a/text/0000-atomic-access.md b/text/0000-atomic-access.md new file mode 100644 index 00000000000..128eff40a7a --- /dev/null +++ b/text/0000-atomic-access.md @@ -0,0 +1,61 @@ +- Feature Name: atomic_access +- Start Date: 2016-06-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add the following methods to atomic types: + +```rust +impl AtomicT { + fn get_mut(&mut self) -> &mut T; + fn into_inner(self) -> T; + fn as_raw(&self) -> *mut T; + unsafe fn from_raw(ptr: *mut T) -> &AtomicT; +} +``` + +# Motivation +[motivation]: #motivation + +## `get_mut` and `into_inner` + +These methods are useful for accessing the value inside an atomic object directly when there are no other threads accessing it. This is guaranteed by the mutable reference and the move, since it means there can be no other live references to the atomic. + +A normal load/store is different from a `load(Relaxed)` or `store(Relaxed)` because it has much weaker synchronization guarantees, which means that the compiler can produce more efficient code. In particular, LLVM currently treats all atomic operations (even relaxed ones) as volatile operations, which means that it does not perform any optimizations on them. For example, it will not eliminate a `load(Relaxed)` even if the results of the load is not used anywhere. + +`get_mut` in particular is expected to be useful in `Drop` implementations where you have a `&mut self` and need to read the value of an atomic. `into_inner` somewhat overlaps in functionality with `get_mut`, but it is included to allow extracting the value without requiring the atomic object to be mutable. These methods mirror `Mutex::get_mut` and `Mutex::into_inner`. + +## `as_raw` and `from_raw` + +These methods are mainly intended to be used for FFI, where a variable of a non-atomic type needs to be modified atomically. The most common example of this is the Linux `futex` system call which takes an `int*` parameter pointing to an integer that is atomically modified by both userspace and the kernel. + +Rust code invoking the `futex` system call so far has simply passed the address of the atomic object directly to the system call. However this makes the assumption that the atomic type has the same layout as the underlying integer type. Using `as_raw` instead makes it clear that the resulting pointer will point to the integer value inside the atomic object. + +`from_raw` provides the reverse operation: it allows Rust code to atomically modify a value that was not declared as a atomic type. This is useful when dealing with FFI structs that are shared with a thread managed by a C library. Another example would be to atomically modify a value in a memory mapped file that is shared with another process. + +# Detailed design +[design]: #detailed-design + +The actual implementations of these functions are mostly trivial since they are based on `UnsafeCell::get`. The only exception is `from_raw` which will cast the given pointer to a different type, but that should also be fine. + +# Drawbacks +[drawbacks]: #drawbacks + +The functionality of `into_inner` somewhat overlaps with `get_mut`. + +`from_raw` returns an unbounded lifetime. + +# Alternatives +[alternatives]: #alternatives + +The functionality of `get_mut` and `into_inner` can be implemented using `load(Relaxed)`, however the latter can result in worse code because it is poorly handled by the optimizer. + +The functionality of `as_raw` and `from_raw` could be achieved using transmutes instead, however this requires making assumptions about the internal layout of the atomic types. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From b966a634b1fccdaab2501be2db2e7accbff50ba7 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Wed, 15 Jun 2016 13:02:51 +0100 Subject: [PATCH 0960/1195] Extend Cell to non-Copy types --- text/0000-movecell.md | 54 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 text/0000-movecell.md diff --git a/text/0000-movecell.md b/text/0000-movecell.md new file mode 100644 index 00000000000..e5f4ba7ee25 --- /dev/null +++ b/text/0000-movecell.md @@ -0,0 +1,54 @@ +- Feature Name: move_cell +- Start Date: 2016-06-15 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Extend `Cell` to work with non-`Copy` types. + +# Motivation +[motivation]: #motivation + +It allows safe inner-mutability of non-`Copy` types without the overhead of `RefCell`'s reference counting. + +# Detailed design +[design]: #detailed-design + +```rust +impl Cell { + fn set(&mut self, val: T); + fn replace(&mut self, val: T) -> T; + fn into_inner(self) -> T; +} + +impl Cell { + fn get(&self); +} + +impl Cell { + fn take(&mut self) -> T; +} +``` + +The `get` method is kept but is only available for `T: Copy`. The `set` method is available for all `T`. + +The `into_inner` and `replace` methods are added, which allow the value in a cell to be read even if `T` is not `Copy`. The `get` method can't be used since the cell must always contain a valid value. + +Finally, a `take` method is added which is equivalent to `self.replace(Default::default())`. + +# Drawbacks +[drawbacks]: #drawbacks + +It makes the `Cell` type more complicated. + +# Alternatives +[alternatives]: #alternatives + +The alternative is to use the `MoveCell` type from crates.io which provides the same functionality. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From 71898f58668aeb6c7a6775b77dc5407ece58124c Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Thu, 16 Jun 2016 02:20:42 +0100 Subject: [PATCH 0961/1195] Fix reference types --- text/0000-movecell.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-movecell.md b/text/0000-movecell.md index e5f4ba7ee25..a90c6e36265 100644 --- a/text/0000-movecell.md +++ b/text/0000-movecell.md @@ -18,8 +18,8 @@ It allows safe inner-mutability of non-`Copy` types without the overhead of `Ref ```rust impl Cell { - fn set(&mut self, val: T); - fn replace(&mut self, val: T) -> T; + fn set(&self, val: T); + fn replace(&self, val: T) -> T; fn into_inner(self) -> T; } @@ -28,7 +28,7 @@ impl Cell { } impl Cell { - fn take(&mut self) -> T; + fn take(&self) -> T; } ``` From 4c92f15f927dc80ffa5486819c2a21dc6d4414e6 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Thu, 16 Jun 2016 10:12:58 +0100 Subject: [PATCH 0962/1195] Fix typo and add a drawback --- text/0000-movecell.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-movecell.md b/text/0000-movecell.md index a90c6e36265..95e9f8422f2 100644 --- a/text/0000-movecell.md +++ b/text/0000-movecell.md @@ -24,7 +24,7 @@ impl Cell { } impl Cell { - fn get(&self); + fn get(&self) -> T; } impl Cell { @@ -43,6 +43,8 @@ Finally, a `take` method is added which is equivalent to `self.replace(Default:: It makes the `Cell` type more complicated. +`Cell` will only be able to derive traits like `Eq` and `Ord` for types that are `Copy`, since there is no way to non-destructively read the contents of a non-`Copy` `Cell`. + # Alternatives [alternatives]: #alternatives From 131ea36ddb554df37a666e2d49b2617202d999e3 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 16 Jun 2016 13:52:30 +0100 Subject: [PATCH 0963/1195] Merge RFC 1590: macro lifetimes --- text/{0000-macro-lifetimes.md => 1590-macro-lifetimes.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-macro-lifetimes.md => 1590-macro-lifetimes.md} (93%) diff --git a/text/0000-macro-lifetimes.md b/text/1590-macro-lifetimes.md similarity index 93% rename from text/0000-macro-lifetimes.md rename to text/1590-macro-lifetimes.md index 0580cd0ab31..38b92d51477 100644 --- a/text/0000-macro-lifetimes.md +++ b/text/1590-macro-lifetimes.md @@ -1,7 +1,7 @@ - Feature Name: Allow `lifetime` specifiers to be passed to macros - Start Date: 2016-04-22 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1590 +- Rust Issue: https://github.com/rust-lang/rust/issues/34303 # Summary [summary]: #summary From 7205164667cfe34878c4ed719bc331a2c68be141 Mon Sep 17 00:00:00 2001 From: Ashley Williams Date: Fri, 17 Jun 2016 11:53:43 +0100 Subject: [PATCH 0964/1195] propose assert_ne --- text/0000-assert_ne.md | 57 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) create mode 100644 text/0000-assert_ne.md diff --git a/text/0000-assert_ne.md b/text/0000-assert_ne.md new file mode 100644 index 00000000000..23e683e2782 --- /dev/null +++ b/text/0000-assert_ne.md @@ -0,0 +1,57 @@ +- Feature Name: Assert Not Equals Macro (`assert_ne`) +- Start Date: (2016-06-17) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +`assert_ne` is a macro that takes 2 arguments and panics if they are equal. It +works and is implemented identically to `assert_eq` and serves as its compliment. + +# Motivation +[motivation]: #motivation + +This feature, among other reasons, makes testing more readable and consistent as +it compliments `asset_eq`. It gives the same style panic message as `assert_eq`, +which eliminates the need to write it yourself. + +# Detailed design +[design]: #detailed-design + +This feature has exactly the same design and implementation as `assert_eq`. + +Here is the definition: + +```rust +macro_rules! assert_ne { + ($left:expr , $right:expr) => ({ + match (&$left, &$right) { + (left_val, right_val) => { + if *left_val == *right_val { + panic!("assertion failed: `(left !== right)` \ + (left: `{:?}`, right: `{:?}`)", left_val, right_val) + } + } + } + }) +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +Any addition to the standard library will need to be maintained forever, so it is +worth weighing the maintenance cost of this over the value add. Given that it is so +similar to `assert_eq`, I believe the weight of this drawback is low. + +# Alternatives +[alternatives]: #alternatives + +Alternatively, users implement this feature themselves, or use the crate `assert_ne` +that I published. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None at this moment. From 7d30463eb3a5048ab6cf0d130c0585297584b51b Mon Sep 17 00:00:00 2001 From: Ashley Williams Date: Fri, 17 Jun 2016 14:19:00 +0100 Subject: [PATCH 0965/1195] s/compliment/complement --- text/0000-assert_ne.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-assert_ne.md b/text/0000-assert_ne.md index 23e683e2782..530fad074dc 100644 --- a/text/0000-assert_ne.md +++ b/text/0000-assert_ne.md @@ -7,13 +7,13 @@ [summary]: #summary `assert_ne` is a macro that takes 2 arguments and panics if they are equal. It -works and is implemented identically to `assert_eq` and serves as its compliment. +works and is implemented identically to `assert_eq` and serves as its complement. # Motivation [motivation]: #motivation This feature, among other reasons, makes testing more readable and consistent as -it compliments `asset_eq`. It gives the same style panic message as `assert_eq`, +it complements `asset_eq`. It gives the same style panic message as `assert_eq`, which eliminates the need to write it yourself. # Detailed design From 76c59969265223815b14fb83bccdfd5adc285f6c Mon Sep 17 00:00:00 2001 From: Ashley Williams Date: Fri, 17 Jun 2016 14:23:54 +0100 Subject: [PATCH 0966/1195] replace !== with != --- text/0000-assert_ne.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-assert_ne.md b/text/0000-assert_ne.md index 530fad074dc..0be68e70a84 100644 --- a/text/0000-assert_ne.md +++ b/text/0000-assert_ne.md @@ -29,7 +29,7 @@ macro_rules! assert_ne { match (&$left, &$right) { (left_val, right_val) => { if *left_val == *right_val { - panic!("assertion failed: `(left !== right)` \ + panic!("assertion failed: `(left != right)` \ (left: `{:?}`, right: `{:?}`)", left_val, right_val) } } From 8e76a2f86ac5627c82325da4a85b269017767163 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Sat, 18 Jun 2016 09:03:01 +0100 Subject: [PATCH 0967/1195] Clarify implementation of set() --- text/0000-movecell.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-movecell.md b/text/0000-movecell.md index 95e9f8422f2..41682f0fba6 100644 --- a/text/0000-movecell.md +++ b/text/0000-movecell.md @@ -32,7 +32,9 @@ impl Cell { } ``` -The `get` method is kept but is only available for `T: Copy`. The `set` method is available for all `T`. +The `get` method is kept but is only available for `T: Copy`. + +The `set` method is available for all `T`. It will need to be implemented by calling `replace` and dropping the returned value. Dropping the old value in-place is unsound since the `Drop` impl will hold a mutable reference to the cell contents. The `into_inner` and `replace` methods are added, which allow the value in a cell to be read even if `T` is not `Copy`. The `get` method can't be used since the cell must always contain a valid value. From d7a7d5b600042ca8cbeb0419fc1485116e58644a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?G=C3=A1bor=20Lehel?= Date: Mon, 20 Jun 2016 18:18:45 +0200 Subject: [PATCH 0968/1195] Update link {discuss -> internals}.rust-lang.org in README --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index cc04fd9bb25..2a9c2ad4da8 100644 --- a/README.md +++ b/README.md @@ -147,7 +147,7 @@ project developers, and particularly members of the relevant [sub-team] is a good indication that the RFC is worth pursuing. [issues]: https://github.com/rust-lang/rfcs/issues -[discuss]: http://discuss.rust-lang.org/ +[discuss]: http://internals.rust-lang.org/ ## What the process is From 1ebc429c77b797fa7c7e8fa65dd7e31ec69f60f1 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Mon, 20 Jun 2016 13:23:52 -0400 Subject: [PATCH 0969/1195] more nits --- text/0000-more-api-documentation-conventions.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index d4f525337ca..4d6cf094b8d 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -167,7 +167,7 @@ documentation for more details." Instead, module-level documentation should show a high-level summary of everything in the module, and each type should document itself fully. It is okay if there is some small amount of duplication here. Module-level -documentation should be broad, and not go into a lot of detail, which is left +documentation should be broad and not go into a lot of detail. That is left to the type's documentation. ## Example @@ -410,7 +410,7 @@ mod test { Within doc comments, use Markdown to format your documentation. -Use top level headings # to indicate sections within your comment. Common headings: +Use top level headings (`#`) to indicate sections within your comment. Common headings: * Examples * Panics @@ -419,12 +419,18 @@ Use top level headings # to indicate sections within your comment. Common headin * Aborts * Undefined Behavior +An example: + +```rust +/// # Examples +``` + Even if you only include one example, use the plural form: ‘Examples’ rather than ‘Example’. Future tooling is easier this way. Use backticks (`) to denote a code fragment within a sentence. -Use backticks (```) to write longer examples, like this: +Use triple backticks (```) to write longer examples, like this: This code does something cool. From 89002e7c8f90b9630fb89b13dd4a67e7e3495e33 Mon Sep 17 00:00:00 2001 From: Wang Xuerui Date: Thu, 23 Jun 2016 01:56:53 +0800 Subject: [PATCH 0970/1195] ergonomic-format-args: address review comments * feature name seems not applicable as this is not something to be gated * updated to reflect latest implementation * removed most details for higher-level overview * reworded slightly --- text/0000-ergonomic-format-args.md | 108 +++++++++-------------------- 1 file changed, 34 insertions(+), 74 deletions(-) diff --git a/text/0000-ergonomic-format-args.md b/text/0000-ergonomic-format-args.md index 03438277ab5..fa4cb12e1d2 100644 --- a/text/0000-ergonomic-format-args.md +++ b/text/0000-ergonomic-format-args.md @@ -1,4 +1,4 @@ -- Feature Name: `ergonomic_format_args` +- Feature Name: (not applicable) - Start Date: 2016-05-17 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -7,6 +7,9 @@ [summary]: #summary Removes the one-type-only restriction on `format_args!` arguments. +Expressions like `format_args!("{0:x} {0:o}", foo)` now work as intended, +where each argument is still evaluated only once, in order of appearance +(i.e. left-to-right). # Motivation [motivation]: #motivation @@ -29,8 +32,6 @@ so the mechanism as a whole certainly needs more love. # Detailed design [design]: #detailed-design -## Overview - Formatting is done during both compile-time (expansion-time to be pedantic) and runtime in Rust. As we are concerned with format string parsing, not outputting, this RFC only touches the compile-time side of the existing @@ -39,17 +40,22 @@ formatting mechanism which is `libsyntax_ext` and `libfmt_macros`. Before continuing with the details, it is worth noting that the core flow of current Rust formatting is *mapping arguments to placeholders to format specs*. For clarity, we distinguish among *placeholders*, *macro arguments* and -*generated `ArgumentV1` objects*. They are all *italicized* to provide some +*argument objects*. They are all *italicized* to provide some visual hint for distinction. -To implement the proposed design, first we resolve all implicit references to -the next argument (*next-references* for short) during parse; then we modify -the macro expansion to make use of the now explicit argument references, -preserving the mapping. +To implement the proposed design, the following changes in behavior are made: + +* implicit references are resolved during parse of format string; +* named *macro arguments* are resolved into positional ones; +* placeholder types are remembered and de-duplicated for each *macro argument*, +* the *argument objects* are emitted with information gathered in steps above. -## Parse-time next-reference resolution +As most of the details is best described in the code itself, we only +illustrate some of the high-level changes below. -Currently two forms of next-references exist: `ArgumentNext` and +## Implicit reference resolution + +Currently two forms of implicit references exist: `ArgumentNext` and `CountIsNextParam`. Both take a positional *macro argument* and advance the same internal pointer, but format is parsed before position, as shown in format strings like `"{foo:.*} {} {:.*}"` which is in every way equivalent to @@ -58,68 +64,23 @@ format strings like `"{foo:.*} {} {:.*}"` which is in every way equivalent to As the rule is already known even at compile-time, and does not require the whole format string to be known beforehand, the resolution can happen just inside the parser after a *placeholder* is successfully parsed. As a natural -consequence, both forms of next-reference can be removed from the rest of the -compiler, simplifying work later. - -## Expansion-time argument mapping - -There are two kinds of *macro arguments*, positional and named. Because of the -apparent type difference, two maps are needed to track *placeholder* types -(known as `ArgumentType`s in the code). In the current implementation, -`Vec>` is for positional *macro arguments* and -`HashMap` is for named *macro arguments*, apparently -neither of which supports multiple types for one *macro argument*. Also, for -constructing the `__STATIC_FMTARGS` we need to first figure out the order for -every *placeholder* in the list of *generated `ArgumentV1` objects*. So we -first classify *placeholders* according to their associated *macro arguments*, -which are all explicit now, then assign each of them a correct index. - -### Placeholder type collection - -In the proposed design, lists of `ArgumentType`s are used to store -*placeholder* types for each *macro argument* in order. During verification -the *placeholder* type seen for a *macro argument* is simply pushed into the -respective list. This does not remove the ability to sense unused -*macro arguments*, as the list would simply be empty when checked later, just -as it would be `None` in the old `Option` version. - -### Mapping construction - -For consistency with the current implementation, named *macro arguments* are -still put at the end of *generated `ArgumentV1` objects*. Which means we have -to consume all of format string in order to know how many *placeholders* there -are referencing to positional *macro arguments*. As such, the verification -and translation of pieces are now separated with mapping construction in -between. - -Obviously, the orders used during mapping and actual expansion must agree, but -fortunately the rules are very simple now only explicit references remain. -We iterate over the list of known positional *macro arguments*, recording the -index at which every bunch of *generated `ArgumentV1` objects* would begin for -each positional *macro argument*. After that, we also record the total number -for mapping the named *macro arguments*, as the relative offsets of named -*placeholders* are already recorded during verification. - -### Expansion - -With mapping between *placeholders* and *generated `ArgumentV1` objects* -ready at hand, it is easy to emit correct `Argument`s. Scratch space is -provided to `trans_piece` for remembering how many *placeholders* for a given -*macro argument* have been processed. This information is then used to rewrite -all references from using *macro argument* indices to -*generated `ArgumentV1` object* indices, namely: - -* `ArgumentIs(i)` -* `ArgumentNamed(n)` -* `CountIsParam(i)` -* `CountIsName(n)` - -For the count references, some may suggest that they are now potentially -ambiguous. However considering the implementation of `verify_count`, the -parameter used by each `Count` is individually injected into the list of -*generated `ArgumentV1` objects* as if it were explicitly specified. Also it -is *macro arguments* to be referenced, not the potentially multiple -*placeholders*, so there are in fact no ambiguities. +consequence, both forms can be removed from the rest of the compiler, +simplifying work later. + +## Named argument resolution + +Not seen elsewhere in Rust, named arguments in format macros are best seen as +syntactic sugar, and we'd better actually treat them as such. Just after +successfully parsing the *macro arguments*, we immediately rewrite every name +to its respective position in the argument list, which again simplifies the +process. + +## Processing and expansion + +We only have absolute positional references to *macro arguments* at this point, +and it's straightforward to remember all unique *placeholders* encountered for +each. The unique *placeholders* are emitted into *argument objects* in order, +to preserve evaluation order, but no difference in behavior otherwise. # Drawbacks [drawbacks]: #drawbacks @@ -182,5 +143,4 @@ ergonomics is simply bad and the code becomes unnecessarily convoluted. # Unresolved questions [unresolved]: #unresolved-questions -* Does the *generated `ArgumentV1` objects* need deduplication? -* Will it break the ABI if handling of next-references in `libcore/fmt` is removed as well? +None. From fd51b8578ede81bef785e2290bd13187ea27185d Mon Sep 17 00:00:00 2001 From: Felix S Klock II Date: Thu, 23 Jun 2016 23:41:33 +0200 Subject: [PATCH 0971/1195] added feature gate for attributes on generics --- text/0000-dropck-param-eyepatch.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/text/0000-dropck-param-eyepatch.md b/text/0000-dropck-param-eyepatch.md index c5e5c6eedb0..2a1c7593555 100644 --- a/text/0000-dropck-param-eyepatch.md +++ b/text/0000-dropck-param-eyepatch.md @@ -1,4 +1,4 @@ -- Feature Name: dropck_eyepatch +- Feature Name: dropck_eyepatch, generic_param_attrs - Start Date: 2015-10-19 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -19,6 +19,9 @@ and type paramters). Atop that capability, this RFC proposes adding a holds data that must not be accessed during the dynamic extent of that `drop` invocation. +As a side-effect, enable adding attributes to the formal declarations +of generic type and lifetime parameters. + [RFC 1238]: https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md [RFC 769]: https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md @@ -140,7 +143,7 @@ storage for [cyclic graph structures][dropck_legal_cycles.rs]). 1. Add the ability to attach attributes to syntax that binds formal lifetime or type parmeters. For the purposes of this RFC, the only place in the syntax that requires such attributes are `impl` - blocks, as in `impl Drop for Type { ... }` + blocks, as in `impl Drop for Type { ... }` 2. Add a new fine-grained attribute, `may_dangle`, which is attached to the binding sites for lifetime or type parameters on an `Drop` @@ -163,6 +166,8 @@ storage for [cyclic graph structures][dropck_legal_cycles.rs]). This is a simple extension to the syntax. +It is guarded by the feature gate `generic_param_attrs`. + Constructions like the following will now become legal. Example of eyepatch attribute on a single type parameter: @@ -212,6 +217,8 @@ unsafe impl<'a, X, Y> Drop for Foo<'a, #[may_dangle] X, Y> { Add a new attribute, `#[may_dangle]` (the "eyepatch"). +It is guarded by the feature gate `dropck_eyepatch`. + The eyepatch is similar to `unsafe_destructor_blind_to_params`: it is part of the `Drop` implementation, and it is meant to assert that a destructor is guaranteed not to access certain kinds From ef11dc096a7d10ae4157f193eddaec0c2314c295 Mon Sep 17 00:00:00 2001 From: Felix S Klock II Date: Thu, 23 Jun 2016 23:46:52 +0200 Subject: [PATCH 0972/1195] remove now-meaningless paragraph --- text/0000-dropck-param-eyepatch.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/text/0000-dropck-param-eyepatch.md b/text/0000-dropck-param-eyepatch.md index 2a1c7593555..b72c658d784 100644 --- a/text/0000-dropck-param-eyepatch.md +++ b/text/0000-dropck-param-eyepatch.md @@ -464,10 +464,6 @@ reflected in what he wrote in the [RFC 1238 alternatives][].) ## Make dropck "see again" via (focused) where-clauses -(This alternative carries over some ideas from -[the previous section][blacklist-not-whitelist], but it stands well on -its own as something to consider, so I am giving it its own section.) - The idea is that we keep the UGEH attribute, blunt hammer that it is. You first opt out of the dropck ordering constraints via that, and then you add back in ordering constraints via `where` clauses. From dc053829d766053b6bbc52aed16e73f22d09d43c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 24 Jun 2016 09:01:17 -0700 Subject: [PATCH 0973/1195] RFC 1618 is ergonomic format_args! --- ...ergonomic-format-args.md => 1618-ergonomic-format-args.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-ergonomic-format-args.md => 1618-ergonomic-format-args.md} (97%) diff --git a/text/0000-ergonomic-format-args.md b/text/1618-ergonomic-format-args.md similarity index 97% rename from text/0000-ergonomic-format-args.md rename to text/1618-ergonomic-format-args.md index fa4cb12e1d2..7178da14b96 100644 --- a/text/0000-ergonomic-format-args.md +++ b/text/1618-ergonomic-format-args.md @@ -1,7 +1,7 @@ - Feature Name: (not applicable) - Start Date: 2016-05-17 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1618](https://github.com/rust-lang/rfcs/pull/1618) +- Rust Issue: [rust-lang/rust#33642](https://github.com/rust-lang/rust/pull/33642) # Summary [summary]: #summary From b957f6306a5235dea72f3307fe404e091d2a0957 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:25:10 -0400 Subject: [PATCH 0974/1195] Intermediate step before adding images back --- .../0000-default-and-expanded-rustc-errors.md | 160 ++++++++++++++---- 1 file changed, 125 insertions(+), 35 deletions(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index d9bd506f545..8aff178b551 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -6,15 +6,32 @@ # Summary This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust error messages and --explain text to focus on the user's code. The end goal is for errors and explain text to be more readable, more friendly to new users, while still helping Rust coders fix bugs as quickly as possible. We expect to follow this RFC with a supplemental RFC that provides a writing style guide for error messages and explain text with a focus on readability and education. -This RFC details work in close collaboration with Niko Matsakis and Yehuda Katz, with input from Aaron Turon and Alex Crichton. Special thanks to those who gave us feedback on previous iterations of the proposal. - # Motivation ## Default error format Rust offers a unique value proposition in the landscape of languages in part by codifying concepts like ownership and borrowing. Because these concepts are unique to Rust, it's critical that the learning curve be as smooth as possible. And one of the most important tools for lowering the learning curve is providing excellent errors that serve to make the concepts less intimidating, and to help 'tell the story' about what those concepts mean in the context of the programmer's code. -![Image of current error format](http://www.jonathanturner.org/images/old_errors_new2.png) +[as text] +``` +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22: 29:30 error: cannot borrow `foo.bar1` as mutable more than once at a time [E0499] +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1; + ^~~~~~~~ +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22: 29:30 help: run `rustc --explain E0499` to see a detailed explanation +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28:21: 28:29 note: previous borrow of `foo.bar1` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `foo.bar1` until the borrow ends +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28 let bar1 = &mut foo.bar1; + ^~~~~~~~ +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31:2: 31:2 note: previous borrow ends here +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:26 fn borrow_same_field_twice_mut_mut() { +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:27 let mut foo = make_foo(); +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28 let bar1 = &mut foo.bar1; +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1; +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:30 *bar1; +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31 } + ^ +``` + +[TODO TODO TODO TODO as an image] *Example of a borrow check error in the current compiler* @@ -27,7 +44,20 @@ Though a lot of time has been spent on the current error messages, they have a c This RFC details a redesign of errors to focus more on the source the programmer wrote. This format addresses the above concerns by eliminating clutter, following a more natural order for help messages, and pointing the user to both "what" the error is and "why" the error is occurring by using color-coded labels. Below you can see the same error again, this time using the proposed format: -![Image of new error flow](http://www.jonathanturner.org/images/new_errors_new2.png) +``` +error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time + --> src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22 + | +28 | let bar1 = &mut foo.bar1; + | -------- first mutable borrow occurs here +29 | let _bar2 = &mut foo.bar1; + | ^^^^^^^^ second mutable borrow occurs here +30 | *bar1; +31 | } + | - first borrow ends here +``` + +[as image TODO TODO TODO TODO TODO TODO] *Example of the same borrow check error in the proposed format* @@ -52,9 +82,26 @@ impl TheDarkKnight { *Example of the current --explain (showing E0507)* -To help users, this RFC proposes that --explain no longer uses an error code. Instead, --explain becomes a flag in a cargo or rustc invocation that enables an expanded error-reporting mode which incorporates the user's code. This more textual mode gives additional explanation to help understand compiler messages better. The end result is a richer, on-demand error reporting style. +To help users, this RFC proposes a new `--explain errors`. This new mode is more textual error reporting mode that gives additional explanation to help better understand compiler messages. The end result is a richer, on-demand error reporting style. + +``` +error: cannot move out of borrowed content + --> /Users/jturner/Source/errors/borrowck-move-out-of-vec-tail.rs:30:17 + +I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement: + +29 |> match tail { -![Image of Rust error in elm-style](http://www.jonathanturner.org/images/elm_like_rust.png) +In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. + +30 |> [Foo { string: aa }, + |> ^^ cannot move out of borrowed content + +You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix might look this: + +30 |> [Foo { string: ref aa }, + +``` # Detailed design @@ -62,11 +109,7 @@ The RFC is separated into two parts: the format of error messages and the format ## Format of error messages -The proposal is a lighter error format focused on the code the user wrote. Messages that help understand why an error occurred appear as labels on the source. You can see an example below: - -![Image of new error flow](http://www.jonathanturner.org/images/new_errors_new2.png) - -The goals of this new format are to: +The proposal is a lighter error format focused on the code the user wrote. Messages that help understand why an error occurred appear as labels on the source. The goals of this new format are to: * Create something that's visually easy to parse * Remove noise/unnecessary information @@ -86,29 +129,49 @@ In order to accomplish this, the proposed design needs to satisfy a number of co ### Header -![Image of new error format heading](http://www.jonathanturner.org/images/rust_error_1_new.png) +``` +error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time + --> src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22 +``` -The header now spans two lines. It gives you access to knowing a) if it's a warning or error, b) the text of the warning/error, and c) the location of this warning/error. You can see we also use the [--explain E0499] as a way to let the developer know they can get more information about this kind of issue, though this may be replaced in favor of the more general --explain also described in this RFC. While we use some bright colors here, we expect the use of colors and bold text in the 'Source area' (shown below) to draw the eye first. +The header still serves the original purpose of knowing: a) if it's a warning or error, b) the text of the warning/error, and c) the location of this warning/error. We keep the error code, now a part of the error indicator, as a way to help improve search results. ### Line number column -![Image of new error format line number column](http://www.jonathanturner.org/images/rust_error_2_new.png) +``` + | +28 | + | +29 | + | +30 | +31 | + | +``` The line number column lets you know where the error is occurring in the file. Because we only show lines that are of interest for the given error/warning, we elide lines if they are not annotated as part of the message (we currently use the heuristic to elide after one un-annotated line). -Inspired by Dybuk and Elm, the line numbers are separated with a 'wall', a separator formed from |>, to clearly distinguish what is a line number from what is source at a glance. +Inspired by Dybuk and Elm, the line numbers are separated with a 'wall', a separator formed from pipe('|') characters, to clearly distinguish what is a line number from what is source at a glance. As the wall also forms a way to visually separate distinct errors, we propose extending this concept to also support span-less notes and hints. For example: ``` -92 |> config.target_dir(&pkg) - |> ^^^^ expected `core::workspace::Workspace`, found `core::package::Package` - => note: expected type `&core::workspace::Workspace<'_>` - => note: found type `&core::package::Package` +92 | config.target_dir(&pkg) + | ^^^^ expected `core::workspace::Workspace`, found `core::package::Package` + = note: expected type `&core::workspace::Workspace<'_>` + = note: found type `&core::package::Package` ``` ### Source area -![Image of new error format source area](http://www.jonathanturner.org/images/rust_error_3_new.png) +``` + let bar1 = &mut foo.bar1; + -------- first mutable borrow occurs here + let _bar2 = &mut foo.bar1; + ^^^^^^^^ second mutable borrow occurs here + *bar1; + } + - first borrow ends here +``` The source area shows the related source code for the error/warning. The source is laid out in the order it appears in the source file, giving the user a way to map the message against the source they wrote. @@ -116,9 +179,7 @@ Key parts of the code are labeled with messages to help the user understand the The primary label is the label associated with the main warning/error. It explains the **what** of the compiler message. By reading it, the user can begin to understand what the root cause of the error or warning is. This label is colored to match the level of the message (yellow for warning, red for error) and uses the ^^^ underline. -Secondary labels help to understand the error and use blue text and --- underline. These labels explain the **why** of the compiler message. You can see one such example in the above message where the secondary labels explain that there is already another borrow going on. In another example, we see another way that primary and secondary work together to tell the whole story for why the error occurred: - -![Image of new error format source area](http://www.jonathanturner.org/images/primary_secondary.png) +Secondary labels help to understand the error and use blue text and --- underline. These labels explain the **why** of the compiler message. You can see one such example in the above message where the secondary labels explain that there is already another borrow going on. In another example, we see another way that primary and secondary work together to tell the whole story for why the error occurred. Taken together, primary and secondary labels create a 'flow' to the message. Flow in the message lets the user glance at the colored labels and quickly form an educated guess as to how to correctly update their code. @@ -130,17 +191,29 @@ Currently, --explain text focuses on the error code. You invoke the compiler wi We propose changing --explain to no longer take an error code. Instead, passing --explain to the compiler (or to cargo) will turn the compiler output into an expanded form which incorporates the same source and label information the user saw in the default message with more explanation text. -![Image of Rust error in elm-style](http://www.jonathanturner.org/images/elm_like_rust.png) +``` +error: cannot move out of borrowed content + --> /Users/jturner/Source/errors/borrowck-move-out-of-vec-tail.rs:30:17 + +I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement: -*Example of an expanded error message* +29 |> match tail { -The expanded error message effectively becomes a template. The text of the template is the educational text that is explaining the message more more detail. The template is then populated using the source lines, labels, and spans from the same compiler message that's printed in the default mode. This lets the message writer call out each label or span as appropriate in the expanded text. +In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. -It's possible to also add additional labels that aren't necessarily shown in the default error mode but would be available in the expanded error format. For example, the above error might look like this is as a default error: +30 |> [Foo { string: aa }, + |> ^^ cannot move out of borrowed content + +You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix might look this: + +30 |> [Foo { string: ref aa }, +``` + +*Example of an expanded error message* -![Image of same error without all of the same labels](http://www.jonathanturner.org/images/default_borrowed_content.png) +The expanded error message effectively becomes a template. The text of the template is the educational text that is explaining the message more more detail. The template is then populated using the source lines, labels, and spans from the same compiler message that's printed in the default mode. This lets the message writer call out each label or span as appropriate in the expanded text. -This gives the explain text writer maximal flexibility without impacting the readability of the default message. I'm currently prototyping an implementation of how this templating could work in practice. +It's possible to also add additional labels that aren't necessarily shown in the default error mode but would be available in the expanded error format. This gives the explain text writer maximal flexibility without impacting the readability of the default message. I'm currently prototyping an implementation of how this templating could work in practice. ## Tying it together @@ -153,11 +226,9 @@ error: aborting due to 2 previous errors Be changed to notify users of this ability: ``` -note: You can compile again with --explain for more information about these errors +note: compile failed due to 2 errors. You can compile again with `--explain errors` for more information ``` -As this helps inform the user of the --explain capability. - # Drawbacks Changes in the error format can impact integration with other tools. For example, IDEs that use a simple regex to detect the error would need to be updated to support the new format. This takes time and community coordination. @@ -168,9 +239,28 @@ There is a fair amount of work involved to update the errors and explain text to # Alternatives -Rather than using the proposed error format format, we could only provide the verbose --explain style that is proposed in this RFC. Famous programmers like [John Carmack](https://twitter.com/ID_AA_Carmack/status/735197548034412546) have praised the Elm error format. +Rather than using the proposed error format format, we could only provide the verbose --explain style that is proposed in this RFC. Respected programmers like [John Carmack](https://twitter.com/ID_AA_Carmack/status/735197548034412546) have praised the Elm error format. + +``` +Detected errors in 1 module. + +-- TYPE MISMATCH --------------------------------------------------------------- +The right argument of (+) is causing a type mismatch. + +25| model + "1" + ^^^ +(+) is expecting the right argument to be a: -![Image of Elm error](http://www.jonathanturner.org/images/elm_error.jpg) + number + +But the right argument is: + + String + +Hint: To append strings in Elm, you need to use the (++) operator, not (+). + +Hint: I always figure out the type of the left argument first and if it is acceptable on its own, I assume it is "correct" in subsequent checks. So the problem may actually be in how the left and right arguments interact. +``` *Example of an Elm error* @@ -188,4 +278,4 @@ Likewise, While some of us have been dogfooding these errors, we don't know what There are a few unresolved questions: * Editors that rely on pattern-matching the compiler output will need to be updated. It's an open question how best to transition to using the new errors. There is on-going discussion of standardizing the JSON output, which could also be used. -* Can additional error notes be shown without the "rainbow problem" where too many colors and too much boldness cause errors to beocome less readable? +* Can additional error notes be shown without the "rainbow problem" where too many colors and too much boldness cause errors to become less readable? From 860673e0347b24a62ba9f102f019050c6f5485ba Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:27:52 -0400 Subject: [PATCH 0975/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 8aff178b551..1cc24f39a6c 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -189,7 +189,7 @@ Note: We'll talk more about additional style guidance for wording to help create Currently, --explain text focuses on the error code. You invoke the compiler with --explain and receive a verbose description of what causes errors of that number. The resulting message can be helpful, but it uses generic sample code which makes it feel less connected to the user's code. -We propose changing --explain to no longer take an error code. Instead, passing --explain to the compiler (or to cargo) will turn the compiler output into an expanded form which incorporates the same source and label information the user saw in the default message with more explanation text. +We propose adding a new `--explain errors`. By passing this to the compiler (or to cargo), the compiler will switch to an expanded error form which incorporates the same source and label information the user saw in the default message with more explanation text. ``` error: cannot move out of borrowed content From 89c5d93a961398bb1bdbe1072bbd01287d3425e2 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:28:43 -0400 Subject: [PATCH 0976/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 1cc24f39a6c..fc4ac174ee9 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -105,7 +105,7 @@ You can avoid moving the contents out by working with each part using a referenc # Detailed design -The RFC is separated into two parts: the format of error messages and the format of expanded error messages (using --explain). +The RFC is separated into two parts: the format of error messages and the format of expanded error messages (using --explain errors). ## Format of error messages From 35db119d8c53fab24990ae81a0c3132efca42970 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:35:09 -0400 Subject: [PATCH 0977/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index fc4ac174ee9..27723ea8989 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -31,7 +31,8 @@ src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31 } ^ ``` -[TODO TODO TODO TODO as an image] +[as image] +![Image of new error flow](http://www.jonathanturner.org/images/old_errors_3.png) *Example of a borrow check error in the current compiler* @@ -57,7 +58,8 @@ error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time | - first borrow ends here ``` -[as image TODO TODO TODO TODO TODO TODO] +[as image] +![Image of new error flow](http://www.jonathanturner.org/images/new_errors_3.png) *Example of the same borrow check error in the proposed format* @@ -105,7 +107,7 @@ You can avoid moving the contents out by working with each part using a referenc # Detailed design -The RFC is separated into two parts: the format of error messages and the format of expanded error messages (using --explain errors). +The RFC is separated into two parts: the format of error messages and the format of expanded error messages (using `--explain errors`). ## Format of error messages From c15b656c224b3ed6b539f815633b49b6861e7cf4 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:35:40 -0400 Subject: [PATCH 0978/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 27723ea8989..6b5c891e556 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -45,6 +45,7 @@ Though a lot of time has been spent on the current error messages, they have a c This RFC details a redesign of errors to focus more on the source the programmer wrote. This format addresses the above concerns by eliminating clutter, following a more natural order for help messages, and pointing the user to both "what" the error is and "why" the error is occurring by using color-coded labels. Below you can see the same error again, this time using the proposed format: +[as text] ``` error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time --> src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22 From d97df368824ef6a48cf8787081d45edd40515f96 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:37:18 -0400 Subject: [PATCH 0979/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 6b5c891e556..b34e98fd92e 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -4,7 +4,7 @@ - Rust Issue: (leave this empty) # Summary -This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust error messages and --explain text to focus on the user's code. The end goal is for errors and explain text to be more readable, more friendly to new users, while still helping Rust coders fix bugs as quickly as possible. We expect to follow this RFC with a supplemental RFC that provides a writing style guide for error messages and explain text with a focus on readability and education. +This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust error messages and improve --explain capabilities to focus on the user's code. The end goal is for errors and explain text to be more readable, more friendly to new users, while still helping Rust coders fix bugs as quickly as possible. We expect to follow this RFC with a supplemental RFC that provides a writing style guide for error messages and explain text with a focus on readability and education. # Motivation @@ -60,7 +60,7 @@ error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time ``` [as image] -![Image of new error flow](http://www.jonathanturner.org/images/new_errors_3.png) +![Image of new error flow](http://www.jonathanturner.org/images/new_errors_3.png =250x) *Example of the same borrow check error in the proposed format* From 03142e7a255419d20d20c1e9da5887dcba7a3f50 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:39:49 -0400 Subject: [PATCH 0980/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index b34e98fd92e..305f5477c6b 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -60,7 +60,9 @@ error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time ``` [as image] -![Image of new error flow](http://www.jonathanturner.org/images/new_errors_3.png =250x) +![Image of new error flow](http://www.jonathanturner.org/images/new_errors_3.png) + + *Example of the same borrow check error in the proposed format* From dc6f3084d6aef9f735a3f75c2214dc1a08c7a107 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:40:28 -0400 Subject: [PATCH 0981/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 305f5477c6b..61c2d7e51e8 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -62,7 +62,7 @@ error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time [as image] ![Image of new error flow](http://www.jonathanturner.org/images/new_errors_3.png) - + *Example of the same borrow check error in the proposed format* From 2eb330bcc5b9cbe8e3735a2e0c29e2178a09353b Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 13:41:08 -0400 Subject: [PATCH 0982/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 61c2d7e51e8..1501beee53e 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -60,9 +60,8 @@ error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time ``` [as image] -![Image of new error flow](http://www.jonathanturner.org/images/new_errors_3.png) - + *Example of the same borrow check error in the proposed format* From 73dd7e274c4cb4d06d95d2ae5eec878ca1af5dec Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Fri, 24 Jun 2016 14:57:50 -0400 Subject: [PATCH 0983/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 1501beee53e..5fd3e4eb8ea 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -96,12 +96,14 @@ I’m trying to track the ownership of the contents of `tail`, which is borrowed 29 |> match tail { -In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. +In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` +value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. 30 |> [Foo { string: aa }, |> ^^ cannot move out of borrowed content -You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix might look this: +You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix +might look this: 30 |> [Foo { string: ref aa }, @@ -203,12 +205,14 @@ I’m trying to track the ownership of the contents of `tail`, which is borrowed 29 |> match tail { -In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. +In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` +value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. 30 |> [Foo { string: aa }, |> ^^ cannot move out of borrowed content -You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix might look this: +You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix +might look this: 30 |> [Foo { string: ref aa }, ``` @@ -263,7 +267,8 @@ But the right argument is: Hint: To append strings in Elm, you need to use the (++) operator, not (+). -Hint: I always figure out the type of the left argument first and if it is acceptable on its own, I assume it is "correct" in subsequent checks. So the problem may actually be in how the left and right arguments interact. +Hint: I always figure out the type of the left argument first and if it is acceptable on its own, I assume it +is "correct" in subsequent checks. So the problem may actually be in how the left and right arguments interact. ``` *Example of an Elm error* From b48ef3700842391b28fb96298d274edeb12edf6c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Fri, 24 Jun 2016 23:41:39 +0200 Subject: [PATCH 0984/1195] @Trait -> impl Trait --- text/0000-conservative-impl-trait.md | 88 +++++++++++++--------------- 1 file changed, 40 insertions(+), 48 deletions(-) diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md index 2f3f6693c2a..37a5b6caa86 100644 --- a/text/0000-conservative-impl-trait.md +++ b/text/0000-conservative-impl-trait.md @@ -141,29 +141,21 @@ with other extensions. ## Syntax -Let's start with the bikeshed: The proposed syntax is `@Trait` in return type -position, composing like trait objects to forms like `@(Foo+Send+'a)`. - -The reason for choosing a sigil is ergonomics: Whatever the exact final feature -will be capable of, you'd want it to be as easy to read/write as trait objects, -or else the more performant and idiomatic option would be the more verbose one, -and thus probably less used. - -The argument can be made this decreases the google-ability of Rust syntax (and -this doesn't even talk about the _old_ `@T` pointer semantic the internet is -still littered with), but this would be somewhat mitigated by the feature being -supposedly used commonly once it lands, and can be explained in the docs as -being short for `abstract` or `anonym`. And in any case, it's a problem we -already suffer with `&T` and `&mut T`. - -If there are good reasons against `@`, there is also the choice of `~`. -All points from above still apply, except `~` is a bit rarer in language -syntaxes in general, and depending on keyboard layout somewhat harder to reach. - -Finally, if there is a huge incentive _against_ new (old?) sigils in the language, -there is also the option of using keyword-based syntax like `impl Trait` or -`abstract Trait`, but this would add a verbosity overhead for a feature -that will be used somewhat commonly. +Let's start with the bikeshed: The proposed syntax is `impl Trait` in return type +position, composing like trait objects to forms like `impl Foo+Send+'a`. + +It can be explained as "a type that implements `Trait`", +and has been used in that form in most earlier discussions and proposals. + +Initial versions of this RFC proposed `@Trait` for brevity reasons, +since the feature is supposed to be used commonly once implemented, +but due to strong negative reactions by the community this has been +changed back to the current form. + +There are other possibilities, like `abstract Trait` or `~Trait`, with +good reasons for or against them, but since the concrete choice of syntax +is not a blocker for the implementation of this RFC, it is intended for +a possible follow-up RFC to address syntax changes if needed. ## Semantics @@ -177,7 +169,7 @@ and the *initial limitations* (which are likely to be lifted later). **Core semantics**: -- If a function returns `@Trait`, its body can return values of any type that +- If a function returns `impl Trait`, its body can return values of any type that implements `Trait`, but all return values need to be of the same type. - As far as the typesystem and the compiler is concerned, the return type @@ -201,11 +193,11 @@ and the *initial limitations* (which are likely to be lifted later). in the module system. This means type equality behaves like this: ```rust - fn foo(t: T) -> @Trait { + fn foo(t: T) -> impl Trait { t } - fn bar() -> @Trait { + fn bar() -> impl Trait { 123 } @@ -213,30 +205,30 @@ and the *initial limitations* (which are likely to be lifted later). equal_type(bar(), bar()); // OK equal_type(foo::(0), foo::(0)); // OK - equal_type(bar(), foo::(0)); // ERROR, `@Trait {bar}` is not the same type as `@Trait {foo}` - equal_type(foo::(false), foo::(0)); // ERROR, `@Trait {foo}` is not the same type as `@Trait {foo}` + equal_type(bar(), foo::(0)); // ERROR, `impl Trait {bar}` is not the same type as `impl Trait {foo}` + equal_type(foo::(false), foo::(0)); // ERROR, `impl Trait {foo}` is not the same type as `impl Trait {foo}` ``` - The code generation passes of the compiler would not draw a distinction between the abstract return type and the underlying type, just like they don't for generic paramters. This means: - - The same trait code would be instantiated, for example, `-> @Any` + - The same trait code would be instantiated, for example, `-> impl Any` would return the type id of the underlying type. - Specialization would specialize based on the underlying type. **Initial limitations**: -- `@Trait` may only be written within the return type of a freestanding or +- `impl Trait` may only be written within the return type of a freestanding or inherent-impl function, not in trait definitions or any non-return type position. They may also not appear in the return type of closure traits or function pointers, unless these are themself part of a legal return type. - Eventually, we will want to allow the feature to be used within traits, and like in argument position as well (as an ergonomic improvement over today's generics). - - Using `@Trait` multiple times in the same return type would be valid, - like for example in `-> (@Foo, @Bar)`. + - Using `impl Trait` multiple times in the same return type would be valid, + like for example in `-> (impl Foo, impl Bar)`. -- The type produced when a function returns `@Trait` would be effectively +- The type produced when a function returns `impl Trait` would be effectively unnameable, just like closures and function items. - We will almost certainly want to lift this limitation in the long run, so @@ -248,7 +240,7 @@ and the *initial limitations* (which are likely to be lifted later). would be forbidden just like on the outside: ```rust - fn sum_to(n: u32) -> @Display { + fn sum_to(n: u32) -> impl Display { if n == 0 { 0 } else { @@ -272,7 +264,7 @@ Trait`.) The design as choosen in this RFC lies somewhat in between those two, since it allows OIBITs to leak through, and allows specialization to "see" the full type -being returned. That is, `@Trait` does not attempt to be a "tightly sealed" +being returned. That is, `impl Trait` does not attempt to be a "tightly sealed" abstraction boundary. The rationale for this design is a mixture of pragmatics and principles. @@ -293,18 +285,18 @@ be prepared to work with any type that meets at least that bound. Again, with specialization, the caller may dispatch on additional type information beyond those bounds. -In other words, to the extent that returning `@Trait` is intended to be +In other words, to the extent that returning `impl Trait` is intended to be symmetric with taking a generic `T: Trait`, transparency with respect to specialization maintains that symmetry. **Pragmatics for specialization transparency**: -The practical reason we want `@Trait` to be transparent to specialization is the +The practical reason we want `impl Trait` to be transparent to specialization is the same as the reason we want specialization in the first place: to be able to break through abstractions with more efficient special-case code. This is particularly important for one of the primary intended usecases: -returning `@Iterator`. We are very likely to employ specialization for various +returning `impl Iterator`. We are very likely to employ specialization for various iterator types, and making the underlying return type invisible to specialization would lose out on those efficiency wins. @@ -363,11 +355,11 @@ abstract types. ### Limitation to only return type position There have been various proposed additional places where abstract types -might be usable. For example, `fn x(y: @Trait)` as shorthand for +might be usable. For example, `fn x(y: impl Trait)` as shorthand for `fn x(y: T)`. Since the exact semantics and user experience for these locations are yet -unclear (`@Trait` would effectively behave completely different before and after +unclear (`impl Trait` would effectively behave completely different before and after the `->`), this has also been excluded from this proposal. ### Type transparency in recursive functions @@ -376,7 +368,7 @@ Functions with abstract return types can not see through their own return type, making code like this not compile: ```rust -fn sum_to(n: u32) -> @Display { +fn sum_to(n: u32) -> impl Display { if n == 0 { 0 } else { @@ -399,7 +391,7 @@ specialization makes it uncertain whether this would be sound. In any case, it can be initially worked around by defining a local helper function like this: ```rust -fn sum_to(n: u32) -> @Display { +fn sum_to(n: u32) -> impl Display { fn sum_to_(n: u32) -> u32 { if n == 0 { 0 @@ -413,13 +405,13 @@ fn sum_to(n: u32) -> @Display { ### Not legal in function pointers/closure traits -Because `@Trait` defines a type tied to the concrete function body, +Because `impl Trait` defines a type tied to the concrete function body, it does not make much sense to talk about it separately in a function signature, so the syntax is forbidden there. ### Compability with conditional trait bounds -On valid critique for the existing `@Trait` proposal is that it does not +On valid critique for the existing `impl Trait` proposal is that it does not cover more complex scenarios, where the return type would implement one or more traits depending on whether a type parameter does so with another. @@ -433,7 +425,7 @@ impl Iterator for SkipOne { ... } impl DoubleEndedIterator for SkipOne { ... } ``` -Using just `-> @Iterator`, this would not be possible to reproduce. +Using just `-> impl Iterator`, this would not be possible to reproduce. Since there has been no proposals so far that would address this in a way that would conflict with the fixed-trait-set case, this RFC punts on that issue as well. @@ -519,12 +511,12 @@ is that it creates a somewhat inconsistent mental model: it forces you to understand the feature in a highly special-cased way, rather than as a general way to talk about unknown-but-bounded types in function signatures. This could be particularly bewildering to newcomers, who must choose between `T: Trait`, -`Box`, and `@Trait`, with the latter only usable in one place. +`Box`, and `impl Trait`, with the latter only usable in one place. ## Drawbacks due to partial transparency -The fact that specialization and OIBITs can "see through" `@Trait` may be -surprising, to the extent that one wants to see `@Trait` as an abstraction +The fact that specialization and OIBITs can "see through" `impl Trait` may be +surprising, to the extent that one wants to see `impl Trait` as an abstraction mechanism. However, as the RFC argued in the rationale section, this design is probably the most consistent with our existing post-specialization abstraction mechanisms, and lead to the relatively simple story that *privacy* is the way to From 1e7f235c69a56a722bdf6133f352034693ca78b3 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Mon, 27 Jun 2016 08:31:06 -0400 Subject: [PATCH 0985/1195] Update 0000-default-and-expanded-rustc-errors.md --- text/0000-default-and-expanded-rustc-errors.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index 5fd3e4eb8ea..bd9a9d3f1dc 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -94,18 +94,18 @@ error: cannot move out of borrowed content I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement: -29 |> match tail { +29 | match tail { In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. -30 |> [Foo { string: aa }, - |> ^^ cannot move out of borrowed content +30 | [Foo { string: aa }, + | ^^ cannot move out of borrowed content You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix might look this: -30 |> [Foo { string: ref aa }, +30 | [Foo { string: ref aa }, ``` @@ -203,18 +203,18 @@ error: cannot move out of borrowed content I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement: -29 |> match tail { +29 | match tail { In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. -30 |> [Foo { string: aa }, - |> ^^ cannot move out of borrowed content +30 | [Foo { string: aa }, + | ^^ cannot move out of borrowed content You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix might look this: -30 |> [Foo { string: ref aa }, +30 | [Foo { string: ref aa }, ``` *Example of an expanded error message* From c9f56b8bb6f2525158de8948d1c8295394bfab25 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 27 Jun 2016 13:51:57 -0700 Subject: [PATCH 0986/1195] RFC 1522 is Mimimal impl Trait --- text/0000-conservative-impl-trait.md | 545 --------------------------- 1 file changed, 545 deletions(-) delete mode 100644 text/0000-conservative-impl-trait.md diff --git a/text/0000-conservative-impl-trait.md b/text/0000-conservative-impl-trait.md deleted file mode 100644 index 37a5b6caa86..00000000000 --- a/text/0000-conservative-impl-trait.md +++ /dev/null @@ -1,545 +0,0 @@ -- Feature Name: conservative_impl_trait -- Start Date: 2016-01-31 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary -[summary]: #summary - -Add a conservative form of abstract return types, aka `impl Trait`, -that will be compatible with most possible future extensions by -initially being restricted to: - -- Only free-standing or inherent functions. -- Only return type position of a function. - -Abstract return types allow a function to hide a concrete return -type behind a trait interface similar to trait objects, while -still generating the same statically dispatched code as with concrete types. - -With the placeholder syntax used in discussions so far, -abstract return types would be used roughly like this: - -```rust -fn foo(n: u32) -> impl Iterator { - (0..n).map(|x| x * 100) -} -// ^ behaves as if it had return type Map, Closure> -// where Closure = type of the |x| x * 100 closure. - -for x in foo(10) { - // x = 0, 100, 200, ... -} - -``` - -# Background - -There has been much discussion around the `impl Trait` feature already, with -different proposals extending the core idea into different directions: - -- The [original proposal](https://github.com/rust-lang/rfcs/pull/105). -- A [blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/) reviving - the proposal and further exploring the design space. -- A [more recent proposal](https://github.com/rust-lang/rfcs/pull/1305) with a - substantially more ambitious scope. - -This RFC is an attempt to make progress on the feature by proposing a minimal -subset that should be forwards-compatible with a whole range of extensions that -have been discussed (and will be reviewed in this RFC). However, even this small -step requires resolving some of the core questions raised in -[the blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/). - -This RFC is closest in spirit to the -[original RFC](https://github.com/rust-lang/rfcs/pull/105), and we'll repeat -its motivation and some other parts of its text below. - -# Motivation -[motivation]: #motivation - -> Why are we doing this? What use cases does it support? What is the expected outcome? - -In today's Rust, you can write a function signature like - -````rust -fn consume_iter_static>(iter: I) -fn consume_iter_dynamic(iter: Box>) -```` - -In both cases, the function does not depend on the exact type of the argument. -The type is held "abstract", and is assumed only to satisfy a trait bound. - -* In the `_static` version using generics, each use of the function is - specialized to a concrete, statically-known type, giving static dispatch, inline - layout, and other performance wins. - -* In the `_dynamic` version using trait objects, the concrete argument type is - only known at runtime using a vtable. - -On the other hand, while you can write - -````rust -fn produce_iter_dynamic() -> Box> -```` - -you _cannot_ write something like - -````rust -fn produce_iter_static() -> Iterator -```` - -That is, in today's Rust, abstract return types can only be written using trait -objects, which can be a significant performance penalty. This RFC proposes -"unboxed abstract types" as a way of achieving signatures like -`produce_iter_static`. Like generics, unboxed abstract types guarantee static -dispatch and inline data layout. - -Here are some problems that unboxed abstract types solve or mitigate: - -* _Returning unboxed closures_. Closure syntax generates an anonymous type - implementing a closure trait. Without unboxed abstract types, there is no way - to use this syntax while returning the resulting closure unboxed, because there - is no way to write the name of the generated type. - -* _Leaky APIs_. Functions can easily leak implementation details in their return - type, when the API should really only promise a trait bound. For example, a - function returning `Rev>` is revealing exactly how the iterator - is constructed, when the function should only promise that it returns _some_ - type implementing `Iterator`. Using newtypes/structs with private fields - helps, but is extra work. Unboxed abstract types make it as easy to promise only - a trait bound as it is to return a concrete type. - -* _Complex types_. Use of iterators in particular can lead to huge types: - - ````rust - Chain>>>, SkipWhile<'a, u16, Map<'a, &u16, u16, slice::Items>>> - ```` - - Even when using newtypes to hide the details, the type still has to be written - out, which can be very painful. Unboxed abstract types only require writing the - trait bound. - -* _Documentation_. In today's Rust, reading the documentation for the `Iterator` - trait is needlessly difficult. Many of the methods return new iterators, but - currently each one returns a different type (`Chain`, `Zip`, `Map`, `Filter`, - etc), and it requires drilling down into each of these types to determine what - kind of iterator they produce. - -In short, unboxed abstract types make it easy for a function signature to -promise nothing more than a trait bound, and do not generally require the -function's author to write down the concrete type implementing the bound. - -# Detailed design -[design]: #detailed-design - -As explained at the start of the RFC, the focus here is a relatively narrow -introduction of abstract types limited to the return type of inherent methods -and free functions. While we still need to resolve some of the core questions -about what an "abstract type" means even in these cases, we avoid some of the -complexities that come along with allowing the feature in other locations or -with other extensions. - -## Syntax - -Let's start with the bikeshed: The proposed syntax is `impl Trait` in return type -position, composing like trait objects to forms like `impl Foo+Send+'a`. - -It can be explained as "a type that implements `Trait`", -and has been used in that form in most earlier discussions and proposals. - -Initial versions of this RFC proposed `@Trait` for brevity reasons, -since the feature is supposed to be used commonly once implemented, -but due to strong negative reactions by the community this has been -changed back to the current form. - -There are other possibilities, like `abstract Trait` or `~Trait`, with -good reasons for or against them, but since the concrete choice of syntax -is not a blocker for the implementation of this RFC, it is intended for -a possible follow-up RFC to address syntax changes if needed. - -## Semantics - -The core semantics of the feature is described below. - -Note that the sections after this one go into more detail on some of the design -decisions, and that **it is likely for many of the mentioned limitations to be -lifted at some point in the future**. For clarity, we'll separately categories the *core -semantics* of the feature (aspects that would stay unchanged with future extensions) -and the *initial limitations* (which are likely to be lifted later). - -**Core semantics**: - -- If a function returns `impl Trait`, its body can return values of any type that - implements `Trait`, but all return values need to be of the same type. - -- As far as the typesystem and the compiler is concerned, the return type - outside of the function would not be a entirely "new" type, nor would it be a - simple type alias. Rather, its semantics would be very similar to that of - _generic type paramters_ inside a function, with small differences caused by - being an _output_ rather than an _input_ of the function. - - - The type would be known to implement the specified traits. - - The type would not be known to implement any other trait, with - the exception of OIBITS (aka "auto traits") and default traits like `Sized`. - - The type would not be considered equal to the actual underlying type. - - The type would not be allowed to appear as the Self type for an `impl` block. - -- Because OIBITS like `Send` and `Sync` will leak through an abstract return - type, there will be some additional complexity in the compiler due to some - non-local type checking becoming necessary. - -- The return type has an identity based on all generic parameters the - function body is parametrized by, and by the location of the function - in the module system. This means type equality behaves like this: - - ```rust - fn foo(t: T) -> impl Trait { - t - } - - fn bar() -> impl Trait { - 123 - } - - fn equal_type(a: T, b: T) {} - - equal_type(bar(), bar()); // OK - equal_type(foo::(0), foo::(0)); // OK - equal_type(bar(), foo::(0)); // ERROR, `impl Trait {bar}` is not the same type as `impl Trait {foo}` - equal_type(foo::(false), foo::(0)); // ERROR, `impl Trait {foo}` is not the same type as `impl Trait {foo}` - ``` - -- The code generation passes of the compiler would not draw a distinction - between the abstract return type and the underlying type, just like they don't - for generic paramters. This means: - - The same trait code would be instantiated, for example, `-> impl Any` - would return the type id of the underlying type. - - Specialization would specialize based on the underlying type. - -**Initial limitations**: - -- `impl Trait` may only be written within the return type of a freestanding or - inherent-impl function, not in trait definitions or any non-return type position. They may also not appear - in the return type of closure traits or function pointers, - unless these are themself part of a legal return type. - - - Eventually, we will want to allow the feature to be used within traits, and - like in argument position as well (as an ergonomic improvement over today's generics). - - Using `impl Trait` multiple times in the same return type would be valid, - like for example in `-> (impl Foo, impl Bar)`. - -- The type produced when a function returns `impl Trait` would be effectively - unnameable, just like closures and function items. - - - We will almost certainly want to lift this limitation in the long run, so - that abstract return types can be placed into structs and so on. There are a - few ways we could do so, all related to getting at the "output type" of a - function given all of its generic arguments. - -- The function body cannot see through its own return type, so code like this - would be forbidden just like on the outside: - - ```rust - fn sum_to(n: u32) -> impl Display { - if n == 0 { - 0 - } else { - n + sum_to(n - 1) - } - } - ``` - - - It's unclear whether we'll want to lift this limitation, but it should be possible to do so. - -## Rationale - -### Why this semantics for the return type? - -There has been a lot of discussion about what the semantics of the return type -should be, with the theoretical extremes being "full return type inference" and -"fully abstract type that behaves like a autogenerated newtype wrapper". (This -was in fact the main focus of the -[blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/) on `impl -Trait`.) - -The design as choosen in this RFC lies somewhat in between those two, since it -allows OIBITs to leak through, and allows specialization to "see" the full type -being returned. That is, `impl Trait` does not attempt to be a "tightly sealed" -abstraction boundary. The rationale for this design is a mixture of pragmatics -and principles. - -#### Specialization transparency - -**Principles for specialization transparency**: - -The [specialization RFC](https://github.com/rust-lang/rfcs/pull/1210) has given -us a basic principle for how to understand bounds in function generics: they -represent a *minimum* contract between the caller and the callee, in that the -caller must meet at least those bounds, and the callee must be prepared to work -with any type that meets at least those bounds. However, with specialization, -the callee may choose different behavior when additional bounds hold. - -This RFC abides by a similar interpretation for return types: the signature -represents the minimum bound that the callee must satisfy, and the caller must -be prepared to work with any type that meets at least that bound. Again, with -specialization, the caller may dispatch on additional type information beyond -those bounds. - -In other words, to the extent that returning `impl Trait` is intended to be -symmetric with taking a generic `T: Trait`, transparency with respect to -specialization maintains that symmetry. - -**Pragmatics for specialization transparency**: - -The practical reason we want `impl Trait` to be transparent to specialization is the -same as the reason we want specialization in the first place: to be able to -break through abstractions with more efficient special-case code. - -This is particularly important for one of the primary intended usecases: -returning `impl Iterator`. We are very likely to employ specialization for various -iterator types, and making the underlying return type invisible to -specialization would lose out on those efficiency wins. - -#### OIBIT transparency - -OIBITs leak through an abstract return type. This might be considered controversial, since -it effectively opens a channel where the result of function-local type inference affects -item-level API, but has been deemed worth it for the following reasons: - -- Ergonomics: Trait objects already have the issue of explicitly needing to - declare `Send`/`Sync`-ability, and not extending this problem to abstract - return types is desireable. In practice, most uses of this feature would have - to add explicit bounds for OIBITS if they wanted to be maximally usable. - -- Low real change, since the situation already somewhat exists on structs with private fields: - - In both cases, a change to the private implementation might change whether a OIBIT is - implemented or not. - - In both cases, the existence of OIBIT impls is not visible without doc tools - - In both cases, you can only assert the existence of OIBIT impls - by adding explicit trait bounds either to the API or to the crate's testsuite. - -In fact, a large part of the point of OIBITs in the first place was to cut -across abstraction barriers and provide information about a type without the -type's author having to explicitly opt in. - -This means, however, that it has to be considered a silent breaking change to -change a function with a abstract return type in a way that removes OIBIT impls, -which might be a problem. (As noted above, this is already the case for `struct` -definitions.) - -But since the number of used OIBITs is relatvly small, deducing the return type -in a function body and reasoning about whether such a breakage will occur has -been deemed as a manageable amount of work. - -#### Wherefore type abstraction? - -In the [most recent RFC](https://github.com/rust-lang/rfcs/pull/1305) related to -this feature, a more "tightly sealed" abstraction mechanism was -proposed. However, part of the discussion on specialization centered on -precisely the issue of what type abstraction provides and how to achieve it. A -particular salient point there is that, in Rust, *privacy* is already our -primary mechanism for hiding -(["privacy is the new parametricity"](https://github.com/rust-lang/rfcs/pull/1210#issuecomment-181992044)). In -practice, that means that if you want opacity against specialization, you should -use something like a newtype. - -### Anonymity - -A abstract return type cannot be named in this proposal, which means that it -cannot be placed into `structs` and so on. This is not a fundamental limitation -in any sense; the limitation is there both to keep this RFC simple, and because -the precise way we might want to allow naming of such types is still a bit -unclear. Some possibilities include a `typeof` operator, or explicit named -abstract types. - -### Limitation to only return type position - -There have been various proposed additional places where abstract types -might be usable. For example, `fn x(y: impl Trait)` as shorthand for -`fn x(y: T)`. - -Since the exact semantics and user experience for these locations are yet -unclear (`impl Trait` would effectively behave completely different before and after -the `->`), this has also been excluded from this proposal. - -### Type transparency in recursive functions - -Functions with abstract return types can not see through their own return type, -making code like this not compile: - -```rust -fn sum_to(n: u32) -> impl Display { - if n == 0 { - 0 - } else { - n + sum_to(n - 1) - } -} -``` - -This limitation exists because it is not clear how much a function body -can and should know about different instantiations of itself. - -It would be safe to allow recursive calls if the set of generic parameters -is identical, and it might even be safe if the generic parameters are different, -since you would still be inside the private body of the function, just -differently instantiated. - -But variance caused by lifetime parameters and the interaction with -specialization makes it uncertain whether this would be sound. - -In any case, it can be initially worked around by defining a local helper function like this: - -```rust -fn sum_to(n: u32) -> impl Display { - fn sum_to_(n: u32) -> u32 { - if n == 0 { - 0 - } else { - n + sum_to_(n - 1) - } - } - sum_to_(n) -} -``` - -### Not legal in function pointers/closure traits - -Because `impl Trait` defines a type tied to the concrete function body, -it does not make much sense to talk about it separately in a function signature, -so the syntax is forbidden there. - -### Compability with conditional trait bounds - -On valid critique for the existing `impl Trait` proposal is that it does not -cover more complex scenarios, where the return type would implement -one or more traits depending on whether a type parameter does so with another. - -For example, a iterator adapter might want to implement `Iterator` and -`DoubleEndedIterator`, depending on whether the adapted one does: - -```rust -fn skip_one(i: I) -> SkipOne { ... } -struct SkipOne { ... } -impl Iterator for SkipOne { ... } -impl DoubleEndedIterator for SkipOne { ... } -``` - -Using just `-> impl Iterator`, this would not be possible to reproduce. - -Since there has been no proposals so far that would address this in a way -that would conflict with the fixed-trait-set case, this RFC punts on that issue as well. - -### Limitation to free/inherent functions - -One important usecase of abstract return types is to use them in trait methods. - -However, there is an issue with this, namely that in combinations with generic -trait methods, they are effectively equivalent to higher kinded types. -Which is an issue because Rust HKT story is not yet figured out, so -any "accidential implementation" might cause unintended fallout. - -HKT allows you to be generic over a type constructor, aka a -"thing with type parameters", and then instantiate them at some later point to -get the actual type. -For example, given a HK type `T` that takes one type as parameter, you could -write code that uses `T` or `T` without caring about -whether `T = Vec`, `T = Box`, etc. - -Now if we look at abstract return types, we have a similar situation: - -```rust -trait Foo { - fn bar() -> impl Baz -} -``` - -Given a `T: Foo`, we could instantiate `T::bar::` or `T::bar::`, -and could get arbitrary different return types of `bar` instantiated -with a `u32` or `bool`, -just like `T` and `T` might give us `Vec` or `Box` -in the example above. - -The problem does not exists with trait method return types today because -they are concrete: - -```rust -trait Foo { - fn bar() -> X -} -``` - -Given the above code, there is no way for `bar` to choose a return type `X` -that could fundamentally differ between instantiations of `Self` -while still being instantiable with an arbitrary `U`. - -At most you could return a associated type, but then you'd loose the generics -from `bar` - -```rust -trait Foo { - type X; - fn bar() -> Self::X // No way to apply U -} -``` - -So, in conclusion, since Rusts HKT story is not yet fleshed out, -and the compatibility of the current compiler with it is unknown, -it is not yet possible to reach a concrete solution here. - -In addition to that, there are also different proposals as to whether -a abstract return type is its own thing or sugar for a associated type, -how it interacts with other associated items and so on, -so forbidding them in traits seems like the best initial course of action. - -# Drawbacks -[drawbacks]: #drawbacks - -> Why should we *not* do this? - -## Drawbacks due to the proposal's minimalism - -As has been elaborated on above, there are various way this feature could be -extended and combined with the language, so implementing it might cause issues -down the road if limitations or incompatibilities become apparent. However, -variations of this RFC's proposal have been under discussion for quite a long -time at this point, and this proposal is carefully designed to be -future-compatible with them, while resolving the core issue around transparency. - -A drawback of limiting the feature to return type position (and not arguments) -is that it creates a somewhat inconsistent mental model: it forces you to -understand the feature in a highly special-cased way, rather than as a general -way to talk about unknown-but-bounded types in function signatures. This could -be particularly bewildering to newcomers, who must choose between `T: Trait`, -`Box`, and `impl Trait`, with the latter only usable in one place. - -## Drawbacks due to partial transparency - -The fact that specialization and OIBITs can "see through" `impl Trait` may be -surprising, to the extent that one wants to see `impl Trait` as an abstraction -mechanism. However, as the RFC argued in the rationale section, this design is -probably the most consistent with our existing post-specialization abstraction -mechanisms, and lead to the relatively simple story that *privacy* is the way to -achieve hiding in Rust. - -# Alternatives -[alternatives]: #alternatives - -> What other designs have been considered? What is the impact of not doing this? - -See the links in the motivation section for detailed analysis that we won't -repeat here. - -But basically, without this feature certain things remain hard or impossible to do -in Rust, like returning a efficiently usable type parametricised by -types private to a function body, for example an iterator adapter containing a closure. - -# Unresolved questions -[unresolved]: #unresolved-questions - -> What parts of the design are still TBD? - -The precise implementation details for OIBIT transparency are a bit unclear: in -general, it means that type checking may need to proceed in a particular order, -since you cannot get the full type information from the signature alone (you -have to typecheck the function body to determine which OIBITs apply). From c445f53fd130ec6107a21d6f3ec7ccfc2854c7bd Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Wed, 29 Jun 2016 15:07:09 +0200 Subject: [PATCH 0987/1195] Remove as_raw and from_raw in favor of a documentation update --- text/0000-atomic-access.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/text/0000-atomic-access.md b/text/0000-atomic-access.md index 128eff40a7a..2085a6d20d4 100644 --- a/text/0000-atomic-access.md +++ b/text/0000-atomic-access.md @@ -1,4 +1,4 @@ -- Feature Name: atomic_access +q- Feature Name: atomic_access - Start Date: 2016-06-15 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -6,17 +6,17 @@ # Summary [summary]: #summary -Add the following methods to atomic types: +This RFC adds the following methods to atomic types: ```rust impl AtomicT { fn get_mut(&mut self) -> &mut T; fn into_inner(self) -> T; - fn as_raw(&self) -> *mut T; - unsafe fn from_raw(ptr: *mut T) -> &AtomicT; } ``` +It also specifies that the layout of an `AtomicT` type is always the same as the underlying `T` type. So, for example, `AtomicI32` is guaranteed to be transmutable to and from `i32`. + # Motivation [motivation]: #motivation @@ -28,33 +28,33 @@ A normal load/store is different from a `load(Relaxed)` or `store(Relaxed)` beca `get_mut` in particular is expected to be useful in `Drop` implementations where you have a `&mut self` and need to read the value of an atomic. `into_inner` somewhat overlaps in functionality with `get_mut`, but it is included to allow extracting the value without requiring the atomic object to be mutable. These methods mirror `Mutex::get_mut` and `Mutex::into_inner`. -## `as_raw` and `from_raw` +## Atomic type layout -These methods are mainly intended to be used for FFI, where a variable of a non-atomic type needs to be modified atomically. The most common example of this is the Linux `futex` system call which takes an `int*` parameter pointing to an integer that is atomically modified by both userspace and the kernel. +The layout guarantee is mainly intended to be used for FFI, where a variable of a non-atomic type needs to be modified atomically. The most common example of this is the Linux `futex` system call which takes an `int*` parameter pointing to an integer that is atomically modified by both userspace and the kernel. -Rust code invoking the `futex` system call so far has simply passed the address of the atomic object directly to the system call. However this makes the assumption that the atomic type has the same layout as the underlying integer type. Using `as_raw` instead makes it clear that the resulting pointer will point to the integer value inside the atomic object. +Rust code invoking the `futex` system call so far has simply passed the address of the atomic object directly to the system call. However this makes the assumption that the atomic type has the same layout as the underlying integer type, which is not currently guaranteed by the documentation. -`from_raw` provides the reverse operation: it allows Rust code to atomically modify a value that was not declared as a atomic type. This is useful when dealing with FFI structs that are shared with a thread managed by a C library. Another example would be to atomically modify a value in a memory mapped file that is shared with another process. +This also allows the reverse operation by casting a pointer: it allows Rust code to atomically modify a value that was not declared as a atomic type. This is useful when dealing with FFI structs that are shared with a thread managed by a C library. Another example would be to atomically modify a value in a memory mapped file that is shared with another process. # Detailed design [design]: #detailed-design -The actual implementations of these functions are mostly trivial since they are based on `UnsafeCell::get`. The only exception is `from_raw` which will cast the given pointer to a different type, but that should also be fine. +The actual implementations of these functions are mostly trivial since they are based on `UnsafeCell::get`. + +The existing implementations of atomic types already have the same layout as the underlying types (even `AtomicBool` and `bool`), so no change is needed here apart from the documentation. # Drawbacks [drawbacks]: #drawbacks The functionality of `into_inner` somewhat overlaps with `get_mut`. -`from_raw` returns an unbounded lifetime. +We lose the ability to change the layout of atomic types, but this shouldn't be necessary since these types map directly to hardware primitives. # Alternatives [alternatives]: #alternatives The functionality of `get_mut` and `into_inner` can be implemented using `load(Relaxed)`, however the latter can result in worse code because it is poorly handled by the optimizer. -The functionality of `as_raw` and `from_raw` could be achieved using transmutes instead, however this requires making assumptions about the internal layout of the atomic types. - # Unresolved questions [unresolved]: #unresolved-questions From 28777a25bea010f406936602daf9ac3579aeb04b Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Fri, 1 Jul 2016 19:21:39 -0700 Subject: [PATCH 0988/1195] Document what happens when writing to a union field that implements Drop --- text/1444-union.md | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/text/1444-union.md b/text/1444-union.md index 82a809cefec..7a3ee23abdc 100644 --- a/text/1444-union.md +++ b/text/1444-union.md @@ -114,6 +114,18 @@ If a union contains multiple fields of different sizes, assigning to a field smaller than the entire union must not change the memory of the union outside that field. +Union fields will normally not implement `Drop`, and by default, declaring a +union with a field type that implements `Drop` will produce a lint warning. +Assigning to a field with a type that implements `Drop` will call `drop()` on +the previous value of that field. This matches the behavior of `struct` fields +that implement `Drop`. To avoid this, such as if interpreting the union's +value via that field and dropping it would produce incorrect behavior, Rust +code can assign to the entire union instead of the field. A union does not +implicitly implement `Drop` even if its field types do. + +The lint warning produced when declaring a union field of a type that +implements `Drop` should document this caveat in its explanatory text. + ## Pattern matching Unsafe code may pattern match on union fields, using the same syntax as a @@ -244,10 +256,12 @@ A union may have trait implementations, using the same `impl` syntax as a struct. The compiler should provide a lint if a union field has a type that implements -the `Drop` trait. The compiler may optionally provide a pragma to disable that -lint, for code that intentionally stores a type with Drop in a union. The -compiler must never implicitly generate a Drop implementation for the union -itself, though Rust code may explicitly implement Drop for a union type. +the `Drop` trait. The explanation for that lint should include an explanation +of the caveat documented in the section "Writing fields". The compiler may +optionally provide a pragma to disable that lint, for code that intentionally +stores a type with Drop in a union. The compiler must never implicitly +generate a Drop implementation for the union itself, though Rust code may +explicitly implement Drop for a union type. ## Generic unions From a80a40efd108aae2a6dde8a26e0b240c0d90556a Mon Sep 17 00:00:00 2001 From: Diggory Blake Date: Sun, 3 Jul 2016 15:45:20 +0100 Subject: [PATCH 0989/1195] First draft --- text/0000-windows-subsystem.md | 96 ++++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) create mode 100644 text/0000-windows-subsystem.md diff --git a/text/0000-windows-subsystem.md b/text/0000-windows-subsystem.md new file mode 100644 index 00000000000..7e9c9bb7317 --- /dev/null +++ b/text/0000-windows-subsystem.md @@ -0,0 +1,96 @@ +- Feature Name: Windows Subsystem +- Start Date: 2016-07-03 +- RFC PR: ____ +- Rust Issue: ____ + +# Summary +[summary]: #summary + +Rust programs compiled for windows will always flash up a console window on +startup. This behavior is controlled via the `SUBSYSTEM` parameter passed to the +linker, and so *can* be overridden with specific compiler flags. However, doing +so will bypass the rust-specific initialization code in `libstd`. + +This RFC proposes supporting this case explicitly, allowing `libstd` to +continue to be initialized correctly. + +# Motivation +[motivation]: #motivation + +The `WINDOWS` subsystem is commonly used on windows: desktop applications +typically do not want to flash up a console window on startup. + +Currently, using the `WINDOWS` subsystem from rust is undocumented, and the +process is non-trivial: + +A new symbol `pub extern "system" WinMain(...)` with specific argument +and return types must be declared, which will become the new entry point for +the program. + +This is unsafe, and will skip the initialization code in `libstd`. + +# Detailed design +[design]: #detailed-design + +When an executable is linked while compiling for a windows target, it will be +linked for a specific *Subsystem*. The subsystem determines how the operating +system will run the executable, and will affect the execution environment of +the program. + +In practice, only two subsystems are very commonly used: `CONSOLE` and +`WINDOWS`, and from a user's perspective, they determine whether a console will +be automatically created when the program is started. + +The solution this RFC proposes is to always export both `main` and `WinMain` +symbols from rust executables compiled for windows. The `WinMain` function +will simply delegate to the `main` function. + +The end result is that rust programs will "just work" when the subsystem is +overridden via custom linker arguments, and does not require `rustc` to +parse those linker arguments. + +A possible future extension would be to add additional command-line options to +`rustc` (and in turn, Cargo.toml) to specify the subsystem directly. `rustc` +would automatically translate this into the correct linker arguments for +whichever linker is actually being used. + +# Drawbacks +[drawbacks]: #drawbacks + +- Additional platform-specific API surface. +- The difficulty of manually calling the rust initialization code is potentially + a more general problem, and this only solves a specific (if common) case. + +# Alternatives +[alternatives]: #alternatives + +- Choosing to emit only `WinMain` or `main` using `cfg` attributes or similar. + + This is problematic because it requires the compiler to know which subsystem + is being compiled for, and all dependencies would have to be compiled for that + specific subsystem. + + Pushing the decision further down the line, making it purely a + link-time/run-time decision reduces the potential for binary incompatibility. + +- Add a `subsystem` function to determine which subsystem was used. + + The `WinMain` function would first set an internal flag, and only then + delegate to the `main` function. + + A function would be added to `std::os::windows`: + + `fn subsystem() -> &'static str` + + This would check the value of the internal flag, and return either `WINDOWS` or + `CONSOLE` depending on which entry point was actually used. + + The `subsystem` function could be used to eg. redirect logging to a file if + the program is being run on the `WINDOWS` subsystem. However, it would return + an incorrect value if the initialization was skipped, such as if used as a + library from an executable written in another language. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From d5da6a278782ab77babdabd402059a34df346ce9 Mon Sep 17 00:00:00 2001 From: Diggory Blake Date: Sun, 3 Jul 2016 15:49:24 +0100 Subject: [PATCH 0990/1195] Fix typo --- text/0000-windows-subsystem.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-windows-subsystem.md b/text/0000-windows-subsystem.md index 7e9c9bb7317..d0505961ed0 100644 --- a/text/0000-windows-subsystem.md +++ b/text/0000-windows-subsystem.md @@ -57,7 +57,7 @@ whichever linker is actually being used. # Drawbacks [drawbacks]: #drawbacks -- Additional platform-specific API surface. +- Additional platform-specific code. - The difficulty of manually calling the rust initialization code is potentially a more general problem, and this only solves a specific (if common) case. From af5a78d7a39d0bc4e9ad2e49885b92d03c123f61 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sun, 3 Jul 2016 23:32:45 -0700 Subject: [PATCH 0991/1195] Specify compiler lint more precisely Make it a "should" rather than a "may optionally", and name it. --- text/1444-union.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/1444-union.md b/text/1444-union.md index 7a3ee23abdc..eea40228bd9 100644 --- a/text/1444-union.md +++ b/text/1444-union.md @@ -257,11 +257,11 @@ struct. The compiler should provide a lint if a union field has a type that implements the `Drop` trait. The explanation for that lint should include an explanation -of the caveat documented in the section "Writing fields". The compiler may -optionally provide a pragma to disable that lint, for code that intentionally -stores a type with Drop in a union. The compiler must never implicitly -generate a Drop implementation for the union itself, though Rust code may -explicitly implement Drop for a union type. +of the caveat documented in the section "Writing fields". The compiler should +allow disabling that lint with `#[allow(union_field_drop)]`, for code that +intentionally stores a type with Drop in a union. The compiler must never +implicitly generate a Drop implementation for the union itself, though Rust +code may explicitly implement Drop for a union type. ## Generic unions From c8248671a3513fd8f2663f3e9161be1c8fea6dbf Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sun, 3 Jul 2016 23:44:53 -0700 Subject: [PATCH 0992/1195] Discuss the alternative for writing to a union field that implements Drop Summarize (and link to) the discussion in the tracking issue. --- text/1444-union.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/1444-union.md b/text/1444-union.md index eea40228bd9..e4502f0c279 100644 --- a/text/1444-union.md +++ b/text/1444-union.md @@ -388,6 +388,17 @@ languages. Union field accesses already require unsafe blocks, which calls attention to them. Calls to unsafe functions use the same syntax as calls to safe functions. +Much discussion in the [tracking issue for +unions](https://github.com/rust-lang/rust/issues/32836) debated whether +assigning to a union field that implements Drop should drop the previous value +of the field. This produces potentially surprising behavior if that field +doesn't currently contain a valid value of that type. However, that behavior +maintains consistency with assignments to struct fields and mutable variables, +which writers of unsafe code must already take into account; the alternative +would add an additional special case for writers of unsafe code. This does +provide further motivation for the lint for union fields implementing Drop; +code that explicitly overrides that lint will need to take this into account. + # Unresolved questions [unresolved]: #unresolved-questions From ad996de9a5a85ae9dc8da51d469b106113d97678 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 5 Jul 2016 21:01:38 +0300 Subject: [PATCH 0993/1195] adjust wording to allow for "small changes" --- text/0000-rustc-bug-fix-procedure.md | 42 ++++++++++++++++++---------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/text/0000-rustc-bug-fix-procedure.md b/text/0000-rustc-bug-fix-procedure.md index 5a9f532c1ff..c1d158a8123 100644 --- a/text/0000-rustc-bug-fix-procedure.md +++ b/text/0000-rustc-bug-fix-procedure.md @@ -184,14 +184,29 @@ considerate to at least notify the authors of affected crates the breaking change. If we can submit PRs to fix the problem, so much the better. -#### What if issuing a warning is too hard? - -It does happen from time to time that it is nigh impossible to isolate -the breaking change so that you can issue warnings. In such cases, the best -strategy is to mitigate: - -1. Issue warnings for subparts of the problem, and reserve the new errors for - the smallest set of cases you can. +#### Is it ever acceptable to go directly to issuing errors? + +Changes that are believed to have negligible impact can go directly to +issuing an error. One rule of thumb would be to check against +`crates.io`: if fewer than 10 **total** affected projects are found +(**not** root errors), we can move straight to an error. In such +cases, we should still make the "breaking change" page as before, and +we should ensure that the error directs users to this page. In other +words, everything should be the same except that users are getting an +error, and not a warning. Moreover, we should submit PRs to the +affected projects (ideally before the PR implementing the change lands +in rustc). + +If the impact is not believed to be negligible (e.g., more than 10 +crates are affected), then warnings are required (unless the compiler +team agrees to grant a special exemption in some particular case). If +implementing warnings is not feasible, then we should make an +aggressive strategy of migrating crates before we land the change so +as to lower the number of affected crates. Here are some techniques +for approaching this scenario: + +1. Issue warnings for subparts of the problem, and reserve the new + errors for the smallest set of cases you can. 2. Try to give a very precise error message that suggests how to fix the problem and directs users to the tracking issue. 3. It may also make sense to layer the fix: @@ -201,13 +216,7 @@ strategy is to mitigate: versions are available *before* the fix lands, so that downstream users can use them. -If you will be issuing a new hard warning, then it is mandatory to at -least notify authors of affected crates which we know -about. Submitting PRs to fix the problem is strongly recommended. If -the impact is too large to make that practical, then we should try -harder to issue warnings or find a way to avoid making the change at -all. - + ### Stabilization After a change is made, we will **stabilize** the change using the same @@ -221,6 +230,9 @@ process that we use for unstable features: - Convert to error: the change should be made into a hard error. - Revert: we should remove the warning and continue to allow the older code to compile. - Defer: can't decide yet, wait longer, or try other strategies. + +Ideally, breaking changes should have landed on the **stable branch** +of the compiler before they are finalized. ### Batching breaking changes to libsyntax From e4a2e373ceb78005563ac8ed6b744d8411bf90b7 Mon Sep 17 00:00:00 2001 From: Diggory Blake Date: Tue, 5 Jul 2016 21:36:18 +0100 Subject: [PATCH 0994/1195] Expand on alternatives and WinMain signature --- text/0000-windows-subsystem.md | 70 ++++++++++++++++++++++++++++++---- 1 file changed, 62 insertions(+), 8 deletions(-) diff --git a/text/0000-windows-subsystem.md b/text/0000-windows-subsystem.md index d0505961ed0..723982a89a5 100644 --- a/text/0000-windows-subsystem.md +++ b/text/0000-windows-subsystem.md @@ -45,12 +45,34 @@ The solution this RFC proposes is to always export both `main` and `WinMain` symbols from rust executables compiled for windows. The `WinMain` function will simply delegate to the `main` function. +The exact signature is: +``` +pub extern "system" WinMain( + hInstance: HINSTANCE, + hPrevInstance: HINSTANCE, + lpCmdLine: LPSTR, + nCmdShow: i32 +) -> i32; +``` + +Where `HINSTANCE` is a pointer-sized opaque handle, and `LPSTR` is a C-style +null terminated string. + +All four parameters are either irrelevant or can be obtained easily through +other means: +- `hInstance` - Can be obtained via `GetModuleHandle`. +- `hPrevInstance` - Is always NULL. +- `lpCmdLine` - `libstd` already provides a function to get command line + arguments. +- `nCmdShow` - Can be obtained via `GetStartupInfo`, although it's not actually + needed any more (the OS will automatically hide/show the first window created). + The end result is that rust programs will "just work" when the subsystem is overridden via custom linker arguments, and does not require `rustc` to parse those linker arguments. A possible future extension would be to add additional command-line options to -`rustc` (and in turn, Cargo.toml) to specify the subsystem directly. `rustc` +`rustc` (and in turn, `Cargo.toml`) to specify the subsystem directly. `rustc` would automatically translate this into the correct linker arguments for whichever linker is actually being used. @@ -60,20 +82,45 @@ whichever linker is actually being used. - Additional platform-specific code. - The difficulty of manually calling the rust initialization code is potentially a more general problem, and this only solves a specific (if common) case. +- This is a breaking change for any crates which already export a `WinMain` + symbol. It is likely that only executable crates would export this symbol, + so the knock-on effect on crate dependencies should be non-existent. + + A possible work-around for this is described below. # Alternatives [alternatives]: #alternatives -- Choosing to emit only `WinMain` or `main` using `cfg` attributes or similar. +- Emit either `WinMain` or `main` from `libstd` based on `cfg` options. + + This has the advantage of not requiring changes to `rustc`, but is something + of a non-starter since it requires a version of `libstd` for each subsystem. + +- Emit either `WinMain` or `main` from `rustc` based on `cfg` options. + + This would not require different versions of `libstd`, but it would require + recompiling all other crates depending on the value of the `cfg` option. - This is problematic because it requires the compiler to know which subsystem - is being compiled for, and all dependencies would have to be compiled for that - specific subsystem. +- Emit either `WinMain` or `main` from `rustc` based on a new command line + option. - Pushing the decision further down the line, making it purely a - link-time/run-time decision reduces the potential for binary incompatibility. + Assuming the command line option need only be specified when compiling the + executable itself, the dependencies would not need to be recompiled were the + subsystem to change. -- Add a `subsystem` function to determine which subsystem was used. + Choosing to emit one or the other means that the compiler and linker must + agree on the subsystem, or else you'll get linker errors. If `rustc` only + specified a `subsystem` to the linker if the option is passed, this would be + a fully backwards compatible change. + + A compiler option is probably desirable in addition to this RFC, but it will + require bike-shedding on the new command line interface, and changes to rustc + to be able to pass on the correct linker flags. + + A similar option would need to be added to `Cargo.toml` to make usage as simple + as possible. + +- Add a `subsystem` function to determine which subsystem was used at runtime. The `WinMain` function would first set an internal flag, and only then delegate to the `main` function. @@ -90,6 +137,13 @@ whichever linker is actually being used. an incorrect value if the initialization was skipped, such as if used as a library from an executable written in another language. +- Use the undocumented MSVC equivalent to weak symbols to avoid breaking + existing code. + + The parameter `/alternatename:_WinMain@16=_RustWinMain@16` can be used to + export `WinMain` only if it is not also exported elsewhere. This is completely + undocumented, but is mentioned here: (http://stackoverflow.com/a/11529277). + # Unresolved questions [unresolved]: #unresolved-questions From fc6499c1184f7e1c573caf9e46499de7c177e910 Mon Sep 17 00:00:00 2001 From: Jonathan Turner Date: Wed, 6 Jul 2016 15:58:57 -0400 Subject: [PATCH 0995/1195] Update RFC with comments and note about upcoming RFC --- .../0000-default-and-expanded-rustc-errors.md | 232 +++++++++++++----- 1 file changed, 166 insertions(+), 66 deletions(-) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/0000-default-and-expanded-rustc-errors.md index bd9a9d3f1dc..ae279b5d9d5 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/0000-default-and-expanded-rustc-errors.md @@ -4,18 +4,27 @@ - Rust Issue: (leave this empty) # Summary -This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust error messages and improve --explain capabilities to focus on the user's code. The end goal is for errors and explain text to be more readable, more friendly to new users, while still helping Rust coders fix bugs as quickly as possible. We expect to follow this RFC with a supplemental RFC that provides a writing style guide for error messages and explain text with a focus on readability and education. +This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust +error messages and improve --explain capabilities to focus on the user's code. The end goal is for +errors and explain text to be more readable, more friendly to new users, while still helping Rust +coders fix bugs as quickly as possible. We expect to follow this RFC with a supplemental RFC that +provides a writing style guide for error messages and explain text with a focus on readability and +education. # Motivation ## Default error format -Rust offers a unique value proposition in the landscape of languages in part by codifying concepts like ownership and borrowing. Because these concepts are unique to Rust, it's critical that the learning curve be as smooth as possible. And one of the most important tools for lowering the learning curve is providing excellent errors that serve to make the concepts less intimidating, and to help 'tell the story' about what those concepts mean in the context of the programmer's code. +Rust offers a unique value proposition in the landscape of languages in part by codifying concepts +like ownership and borrowing. Because these concepts are unique to Rust, it's critical that the +learning curve be as smooth as possible. And one of the most important tools for lowering the +learning curve is providing excellent errors that serve to make the concepts less intimidating, +and to help 'tell the story' about what those concepts mean in the context of the programmer's code. [as text] ``` src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22: 29:30 error: cannot borrow `foo.bar1` as mutable more than once at a time [E0499] -src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1; +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1; ^~~~~~~~ src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22: 29:30 help: run `rustc --explain E0499` to see a detailed explanation src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28:21: 28:29 note: previous borrow of `foo.bar1` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `foo.bar1` until the borrow ends @@ -25,7 +34,7 @@ src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31:2: 31:2 note src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:26 fn borrow_same_field_twice_mut_mut() { src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:27 let mut foo = make_foo(); src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28 let bar1 = &mut foo.bar1; -src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1; +src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1; src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:30 *bar1; src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31 } ^ @@ -36,14 +45,23 @@ src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31 } *Example of a borrow check error in the current compiler* -Though a lot of time has been spent on the current error messages, they have a couple flaws which make them difficult to use. Specifically, the current error format: +Though a lot of time has been spent on the current error messages, they have a couple flaws which +make them difficult to use. Specifically, the current error format: -* Repeats the file position on the left-hand side. This offers no additional information, but instead makes the error harder to read. -* Prints messages about lines often out of order. This makes it difficult for the developer to glance at the error and recognize why the error is occuring -* Lacks a clear visual break between errors. As more errors occur it becomes more difficult to tell them apart. -* Uses technical terminology that is difficult for new users who may be unfamiliar with compiler terminology or terminology specific to Rust. +* Repeats the file position on the left-hand side. This offers no additional information, but +instead makes the error harder to read. +* Prints messages about lines often out of order. This makes it difficult for the developer to +glance at the error and recognize why the error is occuring +* Lacks a clear visual break between errors. As more errors occur it becomes more difficult to tell +them apart. +* Uses technical terminology that is difficult for new users who may be unfamiliar with compiler +terminology or terminology specific to Rust. -This RFC details a redesign of errors to focus more on the source the programmer wrote. This format addresses the above concerns by eliminating clutter, following a more natural order for help messages, and pointing the user to both "what" the error is and "why" the error is occurring by using color-coded labels. Below you can see the same error again, this time using the proposed format: +This RFC details a redesign of errors to focus more on the source the programmer wrote. This format +addresses the above concerns by eliminating clutter, following a more natural order for help +messages, and pointing the user to both "what" the error is and "why" the error is occurring by +using color-coded labels. Below you can see the same error again, this time using the proposed +format: [as text] ``` @@ -67,9 +85,14 @@ error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time ## Expanded error format (revised --explain) -Languages like Elm have shown how effective an educational tool error messages can be if the explanations like our --explain text are mixed with the user's code. As mentioned earlier, it's crucial for Rust to be easy-to-use, especially since it introduces a fair number of concepts that may be unfamiliar to the user. Even experienced users may need to use --explain text from time to time when they encounter unfamiliar messages. +Languages like Elm have shown how effective an educational tool error messages can be if the +explanations like our --explain text are mixed with the user's code. As mentioned earlier, it's +crucial for Rust to be easy-to-use, especially since it introduces a fair number of concepts that +may be unfamiliar to the user. Even experienced users may need to use --explain text from time to +time when they encounter unfamiliar messages. -While we have --explain text today, it uses generic examples that require the user to mentally translate the given example into what works for their specific situation. +While we have --explain text today, it uses generic examples that require the user to mentally +translate the given example into what works for their specific situation. ``` You tried to move out of a value which was borrowed. Erroneous code example: @@ -86,51 +109,67 @@ impl TheDarkKnight { *Example of the current --explain (showing E0507)* -To help users, this RFC proposes a new `--explain errors`. This new mode is more textual error reporting mode that gives additional explanation to help better understand compiler messages. The end result is a richer, on-demand error reporting style. +To help users, this RFC proposes a new `--explain errors`. This new mode is more textual error +reporting mode that gives additional explanation to help better understand compiler messages. The +end result is a richer, on-demand error reporting style. ``` error: cannot move out of borrowed content --> /Users/jturner/Source/errors/borrowck-move-out-of-vec-tail.rs:30:17 -I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement: +I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match +statement: 29 | match tail { -In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` -value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. +In this match, you use an expression of the form [...]. When you do this, it’s like you are opening +up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move +the contents. 30 | [Foo { string: aa }, | ^^ cannot move out of borrowed content -You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix -might look this: +You can avoid moving the contents out by working with each part using a reference rather than a +move. A naive fix might look this: 30 | [Foo { string: ref aa }, - + ``` # Detailed design -The RFC is separated into two parts: the format of error messages and the format of expanded error messages (using `--explain errors`). +The RFC is separated into two parts: the format of error messages and the format of expanded error +messages (using `--explain errors`). ## Format of error messages -The proposal is a lighter error format focused on the code the user wrote. Messages that help understand why an error occurred appear as labels on the source. The goals of this new format are to: +The proposal is a lighter error format focused on the code the user wrote. Messages that help +understand why an error occurred appear as labels on the source. The goals of this new format are +to: * Create something that's visually easy to parse * Remove noise/unnecessary information -* Present information in a way that works well for new developers, post-onboarding, and experienced developers without special configuration -* Draw inspiration from Elm as well as Dybuk and other systems that have already improved on the kind of errors that Rust has. +* Present information in a way that works well for new developers, post-onboarding, and experienced +developers without special configuration +* Draw inspiration from Elm as well as Dybuk and other systems that have already improved on the +kind of errors that Rust has. -In order to accomplish this, the proposed design needs to satisfy a number of constraints to make the result maximally flexible across various terminals: +In order to accomplish this, the proposed design needs to satisfy a number of constraints to make +the result maximally flexible across various terminals: * Multiple errors beside each other should be clearly separate and not muddled together. -* Each error message should draw the eye to where the error occurs with sufficient context to understand why the error occurs. +* Each error message should draw the eye to where the error occurs with sufficient context to +understand why the error occurs. * Each error should have a "header" section that is visually distinct from the code section. -* Code should visually stand out from text and other error messages. This allows the developer to immediately recognize their code. -* Error messages should be just as readable when not using colors (eg for users of black-and-white terminals, color-impaired readers, weird color schemes that we can't predict, or just people that turn colors off) -* Be careful using “ascii art” and avoid unicode. Instead look for ways to show the information concisely that will work across the broadest number of terminals. We expect IDEs to possibly allow for a more graphical error in the future. -* Where possible, use labels on the source itself rather than sentence "notes" at the end. +* Code should visually stand out from text and other error messages. This allows the developer to +immediately recognize their code. +* Error messages should be just as readable when not using colors (eg for users of black-and-white +terminals, color-impaired readers, weird color schemes that we can't predict, or just people that +turn colors off) +* Be careful using “ascii art” and avoid unicode. Instead look for ways to show the information +concisely that will work across the broadest number of terminals. We expect IDEs to possibly allow +for a more graphical error in the future. +* Where possible, use labels on the source itself rather than sentence "notes" at the end. * Keep filename:line easy to spot for people who use editors that let them click on errors ### Header @@ -140,7 +179,9 @@ error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time --> src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22 ``` -The header still serves the original purpose of knowing: a) if it's a warning or error, b) the text of the warning/error, and c) the location of this warning/error. We keep the error code, now a part of the error indicator, as a way to help improve search results. +The header still serves the original purpose of knowing: a) if it's a warning or error, b) the text +of the warning/error, and c) the location of this warning/error. We keep the error code, now a part +of the error indicator, as a way to help improve search results. ### Line number column @@ -155,11 +196,15 @@ The header still serves the original purpose of knowing: a) if it's a warning or | ``` -The line number column lets you know where the error is occurring in the file. Because we only show lines that are of interest for the given error/warning, we elide lines if they are not annotated as part of the message (we currently use the heuristic to elide after one un-annotated line). +The line number column lets you know where the error is occurring in the file. Because we only show +lines that are of interest for the given error/warning, we elide lines if they are not annotated as +part of the message (we currently use the heuristic to elide after one un-annotated line). -Inspired by Dybuk and Elm, the line numbers are separated with a 'wall', a separator formed from pipe('|') characters, to clearly distinguish what is a line number from what is source at a glance. +Inspired by Dybuk and Elm, the line numbers are separated with a 'wall', a separator formed from +pipe('|') characters, to clearly distinguish what is a line number from what is source at a glance. -As the wall also forms a way to visually separate distinct errors, we propose extending this concept to also support span-less notes and hints. For example: +As the wall also forms a way to visually separate distinct errors, we propose extending this concept +to also support span-less notes and hints. For example: ``` 92 | config.target_dir(&pkg) @@ -179,49 +224,75 @@ As the wall also forms a way to visually separate distinct errors, we propose ex - first borrow ends here ``` -The source area shows the related source code for the error/warning. The source is laid out in the order it appears in the source file, giving the user a way to map the message against the source they wrote. +The source area shows the related source code for the error/warning. The source is laid out in the +order it appears in the source file, giving the user a way to map the message against the source +they wrote. -Key parts of the code are labeled with messages to help the user understand the message. +Key parts of the code are labeled with messages to help the user understand the message. -The primary label is the label associated with the main warning/error. It explains the **what** of the compiler message. By reading it, the user can begin to understand what the root cause of the error or warning is. This label is colored to match the level of the message (yellow for warning, red for error) and uses the ^^^ underline. +The primary label is the label associated with the main warning/error. It explains the **what** of +the compiler message. By reading it, the user can begin to understand what the root cause of the +error or warning is. This label is colored to match the level of the message (yellow for warning, +red for error) and uses the ^^^ underline. -Secondary labels help to understand the error and use blue text and --- underline. These labels explain the **why** of the compiler message. You can see one such example in the above message where the secondary labels explain that there is already another borrow going on. In another example, we see another way that primary and secondary work together to tell the whole story for why the error occurred. +Secondary labels help to understand the error and use blue text and --- underline. These labels +explain the **why** of the compiler message. You can see one such example in the above message +where the secondary labels explain that there is already another borrow going on. In another +example, we see another way that primary and secondary work together to tell the whole story for +why the error occurred. -Taken together, primary and secondary labels create a 'flow' to the message. Flow in the message lets the user glance at the colored labels and quickly form an educated guess as to how to correctly update their code. +Taken together, primary and secondary labels create a 'flow' to the message. Flow in the message +lets the user glance at the colored labels and quickly form an educated guess as to how to correctly +update their code. -Note: We'll talk more about additional style guidance for wording to help create flow in the subsequent style RFC. +Note: We'll talk more about additional style guidance for wording to help create flow in the +subsequent style RFC. ## Expanded error messages -Currently, --explain text focuses on the error code. You invoke the compiler with --explain and receive a verbose description of what causes errors of that number. The resulting message can be helpful, but it uses generic sample code which makes it feel less connected to the user's code. +Currently, --explain text focuses on the error code. You invoke the compiler with --explain + and receive a verbose description of what causes errors of that number. The resulting +message can be helpful, but it uses generic sample code which makes it feel less connected to the +user's code. -We propose adding a new `--explain errors`. By passing this to the compiler (or to cargo), the compiler will switch to an expanded error form which incorporates the same source and label information the user saw in the default message with more explanation text. +We propose adding a new `--explain errors`. By passing this to the compiler (or to cargo), the +compiler will switch to an expanded error form which incorporates the same source and label +information the user saw in the default message with more explanation text. ``` error: cannot move out of borrowed content --> /Users/jturner/Source/errors/borrowck-move-out-of-vec-tail.rs:30:17 -I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement: +I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match +statement: 29 | match tail { -In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail` -value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents. +In this match, you use an expression of the form [...]. When you do this, it’s like you are opening +up the `tail` value and taking out its contents. Because `tail` is borrowed, you can’t safely move +the contents. 30 | [Foo { string: aa }, | ^^ cannot move out of borrowed content -You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix -might look this: +You can avoid moving the contents out by working with each part using a reference rather than a +move. A naive fix might look this: 30 | [Foo { string: ref aa }, ``` *Example of an expanded error message* -The expanded error message effectively becomes a template. The text of the template is the educational text that is explaining the message more more detail. The template is then populated using the source lines, labels, and spans from the same compiler message that's printed in the default mode. This lets the message writer call out each label or span as appropriate in the expanded text. +The expanded error message effectively becomes a template. The text of the template is the +educational text that is explaining the message more more detail. The template is then populated +using the source lines, labels, and spans from the same compiler message that's printed in the +default mode. This lets the message writer call out each label or span as appropriate in the +expanded text. -It's possible to also add additional labels that aren't necessarily shown in the default error mode but would be available in the expanded error format. This gives the explain text writer maximal flexibility without impacting the readability of the default message. I'm currently prototyping an implementation of how this templating could work in practice. +It's possible to also add additional labels that aren't necessarily shown in the default error mode +but would be available in the expanded error format. This gives the explain text writer maximal +flexibility without impacting the readability of the default message. I'm currently prototyping an +implementation of how this templating could work in practice. ## Tying it together @@ -239,52 +310,81 @@ note: compile failed due to 2 errors. You can compile again with `--explain erro # Drawbacks -Changes in the error format can impact integration with other tools. For example, IDEs that use a simple regex to detect the error would need to be updated to support the new format. This takes time and community coordination. +Changes in the error format can impact integration with other tools. For example, IDEs that use a +simple regex to detect the error would need to be updated to support the new format. This takes +time and community coordination. -While the new error format has a lot of benefits, it's possible that some errors will feel "shoehorned" into it and, even after careful selection of secondary labels, may still not read as well as the original format. +While the new error format has a lot of benefits, it's possible that some errors will feel +"shoehorned" into it and, even after careful selection of secondary labels, may still not read as +well as the original format. -There is a fair amount of work involved to update the errors and explain text to the proposed format. +There is a fair amount of work involved to update the errors and explain text to the proposed +format. # Alternatives -Rather than using the proposed error format format, we could only provide the verbose --explain style that is proposed in this RFC. Respected programmers like [John Carmack](https://twitter.com/ID_AA_Carmack/status/735197548034412546) have praised the Elm error format. +Rather than using the proposed error format format, we could only provide the verbose --explain +style that is proposed in this RFC. Respected programmers like +[John Carmack](https://twitter.com/ID_AA_Carmack/status/735197548034412546) have praised the Elm +error format. ``` Detected errors in 1 module. --- TYPE MISMATCH --------------------------------------------------------------- -The right argument of (+) is causing a type mismatch. +-- TYPE MISMATCH --------------------------------------------------------------- +The right argument of (+) is causing a type mismatch. 25| model + "1" - ^^^ + ^^^ (+) is expecting the right argument to be a: - number + number But the right argument is: - String + String -Hint: To append strings in Elm, you need to use the (++) operator, not (+). +Hint: To append strings in Elm, you need to use the (++) operator, not (+). + -Hint: I always figure out the type of the left argument first and if it is acceptable on its own, I assume it -is "correct" in subsequent checks. So the problem may actually be in how the left and right arguments interact. +Hint: I always figure out the type of the left argument first and if it is acceptable on its own, I +assume it is "correct" in subsequent checks. So the problem may actually be in how the left and +right arguments interact. ``` *Example of an Elm error* -In developing this RFC, we experimented with both styles. The Elm error format is great as an educational tool, and we wanted to leverage its style in Rust. For day-to-day work, though, we favor an error format that puts heavy emphasis on quickly guiding the user to what the error is and why it occurred, with an easy way to get the richer explanations (using --explain) when user wants them. +In developing this RFC, we experimented with both styles. The Elm error format is great as an +educational tool, and we wanted to leverage its style in Rust. For day-to-day work, though, we +favor an error format that puts heavy emphasis on quickly guiding the user to what the error is and +why it occurred, with an easy way to get the richer explanations (using --explain) when the user +wants them. # Stabilization -Currently, these new rust error format is available on nightly using the ```export RUST_NEW_ERROR_FORMAT=true``` environment variable. Ultimately, this should become the default. In order to get there, we need to ensure that the new error format is indeed an improvement over the existing format in practice. +Currently, this new rust error format is available on nightly using the +```export RUST_NEW_ERROR_FORMAT=true``` environment variable. Ultimately, this should become the +default. In order to get there, we need to ensure that the new error format is indeed an +improvement over the existing format in practice. + +We also have not yet implemented the extended error format. This format will also be gated by its +own flag while we explore and stabilize it. Because of the relative difference in maturity here, +the default error message will be behind a flag for a cycle before it becomes default. The extended +error format will be implemented and a follow-up RFC will be posted describing its design. This will +start its stabilization period, after which time it too will be enabled. -How do we measure the readability of error messages? This RFC details an educated guess as to what would improve the current state but shows no ways to measure success. +How do we measure the readability of error messages? This RFC details an educated guess as to what +would improve the current state but shows no ways to measure success. -Likewise, While some of us have been dogfooding these errors, we don't know what long-term use feels like. For example, after a time does the use of color feel excessive? We can always update the errors as we go, but it'd be helpful to catch it early if possible. +Likewise, while some of us have been dogfooding these errors, we don't know what long-term use feels +like. For example, after a time does the use of color feel excessive? We can always update the +errors as we go, but it'd be helpful to catch it early if possible. # Unresolved questions There are a few unresolved questions: -* Editors that rely on pattern-matching the compiler output will need to be updated. It's an open question how best to transition to using the new errors. There is on-going discussion of standardizing the JSON output, which could also be used. -* Can additional error notes be shown without the "rainbow problem" where too many colors and too much boldness cause errors to become less readable? +* Editors that rely on pattern-matching the compiler output will need to be updated. It's an open +question how best to transition to using the new errors. There is on-going discussion of +standardizing the JSON output, which could also be used. +* Can additional error notes be shown without the "rainbow problem" where too many colors and too +much boldness cause errors to become less readable? From 66ae85c11db03faa323393eeed44bac724b7db53 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 6 Jul 2016 19:00:40 -0400 Subject: [PATCH 0996/1195] Remove text relating to https://github.com/rust-lang/rfcs/pull/1574/files#r59055319 --- ...0000-more-api-documentation-conventions.md | 22 ------------------- 1 file changed, 22 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index 4d6cf094b8d..fe4891e9063 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -95,8 +95,6 @@ Everything should have examples. Here is an example of how to do examples: ``` /// # Examples /// -/// Basic usage: -/// /// ``` /// use op; /// @@ -114,9 +112,6 @@ Everything should have examples. Here is an example of how to do examples: /// ``` ``` -For particularly simple APIs, still say “Examples” and “Basic usage:” for -consistency’s sake. - ### Referring to types [referring-to-types]: #referring-to-types @@ -209,8 +204,6 @@ pub mod option; /// /// # Examples /// -/// Basic usage: -/// /// ``` /// extern crate ref_slice; /// use ref_slice::ref_slice; @@ -255,8 +248,6 @@ mod mut { /// /// # Examples /// - /// Basic usage: - /// /// ``` /// extern crate ref_slice; /// use ref_slice::mut; @@ -291,8 +282,6 @@ in `option.rs`: /// /// # Examples /// -/// Basic usage: -/// /// ``` /// extern crate ref_slice; /// use ref_slice::option; @@ -502,8 +491,6 @@ Everything should have examples. Here is an example of how to do examples: ``` /// # Examples /// -/// Basic usage: -/// /// ``` /// use op; /// @@ -521,9 +508,6 @@ Everything should have examples. Here is an example of how to do examples: /// ``` ``` -For particularly simple APIs, still say “Examples” and “Basic usage:” for -consistency’s sake. - ### Referring to types [referring-to-types]: #referring-to-types @@ -616,8 +600,6 @@ pub mod option; /// /// # Examples /// -/// Basic usage: -/// /// ``` /// extern crate ref_slice; /// use ref_slice::ref_slice; @@ -662,8 +644,6 @@ mod mut { /// /// # Examples /// - /// Basic usage: - /// /// ``` /// extern crate ref_slice; /// use ref_slice::mut; @@ -698,8 +678,6 @@ in `option.rs`: /// /// # Examples /// -/// Basic usage: -/// /// ``` /// extern crate ref_slice; /// use ref_slice::option; From 106fba82e8594aac2a9f8cecd8c55cfb83d38ea8 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 6 Jul 2016 19:01:20 -0400 Subject: [PATCH 0997/1195] remove text relating to https://github.com/rust-lang/rfcs/pull/1574/files#r67898732 --- ...0000-more-api-documentation-conventions.md | 34 ------------------- 1 file changed, 34 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index fe4891e9063..4f52591b99c 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -122,23 +122,6 @@ rather than `Cow<'a, B> where B: 'a + ToOwned + ?Sized`. Another possibility is to write in lower case using a more generic term. In other words, ‘string’ can refer to a `String` or an `&str`, and ‘an option’ can be ‘an `Option`’. -### Use parentheses for functions -[use-parentheses-for-functions]: #use-parentheses-for-functions - -When referring to function names, include the `()`s after the name. For example, do this: - -```rust -/// This behavior is similar to the way that `mem::replace()` works. -``` - -Not this: - -```rust -/// This behavior is similar to the way that `mem::replace` works. -``` - -This helps visually differentiate it in the text. - ### Link all the things [link-all-the-things]: #link-all-the-things @@ -518,23 +501,6 @@ rather than `Cow<'a, B> where B: 'a + ToOwned + ?Sized`. Another possibility is to write in lower case using a more generic term. In other words, ‘string’ can refer to a `String` or an `&str`, and ‘an option’ can be ‘an `Option`’. -### Use parentheses for functions -[use-parentheses-for-functions]: #use-parentheses-for-functions - -When referring to function names, include the `()`s after the name. For example, do this: - -```rust -/// This behavior is similar to the way that `mem::replace()` works. -``` - -Not this: - -```rust -/// This behavior is similar to the way that `mem::replace` works. -``` - -This helps visually differentiate it in the text. - ### Link all the things [link-all-the-things]: #link-all-the-things From f9e4f1452ad6053b19db21cf0e7accb262c5f475 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 6 Jul 2016 19:02:06 -0400 Subject: [PATCH 0998/1195] Remove text relating to https://github.com/rust-lang/rfcs/pull/1574/files#r67826514 Personal note: I don't care at all what ESR thinks, but this was already redundant due to saying American English is the standard. --- text/0000-more-api-documentation-conventions.md | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/text/0000-more-api-documentation-conventions.md b/text/0000-more-api-documentation-conventions.md index 4f52591b99c..178ac66bdc2 100644 --- a/text/0000-more-api-documentation-conventions.md +++ b/text/0000-more-api-documentation-conventions.md @@ -29,12 +29,6 @@ but it tries to motivate and clarify them. This section applies to `rustc` and the standard library. -An additional suggestion over RFC 505: One specific rule that comes up often: -when quoting something for emphasis, use a single quote, and put punctuation -outside the quotes, ‘this’. When quoting something at length, “use double -quotes and put the punctuation inside of the quote.” Most documentation will -end up using single quotes, so if you’re not sure, just stick with them. - ### Using Markdown [using-markdown]: #using-markdown @@ -332,12 +326,6 @@ with regards to spelling, grammar, and punctuation conventions. Language changes over time, so this doesn’t mean that there is always a correct answer to every grammar question, but there is often some kind of formal consensus. -One specific rule that comes up often: when quoting something for emphasis, -use a single quote, and put punctuation outside the quotes, ‘this’. When -quoting something at length, “use double quotes and put the punctuation -inside of the quote.” Most documentation will end up using single quotes, -so if you’re not sure, just stick with them. - ### Use line comments [use-line-comments]: #use-line-comments From 56c73391e12a653344953e6696f6f7268f8f4c06 Mon Sep 17 00:00:00 2001 From: Andrew Cann Date: Thu, 7 Jul 2016 19:29:56 +0800 Subject: [PATCH 0999/1195] Small grammar/typo fixes --- text/0000-bang-type.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-bang-type.md b/text/0000-bang-type.md index 61330efa40c..13db77cf612 100644 --- a/text/0000-bang-type.md +++ b/text/0000-bang-type.md @@ -121,7 +121,7 @@ fn main() { ``` This RFC proposes that we allow `!` to be used directly, as a type, rather than -using `Never` (or equivalent) in it's place. Under this RFC, the above code +using `Never` (or equivalent) in its place. Under this RFC, the above code could more simply be written. ```rust @@ -309,7 +309,7 @@ history: in C `void` is in essence a type like any other. However it can't be used in all the normal positions where a type can be used. This breaks generic code (eg. `T foo(); T val = foo()` where `T == void`) and forces one to use workarounds such as defining `struct Void {}` and wrapping `void`-returning -functions: +functions. In the early days of programming having a type that contained no data probably seemed pointless. After all, there's no point in having a `void` typed function From aa9fbb6db9ba39f62399056cbd502d92c7dd8622 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 11 Jul 2016 12:33:25 +0200 Subject: [PATCH 1000/1195] add links to RFC PR and tracking issue in Rust repo. --- text/0000-dropck-param-eyepatch.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-dropck-param-eyepatch.md b/text/0000-dropck-param-eyepatch.md index c5e5c6eedb0..d6fe2fbc3be 100644 --- a/text/0000-dropck-param-eyepatch.md +++ b/text/0000-dropck-param-eyepatch.md @@ -1,7 +1,7 @@ - Feature Name: dropck_eyepatch - Start Date: 2015-10-19 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1327](https://github.com/rust-lang/rfcs/pull/1327) +- Rust Issue: [rust-lang/rust#34761](https://github.com/rust-lang/rust/issues/34761) # Summary [summary]: #summary From 82afb34bd73ca4c3a555d533d5d25fb8659f60c1 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 11 Jul 2016 12:34:12 +0200 Subject: [PATCH 1001/1195] rename file with RFC PR. --- ...000-dropck-param-eyepatch.md => 1327-dropck-param-eyepatch.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-dropck-param-eyepatch.md => 1327-dropck-param-eyepatch.md} (100%) diff --git a/text/0000-dropck-param-eyepatch.md b/text/1327-dropck-param-eyepatch.md similarity index 100% rename from text/0000-dropck-param-eyepatch.md rename to text/1327-dropck-param-eyepatch.md From 50cec6c97f5242e1bcb7187f6449aaeb85fbb991 Mon Sep 17 00:00:00 2001 From: "Felix S. Klock II" Date: Mon, 11 Jul 2016 12:39:36 +0200 Subject: [PATCH 1002/1195] Explicitly note that RFC solution is meant to be a short term temporary fix. --- text/1327-dropck-param-eyepatch.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/text/1327-dropck-param-eyepatch.md b/text/1327-dropck-param-eyepatch.md index 06f5c2b8d3c..15de93c8420 100644 --- a/text/1327-dropck-param-eyepatch.md +++ b/text/1327-dropck-param-eyepatch.md @@ -22,6 +22,11 @@ holds data that must not be accessed during the dynamic extent of that As a side-effect, enable adding attributes to the formal declarations of generic type and lifetime parameters. +The proposal in this RFC is intended as a *temporary* solution (along +the lines of `#[fundamental]` and *will not* be stabilized +as-is. Instead, we anticipate a more comprehensive approach to be +proposed in a follow-up RFC. + [RFC 1238]: https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md [RFC 769]: https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md @@ -140,6 +145,13 @@ storage for [cyclic graph structures][dropck_legal_cycles.rs]). # Detailed design [detailed design]: #detailed-design +First off: The proposal in this RFC is intended as a *temporary* +solution (along the lines of `#[fundamental]` and *will not* be +stabilized as-is. Instead, we anticipate a more comprehensive approach +to be proposed in a follow-up RFC. + +Having said that, here is the proposed short-term solution: + 1. Add the ability to attach attributes to syntax that binds formal lifetime or type parmeters. For the purposes of this RFC, the only place in the syntax that requires such attributes are `impl` @@ -462,6 +474,10 @@ reflected in what he wrote in the [RFC 1238 alternatives][].) # Alternatives [alternatives]: #alternatives +Note: The alternatives section for this RFC is particularly +note-worthy because the ideas here may serve as the basis for a more +comprehensive long-term approach. + ## Make dropck "see again" via (focused) where-clauses The idea is that we keep the UGEH attribute, blunt hammer that it is. From fd21c143ff2ac936ea6406c8da6307e195d9de95 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Thu, 14 Jul 2016 12:15:43 -0400 Subject: [PATCH 1003/1195] RFC 1574 is "More API documentation conventions" --- ...-conventions.md => 1574-more-api-documentation-conventions.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-more-api-documentation-conventions.md => 1574-more-api-documentation-conventions.md} (100%) diff --git a/text/0000-more-api-documentation-conventions.md b/text/1574-more-api-documentation-conventions.md similarity index 100% rename from text/0000-more-api-documentation-conventions.md rename to text/1574-more-api-documentation-conventions.md From a02c50ef28caf14e041ecb166698a571805b7d79 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 14 Jul 2016 11:55:33 -0700 Subject: [PATCH 1004/1195] RFC 1644 is new rustc errors --- ...errors.md => 1644-default-and-expanded-rustc-errors.md} | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) rename text/{0000-default-and-expanded-rustc-errors.md => 1644-default-and-expanded-rustc-errors.md} (98%) diff --git a/text/0000-default-and-expanded-rustc-errors.md b/text/1644-default-and-expanded-rustc-errors.md similarity index 98% rename from text/0000-default-and-expanded-rustc-errors.md rename to text/1644-default-and-expanded-rustc-errors.md index ae279b5d9d5..5bf1816baac 100644 --- a/text/0000-default-and-expanded-rustc-errors.md +++ b/text/1644-default-and-expanded-rustc-errors.md @@ -1,7 +1,8 @@ -- Feature Name: default_and_expanded_errors_for_rustc +- Feature Name: `default_and_expanded_errors_for_rustc` - Start Date: 2016-06-07 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1644](https://github.com/rust-lang/rfcs/pull/1644) +- Rust Issue: [rust-lang/rust#34826](https://github.com/rust-lang/rust/issues/34826) + [rust-lang/rust#34827](https://github.com/rust-lang/rust/issues/34827) # Summary This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust From 12071f3f83e2195bdbc3fdd7e7b10564055ab4ee Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Fri, 15 Jul 2016 20:41:54 +0000 Subject: [PATCH 1005/1195] Rewrite to overload existing methods. --- text/0000-panic-safe-slicing.md | 105 ++++++++++++++++++-------------- 1 file changed, 60 insertions(+), 45 deletions(-) diff --git a/text/0000-panic-safe-slicing.md b/text/0000-panic-safe-slicing.md index 4794d491dee..c4e4b3e1bad 100644 --- a/text/0000-panic-safe-slicing.md +++ b/text/0000-panic-safe-slicing.md @@ -15,75 +15,90 @@ or `a[..end]`. This RFC proposes such methods to fill the gap. # Detailed design -Add `get_range`, `get_range_mut`, `get_range_unchecked`, `get_range_unchecked_mut` to `SliceExt`. - -`get_range` and `get_range_mut` may be implemented roughly as follows: - +Introduce a `SliceIndex` trait which is implemented by types which can index into a slice: ```rust -use std::ops::{RangeFrom, RangeTo, Range}; -use std::slice::from_raw_parts; -use core::slice::SliceExt; - -trait Rangeable { - fn start(&self, slice: &T) -> usize; - fn end(&self, slice: &T) -> usize; -} +pub trait SliceIndex { + type Output: ?Sized; -impl Rangeable for RangeFrom { - fn start(&self, _: &T) -> usize { self.start } - fn end(&self, slice: &T) -> usize { slice.len() } + fn get(self, slice: &[T]) -> Option<&Self::Output>; + fn get_mut(self, slice: &mut [T]) -> Option<&mut Self::Output>; + unsafe fn get_unchecked(self, slice: &[T]) -> &Self::Output; + unsafe fn get_mut_unchecked(self, slice: &[T]) -> &mut Self::Output; } -impl Rangeable for RangeTo { - fn start(&self, _: &T) -> usize { 0 } - fn end(&self, _: &T) -> usize { self.end } +impl SliceIndex for usize { + type Output = T; + // ... } -impl Rangeable for Range { - fn start(&self, _: &T) -> usize { self.start } - fn end(&self, _: &T) -> usize { self.end } +impl SliceIndex for R + where R: RangeArgument +{ + type Output = [T]; + // ... } +``` -trait GetRangeExt: SliceExt { - fn get_range>(&self, range: R) -> Option<&[Self::Item]>; -} +Alter the `Index`, `IndexMut`, `get`, `get_mut`, `get_unchecked`, and `get_mut_unchecked` +implementations to be generic over `SliceIndex`: +```rust +impl [T] { + pub fn get(&self, idx: I) -> Option + where I: SliceIndex + { + idx.get(self) + } -impl GetRangeExt for [T] { - fn get_range>(&self, range: R) -> Option<&[T]> { - let start = range.start(self); - let end = range.end(self); + pub fn get_mut(&mut self, idx: I) -> Option + where I: SliceIndex + { + idx.get_mut(self) + } - if start > end { return None; } - if end > self.len() { return None; } + pub unsafe fn get_unchecked(&self, idx: I) -> I::Output + where I: SliceIndex + { + idx.get_unchecked(self) + } - unsafe { Some(from_raw_parts(self.as_ptr().offset(start as isize), end - start)) } + pub unsafe fn get_mut_unchecked(&mut self, idx: I) -> I::Output + where I: SliceIndex + { + idx.get_mut_unchecked(self) } } -fn main() { - let a = [1, 2, 3, 4, 5]; +impl Index for [T] + where I: SliceIndex +{ + type Output = I::Output; - assert_eq!(a.get_range(1..), Some(&a[1..])); - assert_eq!(a.get_range(..3), Some(&a[..3])); - assert_eq!(a.get_range(2..5), Some(&a[2..5])); - assert_eq!(a.get_range(..6), None); - assert_eq!(a.get_range(4..2), None); + fn index(&self, idx: I) -> &I::Output { + self.get(idx).expect("out of bounds slice access") + } } -``` -`get_range_unchecked` and `get_range_unchecked_mut` should be the unchecked versions of the methods -above. +impl IndexMut for [T] + where I: SliceIndex +{ + fn index_mut(&self, idx: I) -> &mut I::Output { + self.get_mut(idx).expect("out of bounds slice access") + } +} +``` # Drawbacks -- Are these methods worth adding to `std`? Are such use cases common to justify such extention? +- The `SliceIndex` trait is unfortunate - it's tuned for exactly the set of methods it's used by. + It only exists because inherent methods cannot be overloaded the same way that trait + implementations can be. It would most likely remain unstable indefinitely. # Alternatives - Stay as is. -- Could there be any other (and better!) total functions that serve the similar purpose? +- A previous version of this RFC introduced new `get_slice` etc methods rather than overloading + `get` etc. This avoids the utility trait but is somewhat less ergonomic. # Unresolved questions -- Naming, naming, naming: Is `get_range` the most suitable name? How about `get_slice`, or just - `slice`? Or any others? +None From 17e621b24516f0b14a5baaddcb36db618e4cedaa Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Fri, 15 Jul 2016 20:45:52 +0000 Subject: [PATCH 1006/1195] slicing -> indexing --- text/0000-panic-safe-slicing.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-panic-safe-slicing.md b/text/0000-panic-safe-slicing.md index c4e4b3e1bad..9ef20555aef 100644 --- a/text/0000-panic-safe-slicing.md +++ b/text/0000-panic-safe-slicing.md @@ -5,12 +5,12 @@ # Summary -Add "panic-safe" or "total" alternatives to the existing panicking slicing syntax. +Add "panic-safe" or "total" alternatives to the existing panicking indexing syntax. # Motivation `SliceExt::get` and `SliceExt::get_mut` can be thought as non-panicking versions of the simple -slicing syntax, `a[idx]`. However, there is no such equivalent for `a[start..end]`, `a[start..]`, +indexing syntax, `a[idx]`. However, there is no such equivalent for `a[start..end]`, `a[start..]`, or `a[..end]`. This RFC proposes such methods to fill the gap. # Detailed design From bcb006e3a191c525c8f4b4dc6259d1e0466ade21 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Fri, 15 Jul 2016 20:51:24 +0000 Subject: [PATCH 1007/1195] Add an alternative around separate traits --- text/0000-panic-safe-slicing.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-panic-safe-slicing.md b/text/0000-panic-safe-slicing.md index 9ef20555aef..79b109406a5 100644 --- a/text/0000-panic-safe-slicing.md +++ b/text/0000-panic-safe-slicing.md @@ -98,6 +98,10 @@ impl IndexMut for [T] - Stay as is. - A previous version of this RFC introduced new `get_slice` etc methods rather than overloading `get` etc. This avoids the utility trait but is somewhat less ergonomic. +- Instead of one trait amalgamating all of the required methods, we could have one trait per + method. This would open a more reasonable door to stabilizing those traits, but adds quite a lot + more surface area. Replacing an unstable `SliceIndex` trait with a collection would be + backwards compatible. # Unresolved questions From 7de9b49878e92b37b30b99dfc84f85450d4c09f7 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 15 Jul 2016 14:03:14 -0700 Subject: [PATCH 1008/1195] Actually add RFC 1522 --- text/1522-conservative-impl-trait.md | 545 +++++++++++++++++++++++++++ 1 file changed, 545 insertions(+) create mode 100644 text/1522-conservative-impl-trait.md diff --git a/text/1522-conservative-impl-trait.md b/text/1522-conservative-impl-trait.md new file mode 100644 index 00000000000..0682c85f358 --- /dev/null +++ b/text/1522-conservative-impl-trait.md @@ -0,0 +1,545 @@ +- Feature Name: conservative_impl_trait +- Start Date: 2016-01-31 +- RFC PR: https://github.com/rust-lang/rfcs/pull/1522 +- Rust Issue: https://github.com/rust-lang/rust/issues/34511 + +# Summary +[summary]: #summary + +Add a conservative form of abstract return types, aka `impl Trait`, +that will be compatible with most possible future extensions by +initially being restricted to: + +- Only free-standing or inherent functions. +- Only return type position of a function. + +Abstract return types allow a function to hide a concrete return +type behind a trait interface similar to trait objects, while +still generating the same statically dispatched code as with concrete types. + +With the placeholder syntax used in discussions so far, +abstract return types would be used roughly like this: + +```rust +fn foo(n: u32) -> impl Iterator { + (0..n).map(|x| x * 100) +} +// ^ behaves as if it had return type Map, Closure> +// where Closure = type of the |x| x * 100 closure. + +for x in foo(10) { + // x = 0, 100, 200, ... +} + +``` + +# Background + +There has been much discussion around the `impl Trait` feature already, with +different proposals extending the core idea into different directions: + +- The [original proposal](https://github.com/rust-lang/rfcs/pull/105). +- A [blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/) reviving + the proposal and further exploring the design space. +- A [more recent proposal](https://github.com/rust-lang/rfcs/pull/1305) with a + substantially more ambitious scope. + +This RFC is an attempt to make progress on the feature by proposing a minimal +subset that should be forwards-compatible with a whole range of extensions that +have been discussed (and will be reviewed in this RFC). However, even this small +step requires resolving some of the core questions raised in +[the blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/). + +This RFC is closest in spirit to the +[original RFC](https://github.com/rust-lang/rfcs/pull/105), and we'll repeat +its motivation and some other parts of its text below. + +# Motivation +[motivation]: #motivation + +> Why are we doing this? What use cases does it support? What is the expected outcome? + +In today's Rust, you can write a function signature like + +````rust +fn consume_iter_static>(iter: I) +fn consume_iter_dynamic(iter: Box>) +```` + +In both cases, the function does not depend on the exact type of the argument. +The type is held "abstract", and is assumed only to satisfy a trait bound. + +* In the `_static` version using generics, each use of the function is + specialized to a concrete, statically-known type, giving static dispatch, inline + layout, and other performance wins. + +* In the `_dynamic` version using trait objects, the concrete argument type is + only known at runtime using a vtable. + +On the other hand, while you can write + +````rust +fn produce_iter_dynamic() -> Box> +```` + +you _cannot_ write something like + +````rust +fn produce_iter_static() -> Iterator +```` + +That is, in today's Rust, abstract return types can only be written using trait +objects, which can be a significant performance penalty. This RFC proposes +"unboxed abstract types" as a way of achieving signatures like +`produce_iter_static`. Like generics, unboxed abstract types guarantee static +dispatch and inline data layout. + +Here are some problems that unboxed abstract types solve or mitigate: + +* _Returning unboxed closures_. Closure syntax generates an anonymous type + implementing a closure trait. Without unboxed abstract types, there is no way + to use this syntax while returning the resulting closure unboxed, because there + is no way to write the name of the generated type. + +* _Leaky APIs_. Functions can easily leak implementation details in their return + type, when the API should really only promise a trait bound. For example, a + function returning `Rev>` is revealing exactly how the iterator + is constructed, when the function should only promise that it returns _some_ + type implementing `Iterator`. Using newtypes/structs with private fields + helps, but is extra work. Unboxed abstract types make it as easy to promise only + a trait bound as it is to return a concrete type. + +* _Complex types_. Use of iterators in particular can lead to huge types: + + ````rust + Chain>>>, SkipWhile<'a, u16, Map<'a, &u16, u16, slice::Items>>> + ```` + + Even when using newtypes to hide the details, the type still has to be written + out, which can be very painful. Unboxed abstract types only require writing the + trait bound. + +* _Documentation_. In today's Rust, reading the documentation for the `Iterator` + trait is needlessly difficult. Many of the methods return new iterators, but + currently each one returns a different type (`Chain`, `Zip`, `Map`, `Filter`, + etc), and it requires drilling down into each of these types to determine what + kind of iterator they produce. + +In short, unboxed abstract types make it easy for a function signature to +promise nothing more than a trait bound, and do not generally require the +function's author to write down the concrete type implementing the bound. + +# Detailed design +[design]: #detailed-design + +As explained at the start of the RFC, the focus here is a relatively narrow +introduction of abstract types limited to the return type of inherent methods +and free functions. While we still need to resolve some of the core questions +about what an "abstract type" means even in these cases, we avoid some of the +complexities that come along with allowing the feature in other locations or +with other extensions. + +## Syntax + +Let's start with the bikeshed: The proposed syntax is `impl Trait` in return type +position, composing like trait objects to forms like `impl Foo+Send+'a`. + +It can be explained as "a type that implements `Trait`", +and has been used in that form in most earlier discussions and proposals. + +Initial versions of this RFC proposed `@Trait` for brevity reasons, +since the feature is supposed to be used commonly once implemented, +but due to strong negative reactions by the community this has been +changed back to the current form. + +There are other possibilities, like `abstract Trait` or `~Trait`, with +good reasons for or against them, but since the concrete choice of syntax +is not a blocker for the implementation of this RFC, it is intended for +a possible follow-up RFC to address syntax changes if needed. + +## Semantics + +The core semantics of the feature is described below. + +Note that the sections after this one go into more detail on some of the design +decisions, and that **it is likely for many of the mentioned limitations to be +lifted at some point in the future**. For clarity, we'll separately categories the *core +semantics* of the feature (aspects that would stay unchanged with future extensions) +and the *initial limitations* (which are likely to be lifted later). + +**Core semantics**: + +- If a function returns `impl Trait`, its body can return values of any type that + implements `Trait`, but all return values need to be of the same type. + +- As far as the typesystem and the compiler is concerned, the return type + outside of the function would not be a entirely "new" type, nor would it be a + simple type alias. Rather, its semantics would be very similar to that of + _generic type paramters_ inside a function, with small differences caused by + being an _output_ rather than an _input_ of the function. + + - The type would be known to implement the specified traits. + - The type would not be known to implement any other trait, with + the exception of OIBITS (aka "auto traits") and default traits like `Sized`. + - The type would not be considered equal to the actual underlying type. + - The type would not be allowed to appear as the Self type for an `impl` block. + +- Because OIBITS like `Send` and `Sync` will leak through an abstract return + type, there will be some additional complexity in the compiler due to some + non-local type checking becoming necessary. + +- The return type has an identity based on all generic parameters the + function body is parametrized by, and by the location of the function + in the module system. This means type equality behaves like this: + + ```rust + fn foo(t: T) -> impl Trait { + t + } + + fn bar() -> impl Trait { + 123 + } + + fn equal_type(a: T, b: T) {} + + equal_type(bar(), bar()); // OK + equal_type(foo::(0), foo::(0)); // OK + equal_type(bar(), foo::(0)); // ERROR, `impl Trait {bar}` is not the same type as `impl Trait {foo}` + equal_type(foo::(false), foo::(0)); // ERROR, `impl Trait {foo}` is not the same type as `impl Trait {foo}` + ``` + +- The code generation passes of the compiler would not draw a distinction + between the abstract return type and the underlying type, just like they don't + for generic paramters. This means: + - The same trait code would be instantiated, for example, `-> impl Any` + would return the type id of the underlying type. + - Specialization would specialize based on the underlying type. + +**Initial limitations**: + +- `impl Trait` may only be written within the return type of a freestanding or + inherent-impl function, not in trait definitions or any non-return type position. They may also not appear + in the return type of closure traits or function pointers, + unless these are themself part of a legal return type. + + - Eventually, we will want to allow the feature to be used within traits, and + like in argument position as well (as an ergonomic improvement over today's generics). + - Using `impl Trait` multiple times in the same return type would be valid, + like for example in `-> (impl Foo, impl Bar)`. + +- The type produced when a function returns `impl Trait` would be effectively + unnameable, just like closures and function items. + + - We will almost certainly want to lift this limitation in the long run, so + that abstract return types can be placed into structs and so on. There are a + few ways we could do so, all related to getting at the "output type" of a + function given all of its generic arguments. + +- The function body cannot see through its own return type, so code like this + would be forbidden just like on the outside: + + ```rust + fn sum_to(n: u32) -> impl Display { + if n == 0 { + 0 + } else { + n + sum_to(n - 1) + } + } + ``` + + - It's unclear whether we'll want to lift this limitation, but it should be possible to do so. + +## Rationale + +### Why this semantics for the return type? + +There has been a lot of discussion about what the semantics of the return type +should be, with the theoretical extremes being "full return type inference" and +"fully abstract type that behaves like a autogenerated newtype wrapper". (This +was in fact the main focus of the +[blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/) on `impl +Trait`.) + +The design as choosen in this RFC lies somewhat in between those two, since it +allows OIBITs to leak through, and allows specialization to "see" the full type +being returned. That is, `impl Trait` does not attempt to be a "tightly sealed" +abstraction boundary. The rationale for this design is a mixture of pragmatics +and principles. + +#### Specialization transparency + +**Principles for specialization transparency**: + +The [specialization RFC](https://github.com/rust-lang/rfcs/pull/1210) has given +us a basic principle for how to understand bounds in function generics: they +represent a *minimum* contract between the caller and the callee, in that the +caller must meet at least those bounds, and the callee must be prepared to work +with any type that meets at least those bounds. However, with specialization, +the callee may choose different behavior when additional bounds hold. + +This RFC abides by a similar interpretation for return types: the signature +represents the minimum bound that the callee must satisfy, and the caller must +be prepared to work with any type that meets at least that bound. Again, with +specialization, the caller may dispatch on additional type information beyond +those bounds. + +In other words, to the extent that returning `impl Trait` is intended to be +symmetric with taking a generic `T: Trait`, transparency with respect to +specialization maintains that symmetry. + +**Pragmatics for specialization transparency**: + +The practical reason we want `impl Trait` to be transparent to specialization is the +same as the reason we want specialization in the first place: to be able to +break through abstractions with more efficient special-case code. + +This is particularly important for one of the primary intended usecases: +returning `impl Iterator`. We are very likely to employ specialization for various +iterator types, and making the underlying return type invisible to +specialization would lose out on those efficiency wins. + +#### OIBIT transparency + +OIBITs leak through an abstract return type. This might be considered controversial, since +it effectively opens a channel where the result of function-local type inference affects +item-level API, but has been deemed worth it for the following reasons: + +- Ergonomics: Trait objects already have the issue of explicitly needing to + declare `Send`/`Sync`-ability, and not extending this problem to abstract + return types is desireable. In practice, most uses of this feature would have + to add explicit bounds for OIBITS if they wanted to be maximally usable. + +- Low real change, since the situation already somewhat exists on structs with private fields: + - In both cases, a change to the private implementation might change whether a OIBIT is + implemented or not. + - In both cases, the existence of OIBIT impls is not visible without doc tools + - In both cases, you can only assert the existence of OIBIT impls + by adding explicit trait bounds either to the API or to the crate's testsuite. + +In fact, a large part of the point of OIBITs in the first place was to cut +across abstraction barriers and provide information about a type without the +type's author having to explicitly opt in. + +This means, however, that it has to be considered a silent breaking change to +change a function with a abstract return type in a way that removes OIBIT impls, +which might be a problem. (As noted above, this is already the case for `struct` +definitions.) + +But since the number of used OIBITs is relatvly small, deducing the return type +in a function body and reasoning about whether such a breakage will occur has +been deemed as a manageable amount of work. + +#### Wherefore type abstraction? + +In the [most recent RFC](https://github.com/rust-lang/rfcs/pull/1305) related to +this feature, a more "tightly sealed" abstraction mechanism was +proposed. However, part of the discussion on specialization centered on +precisely the issue of what type abstraction provides and how to achieve it. A +particular salient point there is that, in Rust, *privacy* is already our +primary mechanism for hiding +(["privacy is the new parametricity"](https://github.com/rust-lang/rfcs/pull/1210#issuecomment-181992044)). In +practice, that means that if you want opacity against specialization, you should +use something like a newtype. + +### Anonymity + +A abstract return type cannot be named in this proposal, which means that it +cannot be placed into `structs` and so on. This is not a fundamental limitation +in any sense; the limitation is there both to keep this RFC simple, and because +the precise way we might want to allow naming of such types is still a bit +unclear. Some possibilities include a `typeof` operator, or explicit named +abstract types. + +### Limitation to only return type position + +There have been various proposed additional places where abstract types +might be usable. For example, `fn x(y: impl Trait)` as shorthand for +`fn x(y: T)`. + +Since the exact semantics and user experience for these locations are yet +unclear (`impl Trait` would effectively behave completely different before and after +the `->`), this has also been excluded from this proposal. + +### Type transparency in recursive functions + +Functions with abstract return types can not see through their own return type, +making code like this not compile: + +```rust +fn sum_to(n: u32) -> impl Display { + if n == 0 { + 0 + } else { + n + sum_to(n - 1) + } +} +``` + +This limitation exists because it is not clear how much a function body +can and should know about different instantiations of itself. + +It would be safe to allow recursive calls if the set of generic parameters +is identical, and it might even be safe if the generic parameters are different, +since you would still be inside the private body of the function, just +differently instantiated. + +But variance caused by lifetime parameters and the interaction with +specialization makes it uncertain whether this would be sound. + +In any case, it can be initially worked around by defining a local helper function like this: + +```rust +fn sum_to(n: u32) -> impl Display { + fn sum_to_(n: u32) -> u32 { + if n == 0 { + 0 + } else { + n + sum_to_(n - 1) + } + } + sum_to_(n) +} +``` + +### Not legal in function pointers/closure traits + +Because `impl Trait` defines a type tied to the concrete function body, +it does not make much sense to talk about it separately in a function signature, +so the syntax is forbidden there. + +### Compability with conditional trait bounds + +On valid critique for the existing `impl Trait` proposal is that it does not +cover more complex scenarios, where the return type would implement +one or more traits depending on whether a type parameter does so with another. + +For example, a iterator adapter might want to implement `Iterator` and +`DoubleEndedIterator`, depending on whether the adapted one does: + +```rust +fn skip_one(i: I) -> SkipOne { ... } +struct SkipOne { ... } +impl Iterator for SkipOne { ... } +impl DoubleEndedIterator for SkipOne { ... } +``` + +Using just `-> impl Iterator`, this would not be possible to reproduce. + +Since there has been no proposals so far that would address this in a way +that would conflict with the fixed-trait-set case, this RFC punts on that issue as well. + +### Limitation to free/inherent functions + +One important usecase of abstract return types is to use them in trait methods. + +However, there is an issue with this, namely that in combinations with generic +trait methods, they are effectively equivalent to higher kinded types. +Which is an issue because Rust HKT story is not yet figured out, so +any "accidential implementation" might cause unintended fallout. + +HKT allows you to be generic over a type constructor, aka a +"thing with type parameters", and then instantiate them at some later point to +get the actual type. +For example, given a HK type `T` that takes one type as parameter, you could +write code that uses `T` or `T` without caring about +whether `T = Vec`, `T = Box`, etc. + +Now if we look at abstract return types, we have a similar situation: + +```rust +trait Foo { + fn bar() -> impl Baz +} +``` + +Given a `T: Foo`, we could instantiate `T::bar::` or `T::bar::`, +and could get arbitrary different return types of `bar` instantiated +with a `u32` or `bool`, +just like `T` and `T` might give us `Vec` or `Box` +in the example above. + +The problem does not exists with trait method return types today because +they are concrete: + +```rust +trait Foo { + fn bar() -> X +} +``` + +Given the above code, there is no way for `bar` to choose a return type `X` +that could fundamentally differ between instantiations of `Self` +while still being instantiable with an arbitrary `U`. + +At most you could return a associated type, but then you'd loose the generics +from `bar` + +```rust +trait Foo { + type X; + fn bar() -> Self::X // No way to apply U +} +``` + +So, in conclusion, since Rusts HKT story is not yet fleshed out, +and the compatibility of the current compiler with it is unknown, +it is not yet possible to reach a concrete solution here. + +In addition to that, there are also different proposals as to whether +a abstract return type is its own thing or sugar for a associated type, +how it interacts with other associated items and so on, +so forbidding them in traits seems like the best initial course of action. + +# Drawbacks +[drawbacks]: #drawbacks + +> Why should we *not* do this? + +## Drawbacks due to the proposal's minimalism + +As has been elaborated on above, there are various way this feature could be +extended and combined with the language, so implementing it might cause issues +down the road if limitations or incompatibilities become apparent. However, +variations of this RFC's proposal have been under discussion for quite a long +time at this point, and this proposal is carefully designed to be +future-compatible with them, while resolving the core issue around transparency. + +A drawback of limiting the feature to return type position (and not arguments) +is that it creates a somewhat inconsistent mental model: it forces you to +understand the feature in a highly special-cased way, rather than as a general +way to talk about unknown-but-bounded types in function signatures. This could +be particularly bewildering to newcomers, who must choose between `T: Trait`, +`Box`, and `impl Trait`, with the latter only usable in one place. + +## Drawbacks due to partial transparency + +The fact that specialization and OIBITs can "see through" `impl Trait` may be +surprising, to the extent that one wants to see `impl Trait` as an abstraction +mechanism. However, as the RFC argued in the rationale section, this design is +probably the most consistent with our existing post-specialization abstraction +mechanisms, and lead to the relatively simple story that *privacy* is the way to +achieve hiding in Rust. + +# Alternatives +[alternatives]: #alternatives + +> What other designs have been considered? What is the impact of not doing this? + +See the links in the motivation section for detailed analysis that we won't +repeat here. + +But basically, without this feature certain things remain hard or impossible to do +in Rust, like returning a efficiently usable type parametricised by +types private to a function body, for example an iterator adapter containing a closure. + +# Unresolved questions +[unresolved]: #unresolved-questions + +> What parts of the design are still TBD? + +The precise implementation details for OIBIT transparency are a bit unclear: in +general, it means that type checking may need to proceed in a particular order, +since you cannot get the full type information from the signature alone (you +have to typecheck the function body to determine which OIBITs apply). From ce16e773888879c74ad9e7fb7d33e39bc2a6820e Mon Sep 17 00:00:00 2001 From: Diggory Blake Date: Sun, 17 Jul 2016 18:33:09 +0100 Subject: [PATCH 1009/1195] Remove "explanatory" alternatives and flesh out a possible command line interface --- text/0000-windows-subsystem.md | 61 ++++++++++++++++++++-------------- 1 file changed, 36 insertions(+), 25 deletions(-) diff --git a/text/0000-windows-subsystem.md b/text/0000-windows-subsystem.md index 723982a89a5..06b3ecc6c47 100644 --- a/text/0000-windows-subsystem.md +++ b/text/0000-windows-subsystem.md @@ -9,7 +9,8 @@ Rust programs compiled for windows will always flash up a console window on startup. This behavior is controlled via the `SUBSYSTEM` parameter passed to the linker, and so *can* be overridden with specific compiler flags. However, doing -so will bypass the rust-specific initialization code in `libstd`. +so will bypass the rust-specific initialization code in `libstd`, as the entry +point must be named `WinMain`. This RFC proposes supporting this case explicitly, allowing `libstd` to continue to be initialized correctly. @@ -33,7 +34,7 @@ This is unsafe, and will skip the initialization code in `libstd`. [design]: #detailed-design When an executable is linked while compiling for a windows target, it will be -linked for a specific *Subsystem*. The subsystem determines how the operating +linked for a specific *subsystem*. The subsystem determines how the operating system will run the executable, and will affect the execution environment of the program. @@ -91,36 +92,46 @@ whichever linker is actually being used. # Alternatives [alternatives]: #alternatives -- Emit either `WinMain` or `main` from `libstd` based on `cfg` options. +- Only emit one of either `WinMain` or `main` from `rustc` based on a new + command line option. - This has the advantage of not requiring changes to `rustc`, but is something - of a non-starter since it requires a version of `libstd` for each subsystem. + This command line option would only be applicable when compiling an + executable, and only for windows platforms. No other supported platforms + require a different entry point or additional linker arguments for programs + designed to run with a graphical user interface. -- Emit either `WinMain` or `main` from `rustc` based on `cfg` options. + `rustc` will react to this command line option by changing the exported + name of the entry point to `WinMain`, and passing additional arguments to + the linker to configure the correct subsystem. A mismatch here would result + in linker errors. - This would not require different versions of `libstd`, but it would require - recompiling all other crates depending on the value of the `cfg` option. + A similar option would need to be added to `Cargo.toml` to make usage as + simple as possible. -- Emit either `WinMain` or `main` from `rustc` based on a new command line - option. + There's some bike-shedding which can be done one the exact command line + interface, but one possible option is shown below. - Assuming the command line option need only be specified when compiling the - executable itself, the dependencies would not need to be recompiled were the - subsystem to change. + Rustc usage: + `rustc foo.rs --crate-subsystem windows` - Choosing to emit one or the other means that the compiler and linker must - agree on the subsystem, or else you'll get linker errors. If `rustc` only - specified a `subsystem` to the linker if the option is passed, this would be - a fully backwards compatible change. + Cargo.toml + ```toml + [package] + # ... - A compiler option is probably desirable in addition to this RFC, but it will - require bike-shedding on the new command line interface, and changes to rustc - to be able to pass on the correct linker flags. + [[bin]] + name = "foo" + path = "src/foo.rs" + subsystem = "windows" + ``` - A similar option would need to be added to `Cargo.toml` to make usage as simple - as possible. + The `crate-subsystem` command line option would exist on all platforms, + but would be ignored when compiling for a non-windows target, so as to + support cross-compiling. If not compiling a binary crate, specifying the + option is an error regardless of the target. -- Add a `subsystem` function to determine which subsystem was used at runtime. +- Export both entry points as described in this RFC, but also add a `subsystem` + function to `libstd` determine which subsystem was used at runtime. The `WinMain` function would first set an internal flag, and only then delegate to the `main` function. @@ -137,8 +148,8 @@ whichever linker is actually being used. an incorrect value if the initialization was skipped, such as if used as a library from an executable written in another language. -- Use the undocumented MSVC equivalent to weak symbols to avoid breaking - existing code. +- Export both entry points as described in this RFC, but use the undocumented + MSVC equivalent to weak symbols to avoid breaking existing code. The parameter `/alternatename:_WinMain@16=_RustWinMain@16` can be used to export `WinMain` only if it is not also exported elsewhere. This is completely From a2295286991d19ac63410e88cc95cd9500faad98 Mon Sep 17 00:00:00 2001 From: Jake Goulding Date: Sat, 16 Jul 2016 23:09:04 -0400 Subject: [PATCH 1010/1195] Correct typos in trait impl --- text/1522-conservative-impl-trait.md | 32 ++++++++++++++-------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/text/1522-conservative-impl-trait.md b/text/1522-conservative-impl-trait.md index 0682c85f358..20c23428fef 100644 --- a/text/1522-conservative-impl-trait.md +++ b/text/1522-conservative-impl-trait.md @@ -6,9 +6,9 @@ # Summary [summary]: #summary -Add a conservative form of abstract return types, aka `impl Trait`, -that will be compatible with most possible future extensions by -initially being restricted to: +Add a conservative form of abstract return types, also known as `impl +Trait`, that will be compatible with most possible future extensions +by initially being restricted to: - Only free-standing or inherent functions. - Only return type position of a function. @@ -175,7 +175,7 @@ and the *initial limitations* (which are likely to be lifted later). - As far as the typesystem and the compiler is concerned, the return type outside of the function would not be a entirely "new" type, nor would it be a simple type alias. Rather, its semantics would be very similar to that of - _generic type paramters_ inside a function, with small differences caused by + _generic type parameters_ inside a function, with small differences caused by being an _output_ rather than an _input_ of the function. - The type would be known to implement the specified traits. @@ -189,7 +189,7 @@ and the *initial limitations* (which are likely to be lifted later). non-local type checking becoming necessary. - The return type has an identity based on all generic parameters the - function body is parametrized by, and by the location of the function + function body is parameterized by, and by the location of the function in the module system. This means type equality behaves like this: ```rust @@ -211,7 +211,7 @@ and the *initial limitations* (which are likely to be lifted later). - The code generation passes of the compiler would not draw a distinction between the abstract return type and the underlying type, just like they don't - for generic paramters. This means: + for generic parameters. This means: - The same trait code would be instantiated, for example, `-> impl Any` would return the type id of the underlying type. - Specialization would specialize based on the underlying type. @@ -221,7 +221,7 @@ and the *initial limitations* (which are likely to be lifted later). - `impl Trait` may only be written within the return type of a freestanding or inherent-impl function, not in trait definitions or any non-return type position. They may also not appear in the return type of closure traits or function pointers, - unless these are themself part of a legal return type. + unless these are themselves part of a legal return type. - Eventually, we will want to allow the feature to be used within traits, and like in argument position as well (as an ergonomic improvement over today's generics). @@ -262,7 +262,7 @@ was in fact the main focus of the [blog post](http://aturon.github.io/blog/2015/09/28/impl-trait/) on `impl Trait`.) -The design as choosen in this RFC lies somewhat in between those two, since it +The design as chosen in this RFC lies somewhat in between those two, since it allows OIBITs to leak through, and allows specialization to "see" the full type being returned. That is, `impl Trait` does not attempt to be a "tightly sealed" abstraction boundary. The rationale for this design is a mixture of pragmatics @@ -308,15 +308,15 @@ item-level API, but has been deemed worth it for the following reasons: - Ergonomics: Trait objects already have the issue of explicitly needing to declare `Send`/`Sync`-ability, and not extending this problem to abstract - return types is desireable. In practice, most uses of this feature would have + return types is desirable. In practice, most uses of this feature would have to add explicit bounds for OIBITS if they wanted to be maximally usable. - Low real change, since the situation already somewhat exists on structs with private fields: - In both cases, a change to the private implementation might change whether a OIBIT is implemented or not. - - In both cases, the existence of OIBIT impls is not visible without doc tools + - In both cases, the existence of OIBIT impls is not visible without documentation tools - In both cases, you can only assert the existence of OIBIT impls - by adding explicit trait bounds either to the API or to the crate's testsuite. + by adding explicit trait bounds either to the API or to the crate's test suite. In fact, a large part of the point of OIBITs in the first place was to cut across abstraction barriers and provide information about a type without the @@ -327,7 +327,7 @@ change a function with a abstract return type in a way that removes OIBIT impls, which might be a problem. (As noted above, this is already the case for `struct` definitions.) -But since the number of used OIBITs is relatvly small, deducing the return type +But since the number of used OIBITs is relatively small, deducing the return type in a function body and reasoning about whether such a breakage will occur has been deemed as a manageable amount of work. @@ -409,7 +409,7 @@ Because `impl Trait` defines a type tied to the concrete function body, it does not make much sense to talk about it separately in a function signature, so the syntax is forbidden there. -### Compability with conditional trait bounds +### Compatibility with conditional trait bounds On valid critique for the existing `impl Trait` proposal is that it does not cover more complex scenarios, where the return type would implement @@ -437,7 +437,7 @@ One important usecase of abstract return types is to use them in trait methods. However, there is an issue with this, namely that in combinations with generic trait methods, they are effectively equivalent to higher kinded types. Which is an issue because Rust HKT story is not yet figured out, so -any "accidential implementation" might cause unintended fallout. +any "accidental implementation" might cause unintended fallout. HKT allows you to be generic over a type constructor, aka a "thing with type parameters", and then instantiate them at some later point to @@ -531,13 +531,13 @@ See the links in the motivation section for detailed analysis that we won't repeat here. But basically, without this feature certain things remain hard or impossible to do -in Rust, like returning a efficiently usable type parametricised by +in Rust, like returning a efficiently usable type parameterized by types private to a function body, for example an iterator adapter containing a closure. # Unresolved questions [unresolved]: #unresolved-questions -> What parts of the design are still TBD? +> What parts of the design are still to be determined? The precise implementation details for OIBIT transparency are a bit unclear: in general, it means that type checking may need to proceed in a particular order, From b89530e607f11fa5031dd075f5c8f37502c20137 Mon Sep 17 00:00:00 2001 From: Diggory Blake Date: Sun, 17 Jul 2016 23:53:11 +0100 Subject: [PATCH 1011/1195] Expand on GNU toolchain specifics --- text/0000-windows-subsystem.md | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/text/0000-windows-subsystem.md b/text/0000-windows-subsystem.md index 06b3ecc6c47..778b51fb3b8 100644 --- a/text/0000-windows-subsystem.md +++ b/text/0000-windows-subsystem.md @@ -9,8 +9,8 @@ Rust programs compiled for windows will always flash up a console window on startup. This behavior is controlled via the `SUBSYSTEM` parameter passed to the linker, and so *can* be overridden with specific compiler flags. However, doing -so will bypass the rust-specific initialization code in `libstd`, as the entry -point must be named `WinMain`. +so will bypass the rust-specific initialization code in `libstd`, as when using +the MSVC toolchain, the entry point must be named `WinMain`. This RFC proposes supporting this case explicitly, allowing `libstd` to continue to be initialized correctly. @@ -22,7 +22,7 @@ The `WINDOWS` subsystem is commonly used on windows: desktop applications typically do not want to flash up a console window on startup. Currently, using the `WINDOWS` subsystem from rust is undocumented, and the -process is non-trivial: +process is non-trivial when targeting the MSVC toolchain: A new symbol `pub extern "system" WinMain(...)` with specific argument and return types must be declared, which will become the new entry point for @@ -30,6 +30,8 @@ the program. This is unsafe, and will skip the initialization code in `libstd`. +The GNU toolchain will accept either entry point. + # Detailed design [design]: #detailed-design @@ -130,6 +132,17 @@ whichever linker is actually being used. support cross-compiling. If not compiling a binary crate, specifying the option is an error regardless of the target. +- Have `rustc` override the entry point when calling `link.exe`, and tell it to + use `mainCRTStartup` instead of `winMainCRTStartup`. These are the "true" + entry points of windows programs, which first initialize the C runtime + library, and then call `main` or `WinMain` respectively. + + This is the simplest solution, and it will not have any serious backwards + compatibility problems, since rust programs are already required to have a + `main` function, even if `WinMain` has been separately defined. However, it + relies on the two CRT functions to be interchangeable, although this does + *appear* to be the case currently. + - Export both entry points as described in this RFC, but also add a `subsystem` function to `libstd` determine which subsystem was used at runtime. From e6abfff15d485e39e32a99c5c4a0106d8a5ba05c Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Mon, 18 Jul 2016 21:09:45 +0200 Subject: [PATCH 1012/1195] Add a drawback about doc readability --- text/0000-panic-safe-slicing.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-panic-safe-slicing.md b/text/0000-panic-safe-slicing.md index 79b109406a5..24e358e3add 100644 --- a/text/0000-panic-safe-slicing.md +++ b/text/0000-panic-safe-slicing.md @@ -92,6 +92,11 @@ impl IndexMut for [T] - The `SliceIndex` trait is unfortunate - it's tuned for exactly the set of methods it's used by. It only exists because inherent methods cannot be overloaded the same way that trait implementations can be. It would most likely remain unstable indefinitely. +- Documentation may suffer. Rustdoc output currently explicitly shows each of the ways you can + index a slice, while there will simply be a single generic implementation with this change. This + may not be that bad, though. The doc block currently seems to provided the most valuable + information to newcomers rather than the trait bound, and that will still be present with this + change. # Alternatives From e2981fca797bf27713f566a6d24320be53c65aa0 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 14 Jul 2016 17:52:29 -0700 Subject: [PATCH 1013/1195] RFC: Macros 1.1 Extract a very small sliver of today's procedural macro system in the compiler, just enough to get basic features like custom derive working, to have an eventually stable API. Ensure that these features will not pose a maintenance burden on the compiler but also don't try to provide enough features for the "perfect macro system" at the same time. Overall, this should be considered an incremental step towards an official "plugins 2.0". --- text/0000-macros-1.1.md | 552 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 552 insertions(+) create mode 100644 text/0000-macros-1.1.md diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md new file mode 100644 index 00000000000..04b8e8cb2b3 --- /dev/null +++ b/text/0000-macros-1.1.md @@ -0,0 +1,552 @@ +- Feature Name: `rustc_macros` +- Start Date: 2016-07-14 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Extract a very small sliver of today's procedural macro system in the compiler, +just enough to get basic features like custom derive working, to have an +eventually stable API. Ensure that these features will not pose a maintenance +burden on the compiler but also don't try to provide enough features for the +"perfect macro system" at the same time. Overall, this should be considered an +incremental step towards an official "macros 2.0". + +# Motivation +[motivation]: #motivation + +Some large projects in the ecosystem today, such as [serde] and [diesel], +effectively require the nightly channel of the Rust compiler. Although most +projects have an alternative to work on stable Rust, this tends to be far less +ergonomic and comes with its own set of downsides, and empirically it has not +been enough to push the nightly users to stable as well. + +[serde]: https://github.com/serde-rs/serde +[diesel]: http://diesel.rs/ + +These large projects, however, are often the face of Rust to external users. +Common knowledge is that fast serialization is done using serde, but to others +this just sounds likes "fast Rust needs nightly". Over time this persistent +thought process creates a culture of "well to be serious you require nightly" +and a general feeling that Rust is not "production ready". + +The good news, however, is that this class of projects which require nightly +Rust almost all require nightly for the reason of procedural macros. Even +better, the full functionality of procedural macros is rarely needed, only +custom derive! Even better, custom derive typically doesn't *require* the features +one would expect from a full-on macro system, such as hygiene and modularity, +that normal procedural macros typically do. The purpose of this RFC, as a +result, is to provide these crates a method of working on stable Rust with the +desired ergonomics one would have on nightly otherwise. + +Unfortunately today's procedural macros are not without their architectural +shortcomings as well. For example they're defined and imported with arcane +syntax and don't participate in hygiene very well. To address these issues, +there are a number of RFCs to develop a "macros 2.0" story: + +* [Changes to name resolution](https://github.com/rust-lang/rfcs/pull/1560) +* [Macro naming and modularisation](https://github.com/rust-lang/rfcs/pull/1561) +* [Procedural macros](https://github.com/rust-lang/rfcs/pull/1566) +* [Macros by example 2.0](https://github.com/rust-lang/rfcs/pull/1584) + +Many of these designs, however, will require a significant amount of work to not +only implement but also a significant amount of work to stabilize. The current +understanding is that these improvements are on the time scale of years, whereas +the problem of nightly Rust is today! + +As a result, it is an explicit non-goal of this RFC to architecturally improve +on the current procedural macro system. The drawbacks of today's procedural +macros will be the same as those proposed in this RFC. The major goal here is +to simply minimize the exposed surface area between procedural macros and the +compiler to ensure that the interface is well defined and can be stably +implemented in future versions of the compiler as well. + +Put another way, we currently have macros 1.0 unstable today, we're shooting +for macros 2.0 stable in the far future, but this RFC is striking a middle +ground at macros 1.1 today! + +# Detailed design +[design]: #detailed-design + +First, before looking how we're going to expose procedural macros, let's +take a detailed look at how they work today. + +### Today's procedural macros + +A procedural macro today is loaded into a crate with the `#![plugin(foo)]` +annotation at the crate root. This in turn looks for a crate named `foo` [via +the same crate loading mechanisms][loader] as `extern crate`, except [with the +restriction][host-restriction] that the target triple of the crate must be the +same as the target the compiler was compiled for. In other words, if you're on +x86 compiling to ARM, macros must also be compiled for x86. + +[loader]: https://github.com/rust-lang/rust/blob/78d49bfac2bbcd48de522199212a1209f498e834/src/librustc_metadata/creader.rs#L480 +[host-restriction]: https://github.com/rust-lang/rust/blob/78d49bfac2bbcd48de522199212a1209f498e834/src/librustc_metadata/creader.rs#L494 + +Once a crate is found, it's required to be a dynamic library as well, and once +that's all verified the compiler [opens it up with `dlopen`][dlopen] (or the +equivalent therein). After loading, the compiler will [look for a special +symbol][symbol] in the dynamic library, and then call it with a macro context. + +[dlopen]: https://github.com/rust-lang/rust/blob/78d49bfac2bbcd48de522199212a1209f498e834/src/librustc_plugin/load.rs#L124 +[symbol]: https://github.com/rust-lang/rust/blob/78d49bfac2bbcd48de522199212a1209f498e834/src/librustc_plugin/load.rs#L136-L139 + +So as we've seen macros are compiled as normal crates into dynamic libraries. +One function in the crate is tagged with `#[plugin_registrar]` which gets wired +up to this "special symbol" the compiler wants. When the function is called with +a macro context, it uses the passed in [plugin registry][registry] to register +custom macros, attributes, etc. + +[registry]: https://github.com/rust-lang/rust/blob/78d49bfac2bbcd48de522199212a1209f498e834/src/librustc_plugin/registry.rs#L30-L69 + +After a macro is registered, the compiler will then continue the normal process +of expanding a crate. Whenever the compiler encounters this macro it will call +this registration with essentially and AST and morally gets back a different +AST to splice in or replace. + +### Today's drawbacks + +This expansion process suffers from many of the downsides mentioned in the +motivation section, such as a lack of hygiene, a lack of modularity, and the +inability to import macros as you would normally other functionality in the +module system. + +Additionally, though, it's essentially impossible to ever *stabilize* because +the interface to the compiler is... the compiler! We clearly want to make +changes to the compiler over time, so this isn't acceptable. To have a stable +interface we'll need to cut down this surface area *dramatically* to a curated +set of known-stable APIs. + +Somewhat more subtly, the technical ABI of procedural macros is also exposed +quite thinly today as well. The implementation detail of dynamic libraries, and +especially that both the compiler and the macro dynamically link to libraries +like libsyntax, cannot be changed. This precludes, for example, a completely +statically linked compiler (e.g. compiled for `x86_64-unknown-linux-musl`). +Another goal of this RFC will also be to hide as many of these technical +details as possible, allowing the compiler to flexibly change how it interfaces +to macros. + +## Macros 1.1 + +Ok, with the background knowledge of what procedural macros are today, let's +take a look at how we can solve the major problems blocking its stabilization: + +* Sharing an API of the entire compiler +* Frozen interface between the compiler and macros + +### `librustc_macro` + +Proposed in [RFC 1566](https://github.com/rust-lang/rfcs/pull/1566) and +described in [this blog post](http://ncameron.org/blog/libmacro/) the +distribution will now ship with a new `librustc_macro` crate available for macro +authors. The intention here is that the gory details of how macros *actually* +talk to the compiler is entirely contained within this one crate. The stable +interface to the compiler is then entirely defined in this crate, and we can +make it as small or large as we want. Additionally, like the standard library, +it can contain unstable APIs to test out new pieces of functionality over time. + +The initial implementation of `librustc_macro` is proposed to be *incredibly* +bare bones: + +```rust +#![crate_name = "macro"] + +pub struct TokenStream { + // ... +} + +#[derive(Debug)] +pub struct LexError { + // ... +} + +pub struct Context { + // ... +} + +impl TokenStream { + pub fn from_source(cx: &mut Context, + source: &str) -> Result { + // ... + } + + pub fn to_source(&self, cx: &mut Context) -> String { + // ... + } +} +``` + +That is, there will only be a handful of exposed types and `TokenStream` can +only be converted to and from a `String`. Eventually `TokenStream` type will +more closely resemble token streams [in the compiler +itself][compiler-tokenstream], and more fine-grained manipulations will be +available as well. + +Additionally, the `Context` structure will initially be completely devoid +of functionality, but in the future it will be the entry point for [many other +features][macro20] one would expect in macros 2.0 + +[compiler-tokenstream]: https://github.com/rust-lang/rust/blob/master/src/libsyntax/tokenstream.rs#L323-L338 +[macro20]: http://ncameron.org/blog/libmacro/ + +### Defining a macro + +A new crate type will be added to the compiler, `rustc-macro` (described below), +indicating a crate that's compiled as a procedural macro. There will not be a +"registrar" function in this crate type (like there is today), but rather a +number of functions which act as token stream transformers to implement macro +functionality. + +A macro crate might look like: + +```rust +#![crate_type = "rustc-macro"] +#![crate_name = "double"] + +extern crate rustc_macro; + +use rustc_macro::{Context, TokenStream}; + +#[rustc_macro_derive(Double)] +pub fn double(cx: &mut Context, input: TokenStream) -> TokenStream { + let source = input.to_source(cx); + + // Parse `source` for struct/enum declaration, and then build up some new + // source code representing representing a number of items in the + // implementation of the `Double` trait for the struct/enum in question. + let source = derive_double(cx, source); + + // Parse this back to a token stream and return it + TokenStream::from_source(cx, &source).unwrap() +} +``` + +This new `rustc_macro_derive` attribute will be allowed inside of a +`rustc-macro` crate but disallowed in other crate types. It defines a new +`#[derive]` mode which can be used in a crate. The input here is the entire +struct that `#[derive]` was attached to, attributes and all. The output is +**expected to include the `struct`/`enum` itself** as well as any number of +items to be contextually "placed next to" the initial declaration. + +Again, though, there is no hygiene, it's as if the source was simply +copy/pasted. All span information for the `TokenStream` structures returned by +`from_source` will point to the original `#[derive]` annotation. This means +that error messages related to struct definitions will get *worse* if they have +a custom derive attribute placed on them, because the entire struct's span will +get folded into the `#[derive]` annotation. Eventually, though, more span +information will be stable on the `TokenStream` type, so this is just a +temporary limitation. + +The `rustc_macro_derive` attribute requires the signature (similar to [macros +2.0][mac20sig]): + +[mac20sig]: http://ncameron.org/blog/libmacro/#tokenisingandquasiquoting + +```rust +fn(&mut Context, TokenStream) -> TokenStream +``` + +If a macro cannot process the input token stream, it is expected to panic for +now, although eventually it will call methods on `Context` to provide more +structured errors. The compiler will wrap up the panic message and display it +to the user appropriately. Eventually, however, `librustc_macro` will provide +more interesting methods of signaling errors to users. + +Customization of user-defined `#[derive]` modes can still be done through custom +attributes, although it will be required for `rustc_macro_derive` +implementations to remove these attributes when handing them back to the +compiler. The compiler will still gate unknown attributes by default. + +### `rustc-macro` crates + +Like the executable, staticlib, and cdylib crate types, the `rustc-macro` crate +type is intended to be a final product. What it *actually* produces is not +specified, but if a `-L` path is provided to it then the compiler will recognize +the output artifacts as a macro and it can be loaded for a program. + +Initially if a crate is compiled with the `rustc-macro` crate type (and possibly +others) it will forbid exporting any items in the crate other than those +functions tagged `#[rustc_macro_derive]` and those functions must also be placed +at the crate root. Finally, the compiler will automatically set the +`cfg(rustc_macro)` annotation whenever any crate type of a compilation is the +`rustc-macro` crate type. + +While these properties may seem a bit odd, they're intended to allow a number of +forwards-compatible extensions to be implemented in macros 2.0: + +* Macros eventually want to be imported from crates (e.g. `use foo::bar!`) and + limiting where `#[derive]` can be defined reduces the surface area for + possible conflict. +* Macro crates eventually want to be compiled to be available both at runtime + and at compile time. That is, an `extern crate foo` annotation may load + *both* a `rustc-macro` crate and a crate to link against, if they are + available. Limiting the public exports for now to only custom-derive + annotations should allow for maximal flexibility here. + +### Using a procedural macro + +Using a procedural macro will be very similar to today's `extern crate` system, +such as: + +```rust +extern crate double; + +#[derive(Double)] +pub struct Foo; + +fn main() { + // ... +} +``` + +That is, the `extern crate` directive will now also be enhanced to look for +crates compiled as `rustc-macro` in addition to those compiled as `dylib` and +`rlib`. Today this will be temporarily limited to finding *either* a +`rustc-macro` crate or an rlib/dylib pair compiled for the target, but this +restriction may be lifted in the future. + +The custom derive annotations loaded from `rustc-macro` crates today will all be +placed into the same global namespace. Any conflicts (shadowing) will cause the +compiler to generate an error, and it must be resolved by loading only one or +the other of the `rustc-macro` crates (eventually this will be solved with a +more principled `use` system in macros 2.0). + +### Initial implementation details + +This section lays out what the initial implementation details of macros 1.1 +will look like, but none of this will be specified as a stable interface to the +compiler. These exact details are subject to change over time as the +requirements of the compiler change, and even amongst platforms these details +may be subtly different. + +The compiler will essentially consider `rustc-macro` crates as `--crate-type +dylib -C prefer-dyanmic`. That is, compiled the same way they are today. This +namely means that these macros will dynamically link to the same standard +library as the compiler itself, therefore sharing resources like a global +allocator, etc. + +The `librustc_macro` crate will compiled as an rlib and a static copy of it +will be included in each macro. This crate will provide a symbol known by the +compiler that can be dynamically loaded. The compiler will `dlopen` a macro +crate in the same way it does today, find this symbol in `librustc_macro`, and +call it. + +The `rustc_macro_define` and `rustc_macro_derive` attributes will be encoded +into the crate's metadata, and the compiler will discover all these functions, +load their function pointers, and pass them to the `librustc_macro` entry point +as well. This provides the opportunity to register all the various expansion +mechanisms with the compiler. + +The actual underlying representation of `TokenStream` will be basically the same +as it is in the compiler today. (the details on this are a little light +intentionally, shouldn't be much need to go into *too* much detail). + +### Initial Cargo integration + +Like plugins today, Cargo needs to understand which crates are `rustc-macro` +crates and which aren't. Cargo additionally needs to understand this to sequence +compilations correctly and ensure that `rustc-macro` crates are compiled for the +host platform. To this end, Cargo will understand a new attribute in the `[lib]` +section: + +```toml +[lib] +rustc-macro = true +``` + +This annotation indicates that the crate being compiled should be compiled as a +`rustc-macro` crate type for the host platform in the current compilation. + +Eventually Cargo may also grow support to understand that a `rustc-macro` crate +should be compiled twice, once for the host and once for the target, but this is +intended to be a backwards-compatible extension to Cargo. + +## Pieces to stabilize + +Eventually this RFC is intended to be considered for stabilization (after it's +implemented and proven out on nightly, of course). The summary of pieces that +would become stable are: + +* The `rustc_macro` crate, and a small set of APIs within (skeleton above) +* The `rustc-macro` crate type, in addition to its current limitations +* The `#[rustc_macro_derive]` attribute +* The signature of the `#![rustc_macro_derive]` functions +* The `#![rustc_macro_crate]` attribute +* Semantically being able to load macro crates compiled as `rustc-macro` into + the compiler, requiring that the crate was compiled by the exact compiler. +* The semantic behavior of loading custom derive annotations, in that they're + just all added to the same global namespace with errors on conflicts. + Additionally, definitions end up having no hygiene for now. +* The `rustc-macro = true` attribute in Cargo + +### Macros 1.1 in practice + +Alright, that's a lot to take in! Let's take a look at what this is all going to +look like in practice, focusing on a case study of `#[derive(Serialize)]` for +serde. + +First off, serde will provide a crate, let's call it `serde_macros`. The +`Cargo.toml` will look like: + +```toml +[package] +name = "serde-macros" +# ... + +[lib] +rustc-macro = true + +[dependencies] +syntex_syntax = "0.38.0" +``` + +The contents will look similar to + +```rust +extern crate rustc_macro; +extern crate syntex_syntax; + +use rustc_macro::{Context, TokenStream}; + +#[rustc_macro_derive(Serialize)] +pub fn derive_serialize(_cx: &mut Context, + input: TokenStream) -> TokenStream { + let input = input.to_source(); + + // use syntex_syntax from crates.io to parse `input` into an AST + + // use this AST to generate an impl of the `Serialize` trait for the type in + // question + + // convert that impl to a string + + // parse back into a token stream + return TokenStream::from_source(&impl_source).unwrap() +} +``` + +Next, crates will depend on this such as: + +```toml +[dependencies] +serde = "0.9" +serde-macros = "0.9" +``` + +And finally use it as such: + +```rust +extern crate serde; +extern crate serde_macros; + +#[derive(Serialize)] +pub struct Foo { + a: usize, + #[serde(rename = "foo")] + b: String, +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +* This is not an interface that would be considered for stabilization in a void, + there are a number of known drawbacks to the current macro system in terms of + how it architecturally fits into the compiler. Additionally, there's work + underway to solve all these problems with macros 2.0. + + As mentioned before, however, the stable version of macros 2.0 is currently + quite far off, and the desire for features like custom derive are very real + today. The rationale behind this RFC is that the downsides are an acceptable + tradeoff from moving a significant portion of the nightly ecosystem onto stable + Rust. + +* This implementation is likely to be less performant than procedural macros + are today. Round tripping through strings isn't always a speedy operation, + especially for larger expansions. Strings, however, are a very small + implementation detail that's easy to see stabilized until the end of time. + Additionally, it's planned to extend the `TokenStream` API in the future to + allow more fine-grained transformations without having to round trip through + strings. + +* Users will still have an inferior experience to today's nightly macros + specifically with respect to compile times. The `syntex_syntax` crate takes + quite a few seconds to compile, and this would be required by any crate which + uses serde. To offset this, though, the `syntex_syntax` could be *massively* + stripped down as all it needs to do is parse struct declarations mostly. There + are likely many other various optimizations to compile time that can be + applied to ensure that it compiles quickly. + +* Plugin authors will need to be quite careful about the code which they + generate as working with strings loses much of the expressiveness of macros in + Rust today. For example: + + ```rust + macro_rules! foo { + ($x:expr) => { + #[derive(Serialize)] + enum Foo { Bar = $x, Baz = $x * 2 } + } + } + foo!(1 + 1); + ``` + + Plugin authors would have to ensure that this is not naively interpreted as + `Baz = 1 + 1 * 2` as this will cause incorrect results. + +# Alternatives +[alternatives]: #alternatives + +* Wait for macros 2.0, but this likely comes with the high cost of postponing a + stable custom-derive experience on the time scale of years. + +* Don't add `rustc_macro` as a new crate, but rather specify that + `#[rustc_macro_derive]` has a stable-ABI friendly signature. This does not + account, however, for the eventual planned introduction of the `rustc_macro` + crate and is significantly harder to write. The marginal benefit of being + slightly more flexible about how it's run likely isn't worth it. + +* The syntax for defining a macro may be different in the macros 2.0 world (e.g. + `pub macro foo` vs an attribute), that is it probably won't involve a function + attribute like `#[rustc_macro_derive]`. This interim system could possibly use + this syntax as well, but it's unclear whether we have a concrete enough idea + in mind to implement today. + +* Instead of passing around `&mut Context` we could allow for storage of + compiler data structures in thread-local-storage. This would avoid threading + around an extra parameter and perhaps wouldn't lose too much flexibility. + +* In addition to allowing definition of custom-derive forms, definition of + custom procedural macros could also be allowed. They are similarly + transformers from token streams to token streams, so the interface in this RFC + would perhaps be appropriate. This addition, however, adds more surface area + to this RFC and the macro 1.1 system which may not be necessary in the long + run. It's currently understood that *only* custom derive is needed to move + crates like serde and diesel onto stable Rust. + +* Instead of having a global namespace of `#[derive]` modes which `rustc-macro` + crates append to, we could at least require something along the lines of + `#[derive(serde_macros::Deserialize)]`. This is unfortunately, however, still + disconnected from what name resolution will actually be eventually and also + deviates from what you actually may want, `#[derive(serde::Deserialize)]`, for + example. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Is the interface between macros and the compiler actually general enough to + be implemented differently one day? + +* The intention of macros 1.1 is to be *as close as possible* to macros 2.0 in + spirit and implementation, just without stabilizing vast quantities of + features. In that sense, it is the intention that given a stable macros 1.1, + we can layer on features backwards-compatibly to get to macros 2.0. Right now, + though, the delta between what this RFC proposes and where we'd like to is + very small, and can get get it down to actually zero? + +* Eventually macro crates will want to be loaded both at compile time and + runtime, and this means that Cargo will need to understand to compile these + crates twice, once as `rustc-macro` and once as an rlib. Does Cargo have + enough information to do this? Are the extensions needed here + backwards-compatible? From ea504d4ef90e1edbdb9a019c1016d9c39d84580a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 18 Jul 2016 13:18:32 -0700 Subject: [PATCH 1014/1195] rustc-macro crates are now intermediate, not final --- text/0000-macros-1.1.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md index 04b8e8cb2b3..249280d2556 100644 --- a/text/0000-macros-1.1.md +++ b/text/0000-macros-1.1.md @@ -260,10 +260,10 @@ compiler. The compiler will still gate unknown attributes by default. ### `rustc-macro` crates -Like the executable, staticlib, and cdylib crate types, the `rustc-macro` crate -type is intended to be a final product. What it *actually* produces is not -specified, but if a `-L` path is provided to it then the compiler will recognize -the output artifacts as a macro and it can be loaded for a program. +Like the rlib and dylib crate types, the `rustc-macro` crate +type is intended to be an intermediate product. What it *actually* produces is +not specified, but if a `-L` path is provided to it then the compiler will +recognize the output artifacts as a macro and it can be loaded for a program. Initially if a crate is compiled with the `rustc-macro` crate type (and possibly others) it will forbid exporting any items in the crate other than those From 9659073d2466e03d2771f2aa6787a47d4c468415 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 18 Jul 2016 15:26:29 -0700 Subject: [PATCH 1015/1195] Require #[macro_use] on macro crates for now --- text/0000-macros-1.1.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md index 249280d2556..5930a50b8d8 100644 --- a/text/0000-macros-1.1.md +++ b/text/0000-macros-1.1.md @@ -290,6 +290,7 @@ Using a procedural macro will be very similar to today's `extern crate` system, such as: ```rust +#[macro_use] extern crate double; #[derive(Double)] @@ -438,6 +439,7 @@ And finally use it as such: ```rust extern crate serde; +#[macro_use] extern crate serde_macros; #[derive(Serialize)] From 4b052f88e5cc2dacdd9dbc7ad528c132fa21f6d8 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 18 Jul 2016 15:55:06 -0700 Subject: [PATCH 1016/1195] Add version drawback and clarify another --- text/0000-macros-1.1.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md index 5930a50b8d8..8705e027b73 100644 --- a/text/0000-macros-1.1.md +++ b/text/0000-macros-1.1.md @@ -495,7 +495,17 @@ pub struct Foo { ``` Plugin authors would have to ensure that this is not naively interpreted as - `Baz = 1 + 1 * 2` as this will cause incorrect results. + `Baz = 1 + 1 * 2` as this will cause incorrect results. The compiler will also + need to be careful to parenthesize token streams like this when it generates + a stringified source. + +* By having separte library and macro crate support today (e.g. `serde` and + `serde_macros`) it's possible for there to be version skew between the two, + making it tough to ensure that the two versions you're using are compatible + with one another. This would be solved if `serde` itself could define or + reexport the macros, but unfortunately that would require a likely much larger + step towards "macros 2.0" to solve and would greatly increase the size of this + RFC. # Alternatives [alternatives]: #alternatives From 00e6ec0f5deb1148b72c58e26ce0c9f73c84dc25 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 18 Jul 2016 16:36:18 -0700 Subject: [PATCH 1017/1195] Don't pass around `&mut Context` --- text/0000-macros-1.1.md | 45 +++++++++++++++++++---------------------- 1 file changed, 21 insertions(+), 24 deletions(-) diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md index 8705e027b73..4615709f59e 100644 --- a/text/0000-macros-1.1.md +++ b/text/0000-macros-1.1.md @@ -161,17 +161,16 @@ pub struct LexError { // ... } -pub struct Context { - // ... -} +impl FromStr for TokenStream { + type Err = LexError; -impl TokenStream { - pub fn from_source(cx: &mut Context, - source: &str) -> Result { + fn from_str(s: &str) -> Result { // ... } +} - pub fn to_source(&self, cx: &mut Context) -> String { +impl fmt::Display for TokenStream { + fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { // ... } } @@ -183,12 +182,7 @@ more closely resemble token streams [in the compiler itself][compiler-tokenstream], and more fine-grained manipulations will be available as well. -Additionally, the `Context` structure will initially be completely devoid -of functionality, but in the future it will be the entry point for [many other -features][macro20] one would expect in macros 2.0 - [compiler-tokenstream]: https://github.com/rust-lang/rust/blob/master/src/libsyntax/tokenstream.rs#L323-L338 -[macro20]: http://ncameron.org/blog/libmacro/ ### Defining a macro @@ -206,10 +200,10 @@ A macro crate might look like: extern crate rustc_macro; -use rustc_macro::{Context, TokenStream}; +use rustc_macro::TokenStream; #[rustc_macro_derive(Double)] -pub fn double(cx: &mut Context, input: TokenStream) -> TokenStream { +pub fn double(input: TokenStream) -> TokenStream { let source = input.to_source(cx); // Parse `source` for struct/enum declaration, and then build up some new @@ -244,11 +238,11 @@ The `rustc_macro_derive` attribute requires the signature (similar to [macros [mac20sig]: http://ncameron.org/blog/libmacro/#tokenisingandquasiquoting ```rust -fn(&mut Context, TokenStream) -> TokenStream +fn(TokenStream) -> TokenStream ``` If a macro cannot process the input token stream, it is expected to panic for -now, although eventually it will call methods on `Context` to provide more +now, although eventually it will call methods in `rustc_macro` to provide more structured errors. The compiler will wrap up the panic message and display it to the user appropriately. Eventually, however, `librustc_macro` will provide more interesting methods of signaling errors to users. @@ -408,12 +402,11 @@ The contents will look similar to extern crate rustc_macro; extern crate syntex_syntax; -use rustc_macro::{Context, TokenStream}; +use rustc_macro::TokenStream; #[rustc_macro_derive(Serialize)] -pub fn derive_serialize(_cx: &mut Context, - input: TokenStream) -> TokenStream { - let input = input.to_source(); +pub fn derive_serialize(input: TokenStream) -> TokenStream { + let input = input.to_string(); // use syntex_syntax from crates.io to parse `input` into an AST @@ -423,7 +416,7 @@ pub fn derive_serialize(_cx: &mut Context, // convert that impl to a string // parse back into a token stream - return TokenStream::from_source(&impl_source).unwrap() + return impl_source.parse().unwrap() } ``` @@ -525,9 +518,13 @@ pub struct Foo { this syntax as well, but it's unclear whether we have a concrete enough idea in mind to implement today. -* Instead of passing around `&mut Context` we could allow for storage of - compiler data structures in thread-local-storage. This would avoid threading - around an extra parameter and perhaps wouldn't lose too much flexibility. +* The `TokenStream` state likely has some sort of backing store behind it like a + string interner, and in the APIs above it's likely that this state is passed + around in thread-local-storage to avoid threading through a parameter like + `&mut Context` everywhere. An alternative would be to explicitly pass this + parameter, but it might hinder trait implementations like `fmt::Display` and + `FromStr`. Additionally, threading an extra parameter could perhaps become + unwieldy over time. * In addition to allowing definition of custom-derive forms, definition of custom procedural macros could also be allowed. They are similarly From a88bfbb7e599cea6db1cabe3135921241ee1bc94 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 08:33:12 -0700 Subject: [PATCH 1018/1195] Add named-field-puns RFC draft --- text/0000-named-field-puns.md | 85 +++++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 text/0000-named-field-puns.md diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md new file mode 100644 index 00000000000..4a8ee3411dd --- /dev/null +++ b/text/0000-named-field-puns.md @@ -0,0 +1,85 @@ +- Feature Name: named-field-puns +- Start Date: 2016-07-18 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +When initializing a data structure (struct, enum, union) with named fields, allow writing `fieldname` as a shorthand for `fieldname: fieldname`. This allows a compact syntax for initialization, with less duplication: + + struct SomeStruct { field1: ComplexType, field2: AnotherType } + + impl SomeStruct { + fn new() -> Self { + let field1 = { + // Various initialization code + }; + let field2 = { + // More initialization code + }; + SomeStruct { complexField, anotherField } + } + } + +# Motivation +[motivation]: #motivation + +When writing initialization code for a data structure, the names of the structure fields often become the most straightforward names to use for their initial values as well. At the end of such an initialization function, then, the initializer will contain many patterns of repeated field names as field values: `field: field, field2: field2, field3: field3`. + +Such repetition of the field names makes it less ergonomic to separately declare and initialize individual fields, and makes it tempting to instead embed complex code directly in the initializer to avoid repetition. + +Rust already allows [similar syntax for destructuring in pattern matches](https://doc.rust-lang.org/book/patterns.html#destructuring): a pattern match can use `SomeStruct { field1, field2 } => ...` to match `field1` and `field2` into values with the same names. This RFC introduces symmetric syntax for initializers. + +A family of related structures will often use the same name for the same type of value. Combining this new syntax with the existing pattern-matching syntax allows simple movement of data between fields with a pattern match: `Struct1 { field1, .. } => Struct2 { field1 }`. + +The proposed syntax also improves structure initializers in closures, such as might appear in a chain of iterator adapters: `|field1, field2| SomeStruct { field1, field2 }`. + +This RFC takes inspiration from the Haskell [NamedFieldPuns extension](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#record-puns), and from ES6 [shorthand property names](http://www.ecma-international.org/ecma-262/6.0/#sec-object-initializer). + +# Detailed design +[design]: #detailed-design + +In the initializer for a `struct` with named fields, a `union` with named fields, or an enum variant with named fields, accept an identifier `field` as a shorthand for `field: field`. + +The shorthand initializer `field` always behaves in every possible way like the longhand initializer `field: field`. This RFC introduces no new behavior or semantics, only a purely syntactic shorthand. The rest of this section only provides further examples to explicitly clarify that this new syntax remains entirely orthogonal to other initializer behavior and semantics. + +If the struct `SomeStruct` has fields `field1` and `field2`, the initializer `SomeStruct { field1, field2 }` behaves in every way like the initializer `SomeStruct { field1: field1, field2: field2 }`. + +An initializer may contain any combination of shorthand and full field initializers: + + let a = SomeStruct { field1, field2: expression, field3 }; + let b = SomeStruct { field1: field1, field2: expression, field3: field3 }; + assert_eq!(a, b); + +An initializer may use shorthand field initializers together with [update syntax](https://doc.rust-lang.org/book/structs.html#update-syntax): + + let a = SomeStruct { field1, .. someStructInstance }; + let b = SomeStruct { field1: field1, .. someStructInstance }; + assert_eq!(a, b); + +This shorthand initializer syntax does not introduce any new compiler errors that cannot also occur with the longhand initializer syntax `field: field`. Existing compiler errors that can occur with the longhand initializer syntax `field: field` also apply to the shorthand initializer syntax `field`: + +- As with the longhand initializer `field: field`, if the structure has no field with the specified name `field`, the shorthand initializer `field` results in a compiler error for attempting to initialize a non-existent field. + +- As with the longhand initializer `field: field`, repeating a field name within the same initializer results in a compiler error ([E0062](https://doc.rust-lang.org/error-index.html#E0062)); this occurs with any combination of shorthand initializers or full `field: expression` initializers. + +- As with the longhand initializer `field: field`, if the name `field` does not resolve, the shorthand initializer `field` results in a compiler error for an unresolved name ([E0425](https://doc.rust-lang.org/error-index.html#E0425)). + +- As with the longhand initializer `field: field`, if the name `field` resolves to a value with type incompatible with the field `field` in the structure, the shorthand initializer `field` results in a compiler error for mismatched types ([E0308](https://doc.rust-lang.org/error-index.html#E0308)). + +# Drawbacks +[drawbacks]: #drawbacks + +This new syntax could significantly improve readability given clear and local field-punning variables, but could also be abused to decrease readability if used with more distant variables. + +As with many syntactic changes, a macro could implement this instead. See the Alternatives section for discussion of this. + +The shorthand initializer syntax looks similar to positional initialization of a structure without field names; reinforcing this, the initializer will commonly list the fields in the same order that the struct declares them. However, the shorthand initializer syntax differs from the positional initializer syntax (such as for a tuple struct) in that the positional syntax uses parentheses instead of braces: `SomeStruct(x, y)` is unambiguously a positional initializer, while `SomeStruct { x, y }` is unambiguously a shorthand initializer for the named fields `x` and `y`. + +# Alternatives +[alternatives]: #alternatives + +In addition to this syntax, initializers could support omitting the field names entirely, with syntax like `SomeStruct { .. }`, which would implicitly initialize omitted fields from identically named variables. However, that would introduce far too much magic into initializers, and the context-dependence seems likely to result in less readable, less obvious code. + +A macro wrapped around the initializer could implement this syntax, without changing the language; for instance, `pun! { SomeStruct { field1, field2 } }` could expand to `SomeStruct { field1: field1, field2: field2 }`. However, this change exists to make structure construction shorter and more expressive; having to use a macro would negate some of the benefit of doing so, particularly in places where brevity improves readability, such as in a closure in the middle of a larger expression. Pattern matching already allows using field names as the destination for the field values; this change adds a symmetrical mechanism for structure construction. \ No newline at end of file From 46a33d338f9c2975ca311472de876718b2cae1a9 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 09:13:36 -0700 Subject: [PATCH 1019/1195] Wrap words --- text/0000-named-field-puns.md | 113 +++++++++++++++++++++++++++------- 1 file changed, 90 insertions(+), 23 deletions(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 4a8ee3411dd..3d185a370e1 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -6,7 +6,9 @@ # Summary [summary]: #summary -When initializing a data structure (struct, enum, union) with named fields, allow writing `fieldname` as a shorthand for `fieldname: fieldname`. This allows a compact syntax for initialization, with less duplication: +When initializing a data structure (struct, enum, union) with named fields, +allow writing `fieldname` as a shorthand for `fieldname: fieldname`. This +allows a compact syntax for initialization, with less duplication: struct SomeStruct { field1: ComplexType, field2: AnotherType } @@ -25,61 +27,126 @@ When initializing a data structure (struct, enum, union) with named fields, allo # Motivation [motivation]: #motivation -When writing initialization code for a data structure, the names of the structure fields often become the most straightforward names to use for their initial values as well. At the end of such an initialization function, then, the initializer will contain many patterns of repeated field names as field values: `field: field, field2: field2, field3: field3`. +When writing initialization code for a data structure, the names of the +structure fields often become the most straightforward names to use for their +initial values as well. At the end of such an initialization function, then, +the initializer will contain many patterns of repeated field names as field +values: `field: field, field2: field2, field3: field3`. -Such repetition of the field names makes it less ergonomic to separately declare and initialize individual fields, and makes it tempting to instead embed complex code directly in the initializer to avoid repetition. +Such repetition of the field names makes it less ergonomic to separately +declare and initialize individual fields, and makes it tempting to instead +embed complex code directly in the initializer to avoid repetition. -Rust already allows [similar syntax for destructuring in pattern matches](https://doc.rust-lang.org/book/patterns.html#destructuring): a pattern match can use `SomeStruct { field1, field2 } => ...` to match `field1` and `field2` into values with the same names. This RFC introduces symmetric syntax for initializers. +Rust already allows +[similar syntax for destructuring in pattern matches](https://doc.rust-lang.org/book/patterns.html#destructuring): +a pattern match can use `SomeStruct { field1, field2 } => ...` to match +`field1` and `field2` into values with the same names. This RFC introduces +symmetric syntax for initializers. -A family of related structures will often use the same name for the same type of value. Combining this new syntax with the existing pattern-matching syntax allows simple movement of data between fields with a pattern match: `Struct1 { field1, .. } => Struct2 { field1 }`. +A family of related structures will often use the same name for the same type +of value. Combining this new syntax with the existing pattern-matching syntax +allows simple movement of data between fields with a pattern match: `Struct1 { +field1, .. } => Struct2 { field1 }`. -The proposed syntax also improves structure initializers in closures, such as might appear in a chain of iterator adapters: `|field1, field2| SomeStruct { field1, field2 }`. +The proposed syntax also improves structure initializers in closures, such as +might appear in a chain of iterator adapters: `|field1, field2| SomeStruct { +field1, field2 }`. -This RFC takes inspiration from the Haskell [NamedFieldPuns extension](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#record-puns), and from ES6 [shorthand property names](http://www.ecma-international.org/ecma-262/6.0/#sec-object-initializer). +This RFC takes inspiration from the Haskell +[NamedFieldPuns extension](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#record-puns), +and from ES6 +[shorthand property names](http://www.ecma-international.org/ecma-262/6.0/#sec-object-initializer). # Detailed design [design]: #detailed-design -In the initializer for a `struct` with named fields, a `union` with named fields, or an enum variant with named fields, accept an identifier `field` as a shorthand for `field: field`. +In the initializer for a `struct` with named fields, a `union` with named +fields, or an enum variant with named fields, accept an identifier `field` as a +shorthand for `field: field`. -The shorthand initializer `field` always behaves in every possible way like the longhand initializer `field: field`. This RFC introduces no new behavior or semantics, only a purely syntactic shorthand. The rest of this section only provides further examples to explicitly clarify that this new syntax remains entirely orthogonal to other initializer behavior and semantics. +The shorthand initializer `field` always behaves in every possible way like the +longhand initializer `field: field`. This RFC introduces no new behavior or +semantics, only a purely syntactic shorthand. The rest of this section only +provides further examples to explicitly clarify that this new syntax remains +entirely orthogonal to other initializer behavior and semantics. -If the struct `SomeStruct` has fields `field1` and `field2`, the initializer `SomeStruct { field1, field2 }` behaves in every way like the initializer `SomeStruct { field1: field1, field2: field2 }`. +If the struct `SomeStruct` has fields `field1` and `field2`, the initializer +`SomeStruct { field1, field2 }` behaves in every way like the initializer +`SomeStruct { field1: field1, field2: field2 }`. -An initializer may contain any combination of shorthand and full field initializers: +An initializer may contain any combination of shorthand and full field +initializers: let a = SomeStruct { field1, field2: expression, field3 }; let b = SomeStruct { field1: field1, field2: expression, field3: field3 }; assert_eq!(a, b); -An initializer may use shorthand field initializers together with [update syntax](https://doc.rust-lang.org/book/structs.html#update-syntax): +An initializer may use shorthand field initializers together with +[update syntax](https://doc.rust-lang.org/book/structs.html#update-syntax): let a = SomeStruct { field1, .. someStructInstance }; let b = SomeStruct { field1: field1, .. someStructInstance }; assert_eq!(a, b); -This shorthand initializer syntax does not introduce any new compiler errors that cannot also occur with the longhand initializer syntax `field: field`. Existing compiler errors that can occur with the longhand initializer syntax `field: field` also apply to the shorthand initializer syntax `field`: +This shorthand initializer syntax does not introduce any new compiler errors +that cannot also occur with the longhand initializer syntax `field: field`. +Existing compiler errors that can occur with the longhand initializer syntax +`field: field` also apply to the shorthand initializer syntax `field`: -- As with the longhand initializer `field: field`, if the structure has no field with the specified name `field`, the shorthand initializer `field` results in a compiler error for attempting to initialize a non-existent field. +- As with the longhand initializer `field: field`, if the structure has no + field with the specified name `field`, the shorthand initializer `field` + results in a compiler error for attempting to initialize a non-existent + field. -- As with the longhand initializer `field: field`, repeating a field name within the same initializer results in a compiler error ([E0062](https://doc.rust-lang.org/error-index.html#E0062)); this occurs with any combination of shorthand initializers or full `field: expression` initializers. +- As with the longhand initializer `field: field`, repeating a field name + within the same initializer results in a compiler error + ([E0062](https://doc.rust-lang.org/error-index.html#E0062)); this occurs with + any combination of shorthand initializers or full `field: expression` + initializers. -- As with the longhand initializer `field: field`, if the name `field` does not resolve, the shorthand initializer `field` results in a compiler error for an unresolved name ([E0425](https://doc.rust-lang.org/error-index.html#E0425)). +- As with the longhand initializer `field: field`, if the name `field` does not + resolve, the shorthand initializer `field` results in a compiler error for an + unresolved name ([E0425](https://doc.rust-lang.org/error-index.html#E0425)). -- As with the longhand initializer `field: field`, if the name `field` resolves to a value with type incompatible with the field `field` in the structure, the shorthand initializer `field` results in a compiler error for mismatched types ([E0308](https://doc.rust-lang.org/error-index.html#E0308)). +- As with the longhand initializer `field: field`, if the name `field` resolves + to a value with type incompatible with the field `field` in the structure, + the shorthand initializer `field` results in a compiler error for mismatched + types ([E0308](https://doc.rust-lang.org/error-index.html#E0308)). # Drawbacks [drawbacks]: #drawbacks -This new syntax could significantly improve readability given clear and local field-punning variables, but could also be abused to decrease readability if used with more distant variables. +This new syntax could significantly improve readability given clear and local +field-punning variables, but could also be abused to decrease readability if +used with more distant variables. -As with many syntactic changes, a macro could implement this instead. See the Alternatives section for discussion of this. +As with many syntactic changes, a macro could implement this instead. See the +Alternatives section for discussion of this. -The shorthand initializer syntax looks similar to positional initialization of a structure without field names; reinforcing this, the initializer will commonly list the fields in the same order that the struct declares them. However, the shorthand initializer syntax differs from the positional initializer syntax (such as for a tuple struct) in that the positional syntax uses parentheses instead of braces: `SomeStruct(x, y)` is unambiguously a positional initializer, while `SomeStruct { x, y }` is unambiguously a shorthand initializer for the named fields `x` and `y`. +The shorthand initializer syntax looks similar to positional initialization of +a structure without field names; reinforcing this, the initializer will +commonly list the fields in the same order that the struct declares them. +However, the shorthand initializer syntax differs from the positional +initializer syntax (such as for a tuple struct) in that the positional syntax +uses parentheses instead of braces: `SomeStruct(x, y)` is unambiguously a +positional initializer, while `SomeStruct { x, y }` is unambiguously a +shorthand initializer for the named fields `x` and `y`. # Alternatives [alternatives]: #alternatives -In addition to this syntax, initializers could support omitting the field names entirely, with syntax like `SomeStruct { .. }`, which would implicitly initialize omitted fields from identically named variables. However, that would introduce far too much magic into initializers, and the context-dependence seems likely to result in less readable, less obvious code. - -A macro wrapped around the initializer could implement this syntax, without changing the language; for instance, `pun! { SomeStruct { field1, field2 } }` could expand to `SomeStruct { field1: field1, field2: field2 }`. However, this change exists to make structure construction shorter and more expressive; having to use a macro would negate some of the benefit of doing so, particularly in places where brevity improves readability, such as in a closure in the middle of a larger expression. Pattern matching already allows using field names as the destination for the field values; this change adds a symmetrical mechanism for structure construction. \ No newline at end of file +In addition to this syntax, initializers could support omitting the field names +entirely, with syntax like `SomeStruct { .. }`, which would implicitly +initialize omitted fields from identically named variables. However, that would +introduce far too much magic into initializers, and the context-dependence +seems likely to result in less readable, less obvious code. + +A macro wrapped around the initializer could implement this syntax, without +changing the language; for instance, `pun! { SomeStruct { field1, field2 } }` +could expand to `SomeStruct { field1: field1, field2: field2 }`. However, this +change exists to make structure construction shorter and more expressive; +having to use a macro would negate some of the benefit of doing so, +particularly in places where brevity improves readability, such as in a closure +in the middle of a larger expression. Pattern matching already allows using +field names as the destination for the field values; this change adds a +symmetrical mechanism for structure construction. From f8e50c541f7484481e43055a0c6fc59047c1a35c Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 09:20:15 -0700 Subject: [PATCH 1020/1195] Caption example usage in summary --- text/0000-named-field-puns.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 3d185a370e1..813c18d97ed 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -8,7 +8,9 @@ When initializing a data structure (struct, enum, union) with named fields, allow writing `fieldname` as a shorthand for `fieldname: fieldname`. This -allows a compact syntax for initialization, with less duplication: +allows a compact syntax for initialization, with less duplication. + +Example usage: struct SomeStruct { field1: ComplexType, field2: AnotherType } From bd201d64ba40c04547ed4dc1a6095d4d7e32ce29 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 09:20:39 -0700 Subject: [PATCH 1021/1195] Remove non-technical use of the word "type" --- text/0000-named-field-puns.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 813c18d97ed..ad743e6a037 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -45,10 +45,10 @@ a pattern match can use `SomeStruct { field1, field2 } => ...` to match `field1` and `field2` into values with the same names. This RFC introduces symmetric syntax for initializers. -A family of related structures will often use the same name for the same type -of value. Combining this new syntax with the existing pattern-matching syntax -allows simple movement of data between fields with a pattern match: `Struct1 { -field1, .. } => Struct2 { field1 }`. +A family of related structures will often use the same field name for a +semantically-shared value. Combining this new syntax with the existing +pattern-matching syntax allows simple movement of data between fields with a +pattern match: `Struct1 { field1, .. } => Struct2 { field1 }`. The proposed syntax also improves structure initializers in closures, such as might appear in a chain of iterator adapters: `|field1, field2| SomeStruct { From 96bcd13fec834f7f39c82ff76e35ca307f90781d Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 09:27:46 -0700 Subject: [PATCH 1022/1195] Tweak adjective --- text/0000-named-field-puns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index ad743e6a037..5dd0be82cb4 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -43,7 +43,7 @@ Rust already allows [similar syntax for destructuring in pattern matches](https://doc.rust-lang.org/book/patterns.html#destructuring): a pattern match can use `SomeStruct { field1, field2 } => ...` to match `field1` and `field2` into values with the same names. This RFC introduces -symmetric syntax for initializers. +symmetrical syntax for initializers. A family of related structures will often use the same field name for a semantically-shared value. Combining this new syntax with the existing From 379e50acb34351043b221c95966a49e8313a8401 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 09:30:16 -0700 Subject: [PATCH 1023/1195] Emphasize precedent and symmetry when assessing macros --- text/0000-named-field-puns.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 5dd0be82cb4..5798f917cd2 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -149,6 +149,7 @@ could expand to `SomeStruct { field1: field1, field2: field2 }`. However, this change exists to make structure construction shorter and more expressive; having to use a macro would negate some of the benefit of doing so, particularly in places where brevity improves readability, such as in a closure -in the middle of a larger expression. Pattern matching already allows using -field names as the destination for the field values; this change adds a -symmetrical mechanism for structure construction. +in the middle of a larger expression. There is also precedent for +language-level support. Pattern matching already allows using field names as +the _destination_ for the field values via destructuring. This change adds a +symmetrical mechanism for construction which uses existing names as _sources_. From f9827d1847c37ffc782cf583a7ab8228f2475b0a Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 10:13:26 -0700 Subject: [PATCH 1024/1195] Fix field names in example --- text/0000-named-field-puns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 5798f917cd2..1fd3ff7f6f7 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -22,7 +22,7 @@ Example usage: let field2 = { // More initialization code }; - SomeStruct { complexField, anotherField } + SomeStruct { field1, field2 } } } From 5b441ffe20aa438a9d599bed65a3a607000e8be5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 19 Jul 2016 10:37:53 -0700 Subject: [PATCH 1025/1195] Remove mentions of old attributes --- text/0000-macros-1.1.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md index 4615709f59e..950722a7fab 100644 --- a/text/0000-macros-1.1.md +++ b/text/0000-macros-1.1.md @@ -327,11 +327,11 @@ compiler that can be dynamically loaded. The compiler will `dlopen` a macro crate in the same way it does today, find this symbol in `librustc_macro`, and call it. -The `rustc_macro_define` and `rustc_macro_derive` attributes will be encoded -into the crate's metadata, and the compiler will discover all these functions, -load their function pointers, and pass them to the `librustc_macro` entry point -as well. This provides the opportunity to register all the various expansion -mechanisms with the compiler. +The `rustc_macro_derive` attribute will be encoded into the crate's metadata, +and the compiler will discover all these functions, load their function +pointers, and pass them to the `librustc_macro` entry point as well. This +provides the opportunity to register all the various expansion mechanisms with +the compiler. The actual underlying representation of `TokenStream` will be basically the same as it is in the compiler today. (the details on this are a little light @@ -367,7 +367,6 @@ would become stable are: * The `rustc-macro` crate type, in addition to its current limitations * The `#[rustc_macro_derive]` attribute * The signature of the `#![rustc_macro_derive]` functions -* The `#![rustc_macro_crate]` attribute * Semantically being able to load macro crates compiled as `rustc-macro` into the compiler, requiring that the crate was compiled by the exact compiler. * The semantic behavior of loading custom derive annotations, in that they're From 8c307c1e10b28d65afe21c77d858f853a0bae1f3 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 11:56:44 -0700 Subject: [PATCH 1026/1195] Use "similar" to avoid conflation with senses of "shared" --- text/0000-named-field-puns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 1fd3ff7f6f7..d528630f50b 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -46,7 +46,7 @@ a pattern match can use `SomeStruct { field1, field2 } => ...` to match symmetrical syntax for initializers. A family of related structures will often use the same field name for a -semantically-shared value. Combining this new syntax with the existing +semantically-similar value. Combining this new syntax with the existing pattern-matching syntax allows simple movement of data between fields with a pattern match: `Struct1 { field1, .. } => Struct2 { field1 }`. From 293adaf87cda6f538e7606f6f1242965f6ee7470 Mon Sep 17 00:00:00 2001 From: Steven Fackler Date: Tue, 19 Jul 2016 23:56:33 +0200 Subject: [PATCH 1027/1195] Expand a bit and add index methods to trait --- text/0000-panic-safe-slicing.md | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/text/0000-panic-safe-slicing.md b/text/0000-panic-safe-slicing.md index 24e358e3add..df26ca4847a 100644 --- a/text/0000-panic-safe-slicing.md +++ b/text/0000-panic-safe-slicing.md @@ -10,12 +10,19 @@ Add "panic-safe" or "total" alternatives to the existing panicking indexing synt # Motivation `SliceExt::get` and `SliceExt::get_mut` can be thought as non-panicking versions of the simple -indexing syntax, `a[idx]`. However, there is no such equivalent for `a[start..end]`, `a[start..]`, -or `a[..end]`. This RFC proposes such methods to fill the gap. +indexing syntax, `a[idx]`, and `SliceExt::get_unchecked` and `SliceExt::get_unchecked_mut` can +be thought of as unsafe versions with bounds checks elided. However, there is no such equivalent for +`a[start..end]`, `a[start..]`, or `a[..end]`. This RFC proposes such methods to fill the gap. # Detailed design -Introduce a `SliceIndex` trait which is implemented by types which can index into a slice: +The `get`, `get_mut`, `get_unchecked`, and `get_unchecked_mut` will be made generic over `usize` +as well as ranges of `usize` like slice's `Index` implementation currently is. This will allow e.g. +`a.get(start..end)` which will behave analagously to `a[start..end]`. + +Because methods cannot be overloaded in an ad-hoc manner in the same way that traits may be +implemented, we introduce a `SliceIndex` trait which is implemented by types which can index into a +slice: ```rust pub trait SliceIndex { type Output: ?Sized; @@ -24,6 +31,8 @@ pub trait SliceIndex { fn get_mut(self, slice: &mut [T]) -> Option<&mut Self::Output>; unsafe fn get_unchecked(self, slice: &[T]) -> &Self::Output; unsafe fn get_mut_unchecked(self, slice: &[T]) -> &mut Self::Output; + fn index(self, slice: &[T]) -> &Self::Output; + fn index_mut(self, slice: &mut [T]) -> &mut Self::Output; } impl SliceIndex for usize { @@ -39,7 +48,7 @@ impl SliceIndex for R } ``` -Alter the `Index`, `IndexMut`, `get`, `get_mut`, `get_unchecked`, and `get_mut_unchecked` +And then alter the `Index`, `IndexMut`, `get`, `get_mut`, `get_unchecked`, and `get_mut_unchecked` implementations to be generic over `SliceIndex`: ```rust impl [T] { @@ -74,7 +83,7 @@ impl Index for [T] type Output = I::Output; fn index(&self, idx: I) -> &I::Output { - self.get(idx).expect("out of bounds slice access") + idx.index(self) } } @@ -82,7 +91,7 @@ impl IndexMut for [T] where I: SliceIndex { fn index_mut(&self, idx: I) -> &mut I::Output { - self.get_mut(idx).expect("out of bounds slice access") + idx.index_mut(self) } } ``` From 66b2a82f64c4dcdc715d13bfbb89a7f99c93e0f4 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 17:12:28 -0700 Subject: [PATCH 1028/1195] Add subsections to "Alternatives" section --- text/0000-named-field-puns.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index d528630f50b..2b73c38cc47 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -137,12 +137,16 @@ shorthand initializer for the named fields `x` and `y`. # Alternatives [alternatives]: #alternatives +## Wildcards + In addition to this syntax, initializers could support omitting the field names entirely, with syntax like `SomeStruct { .. }`, which would implicitly initialize omitted fields from identically named variables. However, that would introduce far too much magic into initializers, and the context-dependence seems likely to result in less readable, less obvious code. +## Macros + A macro wrapped around the initializer could implement this syntax, without changing the language; for instance, `pun! { SomeStruct { field1, field2 } }` could expand to `SomeStruct { field1: field1, field2: field2 }`. However, this From 7cfd7e9053532c12cbe9d0cbf61f528ce43f77f3 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 19 Jul 2016 17:12:42 -0700 Subject: [PATCH 1029/1195] Add subsection for sigil alternative --- text/0000-named-field-puns.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 2b73c38cc47..e630b9f1263 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -157,3 +157,26 @@ in the middle of a larger expression. There is also precedent for language-level support. Pattern matching already allows using field names as the _destination_ for the field values via destructuring. This change adds a symmetrical mechanism for construction which uses existing names as _sources_. + + +## Sigils + +To minimize confusing shorthand expressions with the construction of +tuple-like structs, we might elect to prefix expanded field names with +sigils. + +For example, if the sigil were `:`, the existing syntax `S { x: x }` +would be expressed as `S { :x }`. This is used in +[MoonScript](http://moonscript.org/reference/#the-language/table-literals). + +This particular choice of sigil may be confusing, due to the +already-overloaded use of `:` for fields and type ascription. Additionally, +in languages such as Ruby and Elixir, `:x` denotes a symbol or atom, which +may be confusing for newcomers. + +Other sigils could be used instead, but even then we are then increasing +the amount of new syntax being introduced. This both increases language +complexity and reduces the gained compactness, worsening the +cost/benefit ratio of adding a shorthand. Any use of a sigil also breaks +the symmetry between binding pattern matching and the proposed +shorthand. From 42f4035dd4af2ded018a221e8b0a9b4dc6e7b7ec Mon Sep 17 00:00:00 2001 From: Ryan Scheel Date: Tue, 19 Jul 2016 17:04:35 -0700 Subject: [PATCH 1030/1195] Explicit type punning alternative. --- text/0000-named-field-puns.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index e630b9f1263..dcfbb9c8af0 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -158,7 +158,6 @@ language-level support. Pattern matching already allows using field names as the _destination_ for the field values via destructuring. This change adds a symmetrical mechanism for construction which uses existing names as _sources_. - ## Sigils To minimize confusing shorthand expressions with the construction of @@ -180,3 +179,19 @@ complexity and reduces the gained compactness, worsening the cost/benefit ratio of adding a shorthand. Any use of a sigil also breaks the symmetry between binding pattern matching and the proposed shorthand. + +## Keyword-prefixed + +Similarly to sigils, we could use a keyword like Nix uses +[inherit](http://nixos.org/nix/manual/#idm46912467627696). Some forms we could +decide upon (using `use` as the keyword of choice here, but it could be +something else), it could look like the following. + +* `S { use x, y, z: 10}` +* `S { use (x, y), z: 10 }` +* `S { use {x, y}, z: 10 }` +* `S { use x, use y, z: 10}` + +This has the same drawbacks as sigils except that it won't be confused for +symbols in other languages or adding more sigils. It also has the benefit +of being something that can be searched for in documentation. \ No newline at end of file From 2fa187d01f00f258306bd5871f77a2d096652a89 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 20 Jul 2016 13:32:37 -0700 Subject: [PATCH 1031/1195] Add a note about what "no hygiene" means --- text/0000-macros-1.1.md | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md index 950722a7fab..d51a979670c 100644 --- a/text/0000-macros-1.1.md +++ b/text/0000-macros-1.1.md @@ -204,15 +204,15 @@ use rustc_macro::TokenStream; #[rustc_macro_derive(Double)] pub fn double(input: TokenStream) -> TokenStream { - let source = input.to_source(cx); + let source = input.to_string(); // Parse `source` for struct/enum declaration, and then build up some new // source code representing representing a number of items in the // implementation of the `Double` trait for the struct/enum in question. - let source = derive_double(cx, source); + let source = derive_double(&source); // Parse this back to a token stream and return it - TokenStream::from_source(cx, &source).unwrap() + source.parse().unwrap() } ``` @@ -223,14 +223,15 @@ struct that `#[derive]` was attached to, attributes and all. The output is **expected to include the `struct`/`enum` itself** as well as any number of items to be contextually "placed next to" the initial declaration. -Again, though, there is no hygiene, it's as if the source was simply -copy/pasted. All span information for the `TokenStream` structures returned by -`from_source` will point to the original `#[derive]` annotation. This means -that error messages related to struct definitions will get *worse* if they have -a custom derive attribute placed on them, because the entire struct's span will -get folded into the `#[derive]` annotation. Eventually, though, more span -information will be stable on the `TokenStream` type, so this is just a -temporary limitation. +Again, though, there is no hygiene. More specifically, the +`TokenStream::from_str` method will use the same expansion context as the derive +attribute itself, not the point of definition of the derive function. All span +information for the `TokenStream` structures returned by `from_source` will +point to the original `#[derive]` annotation. This means that error messages +related to struct definitions will get *worse* if they have a custom derive +attribute placed on them, because the entire struct's span will get folded into +the `#[derive]` annotation. Eventually, though, more span information will be +stable on the `TokenStream` type, so this is just a temporary limitation. The `rustc_macro_derive` attribute requires the signature (similar to [macros 2.0][mac20sig]): From 7d891423604c0333e13e017c5aa26b1ebd04f5d9 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Thu, 21 Jul 2016 18:06:59 +1200 Subject: [PATCH 1032/1195] Update the RFC with discussion from the comment thread --- text/0000-name-resolution.md | 213 +++++++++++++++++++++++++++++++++-- 1 file changed, 204 insertions(+), 9 deletions(-) diff --git a/text/0000-name-resolution.md b/text/0000-name-resolution.md index 84bb8149bf9..fe1b877923f 100644 --- a/text/0000-name-resolution.md +++ b/text/0000-name-resolution.md @@ -36,9 +36,112 @@ At the same time, we should be able to accept more Rust programs by tweaking the current rules around imports and name shadowing. This should make programming using imports easier. + +## Some issues in Rust's name resolution + +Whilst name resolution is sometimes considered a simple part of the compiler, +there are some details in Rust which make it tricky to properly specify and +implement. Some of these may seem obvious, but the distinctions will be +important later. + +* Imported vs declared names - a name can be imported (e.g., `use foo;`) or + declared (e.g., `fn foo ...`). +* Single vs glob imports - a name can be explicitly (e.g., `use a::foo;`) or + implicitly imported (e.g., `use a::*;` where `foo` is declared in `a`). +* Public vs private names - the visibility of names is somewhat tied up with + name resolution, for example in current Rust `use a::*;` only imports the + public names from `a`. +* Lexical scoping - a name can be inherited from a surrounding scope, rather + than being declared in the current one, e.g., `let foo = ...; { foo(); }`. +* There are different kinds of scopes - at the item level, names are not + inherited from outer modules into inner modules. Items may also be declared + inside functions and blocks within functions, with different rules from modules. + At the expression level, blocks (`{...}`) give explicit scope, however, from + the point of view of macro hygiene and region inference, each `let` statement + starts a new implicit scope. +* Explicitly declared vs macro generated names - a name can be declared + explicitly in the source text, or could be declared as the result of expanding + a macro. +* Rust has multiple namespaces - types, values, and macros exist in separate + namespaces (some items produce names in multiple namespaces). Imports + refer (implictly) to one or more names in different namespaces. + + Note that all top-level (i.e., not parameters, etc.) path segments in a path + other than the last must be in the type namespace, e.g., in `a::b::c`, `a` and + `b` are assumed to be in the type namespace, and `c` may be in any namespace. +* Rust has an implicit prelude - the prelude defines a set of names which are + always (unless explicitly opted-out) nameable. The prelude includes macros. + Names in the prelude can be shadowed by any other names. + + # Detailed design [design]: #detailed-design +## Guiding principles + +We would like the following principles to hold. There may be edge cases where +they do not, but we would like these to be as small as possible (and prefer they +don't exist at all). + +#### Avoid 'time-travel' ambiguities, or different results of resolution if names +are resolved in different orders. + +Due to macro expansion, it is possible for a name to be resolved and then to +become ambiguous, or (with rules formulated in a certain way) for a name to be +resolved, then to be amiguous, then to be resolvable again (possibly to +different bindings). + +Furthermore, there is some flexibility in the order in which macros can be +expanded. How a name resolves should be consistent under any ordering. + +The strongest form of this principle, I believe, is that at any stage of +macro expansion, and under any ordering of expansions, if a name resolves to a +binding then it should always (i.e., at any other stage of any other expansion +series) resolve to that binding, and if resolving a name produces an error +(n.b., distinct from not being able to resolve), it should always produce an +error. + + +#### Avoid errors due to the resolver being stuck. + +Errors with concrete causes and explanations are easier for the user to +understand and to correct. If an error is caused by name resolution getting +stuck, rather than by a concrete problem, this is hard to explain or correct. + +For example, if we support a rule that means that a certain glob can't be +expanded before a macro is, but the macro can only be named via that glob +import, then there is an obvious resolution that can't be reached due to our +ordering constraints. + + +#### The order of declarations of items should be irrelevant. + +I.e., names should be able to be used before they are declared. Note that this +clearly does not hold for declarations of variables in statements inside +function bodies. + + +#### Macros should be manually expandable. + +Compiling a program should have the same result before and after expanding a +macro 'by hand', so long as hygiene is accounted for. + + +#### Glob imports should be manually expandable. + +A programmer should be able to replace a glob import with a list import that +imports any names imported by the glob and used in the current scope, without +changing name resolution behaviour. + + +#### Visibility should not affect name resolution. + +Clearly, visibility affects whether a name can be used or not. However, it +should not affect the mechanics of name resolution. I.e., changing a name from +public to private (or vice versa), should not cause more or fewer name +resolution errors (it may of course cause more or fewer accessibility errors). + + ## Changes to name resolution rules ### Multiple unused imports @@ -142,9 +245,12 @@ Note that in combination with the above rule, this means non-public imports are imported by globs where they are private but accessible. -### Globs and explicit names +### Explicit names may shadow implicit names -An explicit name may shadow a glob imported name without causing a name +Here, an implicit name means a name imported via a glob or inherited from an +outer scope (as opposed to being declared or imported directly in an inner scope). + +An explicit name may shadow an implicit name without causing a name resolution error. E.g., ``` @@ -168,6 +274,19 @@ mod boz { } ``` +or + +``` +fn main() { + struct Foo; // 1. + { + struct Foo; // 2. + + let x = Foo; // Ok and refers to declaration 2. + } +} +``` + Note that shadowing is namespace specific. I believe this is consistent with our general approach to name spaces. E.g., @@ -218,7 +337,61 @@ the name will continue to be valid, or there will be an error. Without this caveat, a name could be valid, and then after further expansion, become shadowed by a higher priority name. -This change is discussed in [issue 31337](https://github.com/rust-lang/rust/issues/31337). +An error is reported if there is an ambiguity between names due to the lack of +shadowing, e.g., (this example assumes modularised macros), + +``` +macro_rules! foo { + () => { + macro! bar { ... } + } +} + +mod a { + macro! bar { ... } +} + +mod b { + use a::*; + + foo!(); // Expands to `macro! bar { ... }`. + + bar!(); // ERROR: bar is ambiguous. +} +``` + +Note on the caveat: there will only be an error emitted if an ambiguous name is +used directly or indirectly in a macro use. I.e., is the name of a macro that is +used, or is the name of a module that is used to name a macro either in a macro +use or in an import. + +Alternatives: we could emit an error even if the ambiguous name is not used, or +as a compromise between these two, we could emit an error if the name is in the +type or macro namespace (a name in the value namespace can never cause problems). + +This change is discussed in [issue 31337](https://github.com/rust-lang/rust/issues/31337) +and on this RFC PR's comment thread. + + +### Re-exports, namespaces, and visibility. + +(This is something of a clarification point, rather than explicitly new behaviour. +See also discussion on [issue 31783](https://github.com/rust-lang/rust/issues/31783)). + +An import (`use`) or re-export (`pub use`) imports a name in all available +namespaces. E.g., `use a::foo;` will import `foo` in the type and value +namespaces if it is declared in those namespaces in `a`. + +For a name to be re-exported, it must be public, e.g, `pub use a::foo;` requires +that `foo` is declared publicly in `a`. This is complicated by namespaces. The +following behaviour should be followed for a re-export of `foo`: + +* `foo` is private in all namespaces in which it is declared - emit an error. +* `foo` is public in all namespaces in which it is declared - `foo` is + re-exported in all namespaces. +* `foo` is mixed public/private - `foo` is re-exported in the namespaces in which + it is declared publicly and imported but not re-exported in namespaces in which + it is declared privately. ## Changes to the implementation @@ -243,11 +416,11 @@ the supplying module, we can add it for the importing module. We then loop over the work list and try to lookup names. If a name has exactly one best binding then we use it (and record the binding on a list of resolved -names). If there are zero, or more than one possible binding, then we put it -back on the work list. When we reach a fixed point, i.e., the work list no -longer changes, then we are done. If the work list is empty, then -expansion/import resolution succeeded, otherwise there are names not found, or -ambiguous names, and we failed. +names). If there are zero then we put it back on the work list. If there is more +than one binding, then we record an ambiguity error. When we reach a fixed +point, i.e., the work list no longer changes, then we are done. If the work list +is empty, then expansion/import resolution succeeded, otherwise there are names +not found, or ambiguous names, and we failed. As we are looking up names, we record the resolutions in the binding table. If the name we are looking up is for a glob import, we add bindings for every @@ -270,8 +443,8 @@ In pseudo-code: // pass. fn parse_expand_and_resolve() { loop until fixed point { + process_names() loop until fixed point { - process_names() process_work_list() } expand_macros() @@ -337,6 +510,24 @@ naming, also the macro namespace), and that we must record whether a name is due to macro expansion or not to abide by the caveat to the 'explicit names shadow glob names' rule. +If Rust had a single namespace (or had some other properties), we would not have +to distinguish between failed and unresolved imports. However, it does and we +must. This is not clear from the pseudo-code because it elides namespaces, but +consider the following small example: + +``` +use a::foo; // foo exists in the value namespace of a. +use b::*; // foo exists in the type namespace of b. +``` + +Can we resolve a use fo `foo` in type position to the import from `b`? That +depends on whether `foo` exists in the type namespace in `a`. If we can prove +that it does not (i.e., resolution fails) then we can use the glob import. If we +cannot (i.e., the name is unresolved but we can't prove it will not resolve +later), then it is not safe to use the glob import because it may be shadowed by +the explicit import. (Note, since `foo` exists in at least the value namespace +in `a`, there will be no error due to a bad import). + In order to keep macro expansion comprehensible to programmers, we must enforce that all macro uses resolve to the same binding at the end of resolution as they do when they were resolved. @@ -358,6 +549,10 @@ If import resolution succeeds, then we check our record of name resolutions. We re-resolve and check we get the same result. We can also check for un-used macros at this point. +Note that the rules in the previous section have been carefully formulated to +ensure that this check is sufficient to prevent temporal ambiguities. There are +many slight variations for which this check would not be enough. + ### Privacy In order to resolve imports (and in the future for macro privacy), we must be From f671c2017a5d2a738927acec41225cd51a468565 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Thu, 21 Jul 2016 15:37:21 -0400 Subject: [PATCH 1033/1195] propose the docs team --- text/0000-docs-team.md | 100 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 text/0000-docs-team.md diff --git a/text/0000-docs-team.md b/text/0000-docs-team.md new file mode 100644 index 00000000000..f1866d57623 --- /dev/null +++ b/text/0000-docs-team.md @@ -0,0 +1,100 @@ +- Feature Name: N/A +- Start Date: 2016-07-21 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Create a team responsible for documentation for the Rust project. + +# Motivation +[motivation]: #motivation + +[RFC 1068] introduced a federated governance model for the Rust project. Several initial subteams were set up. There was a note +after the [original subteam list] saying this: + +[RFC 1068]: https://github.com/rust-lang/rfcs/blob/master/text/1068-rust-governance.md +[original subteam list]: https://github.com/rust-lang/rfcs/blob/master/text/1068-rust-governance.md#the-teams + +> In the long run, we will likely also want teams for documentation and for community events, but these can be spun up once there is a more clear need (and available resources). + +Now is the time for a documentation subteam. + +## Why documentation was left out + +Documentation was left out of the original list because it wasn't clear that there would be anyone but me on it. Furthermore, +one of the original reasons for the subteams was to decide who gets counted amongst consensus for RFCs, but it was unclear +how many documentation-related RFCs there would even be. + +## Chicken, meet egg + +However, RFCs are not only what subteams do. To quote the RFC: + +> * Shepherding RFCs for the subteam area. As always, that means (1) ensuring +> that stakeholders are aware of the RFC, (2) working to tease out various +> design tradeoffs and alternatives, and (3) helping build consensus. +> * Accepting or rejecting RFCs in the subteam area. +> * Setting policy on what changes in the subteam area require RFCs, and reviewing direct PRs for changes that do not require an RFC. +> * Delegating reviewer rights for the subteam area. The ability to r+ is not limited to team members, and in fact earning r+ rights is a good stepping stone toward team membership. Each team should set reviewing policy, manage reviewing rights, and ensure that reviews take place in a timely manner. (Thanks to Nick Cameron for this suggestion.) + +The first two are about RFCs themselves, but the second two are more pertinent to documentation. In particular, +deciding who gets `r+` rights is important. A lack of clarity in this area has been unfortuante, and has led to a +chicken and egg situation: without a documentation team, it's unclear how to be more involved in working on Rust's +documentation, but without people to be on the team, there's no reason to form a team. For this reason, I think +a small initial team will break this logjam, and provide room for new contributors to grow. + +# Detailed design +[design]: #detailed-design + +The Rust documentation team will be responsible for all of the things listed above. Specifically, they will pertain +to these areas of the Rust project: + +* The standard library documentation +* The book and other long-form docs +* Cargo's documentation +* The Error Index + +Furthermore, the documentation team will be available to help with ecosystem documentation, in a few ways. Firstly, +in an advisory capacity: helping people who want better documentation for their crates to understand how to accomplish +that goal. Furthermore, monitoring the overall ecosystem documentation, and identifying places where we could contribute +and make a large impact for all Rustaceans. If the Rust project itself has wonderful docs, but the ecosystem has terrible +docs, then people will still be frustrated with Rust's documentation situation, especially given our anti-batteries-included +attitude. To be clear, this does not mean _owning_ the ecosystem docs, but rather working to contribute in more ways +than just the Rust project itself. + +We will coordinate in the `#rust-docs` IRC room, and have regular meetings, as the team sees fit. Regular meetings will be +important to coordinate broader goals; and participation will be important for team members. We hold meetings weekly. + +## Membership + +* @steveklabnik, team lead +* @GuillaumeGomez +* @jonathandturner +* @peschkaj + +It's important to have a path towards attaining team membership; there are some other people who have already been doing +docs work that aren't on this list. These guidelines are not hard and fast, however, anyone wanting to eventually be a +member of the team should pursue these goals: + +* Contributing documentation patches to Rust itself +* Attending doc team meetings, which are open to all +* generally being available on IRC to collaborate with others + +I am not quantifying this exactly because it's not about reaching some specific number; adding someone to the team should +make sense if someone is doing all of these things. + +# Drawbacks +[drawbacks]: #drawbacks + +This is Yet Another Team. Do we have too many teams? I don't think so, but someone might. + +# Alternatives +[alternatives]: #alternatives + +The main alternative is not having a team. This is the status quo, so the situation is well-understood. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Should we quantify the number of commits before you can get on the team? From 15e41ada1d7cbc791f99a56958f8146d6cde2696 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Thu, 21 Jul 2016 16:12:22 -0400 Subject: [PATCH 1034/1195] some feedback from other team members --- text/0000-docs-team.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/text/0000-docs-team.md b/text/0000-docs-team.md index f1866d57623..3372d57b866 100644 --- a/text/0000-docs-team.md +++ b/text/0000-docs-team.md @@ -94,7 +94,14 @@ This is Yet Another Team. Do we have too many teams? I don't think so, but someo The main alternative is not having a team. This is the status quo, so the situation is well-understood. +It's possible that docs come under the purvew of "tools", and so maybe the docs team would be an expansion +of the tools team, rather than its own new team. Or some other subteam. + # Unresolved questions [unresolved]: #unresolved-questions -Should we quantify the number of commits before you can get on the team? +Should we quantify the number of commits before you can get on the team? Or the amount of time? + +Rustdoc is a tool that's owned by the tools team, but the doc team makes +significant contributions to it. Is it part of the tools team or the docs team? +Maybe a backend-frontend split? Or maybe it just stays with tools? From 6a46363a22fdf6c2a4f49d36862a7e15851ccef4 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 22 Jul 2016 17:04:06 -0400 Subject: [PATCH 1035/1195] RFC #1559 is "Allow all literals in attributes" --- ...utes-with-literals.md => 1559-attributes-with-literals.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-attributes-with-literals.md => 1559-attributes-with-literals.md} (98%) diff --git a/text/0000-attributes-with-literals.md b/text/1559-attributes-with-literals.md similarity index 98% rename from text/0000-attributes-with-literals.md rename to text/1559-attributes-with-literals.md index 47958767092..e9044f87c78 100644 --- a/text/0000-attributes-with-literals.md +++ b/text/1559-attributes-with-literals.md @@ -1,7 +1,7 @@ - Feature Name: attributes_with_literals - Start Date: 2016-03-28 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1559 +- Rust Issue: https://github.com/rust-lang/rust/issues/34981 # Summary [summary]: #summary From ec7f3b1782d3295767fbd5dfc21a499e65e8a797 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Sat, 23 Jul 2016 12:48:10 -0700 Subject: [PATCH 1036/1195] Add subsections to "Detailed design" section --- text/0000-named-field-puns.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index dcfbb9c8af0..4d63907612f 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -62,16 +62,22 @@ and from ES6 # Detailed design [design]: #detailed-design +## Grammar + In the initializer for a `struct` with named fields, a `union` with named fields, or an enum variant with named fields, accept an identifier `field` as a shorthand for `field: field`. +## Interpretation + The shorthand initializer `field` always behaves in every possible way like the longhand initializer `field: field`. This RFC introduces no new behavior or semantics, only a purely syntactic shorthand. The rest of this section only provides further examples to explicitly clarify that this new syntax remains entirely orthogonal to other initializer behavior and semantics. +## Examples + If the struct `SomeStruct` has fields `field1` and `field2`, the initializer `SomeStruct { field1, field2 }` behaves in every way like the initializer `SomeStruct { field1: field1, field2: field2 }`. @@ -90,6 +96,8 @@ An initializer may use shorthand field initializers together with let b = SomeStruct { field1: field1, .. someStructInstance }; assert_eq!(a, b); +## Compilation errors + This shorthand initializer syntax does not introduce any new compiler errors that cannot also occur with the longhand initializer syntax `field: field`. Existing compiler errors that can occur with the longhand initializer syntax From 23b91944e517ddde4637aedabd2b93b54ea78d2c Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Sat, 23 Jul 2016 12:48:25 -0700 Subject: [PATCH 1037/1195] Fix EOF newline --- text/0000-named-field-puns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index 4d63907612f..ccec3462c7a 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -202,4 +202,4 @@ something else), it could look like the following. This has the same drawbacks as sigils except that it won't be confused for symbols in other languages or adding more sigils. It also has the benefit -of being something that can be searched for in documentation. \ No newline at end of file +of being something that can be searched for in documentation. From 7016ba90bdd59df611a703f342a3f4fe186197a7 Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Sat, 23 Jul 2016 12:48:51 -0700 Subject: [PATCH 1038/1195] Add description of formal grammar change --- text/0000-named-field-puns.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/text/0000-named-field-puns.md b/text/0000-named-field-puns.md index ccec3462c7a..67eb08b7e92 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-named-field-puns.md @@ -68,6 +68,16 @@ In the initializer for a `struct` with named fields, a `union` with named fields, or an enum variant with named fields, accept an identifier `field` as a shorthand for `field: field`. +With reference to the grammar in `parser-lalr.y`, this proposal would +expand the `field_init` +[rule](https://github.com/rust-lang/rust/blob/master/src/grammar/parser-lalr.y#L1663-L1665) +to the following: + + field_init + : ident + | ident ':' expr + ; + ## Interpretation The shorthand initializer `field` always behaves in every possible way like the From 6098278299d00d184d5b735ca6751c3e1765b406 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 26 Jul 2016 13:32:46 +1200 Subject: [PATCH 1039/1195] More changes HT @jseyfried --- text/0000-name-resolution.md | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/text/0000-name-resolution.md b/text/0000-name-resolution.md index fe1b877923f..cb261d0afcf 100644 --- a/text/0000-name-resolution.md +++ b/text/0000-name-resolution.md @@ -1,4 +1,4 @@ -- Feature Name: N/A +- Feature Name: item_like_imports - Start Date: 2016-02-09 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -310,7 +310,7 @@ mod boz { ``` Caveat: an explicit name which is defined by the expansion of a macro does **not** -shadow glob imports. Example: +shadow implicit names. Example: ``` macro_rules! foo { @@ -332,10 +332,10 @@ mod b { ``` The rationale for this caveat is so that during import resolution, if we have a -glob import we can be sure that any imported names will not be shadowed, either -the name will continue to be valid, or there will be an error. Without this -caveat, a name could be valid, and then after further expansion, become shadowed -by a higher priority name. +glob import (or other implicit name) we can be sure that any imported names will +not be shadowed, either the name will continue to be valid, or there will be an +error. Without this caveat, a name could be valid, and then after further +expansion, become shadowed by a higher priority name. An error is reported if there is an ambiguity between names due to the lack of shadowing, e.g., (this example assumes modularised macros), @@ -393,6 +393,9 @@ following behaviour should be followed for a re-export of `foo`: it is declared publicly and imported but not re-exported in namespaces in which it is declared privately. +For a glob re-export, there is an error if there are no public items in any +namespace. Otherwise private names are imported and public names are re-exported +on a per-namespace basis (i.e., following the above rules). ## Changes to the implementation @@ -433,8 +436,7 @@ in the same way as we parsed the original program. We add new names to the binding table, and expand any new macro uses. If we add names for a module which has back links, we must follow them and add -these names to the importing module (if they are accessible). When following -these back links, we check for cycles, signaling an error if one is found. +these names to the importing module (if they are accessible). In pseudo-code: @@ -520,7 +522,7 @@ use a::foo; // foo exists in the value namespace of a. use b::*; // foo exists in the type namespace of b. ``` -Can we resolve a use fo `foo` in type position to the import from `b`? That +Can we resolve a use of `foo` in type position to the import from `b`? That depends on whether `foo` exists in the type namespace in `a`. If we can prove that it does not (i.e., resolution fails) then we can use the glob import. If we cannot (i.e., the name is unresolved but we can't prove it will not resolve @@ -641,6 +643,13 @@ name lookup could be done lazily (probably with some caching) so no tables binding names to definitions are kept. I prefer the first option, but this is not really in scope for this RFC. +## `pub(restricted)` + +Where this RFC touches on the privacy system there are some edge cases involving +the `pub(path)` form of restricted visibility. I expect the precise solutions +will be settled during implementation and this RFC should be amended to reflect +those choices. + # References From d795e41248db60a24d6a86878c1027aaebde06b6 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 26 Jul 2016 09:46:15 -0400 Subject: [PATCH 1040/1195] remove empty bullet --- text/0000-rustc-bug-fix-procedure.md | 1 - 1 file changed, 1 deletion(-) diff --git a/text/0000-rustc-bug-fix-procedure.md b/text/0000-rustc-bug-fix-procedure.md index c1d158a8123..d709f58d740 100644 --- a/text/0000-rustc-bug-fix-procedure.md +++ b/text/0000-rustc-bug-fix-procedure.md @@ -263,7 +263,6 @@ There are obviously many points that we could tweak in this policy: - Eliminate the tracking issue. - Change the stabilization schedule. -- Two other obvious (and rather extreme) alternatives are not having a policy and not making any sort of breaking change at all: From b8ae71c876a0e53bea64d2dee8a42328d6c0b1b9 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Tue, 26 Jul 2016 10:39:24 -0400 Subject: [PATCH 1041/1195] add a note about how the team is intended as a temporary team with a specific purpose --- text/0000-memory-model-strike-team.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-memory-model-strike-team.md b/text/0000-memory-model-strike-team.md index b3da16f348f..236323ab4be 100644 --- a/text/0000-memory-model-strike-team.md +++ b/text/0000-memory-model-strike-team.md @@ -135,6 +135,13 @@ that roughly correspond to C++11 atomics, and the intention is that we can layer our rules for sequential execution atop those rules for parallel execution. +## Termination conditions + +The unsafe code guidelines team is intended as a temporary strike team +with the goal of producing the documents described below. Once the RFC +for those documents have been approved, responsibility for maintaining +the documents falls to the lang team. + ## Time frame Working out a a set of rules for unsafe code is a detailed process and From 055c5005ddfad70c51ed5ca52ed4c4533c00c252 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Wed, 27 Jul 2016 10:38:38 -0400 Subject: [PATCH 1042/1195] Fix type names in motivation/microbenchmark. --- text/0000-fused-iterator.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/text/0000-fused-iterator.md b/text/0000-fused-iterator.md index 88093c97f3c..a6a5d3553a9 100644 --- a/text/0000-fused-iterator.md +++ b/text/0000-fused-iterator.md @@ -17,11 +17,11 @@ no-op if `I` implements `FusedIterator`. Iterators are allowed to return whatever they want after returning `None` once. However, assuming that an iterator continues to return `None` can make -implementing some algorithms/adapters easier. Therefore, `Fused` and -`Iterator::fuse` exist. Unfortunately, the `Fused` iterator adapter introduces a +implementing some algorithms/adapters easier. Therefore, `Fuse` and +`Iterator::fuse` exist. Unfortunately, the `Fuse` iterator adapter introduces a noticeable overhead. Furthermore, many iterators (most if not all iterators in std) already act as if they were fused (this is considered to be the "polite" -behavior). Therefore, it would be nice to be able to pay the `Fused` overhead +behavior). Therefore, it would be nice to be able to pay the `Fuse` overhead only when necessary. Microbenchmarks: @@ -42,28 +42,28 @@ use std::ops::Range; #[derive(Clone, Debug)] #[must_use = "iterator adaptors are lazy and do nothing unless consumed"] -pub struct MyFuse { +pub struct Fuse { iter: I, done: bool } -pub trait Fused: Iterator {} +pub trait FusedIterator: Iterator {} trait IterExt: Iterator + Sized { - fn myfuse(self) -> MyFuse { - MyFuse { + fn myfuse(self) -> Fuse { + Fuse { iter: self, done: false, } } } -impl Fused for MyFuse where MyFuse: Iterator {} -impl Fused for Range where Range: Iterator {} +impl FusedIterator for Fuse where Fuse: Iterator {} +impl FusedIterator for Range where Range: Iterator {} impl IterExt for T {} -impl Iterator for MyFuse where I: Iterator { +impl Iterator for Fuse where I: Iterator { type Item = ::Item; #[inline] @@ -78,14 +78,14 @@ impl Iterator for MyFuse where I: Iterator { } } -impl Iterator for MyFuse where I: Iterator + Fused { +impl Iterator for Fuse where I: FusedIterator { #[inline] fn next(&mut self) -> Option<::Item> { self.iter.next() } } -impl ExactSizeIterator for MyFuse where I: ExactSizeIterator {} +impl ExactSizeIterator for Fuse where I: ExactSizeIterator {} #[bench] fn myfuse(b: &mut test::Bencher) { From c697b6a11392acd8a4ad01d00447867d8968c535 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Wed, 27 Jul 2016 10:42:19 -0400 Subject: [PATCH 1043/1195] Add unresolved question about removal of the Fuse::done unnecessary field. --- text/0000-fused-iterator.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/0000-fused-iterator.md b/text/0000-fused-iterator.md index a6a5d3553a9..f1d6093245a 100644 --- a/text/0000-fused-iterator.md +++ b/text/0000-fused-iterator.md @@ -269,3 +269,8 @@ change. Should this trait be unsafe? I can't think of any way generic unsafe code could end up relying on the guarantees of `Fused`. + +Also, it's possible to implement the specialized `Fuse` struct without a useless +`don` bool. Unfortunately, it's *very* messy. IMO, this is not worth it for now +and can always be fixed in the future as it doesn't change the `FusedIterator` +trait. From 480919045039f950783f2a26fb934cf6750e5c6b Mon Sep 17 00:00:00 2001 From: Anthony Ramine Date: Mon, 27 Jun 2016 12:28:08 +0200 Subject: [PATCH 1044/1195] Introduce non-panicking borrow methods on `RefCell` --- text/0000-try-borrow.md | 70 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 text/0000-try-borrow.md diff --git a/text/0000-try-borrow.md b/text/0000-try-borrow.md new file mode 100644 index 00000000000..8487204f0ff --- /dev/null +++ b/text/0000-try-borrow.md @@ -0,0 +1,70 @@ +- Feature Name: try_borrow +- Start Date: 2016-06-27 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Introduce non-panicking borrow methods on `RefCell`. + +# Motivation +[motivation]: #motivation + +Whenever something is built from user input, for example a graph in which nodes +are `RefCell` values, it is primordial to avoid panicking on bad input. The +only way to avoid panics on cyclic input in this case is a way to +conditionally-borrow the cell contents. + +# Detailed design +[design]: #detailed-design + +```rust +/// Returned when `RefCell::try_borrow` fails. +pub struct BorrowError { _inner: () } + +/// Returned when `RefCell::try_borrow_mut` fails. +pub struct BorrowMutError { _inner: () } + +impl RefCell { + /// Tries to immutably borrows the value. This returns `Err(_)` if the cell + /// was already borrowed mutably. + pub fn try_borrow(&self) -> Result, BorrowError> { ... } + + /// Tries to mutably borrows the value. This returns `Err(_)` if the cell + /// was already borrowed. + pub fn try_borrow_mut(&self) -> Result, BorrowMutError> { ... } +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +This departs from the fallible/infallible convention where we avoid providing +both panicking and non-panicking methods for the same operation. + +# Alternatives +[alternatives]: #alternatives + +The alternative is to provide a `borrow_state` method returning the state +of the borrow flag of the cell, i.e: + +```rust +pub enum BorrowState { + Reading, + Writing, + Unused, +} + +impl RefCell { + pub fn borrow_state(&self) -> BorrowState { ... } +} +``` + +See [the Rust tracking issue](https://github.com/rust-lang/rust/issues/27733) +for this feature. + +# Unresolved questions +[unresolved]: #unresolved-questions + +There are no unresolved questions. From 8f3ef24d03c3525fe99c929d5a5e222d5752435a Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 27 Jul 2016 09:45:28 -0700 Subject: [PATCH 1045/1195] RFC 1660 is RefCell::{try_borrow, try_borrow_mut} --- text/{0000-try-borrow.md => 1660-try-borrow.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-try-borrow.md => 1660-try-borrow.md} (90%) diff --git a/text/0000-try-borrow.md b/text/1660-try-borrow.md similarity index 90% rename from text/0000-try-borrow.md rename to text/1660-try-borrow.md index 8487204f0ff..57a5566efa7 100644 --- a/text/0000-try-borrow.md +++ b/text/1660-try-borrow.md @@ -1,7 +1,7 @@ -- Feature Name: try_borrow +- Feature Name: `try_borrow` - Start Date: 2016-06-27 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1660](https://github.com/rust-lang/rfcs/pull/1660) +- Rust Issue: [rust-lang/rust#35070](https://github.com/rust-lang/rust/issues/35070) # Summary [summary]: #summary From 680ee797c2e8804b1d4c12f14b664ee64b55a108 Mon Sep 17 00:00:00 2001 From: Ashley Williams Date: Wed, 27 Jul 2016 10:06:00 -0700 Subject: [PATCH 1046/1195] add debug_assert_ne to proposal --- text/0000-assert_ne.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/text/0000-assert_ne.md b/text/0000-assert_ne.md index 0be68e70a84..6a7a0910259 100644 --- a/text/0000-assert_ne.md +++ b/text/0000-assert_ne.md @@ -8,6 +8,7 @@ `assert_ne` is a macro that takes 2 arguments and panics if they are equal. It works and is implemented identically to `assert_eq` and serves as its complement. +This proposal also includes a `debug_asset_ne`, matching `debug_assert_eq`. # Motivation [motivation]: #motivation @@ -38,6 +39,14 @@ macro_rules! assert_ne { } ``` +This is complemented by a `debug_assert_ne` (similar to `debug_assert_eq`): + +```rust +macro_rules! debug_assert_ne { + ($($arg:tt)*) => (if cfg!(debug_assertions) { assert_ne!($($arg)*); }) +} +``` + # Drawbacks [drawbacks]: #drawbacks From 9d39535fe880e1800c1ed4d4871b9fb4788798d5 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 27 Jul 2016 10:21:16 -0700 Subject: [PATCH 1047/1195] RFC 1653 is an `assert_ne` macro --- text/{0000-assert_ne.md => 1653-assert_ne.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-assert_ne.md => 1653-assert_ne.md} (91%) diff --git a/text/0000-assert_ne.md b/text/1653-assert_ne.md similarity index 91% rename from text/0000-assert_ne.md rename to text/1653-assert_ne.md index 6a7a0910259..78fd4d29aad 100644 --- a/text/0000-assert_ne.md +++ b/text/1653-assert_ne.md @@ -1,7 +1,7 @@ - Feature Name: Assert Not Equals Macro (`assert_ne`) - Start Date: (2016-06-17) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1653](https://github.com/rust-lang/rfcs/pull/1653) +- Rust Issue: [rust-lang/rust#35073](https://github.com/rust-lang/rust/issues/35073) # Summary [summary]: #summary From d2183557bfe0ba9978db3913a2625463116fe1fe Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 29 Jul 2016 15:58:42 -0400 Subject: [PATCH 1048/1195] RFC 1504 is 128-bit integer support --- text/{0000-int128.md => 1504-int128.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-int128.md => 1504-int128.md} (98%) diff --git a/text/0000-int128.md b/text/1504-int128.md similarity index 98% rename from text/0000-int128.md rename to text/1504-int128.md index f5e5b57942e..90f0f95ca5a 100644 --- a/text/0000-int128.md +++ b/text/1504-int128.md @@ -1,7 +1,7 @@ - Feature Name: int128 - Start Date: 21-02-2016 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1504 +- Rust Issue: https://github.com/rust-lang/rust/issues/35118 # Summary [summary]: #summary From 83116f41258e01904c4b100236de8abbe192eb5d Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 29 Jul 2016 16:13:08 -0400 Subject: [PATCH 1049/1195] RFC 1548 is global-asm --- text/{0000-global-asm.md => 1548-global-asm.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-global-asm.md => 1548-global-asm.md} (94%) diff --git a/text/0000-global-asm.md b/text/1548-global-asm.md similarity index 94% rename from text/0000-global-asm.md rename to text/1548-global-asm.md index f30731af9ca..81c651c0f74 100644 --- a/text/0000-global-asm.md +++ b/text/1548-global-asm.md @@ -1,7 +1,7 @@ - Feature Name: global_asm - Start Date: 2016-03-18 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1548 +- Rust Issue: https://github.com/rust-lang/rust/issues/35119 # Summary [summary]: #summary From 6e831827161216f7ac9a5cef5c64ab48f7a7630a Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 29 Jul 2016 16:18:38 -0400 Subject: [PATCH 1050/1195] RFC 1560 is name resolution --- text/{0000-name-resolution.md => 1560-name-resolution.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-name-resolution.md => 1560-name-resolution.md} (99%) diff --git a/text/0000-name-resolution.md b/text/1560-name-resolution.md similarity index 99% rename from text/0000-name-resolution.md rename to text/1560-name-resolution.md index cb261d0afcf..1afe57b90e3 100644 --- a/text/0000-name-resolution.md +++ b/text/1560-name-resolution.md @@ -1,7 +1,7 @@ - Feature Name: item_like_imports - Start Date: 2016-02-09 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1560 +- Rust Issue: https://github.com/rust-lang/rust/issues/35120 # Summary [summary]: #summary From 7dd371bd39d7f02e2f832694af76a7af675cc779 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 29 Jul 2016 16:24:23 -0400 Subject: [PATCH 1051/1195] amendment --- text/1444-union.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/1444-union.md b/text/1444-union.md index e4502f0c279..75dfe8e8066 100644 --- a/text/1444-union.md +++ b/text/1444-union.md @@ -418,3 +418,9 @@ structs. For instance, a union may contain anonymous structs to define non-overlapping fields, and a struct may contain an anonymous union to define overlapping fields. This RFC does not define anonymous unions or structs, but a subsequent RFC may wish to do so. + +# Edit History + +- This RFC was amended in https://github.com/rust-lang/rfcs/pull/1663/ + to clarify the behavior when an individual field whose type + implements `Drop`. From 773cbfb1a8f17835173f47a76f6aee72095775a8 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 29 Jul 2016 16:29:23 -0400 Subject: [PATCH 1052/1195] RFC 1216 is promote ! to a type --- text/{0000-bang-type.md => 1216-bang-type.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-bang-type.md => 1216-bang-type.md} (99%) diff --git a/text/0000-bang-type.md b/text/1216-bang-type.md similarity index 99% rename from text/0000-bang-type.md rename to text/1216-bang-type.md index 13db77cf612..35acefdfa51 100644 --- a/text/0000-bang-type.md +++ b/text/1216-bang-type.md @@ -1,7 +1,7 @@ - Feature Name: bang_type - Start Date: 2015-07-19 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1216 +- Rust Issue: https://github.com/rust-lang/rust/issues/35121 # Summary From b332a924531e72070b91ca219580eb0c198c28b7 Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Mon, 1 Aug 2016 14:32:33 -0400 Subject: [PATCH 1053/1195] RFC for mem::discriminant() --- text/0000-discriminant.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 text/0000-discriminant.md diff --git a/text/0000-discriminant.md b/text/0000-discriminant.md new file mode 100644 index 00000000000..a45c6110e58 --- /dev/null +++ b/text/0000-discriminant.md @@ -0,0 +1,36 @@ +- Feature Name: (fill me in with a unique ident, my_awesome_feature) +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +One para explanation of the feature. + +# Motivation +[motivation]: #motivation + +Why are we doing this? What use cases does it support? What is the expected outcome? + +# Detailed design +[design]: #detailed-design + +This is the bulk of the RFC. Explain the design in enough detail for somebody familiar +with the language to understand, and for somebody familiar with the compiler to implement. +This should get into specifics and corner-cases, and include examples of how the feature is used. + +# Drawbacks +[drawbacks]: #drawbacks + +Why should we *not* do this? + +# Alternatives +[alternatives]: #alternatives + +What other designs have been considered? What is the impact of not doing this? + +# Unresolved questions +[unresolved]: #unresolved-questions + +What parts of the design are still TBD? From 597d1e59e4fa649476d735c568c408e6324d0077 Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Mon, 1 Aug 2016 14:36:19 -0400 Subject: [PATCH 1054/1195] write RFC --- text/0000-discriminant.md | 104 ++++++++++++++++++++++++++++++++++---- 1 file changed, 93 insertions(+), 11 deletions(-) diff --git a/text/0000-discriminant.md b/text/0000-discriminant.md index a45c6110e58..ba3983a248a 100644 --- a/text/0000-discriminant.md +++ b/text/0000-discriminant.md @@ -1,36 +1,118 @@ -- Feature Name: (fill me in with a unique ident, my_awesome_feature) -- Start Date: (fill me in with today's date, YYYY-MM-DD) +- Feature Name: discriminant +- Start Date: 2016-08-01 - RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Rust Issue: [#24263](https://github.com/rust-lang/rust/pull/24263), [#34785](https://github.com/rust-lang/rust/pull/34785) # Summary [summary]: #summary -One para explanation of the feature. +Add a function that extracts the discriminant from an enum variant as a comparable, hashable, printable, but (for now) opaque and unorderable type. # Motivation [motivation]: #motivation -Why are we doing this? What use cases does it support? What is the expected outcome? +When using an ADT enum that contains data in some of the variants, it is sometimes desirable to know the variant but ignore the data, in order to compare two values by variant or store variants in a hash map when the data is either unhashable or unimportant. + +The motivation for this is mostly identical to [RFC 639](https://github.com/rust-lang/rfcs/blob/master/text/0639-discriminant-intrinsic.md#motivation). # Detailed design [design]: #detailed-design -This is the bulk of the RFC. Explain the design in enough detail for somebody familiar -with the language to understand, and for somebody familiar with the compiler to implement. -This should get into specifics and corner-cases, and include examples of how the feature is used. +The proposed design has been implemented at [#34785](https://github.com/rust-lang/rust/pull/34785) (after some back-and-forth). That implementation is copied at the end of this section for reference. + +A struct `Discriminant` and a free function `fn discriminant(v: &T) -> Discriminant` are added to `std::mem` (for lack of a better home, and noting that `std::mem` already contains similar parametricity escape hatches such as `size_of`). For now, the `Discriminant` struct is simply a newtype over `u64`, because that's what the `discriminant_value` intrinsic returns, and a `PhantomData` to allow it to be generic over `T`. + +Making `Discriminant` generic provides several benefits: + +- `discriminant(&EnumA::Variant) == discriminant(&EnumB::Variant)` is statically prevented. +- In the future, we can implement different behavior for different kinds of enums. For example, if we add a way to distinguish C-like enums at the type level, then we can add a method like `Discriminant::into_inner` for only those enums. Or enums with certain kinds of discriminants could become orderable. + +The function requires a `Reflect` bound on its argument because discriminant extraction is a partial violation of parametricity, in that a generic function with no bounds on its type parameters can nonetheless find out some information about the input types, or perform a "partial equality" comparison. This restriction is debatable (open question #2), especially in light of specialization. The situation is comparable to `TypeId::of` (which requires the bound) and `mem::size_of_val` (which does not). Note that including a bound is the conservative decision, because it can be backwards-compatibly removed. + +```rust +/// Returns a value uniquely identifying the enum variant in `v`. +/// +/// If `T` is not an enum, calling this function will not result in undefined behavior, but the +/// return value is unspecified. +/// +/// # Stability +/// +/// Discriminants can change if enum variants are reordered, if a new variant is added +/// in the middle, or (in the case of a C-like enum) if explicitly set discriminants are changed. +/// Therefore, relying on the discriminants of enums outside of your crate may be a poor decision. +/// However, discriminants of an identical enum should not change between minor versions of the +/// same compiler. +/// +/// # Examples +/// +/// This can be used to compare enums that carry data, while disregarding +/// the actual data: +/// +/// ``` +/// #![feature(discriminant_value)] +/// use std::mem; +/// +/// enum Foo { A(&'static str), B(i32), C(i32) } +/// +/// assert!(mem::discriminant(&Foo::A("bar")) == mem::discriminant(&Foo::A("baz"))); +/// assert!(mem::discriminant(&Foo::B(1)) == mem::discriminant(&Foo::B(2))); +/// assert!(mem::discriminant(&Foo::B(3)) != mem::discriminant(&Foo::C(3))); +/// ``` +pub fn discriminant(v: &T) -> Discriminant { + unsafe { + Discriminant(intrinsics::discriminant_value(v), PhantomData) + } +} + +/// Opaque type representing the discriminant of an enum. +/// +/// See the `discriminant` function in this module for more information. +pub struct Discriminant(u64, PhantomData<*const T>); + +impl Copy for Discriminant {} + +impl clone::Clone for Discriminant { + fn clone(&self) -> Self { + *self + } +} + +impl cmp::PartialEq for Discriminant { + fn eq(&self, rhs: &Self) -> bool { + self.0 == rhs.0 + } +} + +impl cmp::Eq for Discriminant {} + +impl hash::Hash for Discriminant { + fn hash(&self, state: &mut H) { + self.0.hash(state); + } +} + +impl fmt::Debug for Discriminant { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + self.0.fmt(fmt) + } +} +``` # Drawbacks [drawbacks]: #drawbacks -Why should we *not* do this? +1. Anytime we reveal more details about the memory representation of a `repr(rust)` type, we add back-compat guarantees. The author is of the opinion that the proposed `Discriminant` newtype still hides enough to mitigate this drawback. (But see open question #1.) +2. Adding another function and type to core implies an additional maintenance burden, especially when more enum layout optimizations come around (however, there is hardly any burden on top of that associated with the extant `discriminant_value` intrinsic). # Alternatives [alternatives]: #alternatives -What other designs have been considered? What is the impact of not doing this? +1. Do nothing: there is no stable way to extract the discriminant from an enum variant. Users who need such a feature will need to write (or generate) big match statements and hope they optimize well (this has been servo's approach). +2. Directly stabilize the `discriminant_value` intrinsic, or a wrapper that doesn't use an opaque newtype. This more drastically precludes future enum representation optimizations, and won't be able to take advantage of future type system improvements that would let `discriminant` return a type dependent on the enum. # Unresolved questions [unresolved]: #unresolved-questions -What parts of the design are still TBD? +1. Can the return value of `discriminant(&x)` be considered stable between subsequent compilations of the same code? How about if the enum in question is changed by modifying a variant's name? by adding a variant? +2. Is the `T: Reflect` bound necessary? +3. Can `Discriminant` implement `PartialOrd`? From e5c9852cda73c0c6e163b73a90d0c13ca2275a1e Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Mon, 1 Aug 2016 15:06:16 -0400 Subject: [PATCH 1055/1195] remove Reflect bound --- text/0000-discriminant.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-discriminant.md b/text/0000-discriminant.md index ba3983a248a..4d1ccdaed0b 100644 --- a/text/0000-discriminant.md +++ b/text/0000-discriminant.md @@ -27,7 +27,7 @@ Making `Discriminant` generic provides several benefits: - `discriminant(&EnumA::Variant) == discriminant(&EnumB::Variant)` is statically prevented. - In the future, we can implement different behavior for different kinds of enums. For example, if we add a way to distinguish C-like enums at the type level, then we can add a method like `Discriminant::into_inner` for only those enums. Or enums with certain kinds of discriminants could become orderable. -The function requires a `Reflect` bound on its argument because discriminant extraction is a partial violation of parametricity, in that a generic function with no bounds on its type parameters can nonetheless find out some information about the input types, or perform a "partial equality" comparison. This restriction is debatable (open question #2), especially in light of specialization. The situation is comparable to `TypeId::of` (which requires the bound) and `mem::size_of_val` (which does not). Note that including a bound is the conservative decision, because it can be backwards-compatibly removed. +The function no longer requires a `Reflect` bound on its argument even though discriminant extraction is a partial violation of parametricity, in that a generic function with no bounds on its type parameters can nonetheless find out some information about the input types, or perform a "partial equality" comparison. This is debatable (see [this comment](https://github.com/rust-lang/rfcs/pull/639#issuecomment-86441840), [this comment](https://github.com/rust-lang/rfcs/pull/1696#issuecomment-236669066) and open question #2), especially in light of specialization. The situation is comparable to `TypeId::of` (which requires the bound) and `mem::size_of_val` (which does not). Note that including a bound is the conservative decision, because it can be backwards-compatibly removed. ```rust /// Returns a value uniquely identifying the enum variant in `v`. @@ -58,7 +58,7 @@ The function requires a `Reflect` bound on its argument because discriminant ext /// assert!(mem::discriminant(&Foo::B(1)) == mem::discriminant(&Foo::B(2))); /// assert!(mem::discriminant(&Foo::B(3)) != mem::discriminant(&Foo::C(3))); /// ``` -pub fn discriminant(v: &T) -> Discriminant { +pub fn discriminant(v: &T) -> Discriminant { unsafe { Discriminant(intrinsics::discriminant_value(v), PhantomData) } From 960657a40f6bf5b58332bdc985ceda3b5505b0af Mon Sep 17 00:00:00 2001 From: Liigo Zhuang Date: Tue, 2 Aug 2016 11:04:15 +0800 Subject: [PATCH 1056/1195] fix minor typo: s/NonZero/Zeroable --- text/1504-int128.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1504-int128.md b/text/1504-int128.md index 90f0f95ca5a..4b43883bb39 100644 --- a/text/1504-int128.md +++ b/text/1504-int128.md @@ -71,7 +71,7 @@ Several changes need to be done to libcore: - `src/libcore/num/wrapping.rs`: Implement methods for `Wrapping` and `Wrapping`. - `src/libcore/fmt/num.rs`: Implement `Binary`, `Octal`, `LowerHex`, `UpperHex`, `Debug` and `Display` for `u128` and `i128`. - `src/libcore/cmp.rs`: Implement `Eq`, `PartialEq`, `Ord` and `PartialOrd` for `u128` and `i128`. -- `src/libcore/nonzero.rs`: Implement `NonZero` for `u128` and `i128`. +- `src/libcore/nonzero.rs`: Implement `Zeroable` for `u128` and `i128`. - `src/libcore/iter.rs`: Implement `Step` for `u128` and `i128`. - `src/libcore/clone.rs`: Implement `Clone` for `u128` and `i128`. - `src/libcore/default.rs`: Implement `Default` for `u128` and `i128`. From 95579ffe36d9bfcecd814b8cd58811c4be84bf56 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 3 Aug 2016 16:34:12 -0400 Subject: [PATCH 1057/1195] updates from discussion --- text/0000-docs-team.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/text/0000-docs-team.md b/text/0000-docs-team.md index 3372d57b866..0909cbc47c9 100644 --- a/text/0000-docs-team.md +++ b/text/0000-docs-team.md @@ -79,11 +79,13 @@ member of the team should pursue these goals: * Contributing documentation patches to Rust itself * Attending doc team meetings, which are open to all -* generally being available on IRC to collaborate with others +* generally being available on [IRC][^IRC] to collaborate with others I am not quantifying this exactly because it's not about reaching some specific number; adding someone to the team should make sense if someone is doing all of these things. +[^IRC]: The #rust-docs channel on irc.mozilla.org + # Drawbacks [drawbacks]: #drawbacks @@ -100,8 +102,4 @@ of the tools team, rather than its own new team. Or some other subteam. # Unresolved questions [unresolved]: #unresolved-questions -Should we quantify the number of commits before you can get on the team? Or the amount of time? - -Rustdoc is a tool that's owned by the tools team, but the doc team makes -significant contributions to it. Is it part of the tools team or the docs team? -Maybe a backend-frontend split? Or maybe it just stays with tools? +None. From d9626d9e04b0d528e4220a7cddb92407f17296c7 Mon Sep 17 00:00:00 2001 From: Alexander Altman Date: Thu, 4 Aug 2016 15:29:52 -0700 Subject: [PATCH 1058/1195] Remove RFCs whose tracking issues are closed --- README.md | 29 ----------------------------- 1 file changed, 29 deletions(-) diff --git a/README.md b/README.md index 2a9c2ad4da8..8d064f5d6d1 100644 --- a/README.md +++ b/README.md @@ -21,49 +21,20 @@ the direction the language is evolving in. * [0016-more-attributes.md](text/0016-more-attributes.md) * [0019-opt-in-builtin-traits.md](text/0019-opt-in-builtin-traits.md) * [0066-better-temporary-lifetimes.md](text/0066-better-temporary-lifetimes.md) -* [0090-lexical-syntax-simplification.md](text/0090-lexical-syntax-simplification.md) * [0107-pattern-guards-with-bind-by-move.md](text/0107-pattern-guards-with-bind-by-move.md) -* [0132-ufcs.md](text/0132-ufcs.md) * [0135-where.md](text/0135-where.md) -* [0141-lifetime-elision.md](text/0141-lifetime-elision.md) -* [0195-associated-items.md](text/0195-associated-items.md) * [0213-defaulted-type-params.md](text/0213-defaulted-type-params.md) -* [0218-empty-struct-with-braces.md](text/0218-empty-struct-with-braces.md) -* [0320-nonzeroing-dynamic-drop.md](text/0320-nonzeroing-dynamic-drop.md) -* [0339-statically-sized-literals.md](text/0339-statically-sized-literals.md) -* [0385-module-system-cleanup.md](text/0385-module-system-cleanup.md) * [0401-coercions.md](text/0401-coercions.md) -* [0447-no-unused-impl-parameters.md](text/0447-no-unused-impl-parameters.md) * [0495-array-pattern-changes.md](text/0495-array-pattern-changes.md) * [0501-consistent_no_prelude_attributes.md](text/0501-consistent_no_prelude_attributes.md) -* [0509-collections-reform-part-2.md](text/0509-collections-reform-part-2.md) -* [0517-io-os-reform.md](text/0517-io-os-reform.md) -* [0560-integer-overflow.md](text/0560-integer-overflow.md) * [0639-discriminant-intrinsic.md](text/0639-discriminant-intrinsic.md) -* [0769-sound-generic-drop.md](text/0769-sound-generic-drop.md) -* [0771-std-iter-once.md](text/0771-std-iter-once.md) * [0803-type-ascription.md](text/0803-type-ascription.md) * [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) * [0873-type-macros.md](text/0873-type-macros.md) -* [0888-compiler-fence-intrinsics.md](text/0888-compiler-fence-intrinsics.md) -* [0909-move-thread-local-to-std-thread.md](text/0909-move-thread-local-to-std-thread.md) * [0911-const-fn.md](text/0911-const-fn.md) -* [0968-closure-return-type-syntax.md](text/0968-closure-return-type-syntax.md) -* [0980-read-exact.md](text/0980-read-exact.md) * [0982-dst-coercion.md](text/0982-dst-coercion.md) -* [0979-align-splitn-with-other-languages.md](text/0979-align-splitn-with-other-languages.md) -* [1011-process.exit.md](text/1011-process.exit.md) -* [1023-rebalancing-coherence.md](text/1023-rebalancing-coherence.md) -* [1040-duration-reform.md](text/1040-duration-reform.md) -* [1044-io-fs-2.1.md](text/1044-io-fs-2.1.md) -* [1066-safe-mem-forget.md](text/1066-safe-mem-forget.md) -* [1096-remove-static-assert.md](text/1096-remove-static-assert.md) -* [1122-language-semver.md](text/1122-language-semver.md) * [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) -* [1156-adjust-default-object-bounds.md](text/1156-adjust-default-object-bounds.md) -* [1184-stabilize-no_std.md](text/1184-stabilize-no_std.md) * [1214-projections-lifetimes-and-wf.md](text/1214-projections-lifetimes-and-wf.md) -* [1219-use-group-as.md](text/1219-use-group-as.md) * [1228-placement-left-arrow.md](text/1228-placement-left-arrow.md) * [1260-main-reexport.md](text/1260-main-reexport.md) From 20ffd8c2b8833052a4ee7794ef3b14d3295fa7c4 Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Thu, 4 Aug 2016 23:38:58 -0400 Subject: [PATCH 1059/1195] Add Match type and drop some iterators. --- text/0000-regex-1.0.md | 53 ++++++++++++++---------------------------- 1 file changed, 18 insertions(+), 35 deletions(-) diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md index e2e102d5390..d5cc255f509 100644 --- a/text/0000-regex-1.0.md +++ b/text/0000-regex-1.0.md @@ -173,7 +173,7 @@ impl Regex { /// /// The leftmost-first match is defined as the first match that is found /// by a backtracking search. - pub fn find(&self, text: &str) -> Option<(usize, usize)>; + pub fn find<'t>(&self, text: &'t str) -> Option>; /// Returns an iterator of successive non-overlapping matches of this regex /// in the text given. @@ -366,37 +366,14 @@ lifetime of `Captures` is not tied to the lifetime of a `Regex`. ```rust impl<'t> Captures<'t> { - /// Returns the start and end location of the ith capturing group. - /// - /// If group i did not participate in the match, then None is returned. - pub fn pos(&self, i: usize) -> Option<(usize, usize)>; + /// Returns the match associated with the capture group at index `i`. If + /// `i` does not correspond to a capture group, or if the capture group + /// did not participate in the match, then `None` is returned. + pub fn get(&self, i: usize) -> Option>; - /// Returns the text matched by the ith capturing group. - /// - /// If group i did not participate in the match, then None is returned. - pub fn at(&self, i: usize) -> Option<&'t str>; - - /// Returns the text matched by the named capturing group. - /// - /// If the named group did not participate in the match, then None is - /// returned. - pub fn name(&self, name: &str) -> Option<&'t str>; - - /// Returns an iterator for all text matched by each of the capturing groups - /// in order of appearance in the pattern. If a capturing group did not - /// participate in a match, then None is yielded in its place. - pub fn iter<'c>(&'c self) -> SubCapturesIter<'c, 't>; - - /// Returns an iterator for all match locations by each of the capturing - /// groups in order of appearance in the pattern. If a capturing group did - /// not participate in a match, then None is yielded in its place. - pub fn iter_pos(&self) -> SubCapturesPosIter; - - /// Returns an iterator of tuples, where each tuple is the name of the - /// capturing group (if it exists) and the matched text. If a capturing group - /// did not participate in a match, then the second element of the tuple is - /// None. - pub fn iter_named<'c>(&'c self) -> SubCapturesNamedIter<'c, 't>; + /// Returns the match for the capture group named `name`. If `name` isn't a + /// valid capture group or didn't match anything, then `None` is returned. + pub fn name(&self, name: &str) -> Option>; /// Returns the number of captured groups. This is always at least 1, since /// the first unnamed capturing group corresponding to the entire match @@ -464,12 +441,12 @@ Along with this trait, there is also a helper type, `NoExpand` that implements pub struct NoExpand<'t>(pub &'t str); impl<'t> Replacer for NoExpand<'t> { - fn reg_replace(&mut self, _: &Captures) -> Cow { - self.0.into() + fn replace_append(&mut self, _: &Captures, dst: &mut String) { + dst.push_str(self.0); } - fn no_expand(&mut self) -> Option> { - Some(self.0.into()) + fn no_expansion<'r>(&'r mut self) -> Option> { + Some(Cow::Borrowed(self.0)) } } ``` @@ -952,6 +929,12 @@ good deal of work. This section of the RFC lists all breaking changes between `regex 0.1` and the API proposed in this RFC. +* `find` and `find_iter` now return values of type `Match` instead of + `(usize, usize)`. The `Match` type has `start` and `end` methods which can + be used to recover the original offsets, as well as an `as_str` method to + get the matched text. +* The `Captures` type no longer has any iterators defined. Instead, callers + should use the `Regex::capture_names` method. * `bytes::Regex` enables the Unicode flag by default. Previously, it disabled it by default. The flag can be disabled in the pattern with `(?-u)`. * The definition of the `Replacer` trait was completely re-worked. Namely, its From cf5513e6c3dd59b7f3f3237f5065a3015e52e4c6 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 9 Aug 2016 11:18:41 -0400 Subject: [PATCH 1060/1195] Add outline, clarify document structure. --- text/0000-document_all_features.md | 54 ++++++++++++++++++++++++++---- 1 file changed, 48 insertions(+), 6 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 8e0358d2864..d8ad6bdcc52 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -9,6 +9,33 @@ One of the major goals of Rust's development process is *stability without stagnation*. That means we add features regularly. However, it can be difficult to *use* those features if they are not publicly documented anywhere. Therefore, this RFC proposes requiring that all new language features and public standard library items must be documented before landing on the stable release branch (item documentation for the standard library; in the language reference for language features). + +## Outline +[outline]: #outline + +- [Summary](#summary) + - [Outline](#outline) +- [Motivation](#motivation) + - [The Current Situation](#the-current-situation) + - [Precedent](#precedent) +- [Detailed design](#detailed-design) + - [New RFC section: “How do we teach + this?”](#new-rfc-section-how-do-we-teach-this) + - [New requirement to document changes before + stabilizing](#new-requirement-to-document-changes-before-stabilizing) + - [Language features](#language-features) + - [Standard library](#standard-library) + - [Add an “Edit” link (optional)](#add-an-edit-link-optional) + - [Support with infrastructure + change](#support-with-infrastructure-change) + - [Visually Distinguish + Nightly (optional)](#visually-distinguish-nightly-optional) +- [How do we teach this?](#how-do-we-teach-this) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Unresolved questions](#unresolved-questions) + + # Motivation [motivation]: #motivation @@ -86,6 +113,12 @@ The basic process of developing new language features will remain largely the sa - a new requirement that the changes themselves be properly documented before being merged to stable +Additionally, we might make some content-level/infrastructural changes: + +- add an "edit" link to the documentation pages +- visually distinguish nightly vs. stable build docs + + ## New RFC section: "How do we teach this?" [new-rfc-section]: #new-rfc-section-how-do-we-teach-this @@ -106,14 +139,22 @@ For a great example of this in practice, see the (currently open) [Ember RFC: Mo [Ember RFC: Module Unification]: https://github.com/dgeb/rfcs/blob/module-unification/text/0000-module-unification.md#how-we-teach-this -## Review before stabilization +## New requirement to document changes before stabilizing -Changes will now be reviewed for changes to the documentation prior to being merged. +Changes will now be reviewed for changes to the documentation prior to being merged. This will proceed in the following places: + +- language features: + - in the reference + - in _The Rust Programming Language_ + - in _Rust by Example_ +- the standard library: in the `std` API docs ### Language features [language-features]: #language-features -In the case of language features, this will be a manual process, involving updates to the `reference.md` file. (It may at some point be sensible to break up the Reference file for easier maintenance; that is left aside as orthogonal to this discussion.) +We will document *all* language features in the Rust Reference, as well as making some updates to _The Rust Programming Language_ and _Rust by Example_ as necessary. + +This will necessarily be a manual process, involving updates to the `reference.md` file. (It may at some point be sensible to break up the Reference file for easier maintenance; that is left aside as orthogonal to this discussion.) Note that the feature documentation does not need to be written by the feature author. In fact, this is one of the areas where the community may be most able to support core developers even if not themselves programming language theorists or compiler hackers. This may free up the compiler developers' time. It will also help communicate the features in a way that is accessible to ordinary Rust users. @@ -128,6 +169,7 @@ When the core team discusses whether to stabilize a feature in a given release, Given the current state of the reference, this may need to proceed in two steps: #### The current state of the reference. +[refstate]: #the-current-state-of-the-reference Since the reference is currently fairly out of date in a number of areas, it may be worth creating a "strike team" to invest a couple months working on the reference: updating it, organizing it, and improving its presentation. (A single web page with *all* of this content is difficult to navigate at best.) This can proceed in parallel with the documentation of new features. It is probably a necessity for this proposal to be particularly effective in the long term. @@ -145,7 +187,7 @@ Updating the reference could proceed stepwise: In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. -## Add an "Edit" link +## Add an "Edit" link (optional) [edit-link]: #add-an-edit-link To support its own change, the Ember team added an "edit this" icon to the top of every page in the guides (and plans to do so for the API documentation, pending infrastructure changes to support that). Each of _The Rust Programming Language_, _Rust by Example_, and the Rust Reference should do the same. @@ -156,12 +198,12 @@ Making a similar change has some downsides (see below under [**Drawbacks**][draw 2. It sends a quiet but real signal that the docs are up for editing. This makes it likelier that people will edit them! -### Optional: Support with infrastructure change +### Support with infrastructure change [edit-link-infrastructure]: #optional-support-with-infrastructure-change The links to edit the documentation could track against the release branch instead of against `master`. (Fixes to documentation would be analogous to bugfix releases in this sense.) Targeting the pull-request automatically would be straightforward. However, see below under [**Drawbacks**][drawbacks]. -## Optional: Visually Distinguish Nightly +## Visually Distinguish Nightly (optional) [distinguish-nightly]: #optional-visually-distinguish-nightly It might be useful to visually distinguish the documentation for nightly Rust as being unstable and subject to change, even simply by setting a different default theme on _The Rust Programming Language_ book for nightly Rust. From 86ac17c0c842b2c6c6c084d934a04c2aab6f28f4 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 10 Aug 2016 18:14:22 -0400 Subject: [PATCH 1061/1195] RFC 1683 is "Propose the docs team" --- text/{0000-docs-team.md => 1683-docs-team.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-docs-team.md => 1683-docs-team.md} (98%) diff --git a/text/0000-docs-team.md b/text/1683-docs-team.md similarity index 98% rename from text/0000-docs-team.md rename to text/1683-docs-team.md index 0909cbc47c9..72c9a0b7256 100644 --- a/text/0000-docs-team.md +++ b/text/1683-docs-team.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2016-07-21 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1683 +- Rust Issue: N/A # Summary [summary]: #summary From 805c453a4562986d7c44d0e4fe87af331aa5754a Mon Sep 17 00:00:00 2001 From: Alexander Ronald Altman Date: Sun, 7 Aug 2016 21:08:10 -0700 Subject: [PATCH 1062/1195] Add current RFCs to README list --- README.md | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/README.md b/README.md index 8d064f5d6d1..56d1925528b 100644 --- a/README.md +++ b/README.md @@ -15,6 +15,7 @@ consistent and controlled path for new features to enter the language and standard libraries, so that all stakeholders can be confident about the direction the language is evolving in. + ## Active RFC List [Active RFC List]: #active-rfc-list @@ -24,6 +25,7 @@ the direction the language is evolving in. * [0107-pattern-guards-with-bind-by-move.md](text/0107-pattern-guards-with-bind-by-move.md) * [0135-where.md](text/0135-where.md) * [0213-defaulted-type-params.md](text/0213-defaulted-type-params.md) +* [0243-trait-based-exception-handling.md](text/0243-trait-based-exception-handling.md) * [0401-coercions.md](text/0401-coercions.md) * [0495-array-pattern-changes.md](text/0495-array-pattern-changes.md) * [0501-consistent_no_prelude_attributes.md](text/0501-consistent_no_prelude_attributes.md) @@ -34,9 +36,51 @@ the direction the language is evolving in. * [0911-const-fn.md](text/0911-const-fn.md) * [0982-dst-coercion.md](text/0982-dst-coercion.md) * [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) +* [1183-swap-out-jemalloc.md](text/1183-swap-out-jemalloc.md) +* [1192-inclusive-ranges.md](text/1192-inclusive-ranges.md) +* [1199-simd-infrastructure.md](text/1199-simd-infrastructure.md) +* [1201-naked-fns.md](text/1201-naked-fns.md) +* [1210-impl-specialization.md](text/1210-impl-specialization.md) +* [1211-mir.md](text/1211-mir.md) * [1214-projections-lifetimes-and-wf.md](text/1214-projections-lifetimes-and-wf.md) +* [1216-bang-type.md](text/1216-bang-type.md) * [1228-placement-left-arrow.md](text/1228-placement-left-arrow.md) +* [1229-compile-time-asserts.md](text/1229-compile-time-asserts.md) +* [1238-nonparametric-dropck.md](text/1238-nonparametric-dropck.md) +* [1240-repr-packed-unsafe-ref.md](text/1240-repr-packed-unsafe-ref.md) * [1260-main-reexport.md](text/1260-main-reexport.md) +* [1268-allow-overlapping-impls-on-marker-traits.md](text/1268-allow-overlapping-impls-on-marker-traits.md) +* [1298-incremental-compilation.md](text/1298-incremental-compilation.md) +* [1317-ide.md](text/1317-ide.md) +* [1327-dropck-param-eyepatch.md](text/1327-dropck-param-eyepatch.md) +* [1331-grammar-is-canonical.md](text/1331-grammar-is-canonical.md) +* [1358-repr-align.md](text/1358-repr-align.md) +* [1359-process-ext-unix.md](text/1359-process-ext-unix.md) +* [1398-kinds-of-allocators.md](text/1398-kinds-of-allocators.md) +* [1399-repr-pack.md](text/1399-repr-pack.md) +* [1422-pub-restricted.md](text/1422-pub-restricted.md) +* [1432-replace-slice.md](text/1432-replace-slice.md) +* [1434-contains-method-for-ranges.md](text/1434-contains-method-for-ranges.md) +* [1440-drop-types-in-const.md](text/1440-drop-types-in-const.md) +* [1444-union.md](text/1444-union.md) +* [1445-restrict-constants-in-patterns.md](text/1445-restrict-constants-in-patterns.md) +* [1492-dotdot-in-patterns.md](text/1492-dotdot-in-patterns.md) +* [1498-ipv6addr-octets.md](text/1498-ipv6addr-octets.md) +* [1504-int128.md](text/1504-int128.md) +* [1513-less-unwinding.md](text/1513-less-unwinding.md) +* [1522-conservative-impl-trait.md](text/1522-conservative-impl-trait.md) +* [1535-stable-overflow-checks.md](text/1535-stable-overflow-checks.md) +* [1542-try-from.md](text/1542-try-from.md) +* [1543-integer_atomics.md](text/1543-integer_atomics.md) +* [1548-global-asm.md](text/1548-global-asm.md) +* [1552-contains-method-for-various-collections.md](text/1552-contains-method-for-various-collections.md) +* [1559-attributes-with-literals.md](text/1559-attributes-with-literals.md) +* [1560-name-resolution.md](text/1560-name-resolution.md) +* [1590-macro-lifetimes.md](text/1590-macro-lifetimes.md) +* [1644-default-and-expanded-rustc-errors.md](text/1644-default-and-expanded-rustc-errors.md) +* [1653-assert_ne.md](text/1653-assert_ne.md) +* [1660-try-borrow.md](text/1660-try-borrow.md) + ## Table of Contents [Table of Contents]: #table-of-contents From 30b58ee19da94ef09a0ddd3eec1aa759af734a04 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 11 Aug 2016 13:50:29 -0700 Subject: [PATCH 1063/1195] RFC 1581 is the FusedIterator marker trait --- text/{0000-fused-iterator.md => 1581-fused-iterator.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-fused-iterator.md => 1581-fused-iterator.md} (97%) diff --git a/text/0000-fused-iterator.md b/text/1581-fused-iterator.md similarity index 97% rename from text/0000-fused-iterator.md rename to text/1581-fused-iterator.md index f1d6093245a..b5cff076a8a 100644 --- a/text/0000-fused-iterator.md +++ b/text/1581-fused-iterator.md @@ -1,7 +1,7 @@ - Feature Name: fused - Start Date: 2016-04-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1581](https://github.com/rust-lang/rfcs/pull/1581) +- Rust Issue: [rust-lang/rust#35602](https://github.com/rust-lang/rust/issues/35602) # Summary [summary]: #summary From f088b53b2850b3c73ba76f618c292271a3c26aba Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 11 Aug 2016 13:58:06 -0700 Subject: [PATCH 1064/1195] RFC 1649 is get_mut/into_inner for atomics --- text/{0000-atomic-access.md => 1649-atomic-access.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-atomic-access.md => 1649-atomic-access.md} (95%) diff --git a/text/0000-atomic-access.md b/text/1649-atomic-access.md similarity index 95% rename from text/0000-atomic-access.md rename to text/1649-atomic-access.md index 2085a6d20d4..e946ca0c9ec 100644 --- a/text/0000-atomic-access.md +++ b/text/1649-atomic-access.md @@ -1,7 +1,7 @@ q- Feature Name: atomic_access - Start Date: 2016-06-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1649](https://github.com/rust-lang/rfcs/pull/1649) +- Rust Issue: [rust-lang/rust#35603](https://github.com/rust-lang/rust/issues/35603) # Summary [summary]: #summary From 39aff7ca15ccece79e60c7008a4eb38cda598c93 Mon Sep 17 00:00:00 2001 From: benaryorg Date: Thu, 11 Aug 2016 23:31:38 +0200 Subject: [PATCH 1065/1195] add: list of methods to be implemented as of current state of discussion Signed-off-by: benaryorg --- text/0000-duration-checked-sub.md | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/text/0000-duration-checked-sub.md b/text/0000-duration-checked-sub.md index 21e55430f81..5edca47239e 100644 --- a/text/0000-duration-checked-sub.md +++ b/text/0000-duration-checked-sub.md @@ -6,8 +6,8 @@ # Summary [summary]: #summary -This RFC adds `checked_sub()` already known from various primitive types to the -`Duration` *struct*. +This RFC adds the `checked_*` methods already known from primitives like +`usize` to `Duration`. # Motivation [motivation]: #motivation @@ -42,6 +42,9 @@ fn render() { } ``` +Of course it is also suitable to not introduce `panic!()`s when adding +`Duration`s. + # Detailed design [design]: #detailed-design @@ -73,21 +76,27 @@ impl Duration { } ``` +The same accounts for all other added methods, namely: + +- `checked_add()` +- `checked_sub()` +- `checked_mul()` +- `checked_div()` + # Drawbacks [drawbacks]: #drawbacks -This proposal adds another `checked_*` method to *libstd*. -One could ask why no `CheckedSub` trait if there is a `Sub` trait. +`None`. # Alternatives [alternatives]: #alternatives The alternatives are simply not doing this and forcing the programmer to code the check on their behalf. +This is not what you want. # Unresolved questions [unresolved]: #unresolved-questions -Should all functions of the form -`(checked|saturating|overflowing|wrapping)_(add|sub|mul|div)` be added? +`None`. From 467ae3cff8787ec7302047810e00136147d3739c Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 12 Aug 2016 16:35:15 -0400 Subject: [PATCH 1066/1195] RFC 1576 is the "literal" matcher for macros --- ...cros-literal-matcher.md => 1576-macros-literal-matcher.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-macros-literal-matcher.md => 1576-macros-literal-matcher.md} (97%) diff --git a/text/0000-macros-literal-matcher.md b/text/1576-macros-literal-matcher.md similarity index 97% rename from text/0000-macros-literal-matcher.md rename to text/1576-macros-literal-matcher.md index 968ae7ebdc9..7626e2150ba 100644 --- a/text/0000-macros-literal-matcher.md +++ b/text/1576-macros-literal-matcher.md @@ -1,7 +1,7 @@ - Feature Name: macros-literal-match - Start Date: 2016-04-08 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1576 +- Rust Issue: https://github.com/rust-lang/rust/issues/35625 # Summary From 017d661a27e6fa17f68324b7543bf4f1a25ca5a7 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 12 Aug 2016 16:37:26 -0400 Subject: [PATCH 1067/1195] RFC 1506 is "clarified ADT kinds" --- text/{0000-adt-kinds.md => 1506-adt-kinds.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-adt-kinds.md => 1506-adt-kinds.md} (98%) diff --git a/text/0000-adt-kinds.md b/text/1506-adt-kinds.md similarity index 98% rename from text/0000-adt-kinds.md rename to text/1506-adt-kinds.md index 30a4f10678e..8c033ade760 100644 --- a/text/0000-adt-kinds.md +++ b/text/1506-adt-kinds.md @@ -1,7 +1,7 @@ - Feature Name: clarified_adt_kinds - Start Date: 2016-02-07 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1506 +- Rust Issue: https://github.com/rust-lang/rust/issues/35626 # Summary [summary]: #summary From 8a970a36bc72139f585bb181b85c7ef3089843c7 Mon Sep 17 00:00:00 2001 From: Diggory Blake Date: Sat, 13 Aug 2016 15:58:08 +0100 Subject: [PATCH 1068/1195] Rewrite RFC in favour of `mainCRTStartup` alternative --- text/0000-windows-subsystem.md | 127 +++++++++++++++------------------ 1 file changed, 57 insertions(+), 70 deletions(-) diff --git a/text/0000-windows-subsystem.md b/text/0000-windows-subsystem.md index 778b51fb3b8..78c581db629 100644 --- a/text/0000-windows-subsystem.md +++ b/text/0000-windows-subsystem.md @@ -22,7 +22,10 @@ The `WINDOWS` subsystem is commonly used on windows: desktop applications typically do not want to flash up a console window on startup. Currently, using the `WINDOWS` subsystem from rust is undocumented, and the -process is non-trivial when targeting the MSVC toolchain: +process is non-trivial when targeting the MSVC toolchain. There are a couple of +approaches, each with their own downsides: + +## Define a WinMain symbol A new symbol `pub extern "system" WinMain(...)` with specific argument and return types must be declared, which will become the new entry point for @@ -32,6 +35,13 @@ This is unsafe, and will skip the initialization code in `libstd`. The GNU toolchain will accept either entry point. +## Override the entry point via linker options + +This uses the same method as will be described in this RFC. However, it will +result in build scripts also being compiled for the windows subsystem, which +can cause additional console windows to pop up during compilation, making the +system unusable while a build is in progress. + # Detailed design [design]: #detailed-design @@ -44,52 +54,65 @@ In practice, only two subsystems are very commonly used: `CONSOLE` and `WINDOWS`, and from a user's perspective, they determine whether a console will be automatically created when the program is started. -The solution this RFC proposes is to always export both `main` and `WinMain` -symbols from rust executables compiled for windows. The `WinMain` function -will simply delegate to the `main` function. +## New crate attribute + +This RFC proposes two changes to solve this problem. The first is adding a +top-level crate attribute to allow specifying which subsystem to use: + +`#![windows_subsystem = "windows"]` + +Initially, the set of possible values will be `{windows, console}`, but may be +extended in future if desired. -The exact signature is: +The use of this attribute in a non-executable crate will result in a compiler +warning. If compiling for a non-windows target, the attribute will be silently +ignored. + +## Additional linker argument + +For the GNU toolchain, this will be sufficient. However, for the MSVC toolchain, +the linker will be expecting a `WinMain` symbol, which will not exist. + +There is some complexity to the way in which a different entry point is expected +when using the windows subsystem. Firstly, the C-runtime library exports two +symbols designed to be used as an entry point: ``` -pub extern "system" WinMain( - hInstance: HINSTANCE, - hPrevInstance: HINSTANCE, - lpCmdLine: LPSTR, - nCmdShow: i32 -) -> i32; +mainCRTStartup +WinMainCRTStartup ``` -Where `HINSTANCE` is a pointer-sized opaque handle, and `LPSTR` is a C-style -null terminated string. +`LINK.exe` will use the subsystem to determine which of these symbols to use +as the default entry point if not overridden. -All four parameters are either irrelevant or can be obtained easily through -other means: -- `hInstance` - Can be obtained via `GetModuleHandle`. -- `hPrevInstance` - Is always NULL. -- `lpCmdLine` - `libstd` already provides a function to get command line - arguments. -- `nCmdShow` - Can be obtained via `GetStartupInfo`, although it's not actually - needed any more (the OS will automatically hide/show the first window created). +Each one performs some unspecified initialization of the CRT, before calling out +to a symbol defined within the program (`main` or `WinMain` respectively). -The end result is that rust programs will "just work" when the subsystem is -overridden via custom linker arguments, and does not require `rustc` to -parse those linker arguments. +The second part of the solution is to pass an additional linker option when +targeting the MSVC toolchain: +`/ENTRY:mainCRTStartup` -A possible future extension would be to add additional command-line options to -`rustc` (and in turn, `Cargo.toml`) to specify the subsystem directly. `rustc` -would automatically translate this into the correct linker arguments for -whichever linker is actually being used. +This will override the entry point to always be `mainCRTStartup`. For +console-subsystem programs this will have no effect, since it was already the +default, but for windows-subsystem programs, it will eliminate the need for +a `WinMain` symbol to be defined. + +This command line option will always be passed to the linker, regardless of the +presence or absence of the `windows_subsystem` crate attribute, except when +the user specifies their own entry point in the linker arguments. This will +require `rustc` to perform some basic parsing of the linker options. # Drawbacks [drawbacks]: #drawbacks -- Additional platform-specific code. +- A new platform-specific crate attribute. - The difficulty of manually calling the rust initialization code is potentially a more general problem, and this only solves a specific (if common) case. -- This is a breaking change for any crates which already export a `WinMain` - symbol. It is likely that only executable crates would export this symbol, - so the knock-on effect on crate dependencies should be non-existent. - - A possible work-around for this is described below. +- The subsystem must be specified earlier than is strictly required: when + compiling C/C++ code only the linker, not the compiler, needs to actually be + aware of the subsystem. +- It is assumed that the initialization performed by the two CRT entry points + is identical. This seems to currently be the case, and is unlikely to change + as this technique appears to be used fairly widely. # Alternatives [alternatives]: #alternatives @@ -132,42 +155,6 @@ whichever linker is actually being used. support cross-compiling. If not compiling a binary crate, specifying the option is an error regardless of the target. -- Have `rustc` override the entry point when calling `link.exe`, and tell it to - use `mainCRTStartup` instead of `winMainCRTStartup`. These are the "true" - entry points of windows programs, which first initialize the C runtime - library, and then call `main` or `WinMain` respectively. - - This is the simplest solution, and it will not have any serious backwards - compatibility problems, since rust programs are already required to have a - `main` function, even if `WinMain` has been separately defined. However, it - relies on the two CRT functions to be interchangeable, although this does - *appear* to be the case currently. - -- Export both entry points as described in this RFC, but also add a `subsystem` - function to `libstd` determine which subsystem was used at runtime. - - The `WinMain` function would first set an internal flag, and only then - delegate to the `main` function. - - A function would be added to `std::os::windows`: - - `fn subsystem() -> &'static str` - - This would check the value of the internal flag, and return either `WINDOWS` or - `CONSOLE` depending on which entry point was actually used. - - The `subsystem` function could be used to eg. redirect logging to a file if - the program is being run on the `WINDOWS` subsystem. However, it would return - an incorrect value if the initialization was skipped, such as if used as a - library from an executable written in another language. - -- Export both entry points as described in this RFC, but use the undocumented - MSVC equivalent to weak symbols to avoid breaking existing code. - - The parameter `/alternatename:_WinMain@16=_RustWinMain@16` can be used to - export `WinMain` only if it is not also exported elsewhere. This is completely - undocumented, but is mentioned here: (http://stackoverflow.com/a/11529277). - # Unresolved questions [unresolved]: #unresolved-questions From 7de8ab730c2962c78cbaa645e2a7cd3241c4b839 Mon Sep 17 00:00:00 2001 From: Diggory Blake Date: Sat, 13 Aug 2016 18:36:36 +0100 Subject: [PATCH 1069/1195] Keep the bunnies happy --- text/0000-windows-subsystem.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-windows-subsystem.md b/text/0000-windows-subsystem.md index 78c581db629..3de857c72de 100644 --- a/text/0000-windows-subsystem.md +++ b/text/0000-windows-subsystem.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Rust programs compiled for windows will always flash up a console window on +Rust programs compiled for windows will always allocate a console window on startup. This behavior is controlled via the `SUBSYSTEM` parameter passed to the linker, and so *can* be overridden with specific compiler flags. However, doing so will bypass the rust-specific initialization code in `libstd`, as when using From d08e636c36367d3be2fecefce5c5a81c9ec3875f Mon Sep 17 00:00:00 2001 From: VC Date: Sun, 14 Aug 2016 00:20:52 -0700 Subject: [PATCH 1070/1195] draft 1 --- text/0000-dllimport.md | 152 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 152 insertions(+) create mode 100644 text/0000-dllimport.md diff --git a/text/0000-dllimport.md b/text/0000-dllimport.md new file mode 100644 index 00000000000..b4128d8ad77 --- /dev/null +++ b/text/0000-dllimport.md @@ -0,0 +1,152 @@ +- Feature Name: dllimport +- Start Date: 2016-08-13 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Make compiler aware of the association between library names adorning `extern` blocks +and symbols defined within the block. Add attributes and command line switches that leverage +this association. + +# Motivation +[motivation]: #motivation + +Most of the time a linkage directive is only needed to inform the linker about +what native libraries need to be linked into a program. On some platforms, +however, the compiler needs more detailed knowledge about what's being linked +from where in order to ensure that symbols are wired up correctly. + +On Windows, when a symbol is imported from a dynamic library, the code that accesses +this symbol must be generated differently than for symbols imported from a static library. + +Currently the compiler is not aware of associations between the libraries and symbols +imported from them, so it cannot alter code generation based on library kind. + +# Detailed design +[design]: #detailed-design + +### Library <-> symbol association + +The compiler shall assume that symbols defined within extern block +are imported from the library mentioned in the `#[link]` attribute adorning the block. + +### Changes to code generation + +On platforms other than Windows the above association will have no effect. +On Windows, however, `#[link(..., kind="dylib")` shall be presumed to mean linking to a dll, +whereas `#[link(..., kind="static")` shall mean static linking. In the former case, all symbols +associated with that library will be marked with LLVM [dllimport][1] storage class. + +[1]: http://llvm.org/docs/LangRef.html#dll-storage-classes + +### Library name and kind variance + +Many native libraries are linked via the command line via `-l` which is passed +in through Cargo build scripts instead of being written in the source code +itself. As a recap, a native library may change names across platforms or +distributions or it may be linked dynamically in some situations and +statically in others which is why build scripts are leveraged to make these +dynamic decisions. In order to support this kind of dynamism, the following +modifications are proposed: + +- A new library kind, "abstract". An "abstract" library by itself does not + cause any libraries to be linked. Its purpose is to establish an identifier, + that may be later referred to from the command line flags. +- Extend syntax of the `-l` flag to `-l [KIND=]lib[:NEWNAME]`. The `NEWNAME` + part may be used to override name of a library specified in the source. +- Add new meaning to the `KIND` part: if "lib" is already specified in the source, + this will override its kind with KIND. Note that this override is possible only + for libraries defined in the current crate. + +Example: + +```rust +// mylib.rs +#[link(name = "foo", kind="dylib")] +extern { + // dllimport applied +} + +#[link(name = "bar", kind="static")] +extern { + // dllimport not applied +} + +#[link(name = "baz", kind="abstract")] +extern { + // dllimport not applied, "baz" not linked +} +``` + +``` +rustc mylib.rs -l static=foo # change foo's kind to "static", dllimport will not be applied +rustc mylib.rs -l foo:newfoo # link newfoo instead of foo +rustc mylib.rs -l dylib=baz:quoox # specify baz's kind as "dylib", change link name to quoox. +``` + +### Unbundled static libs (optional) + +It had been pointed out that sometimes one may wish to link to a static system library +(i.e. one that is always available to the linker) without bundling it into .lib's and .rlib's. +For this use case we'll introduce another library "kind", "static-nobundle". +Such libraries would be treated in the same way as "static", minus the bundling. + +# Drawbacks +[drawbacks]: #drawbacks + +For libraries to work robustly on MSVC, the correct `#[link]` annotation will +be required. Most cases will "just work" on MSVC due to the compiler strongly +favoring static linkage, but any symbols imported from a dynamic library or +exported as a Rust dynamic library will need to be tagged appropriately to +ensure that they work in all situations. Worse still, the `#[link]` annotations +on an `extern` block are not required on any other platform to work correctly, +meaning that it will be common that these attributes are left off by accident. + + +# Alternatives +[alternatives]: #alternatives + +- Instead of enhancing `#[link]`, a `#[linked_from = "foo"]` annotation could be added. + This has the drawback of not being able to handle native libraries whose + name is unpredictable across platforms in an easy fashion, however. + Additionally, it adds an extra attribute to the comipler that wasn't known + previously. + +- Support a `#[dllimport]` on extern blocks (or individual symbols, or both). + This has the following drawbacks, however: + - This attribute would duplicate the information already provided by + `#[link(kind="...")]`. + - It is not always known whether `#[dllimport]` is needed. Native + libraires are not always known whether they're linked dynamically or + statically (e.g. that's what a build script decides), so `dllimport` + will need to be guarded by `cfg_attr`. + +- When linking native libraries, the compiler could attempt to locate each + library on the filesystem and probe the contents for what symbol names are + exported from the native library. This list could then be cross-referenced + with all symbols declared in the program locally to understand which symbols + are coming from a dylib and which are being linked statically. Some downsides + of this approach may include: + + - It's unclear whether this will be a performant operation and not cause + undue runtime overhead during compiles. + + - On Windows linking to a DLL involves linking to its "import library", so + it may be difficult to know whether a symbol truly comes from a DLL or + not. + + - Locating libraries on the system may be difficult as the system linker + often has search paths baked in that the compiler does not know about. + +- As was already mentioned, "kind" override can affect codegen of the current crate only. + This overloading the `-l` flag for this purpose may be confusinfg to developers. + A new codegen flag might be a better fit for this, for example `-C libkind=KIND=LIB`. + +# Unresolved questions +[unresolved]: #unresolved-questions + +- Should un-overridden "abstract" kind cause an error, a warning, or be silently ignored? +- Do we even need "abstract"? Since kind can be overridden, there's no harm in providing a default in the source. +- Should we allow dropping a library specified in the source from linking via `-l lib:` (i.e. "rename to empty")? From 967a8e96041906593af1410c330f0cec7d6da3b3 Mon Sep 17 00:00:00 2001 From: Liigo Zhuang Date: Mon, 15 Aug 2016 09:45:22 +0800 Subject: [PATCH 1071/1195] Rename 1510-rdylib.md to 1510-cdylib.md the file was named by mistake for historic reason. --- text/{1510-rdylib.md => 1510-cdylib.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{1510-rdylib.md => 1510-cdylib.md} (100%) diff --git a/text/1510-rdylib.md b/text/1510-cdylib.md similarity index 100% rename from text/1510-rdylib.md rename to text/1510-cdylib.md From 41852d9f851af2e12be5c04723ef29f6ae60214e Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Mon, 15 Aug 2016 16:48:26 +1200 Subject: [PATCH 1072/1195] Some final clarifying points --- text/0000-macro-naming.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/text/0000-macro-naming.md b/text/0000-macro-naming.md index 680b4b5459b..149b9c0d366 100644 --- a/text/0000-macro-naming.md +++ b/text/0000-macro-naming.md @@ -118,6 +118,14 @@ as other items by an `extern crate` item. No `#[macro_use]` or `#[macro_export]` annotations are required. +## Shadowing + +Macro names follow the same shadowing rules as other names. For example, an +explicitly declared macro would shadow a glob-imported macro with the same name. +Note that since macros are in a different namespace from types and values, a +macro cannot shadow a type or value or vice versa. + + # Drawbacks [drawbacks]: #drawbacks @@ -167,3 +175,13 @@ for a future RFC. Some day, I hope that procedural macros may be defined in the same crate in which they are used. I leave the details of this for later, however, I don't think this affects the design of naming - it should all Just Work. + +## Applying to existing macros + +This RFC is framed in terms of a new macro system. There are various ways that +some parts of it could be applied to existing macros (`macro_rules!`) to +backwards compatibly make existing macros usable under the new naming system. + +I want to leave this question unanswered for now. Until we get some experience +implementing this feature it is unclear how much this is possible. Once we know +that we can try to decide how much of that is also desirable. From a2a6710c12a2465667f253d3c9b1272d39bb621f Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 15 Aug 2016 16:50:43 -0400 Subject: [PATCH 1073/1195] RFC 1607 is "Strike team for rustfmt" --- text/{0000-style-rfcs.md => 1607-style-rfcs.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-style-rfcs.md => 1607-style-rfcs.md} (99%) diff --git a/text/0000-style-rfcs.md b/text/1607-style-rfcs.md similarity index 99% rename from text/0000-style-rfcs.md rename to text/1607-style-rfcs.md index d82896ba861..4d83a40a0e0 100644 --- a/text/0000-style-rfcs.md +++ b/text/1607-style-rfcs.md @@ -1,6 +1,6 @@ - Feature Name: N/A - Start Date: 2016-04-21 -- RFC PR: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1607 - Rust Issue: N/A From 8f5c1cd3608d07594762ad9443805d160b2dab8c Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 15 Aug 2016 16:59:08 -0400 Subject: [PATCH 1074/1195] merge RFC 1643, unsafe code guidelines --- ...odel-strike-team.md => 1643-memory-model-strike-team.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-memory-model-strike-team.md => 1643-memory-model-strike-team.md} (99%) diff --git a/text/0000-memory-model-strike-team.md b/text/1643-memory-model-strike-team.md similarity index 99% rename from text/0000-memory-model-strike-team.md rename to text/1643-memory-model-strike-team.md index 236323ab4be..a8c5b1e9103 100644 --- a/text/0000-memory-model-strike-team.md +++ b/text/1643-memory-model-strike-team.md @@ -1,7 +1,7 @@ - Feature Name: N/A -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- Start Date: 2016-06-07 +- RFC PR: https://github.com/rust-lang/rfcs/pull/1643 +- Rust Issue: N/A # Summary [summary]: #summary From acc572e464b457499a45880053aad352482accdc Mon Sep 17 00:00:00 2001 From: Aidan Hobson Sayers Date: Mon, 15 Aug 2016 23:45:29 +0100 Subject: [PATCH 1075/1195] Fix typo in fn variance --- text/0738-variance.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0738-variance.md b/text/0738-variance.md index d6b4f8be216..15e046353e7 100644 --- a/text/0738-variance.md +++ b/text/0738-variance.md @@ -264,7 +264,7 @@ as desired: PhantomData // covariance PhantomData<*mut T> // invariance, but see "unresolved question" PhantomData> // invariance -PhantomData T> // contravariant +PhantomData // contravariant ``` Even better, the user doesn't really have to understand the terms From 30221dc3e025eb9f8f84ccacbc9622e3a75dff5e Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 16 Aug 2016 14:17:59 -0700 Subject: [PATCH 1076/1195] RFC 1679 is panic-safe slicing methods --- ...000-panic-safe-slicing.md => 1679-panic-safe-slicing.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-panic-safe-slicing.md => 1679-panic-safe-slicing.md} (95%) diff --git a/text/0000-panic-safe-slicing.md b/text/1679-panic-safe-slicing.md similarity index 95% rename from text/0000-panic-safe-slicing.md rename to text/1679-panic-safe-slicing.md index df26ca4847a..11cbd2c4f1b 100644 --- a/text/0000-panic-safe-slicing.md +++ b/text/1679-panic-safe-slicing.md @@ -1,7 +1,7 @@ -- Feature Name: panic_safe_slicing +- Feature Name: `panic_safe_slicing` - Start Date: 2015-10-16 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1679](https://github.com/rust-lang/rfcs/pull/1679) +- Rust Issue: [rust-lang/rfcs#35729](https://github.com/rust-lang/rust/issues/35729) # Summary From bde7d918cd9ce816a2b25792b88cdb2c71f8e65f Mon Sep 17 00:00:00 2001 From: benaryorg Date: Wed, 17 Aug 2016 00:41:54 +0200 Subject: [PATCH 1077/1195] fix code for method Signed-off-by: benaryorg --- text/0000-duration-checked-sub.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-duration-checked-sub.md b/text/0000-duration-checked-sub.md index 5edca47239e..2ce627b1b9b 100644 --- a/text/0000-duration-checked-sub.md +++ b/text/0000-duration-checked-sub.md @@ -54,7 +54,7 @@ underlying primitive types: ```rust impl Duration { - fn checked_sub(self, rhs: Duration) -> Duration { + fn checked_sub(self, rhs: Duration) -> Option { if let Some(mut secs) = self.secs.checked_sub(rhs.secs) { let nanos = if self.nanos >= rhs.nanos { self.nanos - rhs.nanos @@ -67,7 +67,7 @@ impl Duration { } }; debug_assert!(nanos < NANOS_PER_SEC); - Duration { secs: secs, nanos: nanos } + Some(Duration { secs: secs, nanos: nanos }) } else { None From 27fcfc030bbbc9bc315e355d74437e1c99a22e25 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Wed, 17 Aug 2016 19:27:52 -0700 Subject: [PATCH 1078/1195] RFC 1640 is checked methods on Duration --- ...duration-checked-sub.md => 1640-duration-checked-sub.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-duration-checked-sub.md => 1640-duration-checked-sub.md} (92%) diff --git a/text/0000-duration-checked-sub.md b/text/1640-duration-checked-sub.md similarity index 92% rename from text/0000-duration-checked-sub.md rename to text/1640-duration-checked-sub.md index 2ce627b1b9b..362ff2b1382 100644 --- a/text/0000-duration-checked-sub.md +++ b/text/1640-duration-checked-sub.md @@ -1,7 +1,7 @@ -- Feature Name: duration_checked_sub +- Feature Name: `duration_checked` - Start Date: 2016-06-04 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1640](https://github.com/rust-lang/rfcs/pull/1640) +- Rust Issue: [rust-lang/rust#35774](https://github.com/rust-lang/rust/issues/35774) # Summary [summary]: #summary From 19da8c442a9bce30f151a594c5a543a8df1f571e Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Thu, 18 Aug 2016 16:41:34 -0400 Subject: [PATCH 1079/1195] merge RFC 1589: rustc bug fix procedure --- ...c-bug-fix-procedure.md => 1589-rustc-bug-fix-procedure.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-rustc-bug-fix-procedure.md => 1589-rustc-bug-fix-procedure.md} (99%) diff --git a/text/0000-rustc-bug-fix-procedure.md b/text/1589-rustc-bug-fix-procedure.md similarity index 99% rename from text/0000-rustc-bug-fix-procedure.md rename to text/1589-rustc-bug-fix-procedure.md index d709f58d740..9ba3ce8898f 100644 --- a/text/0000-rustc-bug-fix-procedure.md +++ b/text/1589-rustc-bug-fix-procedure.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2016-04-22 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1589 +- Rust Issue: N/A # Summary [summary]: #summary From fa8d2f6b29cb5e128199921907ebe162b2994aeb Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 18 Aug 2016 15:29:14 -0700 Subject: [PATCH 1080/1195] RFC: Enable selecting how the C runtime is linked Enable the compiler to select whether a target dynamically or statically links to a platform's standard C runtime through the introduction of three orthogonal and otherwise general purpose features, one of which would likely never become stable. --- text/0000-crt-link.md | 268 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 268 insertions(+) create mode 100644 text/0000-crt-link.md diff --git a/text/0000-crt-link.md b/text/0000-crt-link.md new file mode 100644 index 00000000000..5ca135d2ad3 --- /dev/null +++ b/text/0000-crt-link.md @@ -0,0 +1,268 @@ +- Feature Name: `crt_link` +- Start Date: 2016-08-18 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Enable the compiler to select whether a target dynamically or statically links +to a platform's standard C runtime through the introduction of three orthogonal +and otherwise general purpose features, one of which would likely never become +stable. + +# Motivation +[motivation]: #motivation + +Today all targets of rustc hard-code how they link to the native C runtime. For +example the `x86_64-unknown-linux-gnu` target links to glibc dynamically, +`x86_64-unknown-linux-musl` links statically to musl, and +`x86_64-pc-windows-msvc` links dynamically to MSVCRT. There are many use cases, +however, where these decisions are not suitable. For example binaries on Alpine +Linux want to link dynamically to musl and redistributable binaries on Windows +are best done by linking statically to MSVCRT. + +The actual underlying code essentially never needs to change depending on how +the C runtime is being linked, just the mechanics of how it's actually all +linked together. As a result it's a common desire to take the target libraries +"off the shelf" and change how the C runtime is linked in as well. + +The purpose of this RFC is to provide a cross-platform solution spanning both +Cargo and the compiler which allows configuration of how the C runtime is +linked. The idea is that the standard MSVC and musl targets can be used as they +are today with an extra compiler flag to change how the C runtime is linked by +default. + +This RFC does *not* propose unifying how the C runtime is linked across +platforms (e.g. always dynamically or always statically) but instead leaves that +decision to each target. + +# Detailed design +[design]: #detailed-design + +This RFC proposed introducing three separate features to the compiler and Cargo. +When combined together they will enable the compiler to change whether the C +standard library is linked dynamically or statically, but in isolation each +should be useful in its own right. + +### A `crt_link` cfg directive + +The compiler will first define a new `crt_link` `#[cfg]` directive. This +directive will behave similarly to directives like `target_os` where they're +defined by the compiler for all targets. The compiler will set this value to +either `"dynamic"` or `"static"` depending on how the C runtime is requested to +being linked. + +For example, crates can then indicate: + +```rust +#[cfg_attr(crt_link = "static", link(name = "c", kind = "static"))] +#[cfg_attr(crt_link = "dynamic", link(name = "c"))] +extern { + // ... +} +``` + +This will notably be used in the `libc` crate where the linkage to the C +runtime is defined. + +Finally, the compiler will *also* allow defining this attribute from the +command line. For example: + +``` +rustc --cfg 'crt_link = "static"' foo.rs +``` + +This will override the compiler's default definition of `crt_link` and use this +one instead. Again, though, the only valid values for this directive are +`"static"` and `"dynamic"`. + +In isolation, however, this directive is not too useful, It would still require +rebuilding the `libc` crate (which the standard library links to) if the +linkage to the C runtime needs to change. This is where the two other features +this RFC proposes come into play though! + +### Forwarding `#[cfg]` to build scripts + +The first feature proposed is enabling Cargo to forward `#[cfg]` directives from +the compiler into build scripts. Currently the compiler supports `--print cfg` +as a flag to print out internal cfg directives, which Cargo currently uses to +implement platform-specific dependencies. + +When Cargo runs a build script it already sets a [number of environment +variables][cargo-build-env], and it will now set a family of `CARGO_CFG_*` +environment variables as well. For each key printed out from `rustc --print +cfg`, Cargo will set an environment variable for the build script to learn +about. + +[cargo-build-env]: http://doc.crates.io/environment-variables.html#environment-variables-cargo-sets-for-build-scripts + +For example, locally `rustc --print cfg` prints: + +``` +target_os="linux" +target_family="unix" +target_arch="x86_64" +target_endian="little" +target_pointer_width="64" +target_env="gnu" +unix +debug_assertions +``` + +And with this Cargo would set the following environment variables for build +script invocations for this target. + +``` +export CARGO_CFG_TARGET_OS=linux +export CARGO_CFG_TARGET_FAMILY=unix +export CARGO_CFG_TARGET_ARCH=x86_64 +export CARGO_CFG_TARGET_ENDIAN=little +export CARGO_CFG_TARGET_POINTER_WIDTH=64 +export CARGO_CFG_TARGET_ENV=gnu +export CARGO_CFG_UNIX +export CARGO_CFG_DEBUG_ASSERTIONS +``` + +As mentioned in the previous section, the linkage of the C standard library +will be a `#[cfg]` directive defined by the compiler, and through this method +build scripts will be able to learn how the C standard library is being linked. +This is crucially important for the MSVC target where code needs to be compiled +differently depending on how the C library is linked. + +This feature ends up having the added benefit of informing build scripts about +selected CPU features as well. For example once the `target_feature` `#[cfg]` +is stabilized build scripts will know whether SSE/AVX/etc are enabled features +for the C code they might be compiling. + +### "Lazy Linking" + +The final feature that will be added to the compiler is the ability to "lazily" +link a native library depending on values of `#[cfg]` at compile time of +downstream crates, not of the crate with the `#[link]` directives. This feature +is never intended to be stabilized, and is instead targeted at being an unstable +implementation detail of the `libc` crate. + +Specifically, the `#[link]` attribute will be extended with a new directive +that it accepts, `cfg(..)`, such as: + +```rust +#[link(name = "foo", cfg(bar))] +``` + +This `cfg` indicates to the compiler that the `#[link]` annotation only applies +if the `bar` directive is matched. The compiler will then use this knowledge +in two ways: + +* When `dllimport` or `dllexport` needs to be applied, it will evaluate the + current compilation's `#[cfg]` directives and see if upstream `#[link]` + directives apply or not. + +* When deciding what native libraries should be linked, the compiler will + evaluate whether they should be linked or not depending on the current + compilation's `#[cfg]` directives nad the upstream `#[link]` directives. + +### Customizing linkage to the C runtime + +With the above features, the following changes will be made to enable selecting +the linkage of the C runtime at compile time for downstream crates. + +First, the `libc` crate will be modified to contain blocks along the lines of: + +```rust +cfg_if! { + if #[cfg(target_env = "musl")] { + #[link(name = "c", cfg(crt_link = "static"), kind = "static")] + #[link(name = "c", cfg(crt_link = "dynamic"))] + extern {} + } else if #[cfg(target_env = "msvc")] { + #[link(name = "msvcrt", cfg(crt_link = "dynamic"))] + #[link(name = "libcmt", cfg(crt_link = "static"))] + extern {} + } else { + // ... + } +} +``` + +This informs the compiler that for the musl target if the CRT is statically +linked then the library named `c` is included statically in libc.rlib. If the +CRT is linked dynamically, however, then the library named `c` will be linked +dynamically. Similarly for MSVC, a static CRT implies linking to `libcmt` and a +dynamic CRT implies linking to `msvcrt` (as we do today). + +After this change, the gcc-rs crate will be modified to check for the +`CARGO_CFG_CRT_LINK` directive. If it is not present or value is `dynamic`, then +it will compile C code with `/MD`. Otherwise if the value is `static` it will +compile code with `/MT`. + +Finally, an example of compiling for MSVC linking statically to the C runtime +would look like: + +``` +RUSTFLAGS='--cfg crt_link="static"' cargo build --target x86_64-pc-windows-msvc +``` + +and similarly, compiling for musl but linking dynamically to the C runtime would +look like: + +``` +RUSTFLAGS='--cfg crt_link="dynamic"' cargo build --target x86_64-unknown-linux-musl +``` + +### Future work + +The features proposed here are intended to be the absolute bare bones of support +needed to configure how the C runtime is linked. A primary drawback, however, is +that it's somewhat cumbersome to select the non-default linkage of the CRT. +Similarly, however, it's cumbersome to select target CPU features which are not +the default, and these two situations are very similar. Eventually it's intended +that there's an ergonomic method for informing the compiler and Cargo of all +"compilation codegen options" over the usage of `RUSTFLAGS` today. It's assume +that configuration of `crt_link` will be included in this ergonomic +configuration as well. + +Furthermore, it would have arguably been a "more correct" choice for Rust to by +default statically link to the CRT on MSVC rather than dynamically. While this +would be a breaking change today due to how C components are compiled, if this +RFC is implemented it should not be a breaking change to switch the defaults in +the future. + +# Drawbacks +[drawbacks]: #drawbacks + +* Working with `RUSTFLAGS` can be cumbersome, but as explained above it's + planned that eventually there's a much more ergonomic configuration method for + other codegen options like `target-cpu` which would also encompass the linkage + of the CRT. + +* Adding a feature which is intended to never be stable (`#[link(.., cfg(..))]`) + is somewhat unfortunate but allows sidestepping some of the more thorny + questions with how this works. The stable *semantics* will be that for some + targets the `--cfg crt_link=...` directive affects the linkage of the CRT, + which seems like a worthy goal regardless. + +# Alternatives +[alternatives]: #alternatives + +* One alternative is to add entirely new targets, for example + `x86_64-pc-windows-msvc-static`. Unfortunately though we don't have a great + naming convention for this, and it also isn't extensible to other codegen + options like `target-cpu`. Additionally, adding a new target is a pretty + heavyweight solution as we'd have to start distributing new artifacts and + such. + +* Another possibility would be to start storing metdata in the "target name" + along the lines of `x86_64-pc-windows-msvc+static`. This is a pretty big + design space, though, which may not play well with Cargo and build scripts, so + for now it's preferred to avoid this rabbit hole of design if possible. + +* Finally, the compiler could simply have an environment variable which + indicates the CRT linkage. This would then be read by the compiler and by + build scripts, and the compiler would have its own back channel for changing + the linkage of the C library along the lines of `#[link(.., cfg(..))]` above. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None, yet. From e9285f941393cb9f2ce3a6ccc59ffb9ee152e294 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Thu, 18 Aug 2016 17:49:16 -0700 Subject: [PATCH 1081/1195] Revising crt-link --- text/0000-crt-link.md | 187 ++++++++++++++++++++++++++---------------- 1 file changed, 115 insertions(+), 72 deletions(-) diff --git a/text/0000-crt-link.md b/text/0000-crt-link.md index 5ca135d2ad3..f673987dddf 100644 --- a/text/0000-crt-link.md +++ b/text/0000-crt-link.md @@ -8,8 +8,19 @@ Enable the compiler to select whether a target dynamically or statically links to a platform's standard C runtime through the introduction of three orthogonal -and otherwise general purpose features, one of which would likely never become -stable. +and otherwise general purpose features, one of which will likely never become +stable and can be considered an implementation detail of std. These features +require rustc to have no intrinsic knowledge of the existence of C runtimes. + +The end result is that rustc will be able to reuse its existing standard library +binaries for the MSVC and musl targets for building code that links either +statically or dynamically to libc. + +The design herein additionally paves the way for improved support for +dllimport/dllexport and cpu-specific features, particularly when combined with a +[std-aware cargo]. + +[std-aware cargo]: https://github.com/rust-lang/rfcs/pull/1133 # Motivation [motivation]: #motivation @@ -22,67 +33,74 @@ however, where these decisions are not suitable. For example binaries on Alpine Linux want to link dynamically to musl and redistributable binaries on Windows are best done by linking statically to MSVCRT. -The actual underlying code essentially never needs to change depending on how -the C runtime is being linked, just the mechanics of how it's actually all -linked together. As a result it's a common desire to take the target libraries -"off the shelf" and change how the C runtime is linked in as well. - -The purpose of this RFC is to provide a cross-platform solution spanning both -Cargo and the compiler which allows configuration of how the C runtime is -linked. The idea is that the standard MSVC and musl targets can be used as they -are today with an extra compiler flag to change how the C runtime is linked by -default. - -This RFC does *not* propose unifying how the C runtime is linked across -platforms (e.g. always dynamically or always statically) but instead leaves that -decision to each target. +Today rustc has no mechanism for accomplishing this besides defining an entirely +new target specification and distributing a build of the standard library for +it. Because target specifications must be described by a target triple, and +target triples have preexisting conventions into which such a scheme does not +fit, we have resisted doing so. # Detailed design [design]: #detailed-design -This RFC proposed introducing three separate features to the compiler and Cargo. -When combined together they will enable the compiler to change whether the C -standard library is linked dynamically or statically, but in isolation each -should be useful in its own right. - -### A `crt_link` cfg directive - -The compiler will first define a new `crt_link` `#[cfg]` directive. This -directive will behave similarly to directives like `target_os` where they're -defined by the compiler for all targets. The compiler will set this value to -either `"dynamic"` or `"static"` depending on how the C runtime is requested to -being linked. +This RFC introduces three separate features to the compiler and Cargo. When +combined together they will enable the compiler to change whether the C standard +library is linked dynamically or statically. In isolation each feature is a +natural extension of existing features, should be useful on their own. + +A key insight is that, for practical purposes, the object code _for the standard +library_ does not need to change based on how the C runtime is being linked; +though it is true that on Windows, it is _generally_ important to properly +manage the use of dllimport/dllexport attributes based on the linkage type, and +C code does need to be compiled with specific options based on the linkage type. +So it is technically possible to produce Rust executables and dynamic libraries +that either link to libc statically or dynamically from a single std binary by +correctly manipulating the arguments to the linker. + +A second insight is that there are multiple existing, unserved use cases for +configuring features of the hardware architecture, underlying platform, or +runtime [1], which require the entire 'world', possibly including std, to be +compiled a certain way. C runtime linkage is another example of this +requirement. + +[1]: https://internals.rust-lang.org/t/pre-rfc-a-vision-for-platform-architecture-configuration-specific-apis/3502 + +From these observations we can design a cross-platform solution spanning both +Cargo and the compiler by which Rust programs may link to either a dynamic or +static C library, using only a single std binary. As future work it discusses +how the proposed scheme scheme can be extended to rebuild std specifically for a +particular C-linkage scenario, which may have minor advantages on Windows due to +issues around dllimport and dllexport; and how this scheme naturally extends +to recompiling std in the presence of modified CPU features. -For example, crates can then indicate: - -```rust -#[cfg_attr(crt_link = "static", link(name = "c", kind = "static"))] -#[cfg_attr(crt_link = "dynamic", link(name = "c"))] -extern { - // ... -} -``` +This RFC does *not* propose unifying how the C runtime is linked across +platforms (e.g. always dynamically or always statically) but instead leaves that +decision to each target, and to future work. -This will notably be used in the `libc` crate where the linkage to the C -runtime is defined. +In summary the new mechanics are: -Finally, the compiler will *also* allow defining this attribute from the -command line. For example: +- Specifying C runtime linkage via `-C target-feature=+crt-static` or `-C + target-feature=-crt-static`. This extends `-C target-feature` to mean not just + "CPU feature" ala LLVM, but "feature of the Rust target". Several existing + properties of this flag, the ability to add, with `+`, _or remove_, with `-`, + the feature, as well as the automatic lowering to `cfg` values, are crucial to + later aspects of the design. This target feature will be added to targets via + a small extension to the compiler's target specification. +- Lowering `cfg` values to Cargo build script environment variables. TODO describe + key points. +- Lazy link attributes. TODO. This feature is only required by std's own copy of the + libc crate, since std is distributed in binary form, and it may yet be a long + time before Cargo itself can rebuild std. -``` -rustc --cfg 'crt_link = "static"' foo.rs -``` +### Specifying dynamic/static C runtime linkage -This will override the compiler's default definition of `crt_link` and use this -one instead. Again, though, the only valid values for this directive are -`"static"` and `"dynamic"`. +`-C target-feature=crt-static` -In isolation, however, this directive is not too useful, It would still require -rebuilding the `libc` crate (which the standard library links to) if the -linkage to the C runtime needs to change. This is where the two other features -this RFC proposes come into play though! +TODO An extension to target specifications that allows custom target-features to be +defined, as well as to indicate whether that feature is on by default. Most +existing targets will define `crt-static`; the existing "musl" targets will +enable `crt-static` by default. -### Forwarding `#[cfg]` to build scripts +### Lowering `cfg` values to Cargo build script environment variables The first feature proposed is enabling Cargo to forward `#[cfg]` directives from the compiler into build scripts. Currently the compiler supports `--print cfg` @@ -125,17 +143,40 @@ export CARGO_CFG_DEBUG_ASSERTIONS ``` As mentioned in the previous section, the linkage of the C standard library -will be a `#[cfg]` directive defined by the compiler, and through this method -build scripts will be able to learn how the C standard library is being linked. -This is crucially important for the MSVC target where code needs to be compiled -differently depending on how the C library is linked. +will be specified as a target feature, which is lowered to a `cfg` value. +One important complication here is that `cfg` values in Rust may be defined +multiple times, and this is the case with target features. When a +`cfg` value is defined multiple times, Cargo will create a single environment +variable with a comma-seperated list of values. + +So for a target with the following features enabled + +``` +target_feature="sse" +target_feature="crt-static" +``` + +Cargo would convert it to the following environment variable: + +``` +export CARGO_CFG_TARGET_FEATURE=sse,crt-static +``` + +Through this method build scripts will be able to learn how the C standard +library is being linked. This is crucially important for the MSVC target where +code needs to be compiled differently depending on how the C library is linked. This feature ends up having the added benefit of informing build scripts about selected CPU features as well. For example once the `target_feature` `#[cfg]` is stabilized build scripts will know whether SSE/AVX/etc are enabled features for the C code they might be compiling. -### "Lazy Linking" +After this change, the gcc-rs crate will be modified to check for the +`CARGO_CFG_TARGET_FEATURE` directive, and parse it into a list of enabled +features. If the `crt-static` feature is not enabled it will compile C code with +`/MD`. Otherwise if the value is `static` it will compile code with `/MT`. + +### Lazy link attributes The final feature that will be added to the compiler is the ability to "lazily" link a native library depending on values of `#[cfg]` at compile time of @@ -172,12 +213,12 @@ First, the `libc` crate will be modified to contain blocks along the lines of: ```rust cfg_if! { if #[cfg(target_env = "musl")] { - #[link(name = "c", cfg(crt_link = "static"), kind = "static")] - #[link(name = "c", cfg(crt_link = "dynamic"))] + #[link(name = "c", cfg(target_feature = "crt-static"), kind = "static")] + #[link(name = "c", cfg(not(target_feature = "crt-static")))] extern {} } else if #[cfg(target_env = "msvc")] { - #[link(name = "msvcrt", cfg(crt_link = "dynamic"))] - #[link(name = "libcmt", cfg(crt_link = "static"))] + #[link(name = "msvcrt", cfg(not(target_feature = "crt-static")))] + #[link(name = "libcmt", cfg(target_feature = "crt-static"))] extern {} } else { // ... @@ -191,23 +232,18 @@ CRT is linked dynamically, however, then the library named `c` will be linked dynamically. Similarly for MSVC, a static CRT implies linking to `libcmt` and a dynamic CRT implies linking to `msvcrt` (as we do today). -After this change, the gcc-rs crate will be modified to check for the -`CARGO_CFG_CRT_LINK` directive. If it is not present or value is `dynamic`, then -it will compile C code with `/MD`. Otherwise if the value is `static` it will -compile code with `/MT`. - Finally, an example of compiling for MSVC linking statically to the C runtime would look like: ``` -RUSTFLAGS='--cfg crt_link="static"' cargo build --target x86_64-pc-windows-msvc +RUSTFLAGS='-C target-feature=+crt-static' cargo build --target x86_64-pc-windows-msvc ``` and similarly, compiling for musl but linking dynamically to the C runtime would look like: ``` -RUSTFLAGS='--cfg crt_link="dynamic"' cargo build --target x86_64-unknown-linux-musl +RUSTFLAGS='-C target-feature=-crt-static' cargo build --target x86_64-unknown-linux-musl ``` ### Future work @@ -218,9 +254,7 @@ that it's somewhat cumbersome to select the non-default linkage of the CRT. Similarly, however, it's cumbersome to select target CPU features which are not the default, and these two situations are very similar. Eventually it's intended that there's an ergonomic method for informing the compiler and Cargo of all -"compilation codegen options" over the usage of `RUSTFLAGS` today. It's assume -that configuration of `crt_link` will be included in this ergonomic -configuration as well. +"compilation codegen options" over the usage of `RUSTFLAGS` today. Furthermore, it would have arguably been a "more correct" choice for Rust to by default statically link to the CRT on MSVC rather than dynamically. While this @@ -228,6 +262,9 @@ would be a breaking change today due to how C components are compiled, if this RFC is implemented it should not be a breaking change to switch the defaults in the future. +TODO: discuss how this could with std-aware cargo to apply dllimport/export correctly +to the standard library's code-generation. + # Drawbacks [drawbacks]: #drawbacks @@ -265,4 +302,10 @@ the future. # Unresolved questions [unresolved]: #unresolved-questions -None, yet. +* What happens during the `cfg` to environment variable conversion for values + that contain commas? It's an unusual corner case, and build scripts should not + depend on such values, but it needs to be handled sanely. + +* Is it really true that lazy linking is only needed by std's libc? What about + in a world where we distribute more precompiled binaries than just std? + From efd28c6281d2540f5e1de3fb224065ef3fe0f7e1 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 19 Aug 2016 09:32:53 -0700 Subject: [PATCH 1082/1195] Resolve some TODO --- text/0000-crt-link.md | 63 ++++++++++++++++++++++++++++--------------- 1 file changed, 42 insertions(+), 21 deletions(-) diff --git a/text/0000-crt-link.md b/text/0000-crt-link.md index f673987dddf..cc8538af6d1 100644 --- a/text/0000-crt-link.md +++ b/text/0000-crt-link.md @@ -9,8 +9,8 @@ Enable the compiler to select whether a target dynamically or statically links to a platform's standard C runtime through the introduction of three orthogonal and otherwise general purpose features, one of which will likely never become -stable and can be considered an implementation detail of std. These features -require rustc to have no intrinsic knowledge of the existence of C runtimes. +stable and can be considered an implementation detail of std. These features do +not require rustc to have intrinsic knowledge of the existence of C runtimes. The end result is that rustc will be able to reuse its existing standard library binaries for the MSVC and musl targets for building code that links either @@ -85,27 +85,39 @@ In summary the new mechanics are: the feature, as well as the automatic lowering to `cfg` values, are crucial to later aspects of the design. This target feature will be added to targets via a small extension to the compiler's target specification. -- Lowering `cfg` values to Cargo build script environment variables. TODO describe - key points. -- Lazy link attributes. TODO. This feature is only required by std's own copy of the +- Lowering `cfg` values to Cargo build script environment variables. This will + enable build scripts to understand all enabled features of a target (like + `crt-static` above) to, for example, compile C code correctly on MSVC. +- Lazy link attributes. This feature is only required by std's own copy of the libc crate, since std is distributed in binary form, and it may yet be a long time before Cargo itself can rebuild std. ### Specifying dynamic/static C runtime linkage -`-C target-feature=crt-static` +A new `target-feature` flag will now be supported by the compiler for relevant +targets: `crt-static`. This can be enabled and disabled in the compiler via: -TODO An extension to target specifications that allows custom target-features to be -defined, as well as to indicate whether that feature is on by default. Most -existing targets will define `crt-static`; the existing "musl" targets will -enable `crt-static` by default. +``` +rustc -C target-feature=+crt-static ... +rustc -C target-feature=-crt-static ... +``` + +Currently all `target-feature` flags are passed through straight to LLVM, but +this proposes extending the meaning of `target-feature` to Rust-target-specific +features as well. Target specifications will be able to indicate what custom +target-features can be defined, and most existing target will define a new +`crt-static` feature which is turned off by default (except for musl). + +The default of `crt-static` will be different depending on the target. For +example `x86_64-unknown-linux-musl` will have it on by default, whereas +`arm-unknown-linux-musleabi` will have it turned off by default. ### Lowering `cfg` values to Cargo build script environment variables -The first feature proposed is enabling Cargo to forward `#[cfg]` directives from -the compiler into build scripts. Currently the compiler supports `--print cfg` -as a flag to print out internal cfg directives, which Cargo currently uses to -implement platform-specific dependencies. +Cargo will begin to forward `#[cfg]` directives from the compiler into build +scripts. Currently the compiler supports `--print cfg` as a flag to print out +internal cfg directives, which Cargo currently uses to implement +platform-specific dependencies. When Cargo runs a build script it already sets a [number of environment variables][cargo-build-env], and it will now set a family of `CARGO_CFG_*` @@ -147,7 +159,7 @@ will be specified as a target feature, which is lowered to a `cfg` value. One important complication here is that `cfg` values in Rust may be defined multiple times, and this is the case with target features. When a `cfg` value is defined multiple times, Cargo will create a single environment -variable with a comma-seperated list of values. +variable with a comma-separated list of values. So for a target with the following features enabled @@ -173,8 +185,9 @@ for the C code they might be compiling. After this change, the gcc-rs crate will be modified to check for the `CARGO_CFG_TARGET_FEATURE` directive, and parse it into a list of enabled -features. If the `crt-static` feature is not enabled it will compile C code with -`/MD`. Otherwise if the value is `static` it will compile code with `/MT`. +features. If the `crt-static` feature is not enabled it will compile C code on +the MSVC target with `/MD`. Otherwise if the value is `static` it will compile +code with `/MT`. ### Lazy link attributes @@ -201,7 +214,7 @@ in two ways: * When deciding what native libraries should be linked, the compiler will evaluate whether they should be linked or not depending on the current - compilation's `#[cfg]` directives nad the upstream `#[link]` directives. + compilation's `#[cfg]` directives and the upstream `#[link]` directives. ### Customizing linkage to the C runtime @@ -262,8 +275,16 @@ would be a breaking change today due to how C components are compiled, if this RFC is implemented it should not be a breaking change to switch the defaults in the future. -TODO: discuss how this could with std-aware cargo to apply dllimport/export correctly -to the standard library's code-generation. +The support in this RFC implies that the exact artifacts that we're shipping +will be usable for both dynamically and statically linking the CRT. +Unfortunately, however, on MSVC code is compiled differently if it's linking to +a dynamic library or not. The standard library uses very little of the MSVCRT, +so this won't be a problem in practice for now, but runs the risk of binding our +hands in the future. It's intended, though, that Cargo [will eventually support +custom-compiling the standard library][std-aware cargo]. The `crt-static` +feature would simply be another input to this logic, so Cargo would +custom-compile the standard library if it differed from the upstream artifacts, +solving this problem. # Drawbacks [drawbacks]: #drawbacks @@ -289,7 +310,7 @@ to the standard library's code-generation. heavyweight solution as we'd have to start distributing new artifacts and such. -* Another possibility would be to start storing metdata in the "target name" +* Another possibility would be to start storing metadata in the "target name" along the lines of `x86_64-pc-windows-msvc+static`. This is a pretty big design space, though, which may not play well with Cargo and build scripts, so for now it's preferred to avoid this rabbit hole of design if possible. From 104b166bb2134b3a65367c94d36424789bc19ae9 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Fri, 19 Aug 2016 10:32:24 -0700 Subject: [PATCH 1083/1195] More crt-link edits --- text/0000-crt-link.md | 123 +++++++++++++++++++++++++++--------------- 1 file changed, 81 insertions(+), 42 deletions(-) diff --git a/text/0000-crt-link.md b/text/0000-crt-link.md index cc8538af6d1..387e845b8e5 100644 --- a/text/0000-crt-link.md +++ b/text/0000-crt-link.md @@ -10,15 +10,16 @@ Enable the compiler to select whether a target dynamically or statically links to a platform's standard C runtime through the introduction of three orthogonal and otherwise general purpose features, one of which will likely never become stable and can be considered an implementation detail of std. These features do -not require rustc to have intrinsic knowledge of the existence of C runtimes. +not require the compiler or language to have intrinsic knowledge of the +existence of C runtimes. The end result is that rustc will be able to reuse its existing standard library -binaries for the MSVC and musl targets for building code that links either +binaries for the MSVC and musl targets to build code that links either statically or dynamically to libc. The design herein additionally paves the way for improved support for -dllimport/dllexport and cpu-specific features, particularly when combined with a -[std-aware cargo]. +dllimport/dllexport, and cpu-specific features, particularly when +combined with a [std-aware cargo]. [std-aware cargo]: https://github.com/rust-lang/rfcs/pull/1133 @@ -30,8 +31,8 @@ example the `x86_64-unknown-linux-gnu` target links to glibc dynamically, `x86_64-unknown-linux-musl` links statically to musl, and `x86_64-pc-windows-msvc` links dynamically to MSVCRT. There are many use cases, however, where these decisions are not suitable. For example binaries on Alpine -Linux want to link dynamically to musl and redistributable binaries on Windows -are best done by linking statically to MSVCRT. +Linux want to link dynamically to musl and creating portable binaries on Windows +is most easily done by linking statically to MSVCRT. Today rustc has no mechanism for accomplishing this besides defining an entirely new target specification and distributing a build of the standard library for @@ -42,10 +43,10 @@ fit, we have resisted doing so. # Detailed design [design]: #detailed-design -This RFC introduces three separate features to the compiler and Cargo. When -combined together they will enable the compiler to change whether the C standard -library is linked dynamically or statically. In isolation each feature is a -natural extension of existing features, should be useful on their own. +This RFC introduces three separate features to the compiler and Cargo. When +combined they will enable the compiler to change whether the C standard library +is linked dynamically or statically. In isolation each feature is a natural +extension of existing features, and each should be useful on its own. A key insight is that, for practical purposes, the object code _for the standard library_ does not need to change based on how the C runtime is being linked; @@ -66,11 +67,12 @@ requirement. From these observations we can design a cross-platform solution spanning both Cargo and the compiler by which Rust programs may link to either a dynamic or -static C library, using only a single std binary. As future work it discusses -how the proposed scheme scheme can be extended to rebuild std specifically for a -particular C-linkage scenario, which may have minor advantages on Windows due to -issues around dllimport and dllexport; and how this scheme naturally extends -to recompiling std in the presence of modified CPU features. +static C library, using only a single std binary. As future work this RFC +discusses how the proposed scheme scheme can be extended to rebuild std +specifically for a particular C-linkage scenario, which may have minor +advantages on Windows due to issues around dllimport and dllexport; and how this +scheme naturally extends to recompiling std in the presence of modified CPU +features. This RFC does *not* propose unifying how the C runtime is linked across platforms (e.g. always dynamically or always statically) but instead leaves that @@ -89,8 +91,8 @@ In summary the new mechanics are: enable build scripts to understand all enabled features of a target (like `crt-static` above) to, for example, compile C code correctly on MSVC. - Lazy link attributes. This feature is only required by std's own copy of the - libc crate, since std is distributed in binary form, and it may yet be a long - time before Cargo itself can rebuild std. + libc crate, and only because std is distributed in binary form and it may yet + be a long time before Cargo itself can rebuild std. ### Specifying dynamic/static C runtime linkage @@ -105,7 +107,7 @@ rustc -C target-feature=-crt-static ... Currently all `target-feature` flags are passed through straight to LLVM, but this proposes extending the meaning of `target-feature` to Rust-target-specific features as well. Target specifications will be able to indicate what custom -target-features can be defined, and most existing target will define a new +target-features can be defined, and most existing targets will define a new `crt-static` feature which is turned off by default (except for musl). The default of `crt-static` will be different depending on the target. For @@ -114,10 +116,10 @@ example `x86_64-unknown-linux-musl` will have it on by default, whereas ### Lowering `cfg` values to Cargo build script environment variables -Cargo will begin to forward `#[cfg]` directives from the compiler into build +Cargo will begin to forward `cfg` values from the compiler into build scripts. Currently the compiler supports `--print cfg` as a flag to print out -internal cfg directives, which Cargo currently uses to implement -platform-specific dependencies. +internal cfg directives, which Cargo uses to implement platform-specific +dependencies. When Cargo runs a build script it already sets a [number of environment variables][cargo-build-env], and it will now set a family of `CARGO_CFG_*` @@ -154,10 +156,11 @@ export CARGO_CFG_UNIX export CARGO_CFG_DEBUG_ASSERTIONS ``` -As mentioned in the previous section, the linkage of the C standard library -will be specified as a target feature, which is lowered to a `cfg` value. -One important complication here is that `cfg` values in Rust may be defined -multiple times, and this is the case with target features. When a +As mentioned in the previous section, the linkage of the C standard library will +be specified as a target feature, which is lowered to a `cfg` value, thus giving +build scripts the ability to modify compilation options based on C standard +library linkage. One important complication here is that `cfg` values in Rust +may be defined multiple times, and this is the case with target features. When a `cfg` value is defined multiple times, Cargo will create a single environment variable with a comma-separated list of values. @@ -186,18 +189,23 @@ for the C code they might be compiling. After this change, the gcc-rs crate will be modified to check for the `CARGO_CFG_TARGET_FEATURE` directive, and parse it into a list of enabled features. If the `crt-static` feature is not enabled it will compile C code on -the MSVC target with `/MD`. Otherwise if the value is `static` it will compile -code with `/MT`. +the MSVC target with `/MD`, indicating dynamic linkage. Otherwise if the value +is `static` it will compile code with `/MT`, indicating static linkage. Because +today the MSVC targets use dynamic linkage and gcc-rs compiles C code with `/MD`, +gcc-rs will remain forward and backwards compatible with existing and future +Rust MSVC toolchains until such time as the the decision is made to change the +MSVC toolchain to `+crt-static` by default. ### Lazy link attributes The final feature that will be added to the compiler is the ability to "lazily" -link a native library depending on values of `#[cfg]` at compile time of -downstream crates, not of the crate with the `#[link]` directives. This feature -is never intended to be stabilized, and is instead targeted at being an unstable -implementation detail of the `libc` crate. +interpret the linkage requirements of a native library depending on values of +`cfg` at compile time of downstream crates, not of the crate with the `#[link]` +directives. This feature is never intended to be stabilized, and is instead +targeted at being an unstable implementation detail of the `libc` crate linked +to `std` (but _not_ the stable `libc` crate deployed to crates.io). -Specifically, the `#[link]` attribute will be extended with a new directive +Specifically, the `#[link]` attribute will be extended with a new argument that it accepts, `cfg(..)`, such as: ```rust @@ -205,21 +213,23 @@ that it accepts, `cfg(..)`, such as: ``` This `cfg` indicates to the compiler that the `#[link]` annotation only applies -if the `bar` directive is matched. The compiler will then use this knowledge -in two ways: +if the `bar` directive is matched. This interpretation is done not during +compilation of the crate in which the `#[link]` directive appears, but during +compilation of the crate in which linking is finally performed. The compiler +will then use this knowledge in two ways: * When `dllimport` or `dllexport` needs to be applied, it will evaluate the - current compilation's `#[cfg]` directives and see if upstream `#[link]` + final compilation unit's `#[cfg]` directives and see if upstream `#[link]` directives apply or not. * When deciding what native libraries should be linked, the compiler will - evaluate whether they should be linked or not depending on the current + evaluate whether they should be linked or not depending on the final compilation's `#[cfg]` directives and the upstream `#[link]` directives. ### Customizing linkage to the C runtime -With the above features, the following changes will be made to enable selecting -the linkage of the C runtime at compile time for downstream crates. +With the above features, the following changes will be made to select the +linkage of the C runtime at compile time for downstream crates. First, the `libc` crate will be modified to contain blocks along the lines of: @@ -239,14 +249,14 @@ cfg_if! { } ``` -This informs the compiler that for the musl target if the CRT is statically +This informs the compiler that, for the musl target, if the CRT is statically linked then the library named `c` is included statically in libc.rlib. If the CRT is linked dynamically, however, then the library named `c` will be linked dynamically. Similarly for MSVC, a static CRT implies linking to `libcmt` and a dynamic CRT implies linking to `msvcrt` (as we do today). -Finally, an example of compiling for MSVC linking statically to the C runtime -would look like: +Finally, an example of compiling for MSVC and linking statically to the C +runtime would look like: ``` RUSTFLAGS='-C target-feature=+crt-static' cargo build --target x86_64-pc-windows-msvc @@ -273,7 +283,7 @@ Furthermore, it would have arguably been a "more correct" choice for Rust to by default statically link to the CRT on MSVC rather than dynamically. While this would be a breaking change today due to how C components are compiled, if this RFC is implemented it should not be a breaking change to switch the defaults in -the future. +the future, after a reasonable transition period. The support in this RFC implies that the exact artifacts that we're shipping will be usable for both dynamically and statically linking the CRT. @@ -286,6 +296,21 @@ feature would simply be another input to this logic, so Cargo would custom-compile the standard library if it differed from the upstream artifacts, solving this problem. +### References + +- [Issue about MSVCRT static linking] + (https://github.com/rust-lang/libc/issues/290) +- [Issue about musl dynamic linking] + (https://github.com/rust-lang/rust/issues/34987) +- [Discussion on issues around glgobal codegen configuration] + (https://internals.rust-lang.org/t/pre-rfc-a-vision-for-platform-architecture-configuration-specific-apis/3502) +- [std-aware Cargo RFC] + (https://github.com/rust-lang/libc/issues/290). + A proposal to teach Cargo to build the standard library. Rebuilding of std will + likely in the future be influenced by `-C target-feature`. +- [Cargo's documentation on build-script environment variables] + (https://github.com/rust-lang/libc/issues/290) + # Drawbacks [drawbacks]: #drawbacks @@ -300,6 +325,11 @@ solving this problem. targets the `--cfg crt_link=...` directive affects the linkage of the CRT, which seems like a worthy goal regardless. +* The lazy semantics of `#[link(cfg(..))]` are not so obvious from the name (no + other `cfg` attribute is treated this way). But this seems a minor issue since + the feature serves one implementation-specif purpose and isn't intended for + stabilization. + # Alternatives [alternatives]: #alternatives @@ -320,6 +350,15 @@ solving this problem. build scripts, and the compiler would have its own back channel for changing the linkage of the C library along the lines of `#[link(.., cfg(..))]` above. +* Another approach has [been proposed recently][rfc-1684] that has + rustc define an environment variable to specify the C runtime kind. + +[rfc-1684]: https://github.com/rust-lang/rfcs/pull/1684 + +* Instead of extending the semantics of `-C target-feature` beyond "CPU + features", we could instead add a new flag for the purpose, e.g. `-C + custom-feature`. + # Unresolved questions [unresolved]: #unresolved-questions From 30f9891c8d77716505979ffe210df34113c0ee19 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Fri, 19 Aug 2016 10:42:25 -0700 Subject: [PATCH 1084/1195] Rename RFC file --- text/{0000-crt-link.md => 0000-crt-static.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-crt-link.md => 0000-crt-static.md} (100%) diff --git a/text/0000-crt-link.md b/text/0000-crt-static.md similarity index 100% rename from text/0000-crt-link.md rename to text/0000-crt-static.md From 5ecde0cabdff3f1992c60063b8d5e8dbf805e05c Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Sun, 21 Aug 2016 17:20:29 -0400 Subject: [PATCH 1085/1195] Update builder and iterator types. --- text/0000-regex-1.0.md | 42 +++++++++++++++++++++++------------------- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md index d5cc255f509..4aa33e18b07 100644 --- a/text/0000-regex-1.0.md +++ b/text/0000-regex-1.0.md @@ -177,7 +177,7 @@ impl Regex { /// Returns an iterator of successive non-overlapping matches of this regex /// in the text given. - pub fn find_iter<'r, 't>(&'r self, text: &'t str) -> FindIter<'r, 't>; + pub fn find_iter<'r, 't>(&'r self, text: &'t str) -> Matches<'r, 't>; /// Returns the leftmost-first match of this regex in the text given with /// locations for all capturing groups that participated in the match. @@ -185,7 +185,7 @@ impl Regex { /// Returns an iterator of successive non-overlapping matches with capturing /// group information in the text given. - pub fn captures_iter<'r, 't>(&'r self, text: &'t str) -> CapturesIter<'r, 't>; + pub fn captures_iter<'r, 't>(&'r self, text: &'t str) -> CaptureMatches<'r, 't>; } ``` @@ -216,14 +216,14 @@ impl Regex { /// Returns an iterator of substrings of `text` delimited by a match of /// this regular expression. Each element yielded by the iterator corresponds /// to text that *isn't* matched by this regex. - pub fn split<'r, 't>(&'r self, text: &'t str) -> SplitsIter<'r, 't>; + pub fn split<'r, 't>(&'r self, text: &'t str) -> Split<'r, 't>; /// Returns an iterator of at most `limit` substrings of `text` delimited by /// a match of this regular expression. Each element yielded by the iterator /// corresponds to text that *isn't* matched by this regex. The remainder of /// `text` that is not split will be the last element yielded by the /// iterator. - pub fn splitn<'r, 't>(&'r self, text: &'t str, limit: usize) -> SplitsNIter<'r, 't>; + pub fn splitn<'r, 't>(&'r self, text: &'t str, limit: usize) -> SplitN<'r, 't>; } ``` @@ -254,7 +254,7 @@ impl Regex { /// Returns an iterator over all capturing group in the pattern in the order /// they were defined (by position of the leftmost parenthesis). The name of /// the group is yielded if it has a name, otherwise None is yielded. - pub fn capture_names(&self) -> CaptureNamesIter; + pub fn capture_names(&self) -> CaptureNames; /// Returns the total number of capturing groups in the pattern. This /// includes the implicit capturing group corresponding to the entire @@ -313,39 +313,39 @@ impl RegexBuilder { /// /// N.B. `RegexBuilder::new("...").compile()` is equivalent to /// `Regex::new("...")`. - pub fn compile(self) -> Result; + pub fn build(&self) -> Result; /// Set the case insensitive flag (i). - pub fn case_insensitive(self, yes: bool) -> RegexBuilder; + pub fn case_insensitive(&mut self, yes: bool); /// Set the multi line flag (m). - pub fn multi_line(self, yes: bool) -> RegexBuilder; + pub fn multi_line(&mut self, yes: bool); /// Set the dot-matches-any-character flag (s). - pub fn dot_matches_new_line(self, yes: bool) -> RegexBuilder; + pub fn dot_matches_new_line(&mut self, yes: bool); /// Set the swap-greedy flag (U). - pub fn swap_greed(self, yes: bool) -> RegexBuilder; + pub fn swap_greed(&mut self, yes: bool); /// Set the ignore whitespace flag (x). - pub fn ignore_whitespace(self, yes: bool) -> RegexBuilder; + pub fn ignore_whitespace(&mut self, yes: bool); /// Set the Unicode flag (u). - pub fn unicode(self, yes: bool) -> RegexBuilder; + pub fn unicode(&mut self, yes: bool); /// Set the approximate size limit (in bytes) of the compiled regular /// expression. /// /// If compiling a pattern would approximately exceed this size, then /// compilation will fail. - pub fn size_limit(self, limit: usize) -> RegexBuilder; + pub fn size_limit(&mut self, limit: usize); /// Set the approximate size limit (in bytes) of the cache used by the DFA. /// /// This is a per thread limit. Once the DFA fills the cache, it will be /// wiped and refilled again. If the cache is wiped too frequently, the /// DFA will quit and fall back to another matching engine. - pub fn dfa_size_limit(self, limit: usize) -> RegexBuilder; + pub fn dfa_size_limit(&mut self, limit: usize); } ``` @@ -402,7 +402,7 @@ and `Index` (for named capture groups). A downside of the `Index` impls is that the return value is bounded to the lifetime of `Captures` instead of the lifetime of the actual text searched because of how the `Index` trait is defined. Callers can work around that limitation if necessary by using an -explicit method such as `at` or `name`. +explicit method such as `get` or `name`. ## Replacer [replacer]: #replacer @@ -759,8 +759,8 @@ trait Regex { type Text: ?Sized; fn is_match(&self, text: &Self::Text) -> bool; - fn find(&self, text: &Self::Text) -> Option<(usize, usize)>; - fn find_iter<'r, 't>(&'r self, text: &'t Self::Text) -> FindIter<'r, 't, Self::Text>; + fn find(&self, text: &Self::Text) -> Option; + fn find_iter<'r, 't>(&'r self, text: &'t Self::Text) -> Matches<'r, 't, Self::Text>; // and so on } ``` @@ -771,7 +771,7 @@ generic code that searches either a `&str` or a `&[u8]` possible, but the semantics of searching `&str` (always valid UTF-8) or `&[u8]` are quite a bit different with respect to the original `Regex`. Secondly, the trait isn't obviously implementable by others. For example, some of the methods return -iterator types such as `FindIter` that are typically implemented with a +iterator types such as `Matches` that are typically implemented with a lower level API that isn't exposed. This suggests that a straight-forward traitification of the current API probably isn't appropriate, and perhaps, a better trait needs to be more fundamental to regex searching. @@ -817,7 +817,7 @@ trait Replacer { ``` But parameterizing the `Captures` type is a little bit tricky. Namely, methods -like `at` want to slice the text at match offsets, but this can't be done +like `get` want to slice the text at match offsets, but this can't be done safely in generic code without introducing another public trait. The final death knell in this idea is that these two implementations cannot @@ -965,3 +965,7 @@ API proposed in this RFC. error information, use the `regex-syntax` crate directly. * To allow future growth, some character classes may no longer compile to make room for possibly adding class set notation in the future. +* Various iterator types have been renamed. +* The `RegexBuilder` type now takes an `&mut self` on most methods instead of + `self`. Additionally, the final build step now uses `build()` instead of + `compile()`. From 4d676e957686d8e5c473522fbb89e1426b319a8a Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Sun, 21 Aug 2016 17:32:40 -0400 Subject: [PATCH 1086/1195] mention Unicode upgrades --- text/0000-regex-1.0.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md index 4aa33e18b07..93ceddcc812 100644 --- a/text/0000-regex-1.0.md +++ b/text/0000-regex-1.0.md @@ -93,7 +93,7 @@ necessary. Thus, this RFC proposes: semantics *is* a breaking change. (For example, changing `\b` from "word boundary assertion" to "backspace character.") -Bug fixes are exceptions to both (2) and (3). +Bug fixes and Unicode upgrades are exceptions to both (2) and (3). Another interesting exception to (2) is that compiling a regex can fail if the entire compiled object would exceed some pre-defined user configurable size. From 8434b822ca28646ee76464e768e672ca019deaab Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Sun, 21 Aug 2016 18:08:57 -0400 Subject: [PATCH 1087/1195] fix builder definition. derp. --- text/0000-regex-1.0.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/text/0000-regex-1.0.md b/text/0000-regex-1.0.md index 93ceddcc812..8feba05155a 100644 --- a/text/0000-regex-1.0.md +++ b/text/0000-regex-1.0.md @@ -316,36 +316,36 @@ impl RegexBuilder { pub fn build(&self) -> Result; /// Set the case insensitive flag (i). - pub fn case_insensitive(&mut self, yes: bool); + pub fn case_insensitive(&mut self, yes: bool) -> &mut RegexBuilder; /// Set the multi line flag (m). - pub fn multi_line(&mut self, yes: bool); + pub fn multi_line(&mut self, yes: bool) -> &mut RegexBuilder; /// Set the dot-matches-any-character flag (s). - pub fn dot_matches_new_line(&mut self, yes: bool); + pub fn dot_matches_new_line(&mut self, yes: bool) -> &mut RegexBuilder; /// Set the swap-greedy flag (U). - pub fn swap_greed(&mut self, yes: bool); + pub fn swap_greed(&mut self, yes: bool) -> &mut RegexBuilder; /// Set the ignore whitespace flag (x). - pub fn ignore_whitespace(&mut self, yes: bool); + pub fn ignore_whitespace(&mut self, yes: bool) -> &mut RegexBuilder; /// Set the Unicode flag (u). - pub fn unicode(&mut self, yes: bool); + pub fn unicode(&mut self, yes: bool) -> &mut RegexBuilder; /// Set the approximate size limit (in bytes) of the compiled regular /// expression. /// /// If compiling a pattern would approximately exceed this size, then /// compilation will fail. - pub fn size_limit(&mut self, limit: usize); + pub fn size_limit(&mut self, limit: usize) -> &mut RegexBuilder; /// Set the approximate size limit (in bytes) of the cache used by the DFA. /// /// This is a per thread limit. Once the DFA fills the cache, it will be /// wiped and refilled again. If the cache is wiped too frequently, the /// DFA will quit and fall back to another matching engine. - pub fn dfa_size_limit(&mut self, limit: usize); + pub fn dfa_size_limit(&mut self, limit: usize) -> &mut RegexBuilder; } ``` From 44c24bdfe53131ee29bc203133610ea1471c8f3c Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 22 Aug 2016 09:52:52 -0400 Subject: [PATCH 1088/1195] Merge RFC 1561: macro naming and modularisation --- text/{0000-macro-naming.md => 1561-macro-naming.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-macro-naming.md => 1561-macro-naming.md} (98%) diff --git a/text/0000-macro-naming.md b/text/1561-macro-naming.md similarity index 98% rename from text/0000-macro-naming.md rename to text/1561-macro-naming.md index 149b9c0d366..2d7a4caf6ee 100644 --- a/text/0000-macro-naming.md +++ b/text/1561-macro-naming.md @@ -1,7 +1,7 @@ - Feature Name: N/A (part of other unstable features) - Start Date: 2016-02-11 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1561 +- Rust Issue: https://github.com/rust-lang/rust/issues/35896 # Summary [summary]: #summary From ac2257321450f4eaa982e1f1e25840852404c3b7 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 22 Aug 2016 08:54:24 -0700 Subject: [PATCH 1089/1195] Add a number of unresolved questions --- text/0000-macros-1.1.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/text/0000-macros-1.1.md b/text/0000-macros-1.1.md index d51a979670c..012014feed6 100644 --- a/text/0000-macros-1.1.md +++ b/text/0000-macros-1.1.md @@ -559,3 +559,16 @@ pub struct Foo { crates twice, once as `rustc-macro` and once as an rlib. Does Cargo have enough information to do this? Are the extensions needed here backwards-compatible? + +* What sort of guarantees will be provided about the runtime environment for + plugins? Are they sandboxed? Are they run in the same process? + +* Should the name of this library be `rustc_macros`? The `rustc_` prefix + normally means "private". Other alternatives are `macro` (make it a contextual + keyword), `macros`, `proc_macro`. + +* Should a `Context` or similar style argument be threaded through the APIs? + Right now they sort of implicitly require one to be threaded through + thread-local-storage. + +* Should the APIs here be namespaced, perhaps with a `_1_1` suffix? From 67a14dc22e4c83f4940ce54a86c0548b06045915 Mon Sep 17 00:00:00 2001 From: Steven Allen Date: Mon, 22 Aug 2016 12:03:42 -0400 Subject: [PATCH 1090/1195] Update unresolved questions and drawbacks for the FusedIterator RFC 1. We can't remove the `done` bool without making `Fuse` invariant. 2. One drawback I failed to mention is that implementing `FusedIterator` locks implementors into obeying the `FusedIterator` spec. --- text/1581-fused-iterator.md | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/text/1581-fused-iterator.md b/text/1581-fused-iterator.md index b5cff076a8a..e95a314418d 100644 --- a/text/1581-fused-iterator.md +++ b/text/1581-fused-iterator.md @@ -190,6 +190,12 @@ impl Iterator for Fuse where I: FusedIterator { 3. Fuse isn't used very often anyways. However, I would argue that it should be used more often and people are just playing fast and loose. I'm hoping that making `Fuse` free when unneeded will encourage people to use it when they should. +4. This trait locks implementors into following the `FusedIterator` spec; + removing the `FusedIterator` implementation would be a breaking change. This + precludes future optimizations that take advantage of the fact that the + behavior of an `Iterator` is undefined after it returns `None` the first + time. + # Alternatives @@ -268,9 +274,11 @@ change. [unresolved]: #unresolved-questions Should this trait be unsafe? I can't think of any way generic unsafe code could -end up relying on the guarantees of `Fused`. +end up relying on the guarantees of `FusedIterator`. -Also, it's possible to implement the specialized `Fuse` struct without a useless -`don` bool. Unfortunately, it's *very* messy. IMO, this is not worth it for now +~~Also, it's possible to implement the specialized `Fuse` struct without a useless +`done` bool. Unfortunately, it's *very* messy. IMO, this is not worth it for now and can always be fixed in the future as it doesn't change the `FusedIterator` -trait. +trait.~~ Resolved: It's not possible to remove the `done` bool without making +`Fuse` invariant. + From a80f9bed204726c8c80860c26595f35828c53a64 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 22 Aug 2016 12:34:15 -0400 Subject: [PATCH 1091/1195] Merge RFC 1623: static lifetime in statics --- text/{0000-static.md => 1623-static.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-static.md => 1623-static.md} (97%) diff --git a/text/0000-static.md b/text/1623-static.md similarity index 97% rename from text/0000-static.md rename to text/1623-static.md index 1ac65277aac..b07b8aed773 100644 --- a/text/0000-static.md +++ b/text/1623-static.md @@ -1,7 +1,7 @@ - Feature Name: static_lifetime_in_statics - Start Date: 2016-05-20 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1623 +- Rust Issue: https://github.com/rust-lang/rust/issues/35897 # Summary [summary]: #summary From 35ad6c5368fdefbc5b51ab5612f5eebf24d4ab6b Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Mon, 22 Aug 2016 17:30:57 +0100 Subject: [PATCH 1092/1195] Add ptr::{read,write}_unaligned --- text/0000-unaligned-access.md | 63 +++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) create mode 100644 text/0000-unaligned-access.md diff --git a/text/0000-unaligned-access.md b/text/0000-unaligned-access.md new file mode 100644 index 00000000000..bf942c65c95 --- /dev/null +++ b/text/0000-unaligned-access.md @@ -0,0 +1,63 @@ +- Feature Name: unaligned_access +- Start Date: 2016-08-22 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Add two functions, `ptr::read_unaligned` and `ptr::write_unaligned`, which allows reading/writing to an unaligned pointer. All other functions that access memory (`ptr::{read,write}`, `ptr::copy{_nonoverlapping}`, etc) require that a pointer be suitably aligned for its type. + +# Motivation +[motivation]: #motivation + +One major use case is to make working with packed structs easier: + +```rust +#[repr(packed)] +struct Packed(u8, u16, u8); + +let mut a = Packed(0, 1, 0); +unsafe { + let b = ptr::read_unaligned(&a.1); + ptr::write_unaligned(&mut a.1, b + 1); +} +``` + +Other use cases generally involve parsing some file formats or network protocols that use unaligned values. + +# Detailed design +[design]: #detailed-design + +The implementation of these functions are simple wrappers around `ptr::copy_nonoverlapping`. The pointers are cast to `u8` to ensure that LLVM does not make any assumptions about the alignment. + +```rust +pub unsafe fn read_unaligned(p: *const T) -> T { + let mut r = mem::uninitialized(); + ptr::copy_nonoverlapping(p as *const u8, + &mut r as *mut _ as *mut u8, + mem::size_of::()); + r +} + +pub unsafe fn write_unaligned(p: *mut T, v: T) { + ptr::copy_nonoverlapping(&v as *const _ as *const u8, + p as *mut u8, + mem::size_of::()); +} +``` + +# Drawbacks +[drawbacks]: #drawbacks + +There functions aren't *stricly* necessary since they are just convenience wrappers around `ptr::copy_nonoverlapping`. + +# Alternatives +[alternatives]: #alternatives + +We could simply not add these, however figuring out how to do unaligned access properly is extremely unintuitive: you need to cast the pointer to `*mut u8` and then call `ptr::copy_nonoverlapping`. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None From 820701005335655e8158396e71de7af438762d74 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 22 Aug 2016 12:37:26 -0400 Subject: [PATCH 1093/1195] Merge RFC #1681: Macros 1.1 --- text/{0000-macros-1.1.md => 1681-macros-1.1.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-macros-1.1.md => 1681-macros-1.1.md} (99%) diff --git a/text/0000-macros-1.1.md b/text/1681-macros-1.1.md similarity index 99% rename from text/0000-macros-1.1.md rename to text/1681-macros-1.1.md index 012014feed6..011e62bd1c4 100644 --- a/text/0000-macros-1.1.md +++ b/text/1681-macros-1.1.md @@ -1,7 +1,7 @@ - Feature Name: `rustc_macros` - Start Date: 2016-07-14 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1681 +- Rust Issue: https://github.com/rust-lang/rust/issues/35900 # Summary [summary]: #summary From 12cba3a77fc3f0ed75feb6c74a0aea03ca3484c9 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Mon, 22 Aug 2016 16:05:39 -0400 Subject: [PATCH 1094/1195] add some notes about spans --- text/1681-macros-1.1.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/1681-macros-1.1.md b/text/1681-macros-1.1.md index 011e62bd1c4..9a50c591897 100644 --- a/text/1681-macros-1.1.md +++ b/text/1681-macros-1.1.md @@ -499,6 +499,12 @@ pub struct Foo { reexport the macros, but unfortunately that would require a likely much larger step towards "macros 2.0" to solve and would greatly increase the size of this RFC. + +* Converting to a string and back loses span information, which can + lead to degraded error messages. For example, currently we can make + an effort to use the span of a given field when deriving code that + is caused by that field, but that kind of precision will not be + possible until a richer interface is available. # Alternatives [alternatives]: #alternatives @@ -572,3 +578,8 @@ pub struct Foo { thread-local-storage. * Should the APIs here be namespaced, perhaps with a `_1_1` suffix? + +* To what extent can we preserve span information through heuristics? + Should we adopt a slightly different API, for example one based on + concatenation, to allow preserving spans? + From 0670624136c89ff58e03577273de0f97900d5533 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Sun, 7 Aug 2016 17:52:17 -0700 Subject: [PATCH 1095/1195] Roadmap RFC --- text/0000-north-star.md | 404 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 404 insertions(+) create mode 100644 text/0000-north-star.md diff --git a/text/0000-north-star.md b/text/0000-north-star.md new file mode 100644 index 00000000000..dc879b7dd72 --- /dev/null +++ b/text/0000-north-star.md @@ -0,0 +1,404 @@ +- Feature Name: north_star +- Start Date: 2016-08-07 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +A refinement of the Rust planning and reporting process, to establish a shared +vision of the language we are building toward among contributors, to make clear +the roadmap toward that vision, and to celebrate our achievements. + +Rust's roadmap will be established in year-long cycles, where we identify up +front - together, as a project - the most critical problems facing the language, +along with the story we want to be able to tell the world about Rust. Work +toward solving those problems, our short-term goals, will be decided in +quarter-long cycles by individual teams. Goals that result in stable features +will be assigned to release milestones for the purposes of reporting the project +roadmap. + +At the end of the year we will deliver a public facing retrospective, describing +the goals we achieved and how to use the new features in detail. It will +celebrate the year's progress in The Rust Project toward our goals, as well as +achievements in the wider community. It will celebrate our performance and +anticipate its impact on the coming year. + +The primary outcome for these changes to the process are that we will have a +consistent way to: + +- Decide our project-wide goals through consensus. +- Advertise our goals as a published roadmap. +- Celebrate our achievements with an informative publicity-bomb. + +# Motivation +[motivation]: #motivation + +Rust is a massive system, developed by a massive team of mostly-independent +contributors. What we've achieved together already is mind-blowing: we've +created a uniquely powerful platform that solves problems that the computing +world had nearly given up on, and jumpstarted a new era in systems +programming. Now that Rust is out in the world, proving itself to be a stable +foundation for building the next generation of computing systems, the +possibilities open to us are nearly endless. + +And that's a big problem. + +For many months approaching the release of Rust 1.0 we had a clear, singular +goal: get Rust done and deliver it to the world. We knew precisely the discreet +steps necessary to get there, and although it was a tense period where the +entire future of the project was on the line, we were united in a single +mission. As The Rust Project Developers we were pumped up, and our user base - +along with the wider programming world - were excited to see what we would +deliver. + +The same has not been true since. We've had a number of major goals - refactor +the compiler, enable strong IDE support, make cross-compilation easier, increase +community diversity - but it's not clear that we've been as focused on them as +needed. Even where there are clear strategic priorities in the project, they are +often under-emphasized in the way we talk about Rust, under-prioritized when we +do our own work or in our efforts to rally contributions, under-staffed by both +Mozilla and community contributors, and backburnered in favor of more present +issues. We are overwhelmed by an avalanche of promising ideas, with major RFCs +demanding attention (and languishing in the queue for months), TODO another +clause to make this sentence shine. + +Compounding this problem is that we have no clear end state for our efforts, no +major deliverable to show for all our work, for the community to rally behind, +and for the user base to anticipate. To a great degree this is a result of our +own successes - we have a short, time-based release cycle where new features +drip out as they become available, and a feature integration process that places +incredible emphasis on maintaining stability. It works shockingly well! But Rust +releases are boring 😢 (admitedly some of the reason for this is that language +features have been delayed waiting on internal compiler refactoring). And - +perhaps surprisingly - our rapid release process seems to cause work to proceed +slowly: the lack of deadlines for features reduces the pressure to get them +done, and today there are many approved RFCs languishing in a half-finished +state, with no-one urgently championing their completion. The slow trickle of +features reduces opportunities to make a big public 'splash' upon release, +lessening the impact of our work. + +The result is that there is a lack of direction in Rust, both real and +perceieved. + +This RFC proposes changes to the way The Rust Project plans its work, +communicates and monitors its progress, directs contributors to focus on the +strategic priorities of the project, and finally, delivers the results of its +effort to the world. + +The changes proposed here are intended to work with the particular strengths of +our project - community development, collaboration, distributed teams, loose +management structure, constant change and uncertanty. It should introduce +minimal additional burden on Rust team members, who are already heavily +overtasked. The proposal does not attempt to solve all problems of project +management in Rust, nor to fit the Rust process into any particular project +mamnagement structure. Let's make a few incremental improvements that will have +the greatest impact, and that we can accomplish without disruptive changes to +the way we work today. + +# Detailed design +[design]: #detailed-design + +Rust's roadmap will be established in year-long cycles, where we identify up +front, as a project, the most critical problems facing the language, formulated +as _problem statements_. Work toward solving those problems, _goals_, will be +planned in quarter-long cycles by individual teams. _goals_ that result in +stable features will be assigned to _release milestones_ for the purposes of +reporting the project roadmap. Along the way, teams will be expected to maintain +_tracking issues_ that communicate progress toward the project's goals. + +The end-of-year retrospective is a 'rallying point'. Its primary purposes are to +create anticipation of a major event in the Rust world, to motivate (rally) +contributors behind the goals we've established to get there, and generate a big +PR-bomb where we can brag to the world about what we've done. It can be thought +of as a 'state of the union'. This is where we tell Rust's story, describe the +new best practices enabled by the new features we've delivered, celebrate those +contributors who helped achieve our goals, honestly evaluate our performance, +and look forward to the year to come. + +## Summary of terminology + +- _problem statement_ - A description of a major issue facing Rust, possibly + spanning multiple teams and disciplines. We decide these together every year + so that everybody understands the direction the project is taking. These are + used as the broad basis for decision making throughout the year. +- _goal_ - These are set by individual teams quarterly, in service of solving + the problems identified by the project. They have estimated deadlines, and + those that result in stable features have estimated release numbers. Goals may + be subdivided into further discrete tasks on the issue tracker. +- _retrospective_ - At the end of the year we deliver a retrospective report. It + presents the result of work toward each of our goals in a way that serves to + reinforce the year's narrative. These are written for public consumption, + showing off new features, surfacing interesting technical details, and + celebrating those contributors who contribute to achieving the project's goals + and resolving it's problems. +- _quarterly milestone_ - All goals have estimates for completion, placed on + quarterly milestones. Each quarter that a goal remains incomplete it must be + re-triaged and re-estimated by the responsible team. + +## The big planning cycle (problem statements and the narrative arc) + +The big cycle spans one year. At the beginning of the cycle we identify areas of +Rust that need the most improvement, and at the end of the cycle is a 'rallying +point' where we deliver to the world the results of our efforts. We choose +year-long cycles because a year is enough time to accomplish relatively large +goals; and because having the rallying point occur at the same time every year +makes it easy to know when to anticipate big news from the project. + +This planning effort is _problem-oriented_. In our collective experience we have +consistently seen that spending up front effort focusing on motivation - even +when we have strong ideas about the solutions - is a critical step in building +consensus. It avoids surprises and hurt feelings, and establishes a strong causal +record for explaining decisions in the future. + +At the beginning of the cycle we spend no more than one month deciding on a +small set of _problem statements_ for the project, for the year. The number +needs to be small enough to present to the community managably, while also +sufficiently motivating the primary work of all the teams for the year. 8-10 is +a reasonable guideline. This planning takes place via the RFC process and is +open to the entire community. The result of the process is the yearly 'north +star RFC'. + +We strictly limit the planning phase to one month in order to keep the +discussion focused and to avoid unrestrained bikeshedding. The activities +specified here are not the focus of the project and we need to get through them +efficiently and get on with the actual work. + +The core team is responsible for initiating the process, either on the internals +forum or directly on the RFC repository, and the core team is responsible for +merging the final RFC, thus it will be their responsibility to ensure that the +discussion drives to a reasonable conclusion in time for the deadline. + +The problem statements established here determine the strategic direction of the +project. They identify critical areas where the project is lacking and represent +a public commitment to fixing them. + +TODO: How do we talk about solutions during this process? We certainly will have +lots of ideas about how these problems are going to get solved, and we can't +pretend like they don't exist. + +Problem statements consist of a single sentence summarizing the problem, and one +or more paragraph describing it in details. Examples of good problem statements +might be: + +- The Rust compiler is slow +- Rust lacks world-class IDE support +- The Rust story for asynchronous I/O is incomplete +- Rust compiler errors are dificult to understand +- Plugins need to be on path to stabilization +- Rust doesn't integrate well with garbage collectors +- Inability to write truly zero-cost abstractions (due to lack of + specialization) (TODO this is awfully goal-oriented, also not a complete + sentence) +- We would like the Rust community to be more diverse +- It's too hard to obtain Rust for the platforms people want to target + +During the actual process each of these would be accompanied by a paragraph or +more of justification. + +Once the year's problem statements are decided, a metabug is created for each on +the rust-lang/rust issue tracker and tagged `R-problem-statement`. In the OP of +each metabug the teams are responsible for maintaining a list of their goals, +linking to tracking issues. + +## The little planning cycle (goals and tracking progress) + +TODO: This is the most important part of the RFC mechanically and needs to be +clear so teams can just read it and follow the instructions. + +The little cycle is where the solutions take shape and are carried out. They +last one quarter - 3 months - and are the responsibility of individual teams. + +Each cycle the teams will have one week to update their set of _goals_. This +includes both creating new goals and reviewing and revising existing goals. A +goal describes a task that contributes to solving the year's problems. It may or +may not involve a concrete deliverable, and it may be in turn subdivided into +further goals. + +The social process of the quarterly planning cycle is less strict, but it +should be conducted in a way that allows open feedback. It is suggested that +teams present their quarterly plan on internals.rust-lang.org at the beginning +of the week, solicit feedback, then finalize them at the end of the week. + +All goals have estimated completion dates. There is no limit on the duration of +a single goal, but they are encouraged to be scoped to less than a quarter year +of work. Goals that are expected to take more than a quarter _must_ be +subdivided into smaller goals of less than a quarter, each with their own +estimates. These estimates are used to place goals onto quarterly milestones. + +Not all the work items done by teams in a quarter should be considered a goal +nor should they be. Goals only need to be granular enough to demonstrate +consistent progress toward solving the project's problems. Work that +contributors toward quarterly goals should still be tracked as sub-tasks of +those goals, but only needs to be filed on the issue tracker and not reported +directly as goals on the roadmap. + +For each goal the teams will create an issue on the issue tracker tagged with +`R-goal`. Each goal must be described in a single sentence summary (TODO what +makes a good summary?). Goals with sub-goals and sub-tasks must list them in the +OP in a standard format. + +During each planning period all goals must be triaged and updated for the +following information: + +- The set of sub-goals and sub-tasks and their status +- The estimated date of completion for goals + +## The retrospective (rallying point) + +- Written for broad public consumption +- Detailed +- Progress toward goals +- Demonstration of new features +- Technical details +- Reinforce the project narrative +- Celebrate contributors who accomplished our goals +- Celebrate the evolution of the ecosystem +- Evaluation of performance, missed goals + +TODO How is it constructed? + +## Release estimation + +The teams are responsible for estimating only the _timeframe_ in which they +complete their work, but possibly the single most important piece of information +desired by users is to know _in what release_ any given feature will become +available. + +To reduce process burden on team members we will not require them to make +that estimate themselves, instead a single person will have the responsibility +each quarter to examine the roadmap, its goals and time estimates, and turn +those into release estimates for individual features. + +The precise mechanics are to be determined. + +## Presenting the roadmap + +As a result of this process the Rust roadmap for the year is encoded in three +main ways, that evolve over the year: + +- The north-star RFC, which contains the problem statements collected in one + place +- The R-problem-statement issues, which contain the individual problem + statements, each linking to supporting goals +- The R-goal issues, which contain the work items, tagged with metadata + indicating their statuses. + +Alone, this is perhaps sufficient for presenting the roadmap. A user could run a +GitHub query for all `R-problem-statement` issues, and by digging through them +get a reasonably accurate picture of the roadmap. + +We may additionally develop tools to present this information in a more +accessible form (for a prototype see [1]). + +[1]: https://brson.github.io/rust-z + +## Calendar + +The timing of the events specified by this RFC is precisely specified in order +to limit bikeshedding. The activities specified here are not the focus of the +project and we need to get through them efficiently and get on with the actual +work. + +The north star RFC development happens during the month of September, starting +September 1 and ending by October 1. This means that an RFC must be ready for +RFC by the last week of September. We choose september for two reasons: it is +the final month of a calendar quarter, allowing the beginning of the years work +to commence at the beginning of calendar Q4; we choose Q4 because it is the +traditional conference season and allows us opportunities to talk publicly about +both our previous years progress as well as next years ambitions. + +Following from the September planning month, the quarterly planning cycles take +place for exactly one week at the beginning of the calendar quarter; and the +development of the yearly retrospective approximately for the month of August. + +## Summary of mechanics + +There are four primary new mechanism introduced by this RFC + +- North star RFC. Each year in September the entire project comes together + to produce this. It is what drives the evolution of the project roadmap + over the next year. +- `R-problem-statement` tag. The north star RFC defines problem statements that + are filed and tagged on the issue tracker. The `R-problem-statement` issues in + turn link to the goals that support them. +- `R-goal`. Reevaluated every quarter by the teams, with feedback from + the wider community, these are filed on the issue tracker, tagged `R-goal` and + linked to the `R-problem-statement` issue they support. +- End-of-year retrospective blog post. In the final month we write a detailed + blog post that hypes up our amazing work. + +For simplicity, all `R-problem-statement` and `R-goal` issues live in +rust-lang/rust, even when they primarily entail work on other code-bases. + +# Drawbacks +[drawbacks]: #drawbacks + +The yearly north star RFC could be an unpleast bikeshed. Maybe nobody actually +agrees on the project's direction. + +This imposes more work on teams to organize their goals. + +There is no mechanism here for presenting the roadmap. + +The end-of-year retrospective will require significant effort. It's not clear +who will be motivated to do it, and at the level of quality it demands. + +# Alternatives +[alternatives]: #alternatives + +Instead of imposing further process structure on teams we might attempt to +derive a roadmap soley from the data they are currently producing. + +To serve the purposes of a 'rallying point', a high-profile deliverable, we +might release a software product instead of the retrospective. A larger-scope +product than the existing rustc+cargo pair could accomplish this, i.e. +The Rust Platform. + +Another rallying point could be a long-term support release. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Are 1 year cycles long enough? + +Does the yearly report serve the purpose of building anticipation, motivation, +and creating a compelling PR-bomb? + +Is a consistent time-frame for the big cycle really the right thing? One of the +problems we have right now is that our release cycles are so predictable they +are boring. It could be more exciting to not know exactly when the cycle is +going to end, to experience the tension of struggling to cross the finish line. + +How can we account for work that is not part of the planning process +described here? + +How can we avoid adding new tags? + +How do we address problems that are outside the scope of the standard library +and compiler itself? Would have used 'the rust platform' and related processes. + +How do we motivate the improvement of rust-lang, other libraries? + +'Problem statement' is not inspiring terminology. We don't want to our roadmap +to be front-loaded with 'problems'. + +Likewise, 'goal' and 'retrospective' could be more colorful. + +How can we work in an inspiring 'vision statement'? + +Can we call the yearly RFC the 'north start RFC'? Too many concepts? + +Does the yearly planning really need to be an RFC? + +Likewise, _this RFC_ is currently titled 'north-star'. + +What about tracking work that is not part of R-problem-statement and R-goal. I +originally wanted to track all features in a roadmap, but this does not account +for anything that has not been explicitly identified as supporting the +roadmap. As formulated this does not provide an easy way to find the status of +arbitrary features in the RFC pipeline. + +How do we present the roadmap? Communicating what the project is working on and +toward is one of the _primary goals_ of this RFC and the solution it proposes is +minimal - read the R-problem-statement issues. From 8b98bfd8f0ff2e4aa1043d4f2a82d36565568371 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 22 Aug 2016 11:39:32 -0700 Subject: [PATCH 1096/1195] Aturon's first edit round --- text/0000-north-star.md | 208 ++++++++++++++++++++++------------------ 1 file changed, 115 insertions(+), 93 deletions(-) diff --git a/text/0000-north-star.md b/text/0000-north-star.md index dc879b7dd72..fd1e961cb74 100644 --- a/text/0000-north-star.md +++ b/text/0000-north-star.md @@ -7,16 +7,16 @@ [summary]: #summary A refinement of the Rust planning and reporting process, to establish a shared -vision of the language we are building toward among contributors, to make clear -the roadmap toward that vision, and to celebrate our achievements. +vision of the project among contributors, to make clear the roadmap toward that +vision, and to celebrate our achievements. Rust's roadmap will be established in year-long cycles, where we identify up -front - together, as a project - the most critical problems facing the language, -along with the story we want to be able to tell the world about Rust. Work -toward solving those problems, our short-term goals, will be decided in -quarter-long cycles by individual teams. Goals that result in stable features -will be assigned to release milestones for the purposes of reporting the project -roadmap. +front - together, as a project - the most critical problems facing the language +and its ecosystem, along with the story we want to be able to tell the world +about Rust. Work toward solving those problems, our short-term goals, will be +decided in quarter-long cycles by individual teams. Goals that result in stable +features will be assigned to release milestones for the purposes of reporting +the project roadmap. At the end of the year we will deliver a public facing retrospective, describing the goals we achieved and how to use the new features in detail. It will @@ -34,52 +34,65 @@ consistent way to: # Motivation [motivation]: #motivation -Rust is a massive system, developed by a massive team of mostly-independent -contributors. What we've achieved together already is mind-blowing: we've -created a uniquely powerful platform that solves problems that the computing -world had nearly given up on, and jumpstarted a new era in systems -programming. Now that Rust is out in the world, proving itself to be a stable -foundation for building the next generation of computing systems, the +Rust is a massive project and ecosystem, developed by a massive team of +mostly-independent contributors. What we've achieved together already is +mind-blowing: we've created a uniquely powerful platform that solves problems +that the computing world had nearly given up on, and jumpstarted a new era in +systems programming. Now that Rust is out in the world, proving itself to be a +stable foundation for building the next generation of computing systems, the possibilities open to us are nearly endless. And that's a big problem. -For many months approaching the release of Rust 1.0 we had a clear, singular -goal: get Rust done and deliver it to the world. We knew precisely the discreet -steps necessary to get there, and although it was a tense period where the -entire future of the project was on the line, we were united in a single -mission. As The Rust Project Developers we were pumped up, and our user base - -along with the wider programming world - were excited to see what we would -deliver. - -The same has not been true since. We've had a number of major goals - refactor -the compiler, enable strong IDE support, make cross-compilation easier, increase -community diversity - but it's not clear that we've been as focused on them as -needed. Even where there are clear strategic priorities in the project, they are -often under-emphasized in the way we talk about Rust, under-prioritized when we -do our own work or in our efforts to rally contributions, under-staffed by both -Mozilla and community contributors, and backburnered in favor of more present -issues. We are overwhelmed by an avalanche of promising ideas, with major RFCs -demanding attention (and languishing in the queue for months), TODO another -clause to make this sentence shine. - -Compounding this problem is that we have no clear end state for our efforts, no -major deliverable to show for all our work, for the community to rally behind, -and for the user base to anticipate. To a great degree this is a result of our -own successes - we have a short, time-based release cycle where new features -drip out as they become available, and a feature integration process that places -incredible emphasis on maintaining stability. It works shockingly well! But Rust -releases are boring 😢 (admitedly some of the reason for this is that language -features have been delayed waiting on internal compiler refactoring). And - -perhaps surprisingly - our rapid release process seems to cause work to proceed -slowly: the lack of deadlines for features reduces the pressure to get them -done, and today there are many approved RFCs languishing in a half-finished -state, with no-one urgently championing their completion. The slow trickle of -features reduces opportunities to make a big public 'splash' upon release, -lessening the impact of our work. - -The result is that there is a lack of direction in Rust, both real and -perceieved. +In the run-up to the release of Rust 1.0 we had a clear, singular goal: get Rust +done and deliver it to the world. We established the discrete steps necessary +to get there, and although it was a tense period where the entire future of the +project was on the line, we were united in a single mission. As The Rust Project +Developers we were pumped up, and our user base - along with the wider +programming world - were excited to see what we would deliver. + +But 1.0 is a unique event, and since then our efforts have become more diffuse +even as the scope of our ambitions widen. This shift is inevitable: **our success +post-1.0 depends on making improvements in increasingly broad and complex ways**. +The downside, of course, is that a less singular focus can make it much harder +to rally our efforts, to communicate a clear story - and ultimately, to ship. + +Since 1.0, we've attempted to lay out some major goals, both through the +[discuss forum] and the [blog]. We've done pretty well in actually achieving +these goals, and in some cases - particularly [MIR] - the community has really +come together to produce amazing, focused results. But in general, there are +several problems with the status quo: + +[discuss forum]: https://internals.rust-lang.org/t/priorities-after-1-0/1901 +[blog]: https://blog.rust-lang.org/2015/08/14/Next-year.html +[MIR]: https://blog.rust-lang.org/2016/04/19/MIR.html + +- We have not systematically tracked or communicated our progression through the + completion of these goals, making it difficult for even the most immersed + community members to know where things stand, and making it difficult for + *anyone* to know how or where to get involved. A symptom is that questions + like "When is MIR landing?" or "What are the blockers for `?` stabilizing" + become extremely frequently-asked. **We should provide an at-a-glance view + what Rust's current strategic priorities are and how they are progressing.** + +- We are overwhelmed by an avalanche of promising ideas, with major RFCs + demanding attention (and languishing in the queue for months) while subteams + focus on their strategic goals. This state of affairs produces needless + friction and loss of momentum. **We should agree on and disseminate our + priorities, so we can all be pulling in roughly the same direction**. + +- We do not have any single point of release, like 1.0, that gathers together a + large body of community work into a single, polished product. Instead, we have + a rapid release process, which is a huge boon for + [stability without stagnation] but can paradoxically reduce pressure to ship + in a timely fashion. **We should find a balance, retaining rapid release but + establishing some focal point around which to rally the community, polish a + product, and establish a clear public narrative**. + +[stability without stagnation]: http://blog.rust-lang.org/2014/10/30/Stability.html + +All told, there's a lot of room to do better in establishing, communicating, and +driving the vision for Rust. This RFC proposes changes to the way The Rust Project plans its work, communicates and monitors its progress, directs contributors to focus on the @@ -100,12 +113,12 @@ the way we work today. [design]: #detailed-design Rust's roadmap will be established in year-long cycles, where we identify up -front, as a project, the most critical problems facing the language, formulated -as _problem statements_. Work toward solving those problems, _goals_, will be -planned in quarter-long cycles by individual teams. _goals_ that result in -stable features will be assigned to _release milestones_ for the purposes of -reporting the project roadmap. Along the way, teams will be expected to maintain -_tracking issues_ that communicate progress toward the project's goals. +front the most critical problems facing the project, formulated as _problem +statements_. Work toward solving those problems, _goals_, will be planned in +quarter-long cycles by individual teams. _Goals_ that result in stable features +will be assigned to _release milestones_ for the purposes of reporting the +project roadmap. Along the way, teams will be expected to maintain _tracking +issues_ that communicate progress toward the project's goals. The end-of-year retrospective is a 'rallying point'. Its primary purposes are to create anticipation of a major event in the Rust world, to motivate (rally) @@ -122,16 +135,19 @@ and look forward to the year to come. spanning multiple teams and disciplines. We decide these together every year so that everybody understands the direction the project is taking. These are used as the broad basis for decision making throughout the year. + - _goal_ - These are set by individual teams quarterly, in service of solving the problems identified by the project. They have estimated deadlines, and those that result in stable features have estimated release numbers. Goals may be subdivided into further discrete tasks on the issue tracker. + - _retrospective_ - At the end of the year we deliver a retrospective report. It presents the result of work toward each of our goals in a way that serves to reinforce the year's narrative. These are written for public consumption, showing off new features, surfacing interesting technical details, and celebrating those contributors who contribute to achieving the project's goals and resolving it's problems. + - _quarterly milestone_ - All goals have estimates for completion, placed on quarterly milestones. Each quarter that a goal remains incomplete it must be re-triaged and re-estimated by the responsible team. @@ -143,13 +159,18 @@ Rust that need the most improvement, and at the end of the cycle is a 'rallying point' where we deliver to the world the results of our efforts. We choose year-long cycles because a year is enough time to accomplish relatively large goals; and because having the rallying point occur at the same time every year -makes it easy to know when to anticipate big news from the project. - -This planning effort is _problem-oriented_. In our collective experience we have -consistently seen that spending up front effort focusing on motivation - even -when we have strong ideas about the solutions - is a critical step in building -consensus. It avoids surprises and hurt feelings, and establishes a strong causal -record for explaining decisions in the future. +makes it easy to know when to anticipate big news from the project. (Being +calendar-based avoids the temptation to slip or produce feature-based releases, +instead providing a fixed point of accountability for shipping.) + +This planning effort is _problem-oriented_. Focusing on "why" may seem like an +obvious thing to do, but in practice it's very easy to become enamored of +particular technical ideas and lose sight of the larger context. By codifying a +top-level focus on motivation, we ensure we are focusing on the right problems +and keeping an open mind on how to solve them. Consensus on the problem space +then frames the debate on solutions, helping to avoid surprises and hurt +feelings, and establishing a strong causal record for explaining decisions in +the future. At the beginning of the cycle we spend no more than one month deciding on a small set of _problem statements_ for the project, for the year. The number @@ -159,43 +180,47 @@ a reasonable guideline. This planning takes place via the RFC process and is open to the entire community. The result of the process is the yearly 'north star RFC'. -We strictly limit the planning phase to one month in order to keep the -discussion focused and to avoid unrestrained bikeshedding. The activities -specified here are not the focus of the project and we need to get through them -efficiently and get on with the actual work. - -The core team is responsible for initiating the process, either on the internals -forum or directly on the RFC repository, and the core team is responsible for -merging the final RFC, thus it will be their responsibility to ensure that the -discussion drives to a reasonable conclusion in time for the deadline. - The problem statements established here determine the strategic direction of the project. They identify critical areas where the project is lacking and represent -a public commitment to fixing them. +a public commitment to fixing them. They should be informed in part by inputs +like [the survey] and [production user outreach], as well as an open discussion +process. And while the end-product is problem-focused, the discussion is likely +to touch on possible solutions as well. We shouldn't blindly commit to solving a +problem without some sense for the plausibility of a solution in terms of both +design and resources. -TODO: How do we talk about solutions during this process? We certainly will have -lots of ideas about how these problems are going to get solved, and we can't -pretend like they don't exist. +[the survey]: https://blog.rust-lang.org/2016/06/30/State-of-Rust-Survey-2016.html +[production user outreach]: https://internals.rust-lang.org/t/production-user-research-summary/2530 Problem statements consist of a single sentence summarizing the problem, and one -or more paragraph describing it in details. Examples of good problem statements -might be: +or more paragraphs describing it (and its importance!) in detail. Examples of +good problem statements might be: -- The Rust compiler is slow +- The Rust compiler is too slow for a tight edit-compile-test cycle - Rust lacks world-class IDE support -- The Rust story for asynchronous I/O is incomplete +- The Rust story for asynchronous I/O is very primitive - Rust compiler errors are dificult to understand -- Plugins need to be on path to stabilization +- Rust plugins have no clear path to stabilization - Rust doesn't integrate well with garbage collectors -- Inability to write truly zero-cost abstractions (due to lack of - specialization) (TODO this is awfully goal-oriented, also not a complete - sentence) -- We would like the Rust community to be more diverse +- Rust's trait system doesn't fully support zero-cost abstractions +- The Rust community is insufficiently diverse +- Rust needs more training materials +- Rust's CI infrastructure is unstable - It's too hard to obtain Rust for the platforms people want to target During the actual process each of these would be accompanied by a paragraph or more of justification. +We strictly limit the planning phase to one month in order to keep the +discussion focused and to avoid unrestrained bikeshedding. The activities +specified here are not the focus of the project and we need to get through them +efficiently and get on with the actual work. + +The core team is responsible for initiating the process, either on the internals +forum or directly on the RFC repository, and the core team is responsible for +merging the final RFC, thus it will be their responsibility to ensure that the +discussion drives to a reasonable conclusion in time for the deadline. + Once the year's problem statements are decided, a metabug is created for each on the rust-lang/rust issue tracker and tagged `R-problem-statement`. In the OP of each metabug the teams are responsible for maintaining a list of their goals, @@ -203,9 +228,6 @@ linking to tracking issues. ## The little planning cycle (goals and tracking progress) -TODO: This is the most important part of the RFC mechanically and needs to be -clear so teams can just read it and follow the instructions. - The little cycle is where the solutions take shape and are carried out. They last one quarter - 3 months - and are the responsibility of individual teams. @@ -229,14 +251,14 @@ estimates. These estimates are used to place goals onto quarterly milestones. Not all the work items done by teams in a quarter should be considered a goal nor should they be. Goals only need to be granular enough to demonstrate consistent progress toward solving the project's problems. Work that -contributors toward quarterly goals should still be tracked as sub-tasks of +contribute toward quarterly goals should still be tracked as sub-tasks of those goals, but only needs to be filed on the issue tracker and not reported directly as goals on the roadmap. For each goal the teams will create an issue on the issue tracker tagged with -`R-goal`. Each goal must be described in a single sentence summary (TODO what -makes a good summary?). Goals with sub-goals and sub-tasks must list them in the -OP in a standard format. +`R-goal`. Each goal must be described in a single sentence summary with a +_deliverable_ that is as crisply stated as possible. Goals with sub-goals and +sub-tasks must list them in the OP in a standard format. During each planning period all goals must be triaged and updated for the following information: From a0701a2d649980c6f36d6bee6b1c82c45df3f5ee Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Mon, 22 Aug 2016 20:52:27 -0700 Subject: [PATCH 1097/1195] More edits to north star RFC --- text/0000-north-star.md | 178 ++++++++++++++++++++++++---------------- 1 file changed, 106 insertions(+), 72 deletions(-) diff --git a/text/0000-north-star.md b/text/0000-north-star.md index fd1e961cb74..8e61da076aa 100644 --- a/text/0000-north-star.md +++ b/text/0000-north-star.md @@ -14,15 +14,16 @@ Rust's roadmap will be established in year-long cycles, where we identify up front - together, as a project - the most critical problems facing the language and its ecosystem, along with the story we want to be able to tell the world about Rust. Work toward solving those problems, our short-term goals, will be -decided in quarter-long cycles by individual teams. Goals that result in stable -features will be assigned to release milestones for the purposes of reporting -the project roadmap. +decided in quarter-long cycles by individual teams. For the purposes of +reporting the project roadmap, goals will be assigned to quartely milestones, +and where these goals result in stable features the Rust version in which they +become stable will be estimated as well. At the end of the year we will deliver a public facing retrospective, describing the goals we achieved and how to use the new features in detail. It will -celebrate the year's progress in The Rust Project toward our goals, as well as -achievements in the wider community. It will celebrate our performance and -anticipate its impact on the coming year. +celebrate the year's progress toward our goals, as well as the achievements of +the wider community. It will evaluate our performance and anticipate its impact +on the coming year. The primary outcome for these changes to the process are that we will have a consistent way to: @@ -58,12 +59,12 @@ The downside, of course, is that a less singular focus can make it much harder to rally our efforts, to communicate a clear story - and ultimately, to ship. Since 1.0, we've attempted to lay out some major goals, both through the -[discuss forum] and the [blog]. We've done pretty well in actually achieving +[internals forum] and the [blog]. We've done pretty well in actually achieving these goals, and in some cases - particularly [MIR] - the community has really come together to produce amazing, focused results. But in general, there are several problems with the status quo: -[discuss forum]: https://internals.rust-lang.org/t/priorities-after-1-0/1901 +[internals forum]: https://internals.rust-lang.org/t/priorities-after-1-0/1901 [blog]: https://blog.rust-lang.org/2015/08/14/Next-year.html [MIR]: https://blog.rust-lang.org/2016/04/19/MIR.html @@ -83,13 +84,13 @@ several problems with the status quo: - We do not have any single point of release, like 1.0, that gathers together a large body of community work into a single, polished product. Instead, we have - a rapid release process, which is a huge boon for - [stability without stagnation] but can paradoxically reduce pressure to ship - in a timely fashion. **We should find a balance, retaining rapid release but + a rapid release process, which results in a [remarkably stable and reliable + product][s] but can paradoxically reduce pressure to ship new features in a + timely fashion. **We should find a balance, retaining rapid release but establishing some focal point around which to rally the community, polish a product, and establish a clear public narrative**. -[stability without stagnation]: http://blog.rust-lang.org/2014/10/30/Stability.html +[s]: http://blog.rust-lang.org/2014/10/30/Stability.html All told, there's a lot of room to do better in establishing, communicating, and driving the vision for Rust. @@ -115,53 +116,60 @@ the way we work today. Rust's roadmap will be established in year-long cycles, where we identify up front the most critical problems facing the project, formulated as _problem statements_. Work toward solving those problems, _goals_, will be planned in -quarter-long cycles by individual teams. _Goals_ that result in stable features -will be assigned to _release milestones_ for the purposes of reporting the -project roadmap. Along the way, teams will be expected to maintain _tracking -issues_ that communicate progress toward the project's goals. - -The end-of-year retrospective is a 'rallying point'. Its primary purposes are to -create anticipation of a major event in the Rust world, to motivate (rally) -contributors behind the goals we've established to get there, and generate a big -PR-bomb where we can brag to the world about what we've done. It can be thought -of as a 'state of the union'. This is where we tell Rust's story, describe the -new best practices enabled by the new features we've delivered, celebrate those -contributors who helped achieve our goals, honestly evaluate our performance, -and look forward to the year to come. +quarter-long cycles by individual teams. For the purposes of reporting the +project roadmap, goals will be assigned to _quartely milestones_, and where +these goals result in stable features the Rust version in which they become +stable will be estimated as well. Along the way, teams will be expected to +maintain _tracking issues_ that communicate progress toward the project's goals. + +At the end of the year we will deliver a public facing retrospective, which is +intended as a 'rallying point'. Its primary purposes are to create anticipation +of a major event in the Rust world, to motivate (rally) contributors behind the +goals we've established to get there, and generate a big PR-bomb where we can +brag to the world about what we've done. It can be thought of as a 'state of the +union'. This is where we tell Rust's story, describe the new best practices +enabled by the new features we've delivered, celebrate those contributors who +helped achieve our goals, honestly evaluate our performance, and look forward to +the year to come. ## Summary of terminology +Key terminology used in this RFC: + - _problem statement_ - A description of a major issue facing Rust, possibly - spanning multiple teams and disciplines. We decide these together every year + spanning multiple teams and disciplines. We decide these together, every year, so that everybody understands the direction the project is taking. These are - used as the broad basis for decision making throughout the year. + used as the broad basis for decision making throughout the year, and are + captured in the yearly "north star RFC", and tagged `R-problem-statement` + on the issue tracker. - _goal_ - These are set by individual teams quarterly, in service of solving the problems identified by the project. They have estimated deadlines, and those that result in stable features have estimated release numbers. Goals may - be subdivided into further discrete tasks on the issue tracker. + be subdivided into further discrete tasks on the issue tracker. They are + tagged `R-goal`. - _retrospective_ - At the end of the year we deliver a retrospective report. It presents the result of work toward each of our goals in a way that serves to reinforce the year's narrative. These are written for public consumption, showing off new features, surfacing interesting technical details, and - celebrating those contributors who contribute to achieving the project's goals - and resolving it's problems. + celebrating those who contribute to achieving the project's goals and + resolving it's problems. - _quarterly milestone_ - All goals have estimates for completion, placed on quarterly milestones. Each quarter that a goal remains incomplete it must be re-triaged and re-estimated by the responsible team. -## The big planning cycle (problem statements and the narrative arc) +## The big planning cycle (problem statements and the north star RFC) The big cycle spans one year. At the beginning of the cycle we identify areas of Rust that need the most improvement, and at the end of the cycle is a 'rallying point' where we deliver to the world the results of our efforts. We choose year-long cycles because a year is enough time to accomplish relatively large goals; and because having the rallying point occur at the same time every year -makes it easy to know when to anticipate big news from the project. (Being +makes it easy to know when to anticipate big news from the project. Being calendar-based avoids the temptation to slip or produce feature-based releases, -instead providing a fixed point of accountability for shipping.) +instead providing a fixed point of accountability for shipping. This planning effort is _problem-oriented_. Focusing on "why" may seem like an obvious thing to do, but in practice it's very easy to become enamored of @@ -251,34 +259,49 @@ estimates. These estimates are used to place goals onto quarterly milestones. Not all the work items done by teams in a quarter should be considered a goal nor should they be. Goals only need to be granular enough to demonstrate consistent progress toward solving the project's problems. Work that -contribute toward quarterly goals should still be tracked as sub-tasks of +contributes toward quarterly goals should still be tracked as sub-tasks of those goals, but only needs to be filed on the issue tracker and not reported directly as goals on the roadmap. For each goal the teams will create an issue on the issue tracker tagged with -`R-goal`. Each goal must be described in a single sentence summary with a -_deliverable_ that is as crisply stated as possible. Goals with sub-goals and -sub-tasks must list them in the OP in a standard format. +`R-goal`. Each goal must be described in a single sentence summary with an +end-result or deliverable that is as crisply stated as possible. Goals with +sub-goals and sub-tasks must list them in the OP in a standard format. During each planning period all goals must be triaged and updated for the following information: - The set of sub-goals and sub-tasks and their status -- The estimated date of completion for goals +- The estimated date of completion ## The retrospective (rallying point) -- Written for broad public consumption -- Detailed -- Progress toward goals -- Demonstration of new features -- Technical details -- Reinforce the project narrative -- Celebrate contributors who accomplished our goals -- Celebrate the evolution of the ecosystem -- Evaluation of performance, missed goals - -TODO How is it constructed? +The retrospective is an opportunity to showcase the best of Rust and its +community to the world. + +It is a report covering all the Rust activity of the past year. It is written +for a broad audience: contributors, users and non-users alike. It reviews each +of the problems we tackled this year and the goals we achieved toward solving +them, and it highlights important work in the broader community and +ecosystem. For both these things the retrospective provides technical detail, as +though it were primary documentation; this is where we show our best side to the +world. It explains new features in depth, with clear prose and plentiful +examples, and it connects them all thematically, as a demonstration of how to +write cutting-edge Rust code. + +While we are always lavish with our praise of contributors, the retrospective is +the best opportunity to celebrate specific individuals and their contributions +toward the strategic interests of the project, as defined way back at the +beginning of the year. + +Finally, the retrospective is an opportunity to evaluate our performance. Did we +make progress toward solving the problems we set out to solve? Did we outright +solve any of them? Where did we fail to meet our goals and how might we do +better next year? + +Since the retrospective must be a high-quality document, and cover a lot of +material, it is expected to require significant planning, editing and revision. +The details of how this will work are to be determined. ## Release estimation @@ -315,6 +338,8 @@ accessible form (for a prototype see [1]). [1]: https://brson.github.io/rust-z +Again, the details are to be determined. + ## Calendar The timing of the events specified by this RFC is precisely specified in order @@ -331,33 +356,42 @@ traditional conference season and allows us opportunities to talk publicly about both our previous years progress as well as next years ambitions. Following from the September planning month, the quarterly planning cycles take -place for exactly one week at the beginning of the calendar quarter; and the -development of the yearly retrospective approximately for the month of August. - -## Summary of mechanics - -There are four primary new mechanism introduced by this RFC - -- North star RFC. Each year in September the entire project comes together - to produce this. It is what drives the evolution of the project roadmap - over the next year. -- `R-problem-statement` tag. The north star RFC defines problem statements that - are filed and tagged on the issue tracker. The `R-problem-statement` issues in - turn link to the goals that support them. -- `R-goal`. Reevaluated every quarter by the teams, with feedback from - the wider community, these are filed on the issue tracker, tagged `R-goal` and - linked to the `R-problem-statement` issue they support. -- End-of-year retrospective blog post. In the final month we write a detailed - blog post that hypes up our amazing work. - -For simplicity, all `R-problem-statement` and `R-goal` issues live in -rust-lang/rust, even when they primarily entail work on other code-bases. +place for exactly one week at the beginning of the calendar quarter; likewise, +the planning for each subsequent quarter at the beginning of the calendar +quarter; and the development of the yearly retrospective approximately for the +month of August. + +## References + +- [Refining RFCs part 1: Roadmap] + (https://internals.rust-lang.org/t/refining-rfcs-part-1-roadmap/3656), + the internals.rust-lang.org thread that spawned this RFC. +- [Post-1.0 priorities thread on internals.rust-lang.org] + (https://internals.rust-lang.org/t/priorities-after-1-0/1901). +- [Post-1.0 blog post on project direction] + (https://blog.rust-lang.org/2015/08/14/Next-year.html). +- [Blog post on MIR] + (https://blog.rust-lang.org/2016/04/19/MIR.html), + a large success in strategic community collaboration. +- ["Stability without stagnation"] + (http://blog.rust-lang.org/2014/10/30/Stability.html), + outlining Rust's philosophy on rapid iteration while maintaining strong + stability guarantees. +- [The 2016 state of Rust survey] + (https://blog.rust-lang.org/2016/06/30/State-of-Rust-Survey-2016.html), + which indicates promising directions for future work. +- [Production user outreach thread on internals.rust-lang.org] + (https://internals.rust-lang.org/t/production-user-research-summary/2530), + another strong indicator of Rust's needs. +- [rust-z] + (https://brson.github.io/rust-z), + a prototype tool to organize the roadmap. # Drawbacks [drawbacks]: #drawbacks -The yearly north star RFC could be an unpleast bikeshed. Maybe nobody actually -agrees on the project's direction. +The yearly north star RFC could be an unpleastant bikeshed. Maybe nobody +actually agrees on the project's direction. This imposes more work on teams to organize their goals. From c3d4f6d20d509638aa8b9e35b5631f9970324ecf Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 22 Aug 2016 21:35:01 -0700 Subject: [PATCH 1098/1195] Aturon's edits --- text/0000-north-star.md | 86 ++++++++++++++++++++++------------------- 1 file changed, 46 insertions(+), 40 deletions(-) diff --git a/text/0000-north-star.md b/text/0000-north-star.md index 8e61da076aa..d66717567c0 100644 --- a/text/0000-north-star.md +++ b/text/0000-north-star.md @@ -310,10 +310,11 @@ complete their work, but possibly the single most important piece of information desired by users is to know _in what release_ any given feature will become available. -To reduce process burden on team members we will not require them to make -that estimate themselves, instead a single person will have the responsibility -each quarter to examine the roadmap, its goals and time estimates, and turn -those into release estimates for individual features. +To reduce process burden on team members we will not require them to make that +estimate themselves; the teams will work purely in terms of quarterly +milestones. Instead, we will have a separate process to map the goals and time +estimates into release estimates for individual features - a process that is +likely automatable. The precise mechanics are to be determined. @@ -326,26 +327,28 @@ main ways, that evolve over the year: place - The R-problem-statement issues, which contain the individual problem statements, each linking to supporting goals -- The R-goal issues, which contain the work items, tagged with metadata - indicating their statuses. +- The R-goal issues, which contain a hierarchy of work items, tagged with + metadata indicating their statuses. -Alone, this is perhaps sufficient for presenting the roadmap. A user could run a +Alone, these provide the *raw data* for a roadmap. A user could run a GitHub query for all `R-problem-statement` issues, and by digging through them get a reasonably accurate picture of the roadmap. -We may additionally develop tools to present this information in a more -accessible form (for a prototype see [1]). +However, for the process to be a success, we need to present the roadmap in a +way that is prominent, succinct, and layered with progressive detail. There is a +lot of opportunity for design here; an early prototype of one possible view is +available [here]. -[1]: https://brson.github.io/rust-z +[here]: https://brson.github.io/rust-z Again, the details are to be determined. ## Calendar The timing of the events specified by this RFC is precisely specified in order -to limit bikeshedding. The activities specified here are not the focus of the -project and we need to get through them efficiently and get on with the actual -work. +to set clear expectations and accountability, and to avoid process slippage. The +activities specified here are not the focus of the project and we need to get +through them efficiently and get on with the actual work. The north star RFC development happens during the month of September, starting September 1 and ending by October 1. This means that an RFC must be ready for @@ -353,7 +356,8 @@ RFC by the last week of September. We choose september for two reasons: it is the final month of a calendar quarter, allowing the beginning of the years work to commence at the beginning of calendar Q4; we choose Q4 because it is the traditional conference season and allows us opportunities to talk publicly about -both our previous years progress as well as next years ambitions. +both our previous years progress as well as next years ambitions. By contrast, +starting with Q1 of the calendar year is problematic due to the holiday season. Following from the September planning month, the quarterly planning cycles take place for exactly one week at the beginning of the calendar quarter; likewise, @@ -361,6 +365,9 @@ the planning for each subsequent quarter at the beginning of the calendar quarter; and the development of the yearly retrospective approximately for the month of August. +The survey and other forms of outreach and data gathering should be timed to fit +well into the overall calendar. + ## References - [Refining RFCs part 1: Roadmap] @@ -390,15 +397,19 @@ month of August. # Drawbacks [drawbacks]: #drawbacks -The yearly north star RFC could be an unpleastant bikeshed. Maybe nobody -actually agrees on the project's direction. - -This imposes more work on teams to organize their goals. +The yearly north star RFC could be an unpleasant bikeshed, because it +simultaneously raises the stakes of discussion while moving away from concrete +proposals. That said, the *problem* orientation should help facilitate +discussion, and in any case it's vital to be explicit about our values and +prioritization. -There is no mechanism here for presenting the roadmap. +While part of the aim of this proposal is to increase the effectiveness of our +team, it also imposes some amount of additional work on everyone. Hopefully the +benefits will outweigh the costs. The end-of-year retrospective will require significant effort. It's not clear -who will be motivated to do it, and at the level of quality it demands. +who will be motivated to do it, and at the level of quality it demands. This is +the piece of the proposal that will probably need the most follow-up work. # Alternatives [alternatives]: #alternatives @@ -409,7 +420,7 @@ derive a roadmap soley from the data they are currently producing. To serve the purposes of a 'rallying point', a high-profile deliverable, we might release a software product instead of the retrospective. A larger-scope product than the existing rustc+cargo pair could accomplish this, i.e. -The Rust Platform. +[The Rust Platform](http://aturon.github.io/blog/2016/07/27/rust-platform/) idea. Another rallying point could be a long-term support release. @@ -421,39 +432,34 @@ Are 1 year cycles long enough? Does the yearly report serve the purpose of building anticipation, motivation, and creating a compelling PR-bomb? -Is a consistent time-frame for the big cycle really the right thing? One of the +Is a consistent time-frame for the big cycle really the right thing? One of the problems we have right now is that our release cycles are so predictable they -are boring. It could be more exciting to not know exactly when the cycle is -going to end, to experience the tension of struggling to cross the finish line. +are almost boring. It could be more exciting to not know exactly when the cycle +is going to end, to experience the tension of struggling to cross the finish +line. How can we account for work that is not part of the planning process described here? -How can we avoid adding new tags? - How do we address problems that are outside the scope of the standard library -and compiler itself? Would have used 'the rust platform' and related processes. +and compiler itself? (See +[The Rust Platform](http://aturon.github.io/blog/2016/07/27/rust-platform/) for +an alternative aimed at this goal.) -How do we motivate the improvement of rust-lang, other libraries? +How do we motivate the improvement of rust-lang crates and other libraries? Are +they part of the planning process? The retrospective? 'Problem statement' is not inspiring terminology. We don't want to our roadmap -to be front-loaded with 'problems'. - -Likewise, 'goal' and 'retrospective' could be more colorful. - -How can we work in an inspiring 'vision statement'? +to be front-loaded with 'problems'. Likewise, 'goal' and 'retrospective' could +be more colorful. Can we call the yearly RFC the 'north start RFC'? Too many concepts? -Does the yearly planning really need to be an RFC? - -Likewise, _this RFC_ is currently titled 'north-star'. - -What about tracking work that is not part of R-problem-statement and R-goal. I +What about tracking work that is not part of R-problem-statement and R-goal? I originally wanted to track all features in a roadmap, but this does not account for anything that has not been explicitly identified as supporting the -roadmap. As formulated this does not provide an easy way to find the status of -arbitrary features in the RFC pipeline. +roadmap. As formulated this proposal does not provide an easy way to find the +status of arbitrary features in the RFC pipeline. How do we present the roadmap? Communicating what the project is working on and toward is one of the _primary goals_ of this RFC and the solution it proposes is From 96a089ae1dde0bc804a98c6158d7dd7b7d195b33 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Tue, 23 Aug 2016 15:21:39 -0700 Subject: [PATCH 1099/1195] Fix typos --- text/0000-north-star.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/text/0000-north-star.md b/text/0000-north-star.md index d66717567c0..4e06b755350 100644 --- a/text/0000-north-star.md +++ b/text/0000-north-star.md @@ -15,7 +15,7 @@ front - together, as a project - the most critical problems facing the language and its ecosystem, along with the story we want to be able to tell the world about Rust. Work toward solving those problems, our short-term goals, will be decided in quarter-long cycles by individual teams. For the purposes of -reporting the project roadmap, goals will be assigned to quartely milestones, +reporting the project roadmap, goals will be assigned to quarterly milestones, and where these goals result in stable features the Rust version in which they become stable will be estimated as well. @@ -102,11 +102,11 @@ effort to the world. The changes proposed here are intended to work with the particular strengths of our project - community development, collaboration, distributed teams, loose -management structure, constant change and uncertanty. It should introduce +management structure, constant change and uncertainty. It should introduce minimal additional burden on Rust team members, who are already heavily overtasked. The proposal does not attempt to solve all problems of project management in Rust, nor to fit the Rust process into any particular project -mamnagement structure. Let's make a few incremental improvements that will have +management structure. Let's make a few incremental improvements that will have the greatest impact, and that we can accomplish without disruptive changes to the way we work today. @@ -117,7 +117,7 @@ Rust's roadmap will be established in year-long cycles, where we identify up front the most critical problems facing the project, formulated as _problem statements_. Work toward solving those problems, _goals_, will be planned in quarter-long cycles by individual teams. For the purposes of reporting the -project roadmap, goals will be assigned to _quartely milestones_, and where +project roadmap, goals will be assigned to _quarterly milestones_, and where these goals result in stable features the Rust version in which they become stable will be estimated as well. Along the way, teams will be expected to maintain _tracking issues_ that communicate progress toward the project's goals. @@ -207,7 +207,7 @@ good problem statements might be: - The Rust compiler is too slow for a tight edit-compile-test cycle - Rust lacks world-class IDE support - The Rust story for asynchronous I/O is very primitive -- Rust compiler errors are dificult to understand +- Rust compiler errors are difficult to understand - Rust plugins have no clear path to stabilization - Rust doesn't integrate well with garbage collectors - Rust's trait system doesn't fully support zero-cost abstractions @@ -352,7 +352,7 @@ through them efficiently and get on with the actual work. The north star RFC development happens during the month of September, starting September 1 and ending by October 1. This means that an RFC must be ready for -RFC by the last week of September. We choose september for two reasons: it is +FCP by the last week of September. We choose September for two reasons: it is the final month of a calendar quarter, allowing the beginning of the years work to commence at the beginning of calendar Q4; we choose Q4 because it is the traditional conference season and allows us opportunities to talk publicly about @@ -415,7 +415,7 @@ the piece of the proposal that will probably need the most follow-up work. [alternatives]: #alternatives Instead of imposing further process structure on teams we might attempt to -derive a roadmap soley from the data they are currently producing. +derive a roadmap solely from the data they are currently producing. To serve the purposes of a 'rallying point', a high-profile deliverable, we might release a software product instead of the retrospective. A larger-scope @@ -453,7 +453,7 @@ they part of the planning process? The retrospective? to be front-loaded with 'problems'. Likewise, 'goal' and 'retrospective' could be more colorful. -Can we call the yearly RFC the 'north start RFC'? Too many concepts? +Can we call the yearly RFC the 'north star RFC'? Too many concepts? What about tracking work that is not part of R-problem-statement and R-goal? I originally wanted to track all features in a roadmap, but this does not account From 0a31e739ea3bf63f231c6f8ae1e763d4fb7e544c Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Tue, 23 Aug 2016 15:39:11 -0700 Subject: [PATCH 1100/1195] Add another unresolved question --- text/0000-north-star.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/0000-north-star.md b/text/0000-north-star.md index 4e06b755350..049cdfecc72 100644 --- a/text/0000-north-star.md +++ b/text/0000-north-star.md @@ -429,6 +429,9 @@ Another rallying point could be a long-term support release. Are 1 year cycles long enough? +Are 1 year cycles too long? What happens if important problems come up +mid-cycle? + Does the yearly report serve the purpose of building anticipation, motivation, and creating a compelling PR-bomb? From 2832657eba3489e751e0e3917daa870b1015c82a Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Tue, 23 Aug 2016 16:12:43 -0700 Subject: [PATCH 1101/1195] 'why' -> 'how' --- text/0000-north-star.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-north-star.md b/text/0000-north-star.md index 049cdfecc72..bb528d084ec 100644 --- a/text/0000-north-star.md +++ b/text/0000-north-star.md @@ -171,7 +171,7 @@ makes it easy to know when to anticipate big news from the project. Being calendar-based avoids the temptation to slip or produce feature-based releases, instead providing a fixed point of accountability for shipping. -This planning effort is _problem-oriented_. Focusing on "why" may seem like an +This planning effort is _problem-oriented_. Focusing on "how" may seem like an obvious thing to do, but in practice it's very easy to become enamored of particular technical ideas and lose sight of the larger context. By codifying a top-level focus on motivation, we ensure we are focusing on the right problems From 1d01c9862be11b5cd3867281fe2404cf7d27f873 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Thu, 25 Aug 2016 17:04:02 -0700 Subject: [PATCH 1102/1195] Remove list of active RFCs from README Very rarely updated, not accurate. --- README.md | 70 ------------------------------------------------------- 1 file changed, 70 deletions(-) diff --git a/README.md b/README.md index 56d1925528b..75d926f318f 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,6 @@ # Rust RFCs [Rust RFCs]: #rust-rfcs -(jump forward to: [Table of Contents], [Active RFC List]) - Many changes, including bug fixes and documentation improvements can be implemented and reviewed via the normal GitHub pull request workflow. @@ -15,77 +13,9 @@ consistent and controlled path for new features to enter the language and standard libraries, so that all stakeholders can be confident about the direction the language is evolving in. - -## Active RFC List -[Active RFC List]: #active-rfc-list - -* [0016-more-attributes.md](text/0016-more-attributes.md) -* [0019-opt-in-builtin-traits.md](text/0019-opt-in-builtin-traits.md) -* [0066-better-temporary-lifetimes.md](text/0066-better-temporary-lifetimes.md) -* [0107-pattern-guards-with-bind-by-move.md](text/0107-pattern-guards-with-bind-by-move.md) -* [0135-where.md](text/0135-where.md) -* [0213-defaulted-type-params.md](text/0213-defaulted-type-params.md) -* [0243-trait-based-exception-handling.md](text/0243-trait-based-exception-handling.md) -* [0401-coercions.md](text/0401-coercions.md) -* [0495-array-pattern-changes.md](text/0495-array-pattern-changes.md) -* [0501-consistent_no_prelude_attributes.md](text/0501-consistent_no_prelude_attributes.md) -* [0639-discriminant-intrinsic.md](text/0639-discriminant-intrinsic.md) -* [0803-type-ascription.md](text/0803-type-ascription.md) -* [0809-box-and-in-for-stdlib.md](text/0809-box-and-in-for-stdlib.md) -* [0873-type-macros.md](text/0873-type-macros.md) -* [0911-const-fn.md](text/0911-const-fn.md) -* [0982-dst-coercion.md](text/0982-dst-coercion.md) -* [1131-likely-intrinsic.md](text/1131-likely-intrinsic.md) -* [1183-swap-out-jemalloc.md](text/1183-swap-out-jemalloc.md) -* [1192-inclusive-ranges.md](text/1192-inclusive-ranges.md) -* [1199-simd-infrastructure.md](text/1199-simd-infrastructure.md) -* [1201-naked-fns.md](text/1201-naked-fns.md) -* [1210-impl-specialization.md](text/1210-impl-specialization.md) -* [1211-mir.md](text/1211-mir.md) -* [1214-projections-lifetimes-and-wf.md](text/1214-projections-lifetimes-and-wf.md) -* [1216-bang-type.md](text/1216-bang-type.md) -* [1228-placement-left-arrow.md](text/1228-placement-left-arrow.md) -* [1229-compile-time-asserts.md](text/1229-compile-time-asserts.md) -* [1238-nonparametric-dropck.md](text/1238-nonparametric-dropck.md) -* [1240-repr-packed-unsafe-ref.md](text/1240-repr-packed-unsafe-ref.md) -* [1260-main-reexport.md](text/1260-main-reexport.md) -* [1268-allow-overlapping-impls-on-marker-traits.md](text/1268-allow-overlapping-impls-on-marker-traits.md) -* [1298-incremental-compilation.md](text/1298-incremental-compilation.md) -* [1317-ide.md](text/1317-ide.md) -* [1327-dropck-param-eyepatch.md](text/1327-dropck-param-eyepatch.md) -* [1331-grammar-is-canonical.md](text/1331-grammar-is-canonical.md) -* [1358-repr-align.md](text/1358-repr-align.md) -* [1359-process-ext-unix.md](text/1359-process-ext-unix.md) -* [1398-kinds-of-allocators.md](text/1398-kinds-of-allocators.md) -* [1399-repr-pack.md](text/1399-repr-pack.md) -* [1422-pub-restricted.md](text/1422-pub-restricted.md) -* [1432-replace-slice.md](text/1432-replace-slice.md) -* [1434-contains-method-for-ranges.md](text/1434-contains-method-for-ranges.md) -* [1440-drop-types-in-const.md](text/1440-drop-types-in-const.md) -* [1444-union.md](text/1444-union.md) -* [1445-restrict-constants-in-patterns.md](text/1445-restrict-constants-in-patterns.md) -* [1492-dotdot-in-patterns.md](text/1492-dotdot-in-patterns.md) -* [1498-ipv6addr-octets.md](text/1498-ipv6addr-octets.md) -* [1504-int128.md](text/1504-int128.md) -* [1513-less-unwinding.md](text/1513-less-unwinding.md) -* [1522-conservative-impl-trait.md](text/1522-conservative-impl-trait.md) -* [1535-stable-overflow-checks.md](text/1535-stable-overflow-checks.md) -* [1542-try-from.md](text/1542-try-from.md) -* [1543-integer_atomics.md](text/1543-integer_atomics.md) -* [1548-global-asm.md](text/1548-global-asm.md) -* [1552-contains-method-for-various-collections.md](text/1552-contains-method-for-various-collections.md) -* [1559-attributes-with-literals.md](text/1559-attributes-with-literals.md) -* [1560-name-resolution.md](text/1560-name-resolution.md) -* [1590-macro-lifetimes.md](text/1590-macro-lifetimes.md) -* [1644-default-and-expanded-rustc-errors.md](text/1644-default-and-expanded-rustc-errors.md) -* [1653-assert_ne.md](text/1653-assert_ne.md) -* [1660-try-borrow.md](text/1660-try-borrow.md) - - ## Table of Contents [Table of Contents]: #table-of-contents * [Opening](#rust-rfcs) -* [Active RFC List] * [Table of Contents] * [When you need to follow this process] * [Before creating an RFC] From 735549875cc0b87f30340cdc8c60237987592f23 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Thu, 1 Sep 2016 15:46:29 -0700 Subject: [PATCH 1103/1195] Update north star RFC - Use 6-week cycles - Don't require time estimates - Do require release cycle milestone estimates - Add language about amending the north star RFC - Require 6-week status reports from teams --- text/0000-north-star.md | 147 +++++++++++++++++++++------------------- 1 file changed, 77 insertions(+), 70 deletions(-) diff --git a/text/0000-north-star.md b/text/0000-north-star.md index bb528d084ec..efde7dbba3e 100644 --- a/text/0000-north-star.md +++ b/text/0000-north-star.md @@ -14,10 +14,9 @@ Rust's roadmap will be established in year-long cycles, where we identify up front - together, as a project - the most critical problems facing the language and its ecosystem, along with the story we want to be able to tell the world about Rust. Work toward solving those problems, our short-term goals, will be -decided in quarter-long cycles by individual teams. For the purposes of -reporting the project roadmap, goals will be assigned to quarterly milestones, -and where these goals result in stable features the Rust version in which they -become stable will be estimated as well. +decided by the individual teams, as they see fit, and regularly re-triaged. For +the purposes of reporting the project roadmap, goals will be assigned to release +cycle milestones. At the end of the year we will deliver a public facing retrospective, describing the goals we achieved and how to use the new features in detail. It will @@ -115,12 +114,12 @@ the way we work today. Rust's roadmap will be established in year-long cycles, where we identify up front the most critical problems facing the project, formulated as _problem -statements_. Work toward solving those problems, _goals_, will be planned in -quarter-long cycles by individual teams. For the purposes of reporting the -project roadmap, goals will be assigned to _quarterly milestones_, and where -these goals result in stable features the Rust version in which they become -stable will be estimated as well. Along the way, teams will be expected to -maintain _tracking issues_ that communicate progress toward the project's goals. +statements_. Work toward solving those problems, _goals_, will be planned as +part of the release cycles by individual teams. For the purposes of reporting +the project roadmap, goals will be assigned to _release cycle milestones_, which +represent the primary work performed each release cycle. Along the way, teams +will be expected to maintain _tracking issues_ that communicate progress toward +the project's goals. At the end of the year we will deliver a public facing retrospective, which is intended as a 'rallying point'. Its primary purposes are to create anticipation @@ -156,20 +155,25 @@ Key terminology used in this RFC: celebrating those who contribute to achieving the project's goals and resolving it's problems. -- _quarterly milestone_ - All goals have estimates for completion, placed on - quarterly milestones. Each quarter that a goal remains incomplete it must be - re-triaged and re-estimated by the responsible team. - -## The big planning cycle (problem statements and the north star RFC) - -The big cycle spans one year. At the beginning of the cycle we identify areas of -Rust that need the most improvement, and at the end of the cycle is a 'rallying -point' where we deliver to the world the results of our efforts. We choose -year-long cycles because a year is enough time to accomplish relatively large -goals; and because having the rallying point occur at the same time every year -makes it easy to know when to anticipate big news from the project. Being -calendar-based avoids the temptation to slip or produce feature-based releases, -instead providing a fixed point of accountability for shipping. +- _release cycle milestone_ - All goals have estimates for completion, placed on + milestones that correspond to the 6 week release cycle. These milestones are + timed to corrspond to a release cycle, but don't represent a specific + release. That is, work toward the current nightly, the current beta, or even + that doesn't directly impact a specific release, all goes into the release + cycle milestone corresponding to the time period in which the work is + completed. + +## Problem statements and the north star RFC + +The full planning cycle spans one year. At the beginning of the cycle we +identify areas of Rust that need the most improvement, and at the end of the +cycle is a 'rallying point' where we deliver to the world the results of our +efforts. We choose year-long cycles because a year is enough time to accomplish +relatively large goals; and because having the rallying point occur at the same +time every year makes it easy to know when to anticipate big news from the +project. Being calendar-based avoids the temptation to slip or produce +feature-based releases, instead providing a fixed point of accountability for +shipping. This planning effort is _problem-oriented_. Focusing on "how" may seem like an obvious thing to do, but in practice it's very easy to become enamored of @@ -234,45 +238,63 @@ the rust-lang/rust issue tracker and tagged `R-problem-statement`. In the OP of each metabug the teams are responsible for maintaining a list of their goals, linking to tracking issues. -## The little planning cycle (goals and tracking progress) - -The little cycle is where the solutions take shape and are carried out. They -last one quarter - 3 months - and are the responsibility of individual teams. - -Each cycle the teams will have one week to update their set of _goals_. This -includes both creating new goals and reviewing and revising existing goals. A -goal describes a task that contributes to solving the year's problems. It may or -may not involve a concrete deliverable, and it may be in turn subdivided into -further goals. - -The social process of the quarterly planning cycle is less strict, but it -should be conducted in a way that allows open feedback. It is suggested that -teams present their quarterly plan on internals.rust-lang.org at the beginning -of the week, solicit feedback, then finalize them at the end of the week. - -All goals have estimated completion dates. There is no limit on the duration of -a single goal, but they are encouraged to be scoped to less than a quarter year -of work. Goals that are expected to take more than a quarter _must_ be -subdivided into smaller goals of less than a quarter, each with their own -estimates. These estimates are used to place goals onto quarterly milestones. - -Not all the work items done by teams in a quarter should be considered a goal -nor should they be. Goals only need to be granular enough to demonstrate -consistent progress toward solving the project's problems. Work that -contributes toward quarterly goals should still be tracked as sub-tasks of -those goals, but only needs to be filed on the issue tracker and not reported -directly as goals on the roadmap. +Like other RFCs, the north star RFC is not immutable, and if new motivations +arise during the year, it may be amended, even to the extent of adding +additional problem statements; though it is not appropriate for the project +to continually rehash the RFC. + +## Goal setting and tracking progress + +During the regular 6-week release cycles is where the solutions take shape and +are carried out. Each cycle teams are expected to set concrete _goals_ that work +toward solving the project's stated problems; and to review and revise their +previous goals. The exact forum and mechanism for doing this evaluation and +goal-setting is left to the individual teams, and to future experimentation, +but the end result is that each release cycle each team will document their +goals and progress in a standard format. + +A goal describes a task that contributes to solving the year's problems. It may +or may not involve a concrete deliverable, and it may be in turn subdivided into +further goals. Not all the work items done by teams in a quarter should be +considered a goal. Goals only need to be granular enough to demonstrate +consistent progress toward solving the project's problems. Work that contributes +toward quarterly goals should still be tracked as sub-tasks of those goals, but +only needs to be filed on the issue tracker and not reported directly as goals +on the roadmap. For each goal the teams will create an issue on the issue tracker tagged with `R-goal`. Each goal must be described in a single sentence summary with an end-result or deliverable that is as crisply stated as possible. Goals with sub-goals and sub-tasks must list them in the OP in a standard format. -During each planning period all goals must be triaged and updated for the -following information: +During each cycle all `R-goal` and `R-unstable` issues assigned to each team +must be triaged and updated for the following information: - The set of sub-goals and sub-tasks and their status -- The estimated date of completion +- The release cycle milestone + +Goals that will be likely completed in this cycle or the next should be assigned +to the appropriate milestone. Some goals may be expected to be completed in +the distant future, and these do not need to be assigned a milestone. + +The release cycle milestone corresponds to a six week period of time and +contains the work done during that time. It does not correspend to a specific +release, nor do the goals assigned to it need to result in a stable feature +landing in any specific release. + +Release cycle milestones serve multiple purposes, not just tracking of the goals +defined in this RFC: `R-goal` tracking, tracking of stabilization of +`R-unstable` and `R-RFC-approved` features, tracking of critical bug fixes. + +Though the release cycle milestones are time-oriented and are not strictly tied +to a single upcoming release, from the set of assigned `R-unstable` issues one +can derive the new features landing in upcoming releases. + +During the last week of every release cycle each team will write a brief +report summarizing their goal progress for the cycle. Some project member +will compile all the team reports and post them to internals.rust-lang.org. +In addition to providing visibility into progress, these will be sources +to draw from for the subsequent release announcements. ## The retrospective (rallying point) @@ -303,21 +325,6 @@ Since the retrospective must be a high-quality document, and cover a lot of material, it is expected to require significant planning, editing and revision. The details of how this will work are to be determined. -## Release estimation - -The teams are responsible for estimating only the _timeframe_ in which they -complete their work, but possibly the single most important piece of information -desired by users is to know _in what release_ any given feature will become -available. - -To reduce process burden on team members we will not require them to make that -estimate themselves; the teams will work purely in terms of quarterly -milestones. Instead, we will have a separate process to map the goals and time -estimates into release estimates for individual features - a process that is -likely automatable. - -The precise mechanics are to be determined. - ## Presenting the roadmap As a result of this process the Rust roadmap for the year is encoded in three From bd058432f4eb948f60136c171ac87ff787d3ece3 Mon Sep 17 00:00:00 2001 From: Scott Olson Date: Mon, 5 Sep 2016 19:30:47 -0500 Subject: [PATCH 1104/1195] Delete stray quote char. --- text/1607-style-rfcs.md | 1 - 1 file changed, 1 deletion(-) diff --git a/text/1607-style-rfcs.md b/text/1607-style-rfcs.md index 4d83a40a0e0..deefc3caf4f 100644 --- a/text/1607-style-rfcs.md +++ b/text/1607-style-rfcs.md @@ -208,7 +208,6 @@ let y = Foo { c: 1000 }; ``` -" (Note this is just an example, not a proposed guideline). From 59620914006dccdd77368db453f6831fd225b829 Mon Sep 17 00:00:00 2001 From: Scott Olson Date: Tue, 6 Sep 2016 04:12:05 -0500 Subject: [PATCH 1105/1195] Block quote the entire style guideline example. Fixes my mistake from #1741. This also renders more obviously as a separate quotation. r? @eddyb --- text/1607-style-rfcs.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/1607-style-rfcs.md b/text/1607-style-rfcs.md index deefc3caf4f..162a580ce09 100644 --- a/text/1607-style-rfcs.md +++ b/text/1607-style-rfcs.md @@ -192,7 +192,9 @@ Guidelines may include more than one acceptable rule, but should offer guidance for when to use each rule (which should be formal enough to be used by a tool). -For example: "a struct literal must be formatted either on a single line (with +For example: + +> A struct literal must be formatted either on a single line (with spaces after the opening brace and before the closing brace, and with fields separated by commas and spaces), or on multiple lines (with one field per line and newlines after the opening brace and before the closing brace). The former @@ -200,7 +202,7 @@ approach should be used for short struct literals, the latter for longer struct literals. For tools, the first approach should be used when the width of the fields (excluding commas and braces) is 16 characters. E.g., -``` +> ```rust let x = Foo { a: 42, b: 34 }; let y = Foo { a: 42, From b7ad6f13e9dce08182984af2ef5b0f0f3c77347a Mon Sep 17 00:00:00 2001 From: Aidan Hobson Sayers Date: Fri, 19 Aug 2016 16:05:33 +0100 Subject: [PATCH 1106/1195] Lifetimes in arguments may be made longer --- text/0738-variance.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/text/0738-variance.md b/text/0738-variance.md index 15e046353e7..c4b70f44140 100644 --- a/text/0738-variance.md +++ b/text/0738-variance.md @@ -155,9 +155,8 @@ know that data with a shorter lifetime has been inserted. (This is traditionally called "invariant".) Finally, there can be cases where it is ok to make a lifetime -*longer*, but not shorter. This comes up when the lifetime is used in -a function return type (and only a fn return type). This is very -unusual in Rust but it can happen. +*longer*, but not shorter. This comes up (for example) in a type like +`fn(&'a u8)`, which may be safely treated as a `fn(&'static u8)`. [v]: http://en.wikipedia.org/wiki/Covariance_and_contravariance_%28computer_science%29 From 7bfa935f28588a801ad317696af7ba09038aa193 Mon Sep 17 00:00:00 2001 From: Markus Unterwaditzer Date: Sun, 11 Sep 2016 23:10:29 +0200 Subject: [PATCH 1107/1195] Fix a typo in RFC #1681 --- text/1681-macros-1.1.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1681-macros-1.1.md b/text/1681-macros-1.1.md index 9a50c591897..fd9b18a54eb 100644 --- a/text/1681-macros-1.1.md +++ b/text/1681-macros-1.1.md @@ -27,7 +27,7 @@ been enough to push the nightly users to stable as well. These large projects, however, are often the face of Rust to external users. Common knowledge is that fast serialization is done using serde, but to others -this just sounds likes "fast Rust needs nightly". Over time this persistent +this just sounds like "fast Rust needs nightly". Over time this persistent thought process creates a culture of "well to be serious you require nightly" and a general feeling that Rust is not "production ready". From 058d72137fa235fec290ba5e776e0af4bc90a2e6 Mon Sep 17 00:00:00 2001 From: Markus Unterwaditzer Date: Sun, 11 Sep 2016 23:14:43 +0200 Subject: [PATCH 1108/1195] Fix a duplicated word --- text/1681-macros-1.1.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1681-macros-1.1.md b/text/1681-macros-1.1.md index fd9b18a54eb..0a0aa483f9b 100644 --- a/text/1681-macros-1.1.md +++ b/text/1681-macros-1.1.md @@ -207,8 +207,8 @@ pub fn double(input: TokenStream) -> TokenStream { let source = input.to_string(); // Parse `source` for struct/enum declaration, and then build up some new - // source code representing representing a number of items in the - // implementation of the `Double` trait for the struct/enum in question. + // source code representing a number of items in the implementation of + // the `Double` trait for the struct/enum in question. let source = derive_double(&source); // Parse this back to a token stream and return it From 3654298b647cdea7772e3bec491e25d04e3cddf8 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 12 Sep 2016 15:16:27 -0700 Subject: [PATCH 1109/1195] RFC 1620 is Regex 1.0 --- text/{0000-regex-1.0.md => 1620-regex-1.0.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-regex-1.0.md => 1620-regex-1.0.md} (99%) diff --git a/text/0000-regex-1.0.md b/text/1620-regex-1.0.md similarity index 99% rename from text/0000-regex-1.0.md rename to text/1620-regex-1.0.md index 8feba05155a..a793d1da593 100644 --- a/text/0000-regex-1.0.md +++ b/text/1620-regex-1.0.md @@ -1,7 +1,7 @@ - Feature Name: regex-1.0 - Start Date: 2016-05-11 -- RFC PR: -- Rust Issue: +- RFC PR: [rust-lang/rfcs#1620](https://github.com/rust-lang/rfcs/pull/1620) +- Rust Issue: N/A # Table of contents From c7edf9b626206b09a3ff7acf8fa4e2c2614cf6e9 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 15 Sep 2016 08:53:58 -0700 Subject: [PATCH 1110/1195] RFC 1696 is `mem::discriminant` --- text/{0000-discriminant.md => 1696-discriminant.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-discriminant.md => 1696-discriminant.md} (99%) diff --git a/text/0000-discriminant.md b/text/1696-discriminant.md similarity index 99% rename from text/0000-discriminant.md rename to text/1696-discriminant.md index 4d1ccdaed0b..437f1d1c72b 100644 --- a/text/0000-discriminant.md +++ b/text/1696-discriminant.md @@ -1,6 +1,6 @@ - Feature Name: discriminant - Start Date: 2016-08-01 -- RFC PR: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1696 - Rust Issue: [#24263](https://github.com/rust-lang/rust/pull/24263), [#34785](https://github.com/rust-lang/rust/pull/34785) # Summary From 5b785bc3426accdbec863454ca218e29617e9839 Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Thu, 15 Sep 2016 23:05:47 -0400 Subject: [PATCH 1111/1195] fix typo in RFC 1696 During discussion we decided to remove the Reflect bound, and I removed it from one part of the text but not another. Further discussion ensued, but we never decided to put Reflect back in. --- text/1696-discriminant.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1696-discriminant.md b/text/1696-discriminant.md index 437f1d1c72b..e0d888fc73b 100644 --- a/text/1696-discriminant.md +++ b/text/1696-discriminant.md @@ -20,7 +20,7 @@ The motivation for this is mostly identical to [RFC 639](https://github.com/rust The proposed design has been implemented at [#34785](https://github.com/rust-lang/rust/pull/34785) (after some back-and-forth). That implementation is copied at the end of this section for reference. -A struct `Discriminant` and a free function `fn discriminant(v: &T) -> Discriminant` are added to `std::mem` (for lack of a better home, and noting that `std::mem` already contains similar parametricity escape hatches such as `size_of`). For now, the `Discriminant` struct is simply a newtype over `u64`, because that's what the `discriminant_value` intrinsic returns, and a `PhantomData` to allow it to be generic over `T`. +A struct `Discriminant` and a free function `fn discriminant(v: &T) -> Discriminant` are added to `std::mem` (for lack of a better home, and noting that `std::mem` already contains similar parametricity escape hatches such as `size_of`). For now, the `Discriminant` struct is simply a newtype over `u64`, because that's what the `discriminant_value` intrinsic returns, and a `PhantomData` to allow it to be generic over `T`. Making `Discriminant` generic provides several benefits: From d04dfa6dc1667fa5d580a9d8bbca9a80c639b617 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 20 Sep 2016 20:06:02 -0400 Subject: [PATCH 1112/1195] Address concerns about documentation infrastructure. --- text/0000-document_all_features.md | 23 ++--------------------- 1 file changed, 2 insertions(+), 21 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index d8ad6bdcc52..d3123f5c444 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -26,10 +26,6 @@ One of the major goals of Rust's development process is *stability without stagn - [Language features](#language-features) - [Standard library](#standard-library) - [Add an “Edit” link (optional)](#add-an-edit-link-optional) - - [Support with infrastructure - change](#support-with-infrastructure-change) - - [Visually Distinguish - Nightly (optional)](#visually-distinguish-nightly-optional) - [How do we teach this?](#how-do-we-teach-this) - [Drawbacks](#drawbacks) - [Alternatives](#alternatives) @@ -107,16 +103,15 @@ The basic decision has led to a substantial improvement in the currency of the d # Detailed design [design]: #detailed-design -The basic process of developing new language features will remain largely the same as today. The changes are two additions: +The basic process of developing new language features will remain largely the same as today. The required changes are two additions: - a new section in the RFC, "How do we teach this?" modeled on Ember's updated RFC process - a new requirement that the changes themselves be properly documented before being merged to stable -Additionally, we might make some content-level/infrastructural changes: +Additionally, we should make some content-level/infrastructural changes: - add an "edit" link to the documentation pages -- visually distinguish nightly vs. stable build docs ## New RFC section: "How do we teach this?" @@ -198,16 +193,6 @@ Making a similar change has some downsides (see below under [**Drawbacks**][draw 2. It sends a quiet but real signal that the docs are up for editing. This makes it likelier that people will edit them! -### Support with infrastructure change -[edit-link-infrastructure]: #optional-support-with-infrastructure-change - -The links to edit the documentation could track against the release branch instead of against `master`. (Fixes to documentation would be analogous to bugfix releases in this sense.) Targeting the pull-request automatically would be straightforward. However, see below under [**Drawbacks**][drawbacks]. - -## Visually Distinguish Nightly (optional) -[distinguish-nightly]: #optional-visually-distinguish-nightly - -It might be useful to visually distinguish the documentation for nightly Rust as being unstable and subject to change, even simply by setting a different default theme on _The Rust Programming Language_ book for nightly Rust. - # How do we teach this? @@ -269,10 +254,6 @@ At a "messaging" level, we should continue to emphasize that *documentation is j Finally, while infrastructure changes could be made in support of a more "targeted" editing experience, doing so would substantially increase the triage work required for the docs. It would also entail extra work "porting" the changes back to `master`. Additionally, because the language itself does not currently "bugfix" releases, this would substantially alter the workflow for dealing with releases in general. -6. Specific to the suggestion to [**Visually Distinguish Nightly**][distinguish-nightly]: - - This requires at least some infrastructure investment. Making the change apply to the Reference as well as to the two books would entail the maintenance of further CSS. This might be acceptable if documentation teams are sufficiently motivated and engaged, but it means that if not very carefully designed up front, any changes to the documentation theme will basically require double CSS changes; they will also require double the *design* efforts. - # Alternatives [alternatives]: #alternatives From 297e5b19231a54e83b247403b4bb24c01f1d13db Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 20 Sep 2016 20:12:23 -0400 Subject: [PATCH 1113/1195] =?UTF-8?q?Address=20concern=20about=20=E2=80=9C?= =?UTF-8?q?edit=20link".?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/0000-document_all_features.md | 30 ------------------------------ 1 file changed, 30 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index d3123f5c444..f830356bda8 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -25,7 +25,6 @@ One of the major goals of Rust's development process is *stability without stagn stabilizing](#new-requirement-to-document-changes-before-stabilizing) - [Language features](#language-features) - [Standard library](#standard-library) - - [Add an “Edit” link (optional)](#add-an-edit-link-optional) - [How do we teach this?](#how-do-we-teach-this) - [Drawbacks](#drawbacks) - [Alternatives](#alternatives) @@ -109,10 +108,6 @@ The basic process of developing new language features will remain largely the sa - a new requirement that the changes themselves be properly documented before being merged to stable -Additionally, we should make some content-level/infrastructural changes: - -- add an "edit" link to the documentation pages - ## New RFC section: "How do we teach this?" [new-rfc-section]: #new-rfc-section-how-do-we-teach-this @@ -182,18 +177,6 @@ Updating the reference could proceed stepwise: In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. -## Add an "Edit" link (optional) -[edit-link]: #add-an-edit-link - -To support its own change, the Ember team added an "edit this" icon to the top of every page in the guides (and plans to do so for the API documentation, pending infrastructure changes to support that). Each of _The Rust Programming Language_, _Rust by Example_, and the Rust Reference should do the same. - -Making a similar change has some downsides (see below under [**Drawbacks**][drawbacks]), but it has two major upsides: - -1. It gives users an obvious action to fix typos. Speaking from personal experience, it can be difficult to find where a given documentation or book page exists in the Rust repository. Even with the drawbacks noted below, this would substantially smooth the process of making e.g. a small typo fix for first-time readers of _The Rust Programming Language_. Making the first contribution easy makes further contributions much more likely. - -2. It sends a quiet but real signal that the docs are up for editing. This makes it likelier that people will edit them! - - # How do we teach this? Since this RFC promotes including this section, it includes it itself. (RFCs, unlike Rust `struct` or `enum` types, may be freely self-referential. No boxing required.) @@ -241,19 +224,6 @@ At a "messaging" level, we should continue to emphasize that *documentation is j 4. If the forthcoming docs team is unable to provide significant support for the core team member responsible for documentation, and perhaps equally if the rest of the community does not also increase involvement, this will simply not work. No individual can manage all of these docs alone. -5. Specific to the suggestion to [**Add an "edit" link**][edit-link]: - - - If the specific page is in flux (e.g. being rewritten, broken into pieces, etc.), then a link to edit `master` will be confusing. - - In addition, when users *have* made edits, it may take some time before it appears, and thus users may be confused when attempting to make edits and finding that the relevant editss have already been made. - - Some pages users attempt to edit are *likely* to have different documentation in them than the existing pages, to account for inbound changes for feature additions to the language! - - Two notes, however: - - 1. Even facing the same issues, the Ember team has found it useful to have the link, as it enables basically any user of a sufficient comfort level with GitHub to fix basic typos or logic errors. - 2. This concern primarily impacts _The Rust Programming Language_. Both in its current state and in the event of an eventual revamp (at least: after such a revamp finished), the Rust Reference is far less likely to see pages removed or moved. - - Finally, while infrastructure changes could be made in support of a more "targeted" editing experience, doing so would substantially increase the triage work required for the docs. It would also entail extra work "porting" the changes back to `master`. Additionally, because the language itself does not currently "bugfix" releases, this would substantially alter the workflow for dealing with releases in general. - # Alternatives [alternatives]: #alternatives From 2bfd6fd060fb55bf0e68bfb984b0f85eea957f8f Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 20 Sep 2016 20:12:48 -0400 Subject: [PATCH 1114/1195] Add note on book rewrite status. --- text/0000-document_all_features.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index f830356bda8..3f25d08d83a 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -278,5 +278,6 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - How do we clearly distinguish between features on nightly, beta, and stable Rust—in the reference especially, but also in the book? - How will the requirement for documentation in the reference be enforced? - Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? +- Given that the book is in the process of a rewrite for print publication, how shall we apply this requirement to RFCs merged during its development? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? - Is a documentation subteam, _a la_ the one used by Ember, worth creating? From 19cf3a3f7df23ce80164e0877e3fd268ceefbf12 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 20 Sep 2016 20:16:52 -0400 Subject: [PATCH 1115/1195] Address concern: unclear language about change reviews. --- text/0000-document_all_features.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 3f25d08d83a..e39ac399c58 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -131,7 +131,7 @@ For a great example of this in practice, see the (currently open) [Ember RFC: Mo ## New requirement to document changes before stabilizing -Changes will now be reviewed for changes to the documentation prior to being merged. This will proceed in the following places: +Prior to approving pull requests that stabilize features, reviewers must verify that the document is properly documented in the following places: - language features: - in the reference From 8e145aeab82c057458daf578e2aeee2ab3e24151 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 20 Sep 2016 20:21:17 -0400 Subject: [PATCH 1116/1195] Address concern: remove blog post requirement. --- text/0000-document_all_features.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index e39ac399c58..a73a981f1cd 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -189,9 +189,7 @@ From the process and core team side of things: 2. Update the RFC process description in the [RFCs README], specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. -3. Write a blog post discussing the new process should be written discussing why we are making this change to the process, and especially explaining both the current problems and the benefits of making the change. - -4. Make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Core team members are already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) +3. Make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Core team members are already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) [RFCs README]: https://github.com/rust-lang/rfcs/blob/master/README.md [What the process is]: https://github.com/rust-lang/rfcs/blob/master/README.md#what-the-process-is From d04b8c31254b263a55636e2c8ae63690537e4746 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 20 Sep 2016 20:21:51 -0400 Subject: [PATCH 1117/1195] Address concern: needless inline strikeouts. --- text/0000-document_all_features.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index a73a981f1cd..290e4038114 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -200,9 +200,9 @@ This is also an opportunity to allow/enable non-core-team members with less expe 2. We can use the more complicated language reference issues as points for mentoring developers interested in contributing to the compiler. Helping document a complex language feature may be a useful on-ramp for working on the compiler itself. -3. ~~We may find it useful to form~~ We are already forming a documentation subteam (under the leadership of the relevant core team representative), similar to what Ember has done, which will be responsible for shepherding these changes along. +3. We have formed a documentation subteam (under the leadership of the relevant core team representative), similar to what Ember has done, which will be responsible for shepherding these changes along. - ~~Whether such a team is formalized or not,~~ Even with such a team in place, a major goal remains encouraging the community to take up a greater degree of responsibility for the state of the documentation, rather than it falling entirely on the shoulders of a single core team member or even the docs team. (Having a dedicated core team member focused solely on docs is *wonderful*, but it means we can sometimes leave it all to just one person, and Rust has far too much going on for any individual to manage on their own.) + Even with such a team in place, a major goal remains encouraging the community to take up a greater degree of responsibility for the state of the documentation, rather than it falling entirely on the shoulders of a single core team member or even the docs team. (Having a dedicated core team member focused solely on docs is *wonderful*, but it means we can sometimes leave it all to just one person, and Rust has far too much going on for any individual to manage on their own.) At a "messaging" level, we should continue to emphasize that *documentation is just as valuable as code*. For example (and there are many other similar opportunities): in addition to highlighting new language features in the release notes for each version, we might highlight any part of the documentation which saw substantial improvement in the release. From 3f8ed2022afe5749dfa41ead4de5d37a57f91e66 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 20 Sep 2016 20:25:18 -0400 Subject: [PATCH 1118/1195] =?UTF-8?q?Address=20concern:=20unnecessary=20re?= =?UTF-8?q?ferences=20to=20=E2=80=9Ccore=20team=E2=80=9D.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/0000-document_all_features.md | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 290e4038114..560e0edf69d 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -154,7 +154,7 @@ Instead, the documentation process should immediately precede the move to stabil This need not be especially long, but it should be long enough for ordinary users to learn how to use the language feature *without reading the RFCs*. -When the core team discusses whether to stabilize a feature in a given release, the reference material will now be a part of that decision. Once the feature *and* reference material are complete, it will be merged normally, and the pull request will simply include the reference material as well as the new feature. +When the discussing whether to stabilize a feature in a given release, the reference material will now be a part of that decision. Once the feature *and* reference material are complete, it will be merged normally, and the pull request will simply include the reference material as well as the new feature. Given the current state of the reference, this may need to proceed in two steps: @@ -183,13 +183,11 @@ Since this RFC promotes including this section, it includes it itself. (RFCs, un To be most effective, this will involve some changes both at a process and core-team level, and at a community level. -From the process and core team side of things: +1. The RFC template must be updated to include the new section for teaching. -1. Update the RFC template to include the new section for teaching. +2. The RFC process in the [RFCs README] must be updated, specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. -2. Update the RFC process description in the [RFCs README], specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. - -3. Make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Core team members are already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) +3. Make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Much of the community is already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) [RFCs README]: https://github.com/rust-lang/rfcs/blob/master/README.md [What the process is]: https://github.com/rust-lang/rfcs/blob/master/README.md#what-the-process-is @@ -200,10 +198,6 @@ This is also an opportunity to allow/enable non-core-team members with less expe 2. We can use the more complicated language reference issues as points for mentoring developers interested in contributing to the compiler. Helping document a complex language feature may be a useful on-ramp for working on the compiler itself. -3. We have formed a documentation subteam (under the leadership of the relevant core team representative), similar to what Ember has done, which will be responsible for shepherding these changes along. - - Even with such a team in place, a major goal remains encouraging the community to take up a greater degree of responsibility for the state of the documentation, rather than it falling entirely on the shoulders of a single core team member or even the docs team. (Having a dedicated core team member focused solely on docs is *wonderful*, but it means we can sometimes leave it all to just one person, and Rust has far too much going on for any individual to manage on their own.) - At a "messaging" level, we should continue to emphasize that *documentation is just as valuable as code*. For example (and there are many other similar opportunities): in addition to highlighting new language features in the release notes for each version, we might highlight any part of the documentation which saw substantial improvement in the release. @@ -220,7 +214,7 @@ At a "messaging" level, we should continue to emphasize that *documentation is j For Rust to attain its goal of *stability without stagnation*, its documentation must also be stable and not stagnant. -4. If the forthcoming docs team is unable to provide significant support for the core team member responsible for documentation, and perhaps equally if the rest of the community does not also increase involvement, this will simply not work. No individual can manage all of these docs alone. +4. If the forthcoming docs team is unable to provide significant support, and perhaps equally if the rest of the community does not also increase involvement, this will simply not work. No individual can manage all of these docs alone. # Alternatives From b425e052fb2544ff01c127cf3c0f24089ada492a Mon Sep 17 00:00:00 2001 From: Joe Ranweiler Date: Tue, 4 Oct 2016 13:11:14 -0700 Subject: [PATCH 1119/1195] Rename feature to "field-init-shorthand" --- text/{0000-named-field-puns.md => 0000-field-init-shorthand.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-named-field-puns.md => 0000-field-init-shorthand.md} (99%) diff --git a/text/0000-named-field-puns.md b/text/0000-field-init-shorthand.md similarity index 99% rename from text/0000-named-field-puns.md rename to text/0000-field-init-shorthand.md index 67eb08b7e92..e86861cc029 100644 --- a/text/0000-named-field-puns.md +++ b/text/0000-field-init-shorthand.md @@ -1,4 +1,4 @@ -- Feature Name: named-field-puns +- Feature Name: field-init-shorthand - Start Date: 2016-07-18 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) From e9495fb7b7eac30e1e3421a002cfd03bf4ff1363 Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Wed, 5 Oct 2016 15:45:12 +0300 Subject: [PATCH 1120/1195] Refine rough edges wrt coercion --- text/0000-loop-break-value.md | 85 ++++++++++++++++++++++------------- 1 file changed, 53 insertions(+), 32 deletions(-) diff --git a/text/0000-loop-break-value.md b/text/0000-loop-break-value.md index 2e28a0c20dd..be853f2e959 100644 --- a/text/0000-loop-break-value.md +++ b/text/0000-loop-break-value.md @@ -84,14 +84,15 @@ Four forms of `break` will be supported: 3. `break EXPR;` 4. `break 'label EXPR;` -where `'label` is the name of a loop and `EXPR` is an expression. +where `'label` is the name of a loop and `EXPR` is an expression. `break` and `break 'label` become +equivalent to `break ()` and `break 'label ()` respectively. ### Result type of loop Currently the result type of a 'loop' without 'break' is `!` (never returns), -which may be coerced to any type), and the result type of a 'loop' with 'break' -is `()`. This is important since a loop may appear as -the last expression of a function: +which may be coerced to any type. The result type of a 'loop' with a 'break' +is `()`. This is important since a loop may appear as the last expression of +a function: ```rust fn f() { @@ -109,20 +110,30 @@ fn g() -> () { fn h() -> ! { loop { do_something(); - // this loop is not allowed to break due to inferred `!` type + // this loop must diverge for the function to typecheck } } ``` -This proposal changes the result type of 'loop' to `T`, where: - -* if a loop is "broken" via `break;` or `break 'label;`, the loop's result type must be `()` -* if a loop is "broken" via `break EXPR;` or `break 'label EXPR;`, `EXPR` must evaluate to type `T` -* as a special case, if a loop is "broken" via `break EXPR;` or `break 'label EXPR;` where `EXPR` evaluates to type `!` (does not return), this does not place a constraint on the type of the loop -* if external constaint on the loop's result type exist (e.g. `let x: S = loop { ... };`), then `T` must be coercible to this type - -It is an error if these types do not agree or if the compiler's type deduction -rules do not yield a concrete type. +This proposal allows 'loop' expression to be of any type `T`, following the same typing and +inference rules that are applicable to other expressions in the language. Type of `EXPR` in every +`break EXPR` and `break 'label EXPR` must be coercible to the type of the loop the `EXPR` appears +in. + + + +It is an error if these types do not agree or if the compiler's type deduction rules do not yield a +concrete type. Examples of errors: @@ -148,24 +159,6 @@ fn z() -> ! { } ``` -Examples involving `!`: - -```rust -fn f() -> () { - // ! coerces to () - loop {} -} -fn g() -> u32 { - // ! coerces to u32 - loop {} -} -fn z() -> ! { - loop { - break panic!(); - } -} -``` - Example showing the equivalence of `break;` and `break ();`: ```rust @@ -180,6 +173,34 @@ fn y() -> () { } ``` +Coercion examples: + +```rust +// ! coerces to any type +loop {}: (); +loop {}: u32; +loop { + break (loop {}: !); +}: u32; +loop { + // ... + break 42; + // ... + break panic!(); +}: u32; + +// break EXPRs are not of the same type, but both coerce to `&[u8]`. +let x = [0; 32]; +let y = [0; 48]; +loop { + // ... + break &x; + // ... + break &y; +}: &[u8]; +``` + + ### Result value A loop only yields a value if broken via some form of `break ...;` statement, From 1c2a50ddfd17c7237c76fdc871220cfd2e22509d Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 7 Oct 2016 16:36:33 +1300 Subject: [PATCH 1121/1195] Updated based on feedback --- text/0000-proc-macros.md | 171 +++++++++++++++++++++++---------------- 1 file changed, 101 insertions(+), 70 deletions(-) diff --git a/text/0000-proc-macros.md b/text/0000-proc-macros.md index 1ab3299f755..00e5cf32207 100644 --- a/text/0000-proc-macros.md +++ b/text/0000-proc-macros.md @@ -15,11 +15,16 @@ This RFC specifies the architecture of the procedural macro system. It relies on [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561) which specifies the naming and modularisation of macros. It leaves many of the details for further RFCs, in particular the details of the APIs available to macro authors -(tentatively called `libmacro`). See this [blog post](http://ncameron.org/blog/libmacro/) -for some ideas of how that might look. +(tentatively called `libproc_macro`, formerly `libmacro`). See this +[blog post](http://ncameron.org/blog/libmacro/) for some ideas of how that might +look. + +[RFC 1681](https://github.com/rust-lang/rfcs/pull/1681) specified a mechanism +for custom derive using 'macros 1.1'. That RFC is essentially a subset of this +one. Changes and differences are noted throughout the text. At the highest level, macros are defined by implementing functions marked with -a `#[macro]` attribute. Macros operate on a list of tokens provided by the +a `#[proc_macro]` attribute. Macros operate on a list of tokens provided by the compiler and return a list of tokens that the macro use is replaced by. We provide low-level facilities for operating on these tokens. Higher level facilities (e.g., for parsing tokens to an AST) should exist as library crates. @@ -57,13 +62,18 @@ like macro is used with syntax `foo!(...)`, and an attribute-like macro with `#[foo(...)] ...`. Macros may be used in the same places as `macro_rules` macros and this remains unchanged. +There is also a third kind, custom derive, which are specified in [RFC +1681](https://github.com/rust-lang/rfcs/pull/1681). This RFC extends the +facilities open to custom derive macros beyond the string-based system of RFC +1681. + To define a procedural macro, the programmer must write a function with a specific signature and attribute. Where `foo` is the name of a function-like macro: ``` -#[macro] -pub fn foo(TokenStream, &mut MacroContext) -> TokenStream; +#[proc_macro] +pub fn foo(TokenStream) -> TokenStream; ``` The first argument is the tokens between the delimiters in the macro use. @@ -75,8 +85,8 @@ The value returned replaces the macro use. Attribute-like: ``` -#[macro_attribute] -pub fn foo(Option, TokenStream, &mut MacroContext) -> TokenStream; +#[prco_macro_attribute] +pub fn foo(Option, TokenStream) -> TokenStream; ``` The first argument is a list of the tokens between the delimiters in the macro @@ -101,54 +111,93 @@ by another macro without parsing, in which case they do not need to parse. The distinction is not statically enforced. It could be, but I don't think the overhead would be justified. -We also introduce a special configuration option: `#[cfg(macro)]`. Items with +Custom derive: + +``` +#[proc_macro_derive] +pub fn foo(TokenStream) -> TokenStream; +``` + +Similar to attribute-like macros, the item a custom derive applies to must +parse. Custom derives may on be applied to the items that a built-in derive may +be applied to (structs and enums). + +Currently, macros implementing custom derive only have the option of converting +the `TokenStream` to a string and converting a result string back to a +`TokenStream`. This option will remain, but macro authors will also be able to +operate directly on the `TokenStream` (which should be preferred, since it +allows for hygiene and span support). + +Procedural macros which take an identifier before the argument list (e.g, `foo! +bar(...)`) will not be supported (at least initially). + +My feeling is that this macro form is not used enough to justify its existence. +From a design perspective, it encourages uses of macros for language extension, +rather than syntactic abstraction. I feel that such macros are at higher risk of +making programs incomprehensible and of fragmenting the ecosystem). + +Behind the scenes, these functions implement traits for each macro kind. We may +in the future allow implementing these traits directly, rather than just +implementing the above functions. By adding methods to these traits, we can +allow macro implementations to pass data to the compiler, for example, +specifying hygiene information or allowing for fast re-compilation. + +## `proc-macro` crates + +[Macros 1.1](https://github.com/rust-lang/rfcs/pull/1681) added a new crate +type: proc-macro. This both allows procedural macros to be declared within the +crate, and dictates how the crate is compiled. Procedural macros must use +this crate type. + +We introduce a special configuration option: `#[cfg(proc_macro)]`. Items with this configuration are not macros themselves but are compiled only for macro uses. -Initially, it will only be legal to apply `#[cfg(macro)]` to a whole crate and -the `#[macro]` and `#[macro_attribute]` attributes may only appear within a -`#[cfg(macro)]` crate. This has the effect of partitioning crates into macro- -defining and non-macro defining crates. Macros may not be used in the crate in -which they are defined, although they may be called as regular functions. In the -future, I hope we can relax these restrictions so that macro and non-macro code -can live in the same crate. +If a crate is a `proc-macro` crate, then the `proc_macro` cfg variable is true +for the whole crate. Initially it will be false for all other crates. This has +the effect of partitioning crates into macro- defining and non-macro defining +crates. In the future, I hope we can relax these restrictions so that macro and +non-macro code can live in the same crate. Importing macros for use means using `extern crate` to make the crate available and then using `use` imports or paths to name macros, just like other items. Again, see [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561) for more details. -When a `#[cfg(macro)]` crate is `extern crate`ed, it's items (even public ones) -are not available to the importing crate; only macros declared in that crate. -There should be a lint to warn about public items which will not be visible due -to `#[cfg(macro)]`. The crate is dynamically linked with the compiler at -compile-time, rather than with the importing crate at runtime. +When a `proc-macro` crate is `extern crate`ed, it's items (even public ones) are +not available to the importing crate; only macros declared in that crate. There +should be a lint to warn about public items which will not be visible due to +`proc_macro`. The crate is used by the compiler at compile-time, rather than +linked with the importing crate at runtime. + +[Macros 1.1](https://github.com/rust-lang/rfcs/pull/1681) required `#[macro_use]` +on `extern crate` which imports procedural macros. This will not be required +and should be deprecated. ## Writing procedural macros Procedural macro authors should not use the compiler crates (libsyntax, etc.). -Using these will remain unstable. We will make available a new crate, libmacro, -which will follow the usual path to stabilisation, will be part of the Rust -distribution, and will be required to be used by procedural macros (because, at -the least, it defines the types used in the required signatures). +Using these will remain unstable. We will make available a new crate, +libproc_macro, which will follow the usual path to stabilisation, will be part +of the Rust distribution, and will be required to be used by procedural macros +(because, at the least, it defines the types used in the required signatures). -The details of libmacro will be specified in a future RFC. In the meantime, this -[blog post](http://ncameron.org/blog/libmacro/) gives an idea of what it might -contain. +The details of libproc_macro will be specified in a future RFC. In the meantime, +this [blog post](http://ncameron.org/blog/libmacro/) gives an idea of what it +might contain. -The philosophy here is that libmacro will contain low-level tools for +The philosophy here is that libproc_macro will contain low-level tools for constructing macros, dealing with tokens, hygiene, pattern matching, quasi- quoting, interactions with the compiler, etc. For higher level abstractions (such as parsing and an AST), macros should use external libraries (there are no -restrictions on `#[cfg(macro)]` crates using other crates). +restrictions on `#[cfg(proc_macro)]` crates using other crates). -The `MacroContext` is an object passed to all procedural macro definitions. It -is the main entry point to the libmacro API and for interaction with the -compiler. Via the `MacroContext`, a procedural macro can access information -about the context in which it is used and defined, and perform operations which -rely on the state of the compiler. It will be more fully defined in the upcoming -RFC proposing libmacro. +A `MacroContext` is an object placed in thread-local storage when a macro is +expanded. It contains data about how the macro is being used and defined. It is +expected that for most uses, macro authors will not use the `MacroContext` +directly, but it will be used by library functions. It will be more fully +defined in the upcoming RFC proposing libproc_macro. Rust macros are hygienic by default. Hygiene is a large and complex subject, but to summarise: effectively, naming takes place in the context of the macro @@ -157,8 +206,8 @@ definition, not the expanded macro. Procedural macros often want to bend the rules around macro hygiene, for example to make items or variables more widely nameable than they would be by default. Procedural macros will be able to take part in the application of the hygiene -algorithm via libmacro. Again, full details must wait for the libmacro RFC and a -sketch is available in this [blog post](http://ncameron.org/blog/libmacro/). +algorithm via libproc_macro. Again, full details must wait for the libproc_macro +RFC and a sketch is available in this [blog post](http://ncameron.org/blog/libmacro/). ## Tokens @@ -219,12 +268,12 @@ pub enum TokenKind { // The content of the comment can be found from the span. Comment(CommentKind), - // Symbol is the string contents, not including delimiters. It would be nice + // `text` is the string contents, not including delimiters. It would be nice // to avoid an allocation in the common case that the string is in the - // source code. We might be able to use `&'Codemap str` or something. - // `Option is for the count of `#`s if the string is a raw string. If + // source code. We might be able to use `&'codemap str` or something. + // `raw_markers` is for the count of `#`s if the string is a raw string. If // the string is not raw, then it will be `None`. - String(Symbol, Option, StringKind), + String { text: Symbol, raw_markers: Option, kind: StringKind }, // char literal, span includes the `'` delimiters. Char(char), @@ -269,6 +318,10 @@ pub enum StringKind { pub struct Symbol { ... } ``` +Note that although tokens exclude whitespace, by examining the spans of tokens, +a procedural macro can get the string representation of a `TokenStream` and thus +has access to whitespace information. + ### Open question: `Punctuation(char)` and multi-char operators. Rust has many compound operators, e.g., `<<`. It's not clear how best to deal @@ -294,14 +347,14 @@ Some solutions: ## Staging 1. Implement [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561). -2. Implement `#[macro]` and `#[cfg(macro)]` and the function approach to +2. Implement `#[proc_macro]` and `#[cfg(proc_macro)]` and the function approach to defining macros. However, pass the existing data structures to the macros, rather than tokens and `MacroContext`. -3. Implement libmacro and make this available to macros. At this stage both old +3. Implement libproc_macro and make this available to macros. At this stage both old and new macros are available (functions with different signatures). This will require an RFC and considerable refactoring of the compiler. 4. Implement some high-level macro facilities in external crates on top of - libmacro. It is hoped that much of this work will be community-led. + libproc_macro. It is hoped that much of this work will be community-led. 5. After some time to allow conversion, deprecate the old-style macros. Later, remove old macros completely. @@ -345,7 +398,10 @@ latter can be written today, the former require more work on an interface to the compiler to be practical). We could use the `macro` keyword rather than the `fn` keyword to declare a -macro. We would then not require a `#[macro]` attribute. +macro. We would then not require a `#[proc_macro]` attribute. + +We could use `#[macro]` instead of `#[proc_macro]` (and similarly for the other +attributes). This would require making `macro` a contextual keyword. We could have a dedicated syntax for procedural macros, similar to the `macro_rules` syntax for macros by example. Since a procedural macro is really @@ -365,31 +421,6 @@ would require additional rules on token trees and may not be possible. # Unresolved questions [unresolved]: #unresolved-questions -### macros with an extra identifier - -We currently allow procedural macros to take an extra ident after the macro name -and before the arguments, e.g., `foo! bar(...)` where `foo` is the macro name -and `bar` is the extra identifier. This is used for `macro_rules` and is useful -for macros which define classes of items, rather than instances of items. E.g., -a `struct!` macro might be used similarly to the `struct` keyword. - -My feeling is that this macro form is not used enough to justify its existence. -From a design perspective, it encourages uses of macros for language extension, -rather than syntactic abstraction. I feel that such macros are at higher risk of -making programs incomprehensible and of fragmenting the ecosystem). - -Therefore, I would like to remove them from the language. Alternatively, they -could be incorporated into the new design by having another kind of macro -function: - -``` -#[macro_with_ident] -pub fn foo(&Token, TokenStream, &mut MacroContext) -> TokenStream; -``` - -where the first argument is the extra identifier. - - ### Linking model Currently, procedural macros are dynamically linked with the compiler. This From 2ebe33c18c466de8e08cd80c80cb966db6e5ddeb Mon Sep 17 00:00:00 2001 From: Vadim Chugunov Date: Mon, 17 Oct 2016 17:18:23 -0700 Subject: [PATCH 1122/1195] Drop kind="abstract". Reword "minus unbundling". --- text/0000-dllimport.md | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/text/0000-dllimport.md b/text/0000-dllimport.md index b4128d8ad77..355c89aea30 100644 --- a/text/0000-dllimport.md +++ b/text/0000-dllimport.md @@ -51,9 +51,6 @@ statically in others which is why build scripts are leveraged to make these dynamic decisions. In order to support this kind of dynamism, the following modifications are proposed: -- A new library kind, "abstract". An "abstract" library by itself does not - cause any libraries to be linked. Its purpose is to establish an identifier, - that may be later referred to from the command line flags. - Extend syntax of the `-l` flag to `-l [KIND=]lib[:NEWNAME]`. The `NEWNAME` part may be used to override name of a library specified in the source. - Add new meaning to the `KIND` part: if "lib" is already specified in the source, @@ -64,26 +61,26 @@ Example: ```rust // mylib.rs -#[link(name = "foo", kind="dylib")] +#[link(name="foo", kind="dylib")] extern { // dllimport applied } -#[link(name = "bar", kind="static")] +#[link(name="bar", kind="static")] extern { // dllimport not applied } -#[link(name = "baz", kind="abstract")] +#[link(name="baz")] extern { - // dllimport not applied, "baz" not linked + // kind defaults to "dylib", dllimport applied } ``` -``` +```sh rustc mylib.rs -l static=foo # change foo's kind to "static", dllimport will not be applied -rustc mylib.rs -l foo:newfoo # link newfoo instead of foo -rustc mylib.rs -l dylib=baz:quoox # specify baz's kind as "dylib", change link name to quoox. +rustc mylib.rs -l foo:newfoo # link newfoo instead of foo, keeping foo's kind as "dylib" +rustc mylib.rs -l dylib=bar # change bar's kind to "dylib", dllimport will be applied ``` ### Unbundled static libs (optional) @@ -91,7 +88,8 @@ rustc mylib.rs -l dylib=baz:quoox # specify baz's kind as "dylib", change link n It had been pointed out that sometimes one may wish to link to a static system library (i.e. one that is always available to the linker) without bundling it into .lib's and .rlib's. For this use case we'll introduce another library "kind", "static-nobundle". -Such libraries would be treated in the same way as "static", minus the bundling. +Such libraries would be treated in the same way as "static", except they will not be bundled into +the target .lib/.rlib. # Drawbacks [drawbacks]: #drawbacks @@ -147,6 +145,4 @@ meaning that it will be common that these attributes are left off by accident. # Unresolved questions [unresolved]: #unresolved-questions -- Should un-overridden "abstract" kind cause an error, a warning, or be silently ignored? -- Do we even need "abstract"? Since kind can be overridden, there's no harm in providing a default in the source. - Should we allow dropping a library specified in the source from linking via `-l lib:` (i.e. "rename to empty")? From d4a0fba44e98f3965a68221b299bc0cb45e87ea8 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 19 Oct 2016 11:36:18 -0400 Subject: [PATCH 1123/1195] Clarify must/should/may around the books. Require release notes. --- text/0000-document_all_features.md | 69 +++++++++++++++--------------- 1 file changed, 34 insertions(+), 35 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 560e0edf69d..e270a421334 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -19,10 +19,8 @@ One of the major goals of Rust's development process is *stability without stagn - [The Current Situation](#the-current-situation) - [Precedent](#precedent) - [Detailed design](#detailed-design) - - [New RFC section: “How do we teach - this?”](#new-rfc-section-how-do-we-teach-this) - - [New requirement to document changes before - stabilizing](#new-requirement-to-document-changes-before-stabilizing) + - [New RFC section: “How do we teach this?”](#new-rfc-section-how-do-we-teach-this) + - [New requirement to document changes before stabilizing](#new-requirement-to-document-changes-before-stabilizing) - [Language features](#language-features) - [Standard library](#standard-library) - [How do we teach this?](#how-do-we-teach-this) @@ -131,13 +129,16 @@ For a great example of this in practice, see the (currently open) [Ember RFC: Mo ## New requirement to document changes before stabilizing -Prior to approving pull requests that stabilize features, reviewers must verify that the document is properly documented in the following places: +Prior to stabilizing a feature, the features will now be documented as follows: -- language features: - - in the reference - - in _The Rust Programming Language_ - - in _Rust by Example_ -- the standard library: in the `std` API docs +- Language features: + - must be documented in the reference. + - should be documented in _The Rust Programming Language_. + - may be documented in _Rust by Example_. +- Standard library additions must include documentation in `std` API docs. +- Both language features and standard library changes must include: + - a single line for the changelog + - a longer summary for the long-form release announcement. ### Language features [language-features]: #language-features @@ -146,7 +147,7 @@ We will document *all* language features in the Rust Reference, as well as makin This will necessarily be a manual process, involving updates to the `reference.md` file. (It may at some point be sensible to break up the Reference file for easier maintenance; that is left aside as orthogonal to this discussion.) -Note that the feature documentation does not need to be written by the feature author. In fact, this is one of the areas where the community may be most able to support core developers even if not themselves programming language theorists or compiler hackers. This may free up the compiler developers' time. It will also help communicate the features in a way that is accessible to ordinary Rust users. +Note that the feature documentation does not need to be written by the feature author. In fact, this is one of the areas where the community may be most able to support the language/compiler developers even if not themselves programming language theorists or compiler hackers. This may free up the compiler developers' time. It will also help communicate the features in a way that is accessible to ordinary Rust users. New features do not need to be documented to be merged into `master`/nightly, and in many cases *should* not, since the features may change substantially before landing on stable, at which point the reference material would need to be rewritten. @@ -184,9 +185,7 @@ Since this RFC promotes including this section, it includes it itself. (RFCs, un To be most effective, this will involve some changes both at a process and core-team level, and at a community level. 1. The RFC template must be updated to include the new section for teaching. - 2. The RFC process in the [RFCs README] must be updated, specifically by including "fail to include a plan for documenting the feature" in the list of possible problems in "Submit a pull request step" in [What the process is]. - 3. Make documentation and teachability of new features *equally* high priority with the features themselves, and communicate this clearly in discussion of the features. (Much of the community is already very good about including this in considerations of language design; this simply makes this an explicit goal of discussions around RFCs.) [RFCs README]: https://github.com/rust-lang/rfcs/blob/master/README.md @@ -194,9 +193,9 @@ To be most effective, this will involve some changes both at a process and core- This is also an opportunity to allow/enable non-core-team members with less experience to contribute more actively to _The Rust Programming Language_, _Rust by Example_, and the Rust Reference. -1. We should write issues for feature documentation, and flag them as approachable entry points for new users. +1. We should write issues for feature documentation, and may flag them as approachable entry points for new users. -2. We can use the more complicated language reference issues as points for mentoring developers interested in contributing to the compiler. Helping document a complex language feature may be a useful on-ramp for working on the compiler itself. +2. We may use the more complicated language reference issues as points for mentoring developers interested in contributing to the compiler. Helping document a complex language feature may be a useful on-ramp for working on the compiler itself. At a "messaging" level, we should continue to emphasize that *documentation is just as valuable as code*. For example (and there are many other similar opportunities): in addition to highlighting new language features in the release notes for each version, we might highlight any part of the documentation which saw substantial improvement in the release. @@ -204,62 +203,62 @@ At a "messaging" level, we should continue to emphasize that *documentation is j # Drawbacks [drawbacks]: #drawbacks -1. The largest drawback at present is that the language reference is *already* quite out of date. It may take substantial work to get it up to date so that new changes can be landed appropriately. (Arguably, however, this should be done regardless, since the language reference is an important part of the language ecosystem.) +1. The largest drawback at present is that the language reference is *already* quite out of date. It may take substantial work to get it up to date so that new changes can be landed appropriately. (Arguably, however, this should be done regardless, since the language reference is an important part of the language ecosystem.) -2. Another potential issue is that some sections of the reference are particularly thorny and must be handled with considerable care (e.g. lifetimes). Although in general it would not be necessary for the author of the new language feature to write all the documentation, considerable extra care and oversight would need to be in place for these sections. +2. Another potential issue is that some sections of the reference are particularly thorny and must be handled with considerable care (e.g. lifetimes). Although in general it would not be necessary for the author of the new language feature to write all the documentation, considerable extra care and oversight would need to be in place for these sections. -3. This may delay landing features on stable. However, all the points raised in [**Precedent**][precedent] on this apply, especially: +3. This may delay landing features on stable. However, all the points raised in [**Precedent**][precedent] on this apply, especially: > We can't get the great new toys unless everybody can enjoy the toys. ([@eccegordo]) For Rust to attain its goal of *stability without stagnation*, its documentation must also be stable and not stagnant. -4. If the forthcoming docs team is unable to provide significant support, and perhaps equally if the rest of the community does not also increase involvement, this will simply not work. No individual can manage all of these docs alone. +4. If the forthcoming docs team is unable to provide significant support, and perhaps equally if the rest of the community does not also increase involvement, this will simply not work. No individual can manage all of these docs alone. # Alternatives [alternatives]: #alternatives -- **Just add the "How do we teach this?" section.** +- **Just add the "How do we teach this?" section.** - Of all the alternatives, this is the easiest (and probably the best). It does not substantially change the state with regard to the documentation, and even having the section in the RFC does not mean that it will end up added to the docs, as evidence by the [`#[deprecated]` RFC][RFC 1270], which included as part of its text: + Of all the alternatives, this is the easiest (and probably the best). It does not substantially change the state with regard to the documentation, and even having the section in the RFC does not mean that it will end up added to the docs, as evidence by the [`#[deprecated]` RFC][RFC 1270], which included as part of its text: > The language reference will be extended to describe this feature as outlined in this RFC. Authors shall be advised to leave their users enough time to react before removing a deprecated item. This is not a small downside by any stretch—but adding the section to the RFC will still have all the secondary benefits noted above, and it probably at least somewhat increases the likelihood that new features do get documented. -- **Embrace the documentation, but do not include "How do we teach this?" section in new RFCs.** +- **Embrace the documentation, but do not include "How do we teach this?" section in new RFCs.** - This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. + This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. - On the other hand, thinking about the impact on teaching should further improve consideration of the general ergonomics of a proposed feature. If something cannot be *taught* well, it's likely the design needs further refinement. + On the other hand, thinking about the impact on teaching should further improve consideration of the general ergonomics of a proposed feature. If something cannot be *taught* well, it's likely the design needs further refinement. -- **No change; leave RFCs as canonical documentation.** +- **No change; leave RFCs as canonical documentation.** - This approach can take (at least) two forms: + This approach can take (at least) two forms: 1. We can leave things as they are, where the RFC and surrounding discussion form the primary point of documentation for newer-than-1.0 language features. As part of that, we could just link more prominently to the RFC repository and describe the process from the documentation pages. 2. We could automatically render the text of the RFCs into part of the documentation used on the site (via submodules and the existing tooling around Markdown documents used for Rust documentation). However, for all the reasons highlighted above in [**Motivation: The Current Situation**][current-situation], RFCs and their associated threads are *not* a good canonical source of information on language features. -- **Add a rule for the standard library but not for language features.** +- **Add a rule for the standard library but not for language features.** - This would basically just turn the _status quo_ into an official policy. It has all the same drawbacks as no change at all, but with the possible benefit of enabling automated checks on standard library documentation. + This would basically just turn the _status quo_ into an official policy. It has all the same drawbacks as no change at all, but with the possible benefit of enabling automated checks on standard library documentation. -- **Add a rule for language features but not for the standard library.** +- **Add a rule for language features but not for the standard library.** - The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. + The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. - On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). + On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). -- **Eliminate the reference entirely.** +- **Eliminate the reference entirely.** - Since the reference is already substantially out of date, it might make sense to stop presenting it publicly at all, at least until such a time as it has been completely reworked and updated. + Since the reference is already substantially out of date, it might make sense to stop presenting it publicly at all, at least until such a time as it has been completely reworked and updated. - The main upside to this is the reality that an outdated and inaccurate reference may be worse than no reference at all, as it may substantially mislead Rust users. + The main upside to this is the reality that an outdated and inaccurate reference may be worse than no reference at all, as it may substantially mislead Rust users. - The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. + The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. [RFC 1270]: https://github.com/rust-lang/rfcs/pull/1270 From 7dd3f1171212813d66c9d7625081ad51daf4452c Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 19 Oct 2016 11:40:46 -0400 Subject: [PATCH 1124/1195] Drop unresolved question about docs subteam: we have one! --- text/0000-document_all_features.md | 1 - 1 file changed, 1 deletion(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index e270a421334..81256da1207 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -271,4 +271,3 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? - Given that the book is in the process of a rewrite for print publication, how shall we apply this requirement to RFCs merged during its development? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? -- Is a documentation subteam, _a la_ the one used by Ember, worth creating? From 88a745bf1cb15ec7121b5a62b86b7168ecec9c87 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 19 Oct 2016 11:58:55 -0400 Subject: [PATCH 1125/1195] Improve wording and rationale. Add some placeholders. --- text/0000-document_all_features.md | 57 +++++++++++++++++++++--------- 1 file changed, 40 insertions(+), 17 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 81256da1207..dd884728c99 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -22,7 +22,12 @@ One of the major goals of Rust's development process is *stability without stagn - [New RFC section: “How do we teach this?”](#new-rfc-section-how-do-we-teach-this) - [New requirement to document changes before stabilizing](#new-requirement-to-document-changes-before-stabilizing) - [Language features](#language-features) + - [Reference](#reference) + - [The state of the reference](#the-current-state-of-the-reference) + - [_The Rust Programming Language_][trpl] + - [_Rust By Example_][rbe] - [Standard library](#standard-library) + - [Release notes][release-notes] - [How do we teach this?](#how-do-we-teach-this) - [Drawbacks](#drawbacks) - [Alternatives](#alternatives) @@ -57,13 +62,13 @@ Although the standard library is in excellent shape regarding documentation, inc ## The Current Situation [current-situation]: #the-current-situation -Today, the canonical source of information about new language features is the RFCs which define them. +Today, the canonical source of information about new language features is the RFCs which define them. The Rust Reference is substantially out of date, and not all new features have made their way into _The Rust Programming Language_. -There are several serious problems with the _status quo_: +There are several serious problems with the _status quo_ of using RFCs as ad hoc documentation: 1. Many users of Rust may simply not know that these RFCs exist. The number of users who do not know (or especially care) about the RFC process or its history will only increase as Rust becomes more popular. -2. In many cases, especially in more complicated language features, some important elements of the decision, details of implementation, and expected behavior are fleshed out either in the associated RFC (pull-request) discussion or in the implementation issues which follow them. +2. In many cases, especially in more complicated language features, some important elements of the decision, details of implementation, and expected behavior are fleshed out either in the pull-request discussion for the RFC, or in the implementation issues which follow them. 3. The RFCs themselves, and even more so the associated pull request discussions, are often dense with programming langauge theory. This is as it should be in context, but it means that the relevant information may be inaccessible to Rust users without prior PLT background, or without the patience to wade through it. @@ -73,6 +78,8 @@ There are several serious problems with the _status quo_: In short, RFCs are a poor source of information about language features for the ordinary Rust user. Rust users should not need to be troubled with details of how the language is implemented works simply to learn how pieces of it work. Nor should they need to dig through tens (much less hundreds) of comments to determine what the final form of the feature is. +However, there is currently no other documentation at all for many newer features. This is a significant barrier to adoption of the language, and equally of adoption of new features which will improve the ergonomics of the language. + ## Precedent [precedent]: #precedent @@ -102,9 +109,9 @@ The basic decision has led to a substantial improvement in the currency of the d The basic process of developing new language features will remain largely the same as today. The required changes are two additions: -- a new section in the RFC, "How do we teach this?" modeled on Ember's updated RFC process +- [a new section in the RFC][new-rfc-section], "How do we teach this?" modeled on Ember's updated RFC process -- a new requirement that the changes themselves be properly documented before being merged to stable +- [a new requirement that the changes themselves be properly documented before being merged to stable][] ## New RFC section: "How do we teach this?" @@ -129,10 +136,12 @@ For a great example of this in practice, see the (currently open) [Ember RFC: Mo ## New requirement to document changes before stabilizing +[require-documentation-before-stabilization]: #new-requirement-to-document-changes-before-stabilizing + Prior to stabilizing a feature, the features will now be documented as follows: - Language features: - - must be documented in the reference. + - must be documented in the Rust Reference. - should be documented in _The Rust Programming Language_. - may be documented in _Rust by Example_. - Standard library additions must include documentation in `std` API docs. @@ -143,40 +152,54 @@ Prior to stabilizing a feature, the features will now be documented as follows: ### Language features [language-features]: #language-features -We will document *all* language features in the Rust Reference, as well as making some updates to _The Rust Programming Language_ and _Rust by Example_ as necessary. +We will document *all* language features in the Rust Reference, as well as making some updates to _The Rust Programming Language_ and _Rust by Example_. + +#### Reference + +[reference]: #reference This will necessarily be a manual process, involving updates to the `reference.md` file. (It may at some point be sensible to break up the Reference file for easier maintenance; that is left aside as orthogonal to this discussion.) -Note that the feature documentation does not need to be written by the feature author. In fact, this is one of the areas where the community may be most able to support the language/compiler developers even if not themselves programming language theorists or compiler hackers. This may free up the compiler developers' time. It will also help communicate the features in a way that is accessible to ordinary Rust users. +Feature documentation does not need to be written by the feature author. In fact, this is one of the areas where the community may be most able to support the language/compiler developers even if not themselves programming language theorists or compiler hackers. This may free up the compiler developers' time. It will also help communicate the features in a way that is accessible to ordinary Rust users. New features do not need to be documented to be merged into `master`/nightly, and in many cases *should* not, since the features may change substantially before landing on stable, at which point the reference material would need to be rewritten. -Instead, the documentation process should immediately precede the move to stabilize. Once the *feature* has been deemed ready for stabilization, either the author or a community volunteer should write the *reference material* for the feature. +Instead, the documentation process should immediately precede the move to stabilize. Once the *feature* has been deemed ready for stabilization, either the author or a community volunteer should write the *reference material* for the feature, to be incorporated into the Rust Reference. -This need not be especially long, but it should be long enough for ordinary users to learn how to use the language feature *without reading the RFCs*. +The reference material need not be especially long, but it should be long enough for ordinary users to learn how to use the language feature *without reading the RFCs*. When the discussing whether to stabilize a feature in a given release, the reference material will now be a part of that decision. Once the feature *and* reference material are complete, it will be merged normally, and the pull request will simply include the reference material as well as the new feature. -Given the current state of the reference, this may need to proceed in two steps: +##### The current state of the reference. -#### The current state of the reference. [refstate]: #the-current-state-of-the-reference -Since the reference is currently fairly out of date in a number of areas, it may be worth creating a "strike team" to invest a couple months working on the reference: updating it, organizing it, and improving its presentation. (A single web page with *all* of this content is difficult to navigate at best.) This can proceed in parallel with the documentation of new features. It is probably a necessity for this proposal to be particularly effective in the long term. - -Once the reference is up to date, the nucleus responsible for that work may either disband or possibly (depending on the core team's evaluation of the necessity of it and the interest of the "strike team" members) become the basis of a new documentation subteam. +Since the reference is fairly out of date, we should create a "strike team" to update it, organize it, and improve its presentation. (A single web page with *all* of this content is difficult to navigate at best.) This can proceed in parallel with the documentation of new features. -Updating the reference could proceed stepwise: +Updating the reference may proceed stepwise: 1. Begin by adding an appendix in the reference with links to all accepted RFCs which have been implemented but are not yet referenced in the documentation. +2. As the reference material is written for each of those RFC features, remove it from that appendix. + +#### _The Rust Programming Language_ -2. As the reference material is written for each of those RFC features, it can be removed from that appendix. +[trpl]: #the-rust-programming-language + + + +#### _Rust by Example_ + +[rbe]: #rust-by-example ### Standard library [std]: #standard-library In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. +### Release Notes + +[release-notes]: #release-notes + # How do we teach this? From 2738cb7c30c848045b4c14afa7fa7fd60219e612 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 19 Oct 2016 12:29:39 -0400 Subject: [PATCH 1126/1195] Fix a couple link refs. --- text/0000-document_all_features.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index dd884728c99..d0c08b9d57c 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -30,8 +30,8 @@ One of the major goals of Rust's development process is *stability without stagn - [Release notes][release-notes] - [How do we teach this?](#how-do-we-teach-this) - [Drawbacks](#drawbacks) -- [Alternatives](#alternatives) -- [Unresolved questions](#unresolved-questions) +- [Alternatives][alternatives] +- [Unresolved questions][unresolved-questions] # Motivation @@ -111,7 +111,7 @@ The basic process of developing new language features will remain largely the sa - [a new section in the RFC][new-rfc-section], "How do we teach this?" modeled on Ember's updated RFC process -- [a new requirement that the changes themselves be properly documented before being merged to stable][] +- [a new requirement that the changes themselves be properly documented before being merged to stable][require-documentation-before-stabilization] ## New RFC section: "How do we teach this?" From 7ae5c6b809a0dfee482429ef136f87dc242a2d96 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 19 Oct 2016 13:38:59 -0400 Subject: [PATCH 1127/1195] Address a number of concerns. - Drop unneeded stubs. - Fix RFC should/must/may language in a number of places. - Drop inline Ember references in How Do We Teach This. - Clarify the book process. - Clarify the expected reference update process --- text/0000-document_all_features.md | 71 +++++++++++++----------------- 1 file changed, 31 insertions(+), 40 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index d0c08b9d57c..446fba1b644 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -25,9 +25,7 @@ One of the major goals of Rust's development process is *stability without stagn - [Reference](#reference) - [The state of the reference](#the-current-state-of-the-reference) - [_The Rust Programming Language_][trpl] - - [_Rust By Example_][rbe] - - [Standard library](#standard-library) - - [Release notes][release-notes] + - [Standard library][standard-library] - [How do we teach this?](#how-do-we-teach-this) - [Drawbacks](#drawbacks) - [Alternatives][alternatives] @@ -111,24 +109,22 @@ The basic process of developing new language features will remain largely the sa - [a new section in the RFC][new-rfc-section], "How do we teach this?" modeled on Ember's updated RFC process -- [a new requirement that the changes themselves be properly documented before being merged to stable][require-documentation-before-stabilization] +- [a new requirement][require-documentation-before-stabilization] that the changes themselves be properly documented before being merged to stable ## New RFC section: "How do we teach this?" [new-rfc-section]: #new-rfc-section-how-do-we-teach-this -Following the example of Ember.js, we will add a new section to the RFC, just after **Detailed design**, titled **How do we teach this?** The section should to explain what changes need to be made to documentation, and if the feature substantially changes what would be considered the "best" way to solve a problem or is a fairly mainstream issue, discuss how it might be incorporated into _The Rust Programming Language_ and/or _Rust by Example_. +Following the example of Ember.js, we must add a new section to the RFC, just after **Detailed design**, titled **How do we teach this?** The section should explain what changes need to be made to documentation, and if the feature substantially changes what would be considered the "best" way to solve a problem or is a fairly mainstream issue, discuss how it might be incorporated into _The Rust Programming Language_ and/or _Rust by Example_. -Here is the Ember RFC section, with suggested substitutions: +Here is the Ember RFC section, with appropriate substitutions and modifications: > # How We Teach This -> What names and terminology work best for these concepts and why? How is this idea best presented? As a continuation of existing ~~Ember~~ **Rust** patterns, or as a wholly new one? +> What names and terminology work best for these concepts and why? How is this idea best presented? As a continuation of existing Rust patterns, or as a wholly new one? > -> Would the acceptance of this proposal mean ~~Ember guides~~ **_The Rust Programing Language_, _Rust by Example_, or the Rust Reference** must be re-organized or altered? Does it change how ~~Ember~~ **Rust** is taught to new users at any level? +> Would the acceptance of this proposal change how Rust is taught to new users at any level? What additions or changes to the Rust Reference, _The Rust Programing Language_, and/or _Rust by Example_ does it entail? > -> How should this feature be introduced and taught to existing ~~Ember~~ **Rust** users? - -We may also find it valuable to add other, more Rust-specific (or programming language- rather than framework-specific) verbiage there. +> How should this feature be introduced and taught to existing Rust users? For a great example of this in practice, see the (currently open) [Ember RFC: Module Unification], which includes several sections discussing conventions, tooling, concepts, and impacts on testing. @@ -149,10 +145,12 @@ Prior to stabilizing a feature, the features will now be documented as follows: - a single line for the changelog - a longer summary for the long-form release announcement. +Stabilization of a feature must not proceed until the requirements outlined in the **How We Teach This** section of the originating RFC have been fulfilled. + ### Language features [language-features]: #language-features -We will document *all* language features in the Rust Reference, as well as making some updates to _The Rust Programming Language_ and _Rust by Example_. +We will document *all* language features in the Rust Reference, as well as updating _The Rust Programming Language_ and _Rust by Example_ as appropriate. (Not all features or changes will require updates to the books.) #### Reference @@ -162,45 +160,40 @@ This will necessarily be a manual process, involving updates to the `reference.m Feature documentation does not need to be written by the feature author. In fact, this is one of the areas where the community may be most able to support the language/compiler developers even if not themselves programming language theorists or compiler hackers. This may free up the compiler developers' time. It will also help communicate the features in a way that is accessible to ordinary Rust users. -New features do not need to be documented to be merged into `master`/nightly, and in many cases *should* not, since the features may change substantially before landing on stable, at which point the reference material would need to be rewritten. +New features do not need to be documented to be merged into `master`/nightly Instead, the documentation process should immediately precede the move to stabilize. Once the *feature* has been deemed ready for stabilization, either the author or a community volunteer should write the *reference material* for the feature, to be incorporated into the Rust Reference. The reference material need not be especially long, but it should be long enough for ordinary users to learn how to use the language feature *without reading the RFCs*. -When the discussing whether to stabilize a feature in a given release, the reference material will now be a part of that decision. Once the feature *and* reference material are complete, it will be merged normally, and the pull request will simply include the reference material as well as the new feature. +Discussion of stabilizing a feature in a given release will now include the status of the reference material. -##### The current state of the reference. +##### The current state of the reference [refstate]: #the-current-state-of-the-reference -Since the reference is fairly out of date, we should create a "strike team" to update it, organize it, and improve its presentation. (A single web page with *all* of this content is difficult to navigate at best.) This can proceed in parallel with the documentation of new features. +Since the reference is fairly out of date, we should create a "strike team" to update it. This can proceed in parallel with the documentation of new features. -Updating the reference may proceed stepwise: +Updating the reference should proceed stepwise: 1. Begin by adding an appendix in the reference with links to all accepted RFCs which have been implemented but are not yet referenced in the documentation. 2. As the reference material is written for each of those RFC features, remove it from that appendix. +The current presentation of the reference is also in need of improvement: a single web page with *all* of this content is difficult to navigate, or to update. Therefore, the strike team may also take this opportunity to reorganize the reference and update its presentation. + #### _The Rust Programming Language_ [trpl]: #the-rust-programming-language +Most new language features should be added to _The Rust Programming Language_. However, since the book is planned to go to print, the main text of the book is expected to be fixed between major revisions. As such, new features should be documented in an online appendix to the book, which may be titled e.g. "Newest Features." - -#### _Rust by Example_ - -[rbe]: #rust-by-example +The published version of the book should note that changes and languages features made available after the book went to print will be documented in that online appendix. ### Standard library -[std]: #standard-library +[standard-library]: #standard-library In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. -### Release Notes - -[release-notes]: #release-notes - - # How do we teach this? Since this RFC promotes including this section, it includes it itself. (RFCs, unlike Rust `struct` or `enum` types, may be freely self-referential. No boxing required.) @@ -244,7 +237,7 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - **Just add the "How do we teach this?" section.** - Of all the alternatives, this is the easiest (and probably the best). It does not substantially change the state with regard to the documentation, and even having the section in the RFC does not mean that it will end up added to the docs, as evidence by the [`#[deprecated]` RFC][RFC 1270], which included as part of its text: + Of all the alternatives, this is the easiest (and probably the best). It does not substantially change the state with regard to the documentation, and even having the section in the RFC does not mean that it will end up added to the docs, as evidence by the [`#[deprecated]` RFC][RFC 1270], which included as part of its text: > The language reference will be extended to describe this feature as outlined in this RFC. Authors shall be advised to leave their users enough time to react before removing a deprecated item. @@ -252,13 +245,14 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - **Embrace the documentation, but do not include "How do we teach this?" section in new RFCs.** - This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. + This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. - On the other hand, thinking about the impact on teaching should further improve consideration of the general ergonomics of a proposed feature. If something cannot be *taught* well, it's likely the design needs further refinement. + On the other hand, thinking about the impact on teaching should further improve consideration of the general ergonomics of a proposed feature. If something cannot be *taught* well, it's likely the design needs further refinement. - **No change; leave RFCs as canonical documentation.** - This approach can take (at least) two forms: + This approach can take (at least) two forms: + 1. We can leave things as they are, where the RFC and surrounding discussion form the primary point of documentation for newer-than-1.0 language features. As part of that, we could just link more prominently to the RFC repository and describe the process from the documentation pages. 2. We could automatically render the text of the RFCs into part of the documentation used on the site (via submodules and the existing tooling around Markdown documents used for Rust documentation). @@ -267,21 +261,21 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - **Add a rule for the standard library but not for language features.** - This would basically just turn the _status quo_ into an official policy. It has all the same drawbacks as no change at all, but with the possible benefit of enabling automated checks on standard library documentation. + This would basically just turn the _status quo_ into an official policy. It has all the same drawbacks as no change at all, but with the possible benefit of enabling automated checks on standard library documentation. - **Add a rule for language features but not for the standard library.** - The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. + The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. - On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). + On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). - **Eliminate the reference entirely.** - Since the reference is already substantially out of date, it might make sense to stop presenting it publicly at all, at least until such a time as it has been completely reworked and updated. + Since the reference is already substantially out of date, it might make sense to stop presenting it publicly at all, at least until such a time as it has been completely reworked and updated. - The main upside to this is the reality that an outdated and inaccurate reference may be worse than no reference at all, as it may substantially mislead Rust users. + The main upside to this is the reality that an outdated and inaccurate reference may be worse than no reference at all, as it may mislead espiecally new Rust users. - The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. + The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. [RFC 1270]: https://github.com/rust-lang/rfcs/pull/1270 @@ -290,7 +284,4 @@ At a "messaging" level, we should continue to emphasize that *documentation is j [unresolved]: #unresolved-questions - How do we clearly distinguish between features on nightly, beta, and stable Rust—in the reference especially, but also in the book? -- How will the requirement for documentation in the reference be enforced? -- Given that the reference is out of date, does it need to be brought up to date before beginning enforcement of this policy? -- Given that the book is in the process of a rewrite for print publication, how shall we apply this requirement to RFCs merged during its development? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? From 2d33ba1bdf03a9aedb87c24a7c0e792ffd87e00e Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 19 Oct 2016 13:45:12 -0400 Subject: [PATCH 1128/1195] Drop internal links. (They're finicky, alas.) --- text/0000-document_all_features.md | 74 +++++++++++++----------------- 1 file changed, 32 insertions(+), 42 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 446fba1b644..abb4ef7f9b7 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -5,35 +5,32 @@ # Summary -[summary]: #summary One of the major goals of Rust's development process is *stability without stagnation*. That means we add features regularly. However, it can be difficult to *use* those features if they are not publicly documented anywhere. Therefore, this RFC proposes requiring that all new language features and public standard library items must be documented before landing on the stable release branch (item documentation for the standard library; in the language reference for language features). ## Outline -[outline]: #outline - -- [Summary](#summary) - - [Outline](#outline) -- [Motivation](#motivation) - - [The Current Situation](#the-current-situation) - - [Precedent](#precedent) -- [Detailed design](#detailed-design) - - [New RFC section: “How do we teach this?”](#new-rfc-section-how-do-we-teach-this) - - [New requirement to document changes before stabilizing](#new-requirement-to-document-changes-before-stabilizing) - - [Language features](#language-features) - - [Reference](#reference) - - [The state of the reference](#the-current-state-of-the-reference) - - [_The Rust Programming Language_][trpl] - - [Standard library][standard-library] -- [How do we teach this?](#how-do-we-teach-this) -- [Drawbacks](#drawbacks) -- [Alternatives][alternatives] -- [Unresolved questions][unresolved-questions] + +- Summary + - Outline +- Motivation + - The Current Situation + - Precedent +- Detailed design + - New RFC section: “How do we teach this?” + - New requirement to document changes before stabilizing + - Language features + - Reference + - The state of the reference + - _The Rust Programming Language_ + - Standard library +- How do we teach this? +- Drawbacks +- Alternatives +- Unresolved questions # Motivation -[motivation]: #motivation At present, new language features are often documented *only* in the RFCs which propose them and the associated announcement blog posts. Moreover, as features change, the existing official language documentation (the Rust Book, Rust by Example, and the language reference) can increasingly grow outdated. @@ -58,7 +55,6 @@ Changing this to require all language features to be documented before stabiliza Although the standard library is in excellent shape regarding documentation, including it in this policy will help guarantee that it remains so going forward. ## The Current Situation -[current-situation]: #the-current-situation Today, the canonical source of information about new language features is the RFCs which define them. The Rust Reference is substantially out of date, and not all new features have made their way into _The Rust Programming Language_. @@ -79,7 +75,6 @@ In short, RFCs are a poor source of information about language features for the However, there is currently no other documentation at all for many newer features. This is a significant barrier to adoption of the language, and equally of adoption of new features which will improve the ergonomics of the language. ## Precedent -[precedent]: #precedent This exact idea has been adopted by the Ember community after their somewhat bumpy transitions at the end of their 1.x cycle and leading into their 2.x transition. As one commenter there [put it][@davidgoli]: @@ -103,17 +98,15 @@ The basic decision has led to a substantial improvement in the currency of the d # Detailed design -[design]: #detailed-design The basic process of developing new language features will remain largely the same as today. The required changes are two additions: -- [a new section in the RFC][new-rfc-section], "How do we teach this?" modeled on Ember's updated RFC process +- a new section in the RFC, "How do we teach this?" modeled on Ember's updated RFC process -- [a new requirement][require-documentation-before-stabilization] that the changes themselves be properly documented before being merged to stable +- a new requirement that the changes themselves be properly documented before being merged to stable ## New RFC section: "How do we teach this?" -[new-rfc-section]: #new-rfc-section-how-do-we-teach-this Following the example of Ember.js, we must add a new section to the RFC, just after **Detailed design**, titled **How do we teach this?** The section should explain what changes need to be made to documentation, and if the feature substantially changes what would be considered the "best" way to solve a problem or is a fairly mainstream issue, discuss how it might be incorporated into _The Rust Programming Language_ and/or _Rust by Example_. @@ -148,7 +141,6 @@ Prior to stabilizing a feature, the features will now be documented as follows: Stabilization of a feature must not proceed until the requirements outlined in the **How We Teach This** section of the originating RFC have been fulfilled. ### Language features -[language-features]: #language-features We will document *all* language features in the Rust Reference, as well as updating _The Rust Programming Language_ and _Rust by Example_ as appropriate. (Not all features or changes will require updates to the books.) @@ -190,7 +182,6 @@ Most new language features should be added to _The Rust Programming Language_. H The published version of the book should note that changes and languages features made available after the book went to print will be documented in that online appendix. ### Standard library -[standard-library]: #standard-library In the case of the standard library, this could conceivably be managed by setting the `#[forbid(missing_docs)]` attribute on the library roots. In lieu of that, manual code review and general discipline should continue to serve. However, if automated tools *can* be employed here, they should. @@ -217,13 +208,12 @@ At a "messaging" level, we should continue to emphasize that *documentation is j # Drawbacks -[drawbacks]: #drawbacks 1. The largest drawback at present is that the language reference is *already* quite out of date. It may take substantial work to get it up to date so that new changes can be landed appropriately. (Arguably, however, this should be done regardless, since the language reference is an important part of the language ecosystem.) 2. Another potential issue is that some sections of the reference are particularly thorny and must be handled with considerable care (e.g. lifetimes). Although in general it would not be necessary for the author of the new language feature to write all the documentation, considerable extra care and oversight would need to be in place for these sections. -3. This may delay landing features on stable. However, all the points raised in [**Precedent**][precedent] on this apply, especially: +3. This may delay landing features on stable. However, all the points raised in **Precedent** on this apply, especially: > We can't get the great new toys unless everybody can enjoy the toys. ([@eccegordo]) @@ -233,7 +223,6 @@ At a "messaging" level, we should continue to emphasize that *documentation is j # Alternatives -[alternatives]: #alternatives - **Just add the "How do we teach this?" section.** @@ -245,19 +234,19 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - **Embrace the documentation, but do not include "How do we teach this?" section in new RFCs.** - This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. + This still gives us most of the benefits (and was in fact the original form of the proposal), and does not place a new burden on RFC authors to make sure that knowing how to *teach* something is part of any new language or standard library feature. - On the other hand, thinking about the impact on teaching should further improve consideration of the general ergonomics of a proposed feature. If something cannot be *taught* well, it's likely the design needs further refinement. + On the other hand, thinking about the impact on teaching should further improve consideration of the general ergonomics of a proposed feature. If something cannot be *taught* well, it's likely the design needs further refinement. - **No change; leave RFCs as canonical documentation.** - This approach can take (at least) two forms: + This approach can take (at least) two forms: 1. We can leave things as they are, where the RFC and surrounding discussion form the primary point of documentation for newer-than-1.0 language features. As part of that, we could just link more prominently to the RFC repository and describe the process from the documentation pages. 2. We could automatically render the text of the RFCs into part of the documentation used on the site (via submodules and the existing tooling around Markdown documents used for Rust documentation). - However, for all the reasons highlighted above in [**Motivation: The Current Situation**][current-situation], RFCs and their associated threads are *not* a good canonical source of information on language features. + However, for all the reasons highlighted above in **Motivation: The Current Situation**, RFCs and their associated threads are *not* a good canonical source of information on language features. - **Add a rule for the standard library but not for language features.** @@ -265,23 +254,24 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - **Add a rule for language features but not for the standard library.** - The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. + The standard library is in much better shape, in no small part because of the ease of writing inline documentation for new modules. Adding a formal rule may not be necessary if good habits are already in place. - On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). + On the other hand, having a formal policy would not seem to *hurt* anything here; it would simply formalize what is already happening (and perhaps, via linting attributes, make it easy to spot when it has failed). - **Eliminate the reference entirely.** - Since the reference is already substantially out of date, it might make sense to stop presenting it publicly at all, at least until such a time as it has been completely reworked and updated. + Since the reference is already substantially out of date, it might make sense to stop presenting it publicly at all, at least until such a time as it has been completely reworked and updated. - The main upside to this is the reality that an outdated and inaccurate reference may be worse than no reference at all, as it may mislead espiecally new Rust users. + The main upside to this is the reality that an outdated and inaccurate reference may be worse than no reference at all, as it may mislead espiecally new Rust users. - The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. + The main downside, of course, is that this would leave very large swaths of the language basically without *any* documentation, and even more of it only documented in RFCs than is the case today. [RFC 1270]: https://github.com/rust-lang/rfcs/pull/1270 # Unresolved questions -[unresolved]: #unresolved-questions - How do we clearly distinguish between features on nightly, beta, and stable Rust—in the reference especially, but also in the book? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? + +[detailed-design]: From 3ad263e3cba5d61613ac91d83d49e101ce251271 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 19 Oct 2016 14:13:29 -0400 Subject: [PATCH 1129/1195] Drop extraneous link item. --- text/0000-document_all_features.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index abb4ef7f9b7..916cc2d6be9 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -273,5 +273,3 @@ At a "messaging" level, we should continue to emphasize that *documentation is j - How do we clearly distinguish between features on nightly, beta, and stable Rust—in the reference especially, but also in the book? - For the standard library, once it migrates to a crates structure, should it simply include the `#[forbid(missing_docs)]` attribute on all crates to set this as a build error? - -[detailed-design]: From 3d882f95d5d1b418ecb44c1dab6223a8fdf7f2c6 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Fri, 21 Oct 2016 12:11:17 -0400 Subject: [PATCH 1130/1195] Fix a typo; clarify some language. --- text/0000-document_all_features.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-document_all_features.md b/text/0000-document_all_features.md index 916cc2d6be9..4f9c7737eae 100644 --- a/text/0000-document_all_features.md +++ b/text/0000-document_all_features.md @@ -42,7 +42,7 @@ Importantly, though, this warning only appears on the [main site][home-to-refere [home-to-reference]: https://www.rust-lang.org/documentation.html -For example, the change in Rust 1.9 to allow users to use the `#[deprecated]` attribute for their own libraries is, at the time of writing this RFC, *nowhere* reflected in official documentation. (Many other examples could be supplied; this one is chosen for its relative simplicity and recency.) The Book's [discussion of attributes][book-attributes] links to the [reference list of attributes][ref-attributes], but as of the time of writing the reference [still specifies][ref-compiler-attributes] that `deprecated` is a compiler-only feature. The two places where users might become aware of the change are [the Rust 1.9 release blog post][1.9-blog] and the [RFC itself][RFC-1270]. Neither (yet) ranks highly in search; users are likely to be misled. +For example, the change in Rust 1.9 to allow users to use the `#[deprecated]` attribute for their own libraries was, at the time of writing this RFC, *nowhere* reflected in official documentation. (Many other examples could be supplied; this one was chosen for its relative simplicity and recency.) The Book's [discussion of attributes][book-attributes] linked to the [reference list of attributes][ref-attributes], but as of the time of writing the reference [still specifies][ref-compiler-attributes] that `deprecated` was a compiler-only feature. The two places where users might have become aware of the change are [the Rust 1.9 release blog post][1.9-blog] and the [RFC itself][RFC-1270]. Neither (yet) ranked highly in search; users were likely to be misled. [book-attributes]: https://doc.rust-lang.org/book/attributes.html [ref-attributes]: https://doc.rust-lang.org/reference.html#attributes @@ -64,7 +64,7 @@ There are several serious problems with the _status quo_ of using RFCs as ad hoc 2. In many cases, especially in more complicated language features, some important elements of the decision, details of implementation, and expected behavior are fleshed out either in the pull-request discussion for the RFC, or in the implementation issues which follow them. -3. The RFCs themselves, and even more so the associated pull request discussions, are often dense with programming langauge theory. This is as it should be in context, but it means that the relevant information may be inaccessible to Rust users without prior PLT background, or without the patience to wade through it. +3. The RFCs themselves, and even more so the associated pull request discussions, are often dense with programming language theory. This is as it should be in context, but it means that the relevant information may be inaccessible to Rust users without prior PLT background, or without the patience to wade through it. 4. Similarly, information about the final decisions on language features is often buried deep at the end of long and winding threads (especially for a complicated feature like `impl` specialization). From e79b2297b2999055d30d8439529aa6c0717d90d5 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 20:05:25 -0700 Subject: [PATCH 1131/1195] RFC 1624 is break with values for loop --- text/{0000-loop-break-value.md => 1624-loop-break-value.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-loop-break-value.md => 1624-loop-break-value.md} (98%) diff --git a/text/0000-loop-break-value.md b/text/1624-loop-break-value.md similarity index 98% rename from text/0000-loop-break-value.md rename to text/1624-loop-break-value.md index be853f2e959..87d491cb8a7 100644 --- a/text/0000-loop-break-value.md +++ b/text/1624-loop-break-value.md @@ -1,7 +1,7 @@ - Feature Name: loop_break_value - Start Date: 2016-05-20 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1624 +- Rust Issue: https://github.com/rust-lang/rust/issues/37339 # Summary [summary]: #summary From 1811b12f9ffb22b1b0535303cc17e3aeb92ecba9 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 22:22:46 -0700 Subject: [PATCH 1132/1195] RFC 1682 is field init shorthand --- ...0-field-init-shorthand.md => 1682-field-init-shorthand.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-field-init-shorthand.md => 1682-field-init-shorthand.md} (98%) diff --git a/text/0000-field-init-shorthand.md b/text/1682-field-init-shorthand.md similarity index 98% rename from text/0000-field-init-shorthand.md rename to text/1682-field-init-shorthand.md index e86861cc029..f0d79f80374 100644 --- a/text/0000-field-init-shorthand.md +++ b/text/1682-field-init-shorthand.md @@ -1,7 +1,7 @@ - Feature Name: field-init-shorthand - Start Date: 2016-07-18 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1682 +- Rust Issue: https://github.com/rust-lang/rust/issues/37340 # Summary [summary]: #summary From 060ea3105dedf5363e48a542f39163c2ac67cf4c Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 25 Oct 2016 09:02:08 -0700 Subject: [PATCH 1133/1195] RFC 1717 is dllimport connected to #[link(kind)] --- text/{0000-dllimport.md => 1717-dllimport.md} | 32 +++++++++---------- 1 file changed, 16 insertions(+), 16 deletions(-) rename text/{0000-dllimport.md => 1717-dllimport.md} (92%) diff --git a/text/0000-dllimport.md b/text/1717-dllimport.md similarity index 92% rename from text/0000-dllimport.md rename to text/1717-dllimport.md index 355c89aea30..3153f493980 100644 --- a/text/0000-dllimport.md +++ b/text/1717-dllimport.md @@ -1,7 +1,7 @@ - Feature Name: dllimport - Start Date: 2016-08-13 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1717](https://github.com/rust-lang/rfcs/pull/1717) +- Rust Issue: [rust-lang/rust#37403](https://github.com/rust-lang/rust/issues/37403) # Summary [summary]: #summary @@ -18,10 +18,10 @@ what native libraries need to be linked into a program. On some platforms, however, the compiler needs more detailed knowledge about what's being linked from where in order to ensure that symbols are wired up correctly. -On Windows, when a symbol is imported from a dynamic library, the code that accesses +On Windows, when a symbol is imported from a dynamic library, the code that accesses this symbol must be generated differently than for symbols imported from a static library. -Currently the compiler is not aware of associations between the libraries and symbols +Currently the compiler is not aware of associations between the libraries and symbols imported from them, so it cannot alter code generation based on library kind. # Detailed design @@ -35,7 +35,7 @@ are imported from the library mentioned in the `#[link]` attribute adorning the ### Changes to code generation On platforms other than Windows the above association will have no effect. -On Windows, however, `#[link(..., kind="dylib")` shall be presumed to mean linking to a dll, +On Windows, however, `#[link(..., kind="dylib")` shall be presumed to mean linking to a dll, whereas `#[link(..., kind="static")` shall mean static linking. In the former case, all symbols associated with that library will be marked with LLVM [dllimport][1] storage class. @@ -48,13 +48,13 @@ in through Cargo build scripts instead of being written in the source code itself. As a recap, a native library may change names across platforms or distributions or it may be linked dynamically in some situations and statically in others which is why build scripts are leveraged to make these -dynamic decisions. In order to support this kind of dynamism, the following +dynamic decisions. In order to support this kind of dynamism, the following modifications are proposed: -- Extend syntax of the `-l` flag to `-l [KIND=]lib[:NEWNAME]`. The `NEWNAME` +- Extend syntax of the `-l` flag to `-l [KIND=]lib[:NEWNAME]`. The `NEWNAME` part may be used to override name of a library specified in the source. -- Add new meaning to the `KIND` part: if "lib" is already specified in the source, - this will override its kind with KIND. Note that this override is possible only +- Add new meaning to the `KIND` part: if "lib" is already specified in the source, + this will override its kind with KIND. Note that this override is possible only for libraries defined in the current crate. Example: @@ -63,7 +63,7 @@ Example: // mylib.rs #[link(name="foo", kind="dylib")] extern { - // dllimport applied + // dllimport applied } #[link(name="bar", kind="static")] @@ -73,22 +73,22 @@ extern { #[link(name="baz")] extern { - // kind defaults to "dylib", dllimport applied + // kind defaults to "dylib", dllimport applied } ``` ```sh -rustc mylib.rs -l static=foo # change foo's kind to "static", dllimport will not be applied +rustc mylib.rs -l static=foo # change foo's kind to "static", dllimport will not be applied rustc mylib.rs -l foo:newfoo # link newfoo instead of foo, keeping foo's kind as "dylib" rustc mylib.rs -l dylib=bar # change bar's kind to "dylib", dllimport will be applied ``` ### Unbundled static libs (optional) -It had been pointed out that sometimes one may wish to link to a static system library -(i.e. one that is always available to the linker) without bundling it into .lib's and .rlib's. +It had been pointed out that sometimes one may wish to link to a static system library +(i.e. one that is always available to the linker) without bundling it into .lib's and .rlib's. For this use case we'll introduce another library "kind", "static-nobundle". -Such libraries would be treated in the same way as "static", except they will not be bundled into +Such libraries would be treated in the same way as "static", except they will not be bundled into the target .lib/.rlib. # Drawbacks @@ -114,7 +114,7 @@ meaning that it will be common that these attributes are left off by accident. - Support a `#[dllimport]` on extern blocks (or individual symbols, or both). This has the following drawbacks, however: - - This attribute would duplicate the information already provided by + - This attribute would duplicate the information already provided by `#[link(kind="...")]`. - It is not always known whether `#[dllimport]` is needed. Native libraires are not always known whether they're linked dynamically or From 49f1fea599ebbdbb7b5ce44a2c5d13997a815067 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 25 Oct 2016 09:29:29 -0700 Subject: [PATCH 1134/1195] RFC 1721 is customization of CRT linkage --- text/{0000-crt-static.md => 1721-crt-static.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-crt-static.md => 1721-crt-static.md} (99%) diff --git a/text/0000-crt-static.md b/text/1721-crt-static.md similarity index 99% rename from text/0000-crt-static.md rename to text/1721-crt-static.md index 387e845b8e5..2ccd66208a4 100644 --- a/text/0000-crt-static.md +++ b/text/1721-crt-static.md @@ -1,7 +1,7 @@ - Feature Name: `crt_link` - Start Date: 2016-08-18 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1721](https://github.com/rust-lang/rfcs/pull/1721) +- Rust Issue: [rust-lang/rust#37406](https://github.com/rust-lang/rust/issues/37406) # Summary [summary]: #summary From dd3ba8ec65f02c7742c75c7a25d5ce2c49fa5012 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Mon, 31 Oct 2016 08:55:54 -0700 Subject: [PATCH 1135/1195] RFC 1665 is windows subsystem support --- text/{0000-windows-subsystem.md => 1665-windows-subsystem.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-windows-subsystem.md => 1665-windows-subsystem.md} (97%) diff --git a/text/0000-windows-subsystem.md b/text/1665-windows-subsystem.md similarity index 97% rename from text/0000-windows-subsystem.md rename to text/1665-windows-subsystem.md index 3de857c72de..09476dc6687 100644 --- a/text/0000-windows-subsystem.md +++ b/text/1665-windows-subsystem.md @@ -1,7 +1,7 @@ - Feature Name: Windows Subsystem - Start Date: 2016-07-03 -- RFC PR: ____ -- Rust Issue: ____ +- RFC PR: [rust-lang/rfcs#1665](https://github.com/rust-lang/rfcs/pull/1665) +- Rust Issue: [rust-lang/rust#37499](https://github.com/rust-lang/rust/issues/37499) # Summary [summary]: #summary From 0253019edbe9d4fbc6243bc57d5c01eb10dfbfb3 Mon Sep 17 00:00:00 2001 From: Jake Goulding Date: Thu, 3 Nov 2016 15:48:30 -0400 Subject: [PATCH 1136/1195] Differentiate between the different uses of "windows" Throw in some proper noun and minor typo fixes as well. --- text/1665-windows-subsystem.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/text/1665-windows-subsystem.md b/text/1665-windows-subsystem.md index 09476dc6687..9100dfb3df5 100644 --- a/text/1665-windows-subsystem.md +++ b/text/1665-windows-subsystem.md @@ -6,10 +6,10 @@ # Summary [summary]: #summary -Rust programs compiled for windows will always allocate a console window on +Rust programs compiled for Windows will always allocate a console window on startup. This behavior is controlled via the `SUBSYSTEM` parameter passed to the linker, and so *can* be overridden with specific compiler flags. However, doing -so will bypass the rust-specific initialization code in `libstd`, as when using +so will bypass the Rust-specific initialization code in `libstd`, as when using the MSVC toolchain, the entry point must be named `WinMain`. This RFC proposes supporting this case explicitly, allowing `libstd` to @@ -18,10 +18,10 @@ continue to be initialized correctly. # Motivation [motivation]: #motivation -The `WINDOWS` subsystem is commonly used on windows: desktop applications +The `WINDOWS` subsystem is commonly used on Windows: desktop applications typically do not want to flash up a console window on startup. -Currently, using the `WINDOWS` subsystem from rust is undocumented, and the +Currently, using the `WINDOWS` subsystem from Rust is undocumented, and the process is non-trivial when targeting the MSVC toolchain. There are a couple of approaches, each with their own downsides: @@ -38,14 +38,14 @@ The GNU toolchain will accept either entry point. ## Override the entry point via linker options This uses the same method as will be described in this RFC. However, it will -result in build scripts also being compiled for the windows subsystem, which +result in build scripts also being compiled for the `WINDOWS` subsystem, which can cause additional console windows to pop up during compilation, making the system unusable while a build is in progress. # Detailed design [design]: #detailed-design -When an executable is linked while compiling for a windows target, it will be +When an executable is linked while compiling for a Windows target, it will be linked for a specific *subsystem*. The subsystem determines how the operating system will run the executable, and will affect the execution environment of the program. @@ -65,7 +65,7 @@ Initially, the set of possible values will be `{windows, console}`, but may be extended in future if desired. The use of this attribute in a non-executable crate will result in a compiler -warning. If compiling for a non-windows target, the attribute will be silently +warning. If compiling for a non-Windows target, the attribute will be silently ignored. ## Additional linker argument @@ -74,7 +74,7 @@ For the GNU toolchain, this will be sufficient. However, for the MSVC toolchain, the linker will be expecting a `WinMain` symbol, which will not exist. There is some complexity to the way in which a different entry point is expected -when using the windows subsystem. Firstly, the C-runtime library exports two +when using the `WINDOWS` subsystem. Firstly, the C-runtime library exports two symbols designed to be used as an entry point: ``` mainCRTStartup @@ -93,7 +93,7 @@ targeting the MSVC toolchain: This will override the entry point to always be `mainCRTStartup`. For console-subsystem programs this will have no effect, since it was already the -default, but for windows-subsystem programs, it will eliminate the need for +default, but for `WINDOWS` subsystem programs, it will eliminate the need for a `WinMain` symbol to be defined. This command line option will always be passed to the linker, regardless of the @@ -105,7 +105,7 @@ require `rustc` to perform some basic parsing of the linker options. [drawbacks]: #drawbacks - A new platform-specific crate attribute. -- The difficulty of manually calling the rust initialization code is potentially +- The difficulty of manually calling the Rust initialization code is potentially a more general problem, and this only solves a specific (if common) case. - The subsystem must be specified earlier than is strictly required: when compiling C/C++ code only the linker, not the compiler, needs to actually be @@ -121,7 +121,7 @@ require `rustc` to perform some basic parsing of the linker options. command line option. This command line option would only be applicable when compiling an - executable, and only for windows platforms. No other supported platforms + executable, and only for Windows platforms. No other supported platforms require a different entry point or additional linker arguments for programs designed to run with a graphical user interface. @@ -133,7 +133,7 @@ require `rustc` to perform some basic parsing of the linker options. A similar option would need to be added to `Cargo.toml` to make usage as simple as possible. - There's some bike-shedding which can be done one the exact command line + There's some bike-shedding which can be done on the exact command line interface, but one possible option is shown below. Rustc usage: @@ -151,7 +151,7 @@ require `rustc` to perform some basic parsing of the linker options. ``` The `crate-subsystem` command line option would exist on all platforms, - but would be ignored when compiling for a non-windows target, so as to + but would be ignored when compiling for a non-Windows target, so as to support cross-compiling. If not compiling a binary crate, specifying the option is an error regardless of the target. From 1a22f6073b557a213bdd7a8067cf0a47f775d7e0 Mon Sep 17 00:00:00 2001 From: archshift Date: Mon, 7 Nov 2016 11:46:43 -0800 Subject: [PATCH 1137/1195] Clarify re-coercion, pidgeonhole drawback; fix typos --- text/0000-closure-to-fn-coercion.md | 65 +++++++++++++++++++++++++---- 1 file changed, 56 insertions(+), 9 deletions(-) diff --git a/text/0000-closure-to-fn-coercion.md b/text/0000-closure-to-fn-coercion.md index 8f9f95eb003..b2c18a55210 100644 --- a/text/0000-closure-to-fn-coercion.md +++ b/text/0000-closure-to-fn-coercion.md @@ -6,19 +6,19 @@ # Summary [summary]: #summary -A non-capturing (that is, does not `Clone` or `move` any local variables) should be -coercable to a function pointer (`fn`). +A non-capturing (that is, does not `Clone` or `move` any local variables) closure +should be coercable to a function pointer (`fn`). # Motivation [motivation]: #motivation -Currently in rust, it is impossible to bind anything but a pre-defined function +Currently in Rust, it is impossible to bind anything but a pre-defined function as a function pointer. When dealing with closures, one must either rely upon -rust's type-inference capabilities, or use the `Fn` trait to abstract for any +Rust's type-inference capabilities, or use the `Fn` trait to abstract for any closure with a certain type signature. -What is not possible, though, is to define a function while at the same time -binding it to a function pointer. +It is not possible to define a function while at the same time binding it to a +function pointer. This is mainly used for convenience purposes, but in certain situations the lack of ability to do so creates a significant amount of boilerplate code. @@ -107,19 +107,66 @@ const foo: [fn(&mut u32); 4] = [ ]; ``` +Note that once explicitly assigned to an `Fn` trait, the closure can no longer be +coerced into `fn`, even if it has no captures. Just as we cannot do: + +```rust +let a: u32 = 0; // Coercion +let b: i32 = a; // Can't re-coerce +let x: *const u32 = &a; // Coercion +let y: &u32 = x; // Can't re-coerce +``` + +We can't similarly re-coerce a `Fn` trait. +```rust +let a: &Fn(u32) -> u32 = |foo: u32| { foo + 1 }; +let b: fn(u32) -> u32 = *a; // Can't re-coerce +``` + # Drawbacks [drawbacks]: #drawbacks -To a rust user, there is no drawback to this new coercion from closures to `fn` types. +This proposal could potentially allow Rust users to accidentally constrain their APIs. +In the case of a crate, a user accidentally returning `fn` instead of `Fn` may find +that their code compiles at first, but breaks when the user later needs to capture variables: + +```rust +// The specific syntax is more convenient to use +fn func_specific(&self) -> (fn() -> u32) { + || return 0 +} + +fn func_general<'a>(&'a self) -> impl Fn() -> u32 { + move || return self.field +} +``` + +In the above example, the API author could start off with the specific version of the function, +and by circumstance later need to capture a variable. The required change from `fn` to `Fn` could +be a breaking change. + +We do expect crate authors to measure their API's flexibility in other areas, however, as when +determining whether to take `&self` or `&mut self`. Taking a similar situation to the above: + +```rust +fn func_specific<'a>(&'a self) -> impl Fn() -> u32 { + move || return self.field +} + +fn func_general<'a>(&'a mut self) -> impl FnMut() -> u32 { + move || { self.field += 1; return self.field; } +} +``` -The only drawback is that it would add some amount of complexity to the type system. +This drawback is probably outweighed by convenience, simplicity, and the potential for optimization +that comes with the proposed changes, however. # Alternatives [alternatives]: #alternatives ## Anonymous function syntax -With this alternative, rust users would be able to directly bind a function +With this alternative, Rust users would be able to directly bind a function to a variable, without needing to give the function a name. ```rust From 756dc01e630e812ce9f960e92a5487a5f633ad38 Mon Sep 17 00:00:00 2001 From: Matt Brubeck Date: Thu, 10 Nov 2016 09:28:56 -0800 Subject: [PATCH 1138/1195] Fix a typo in RFC1201 example code --- text/1201-naked-fns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1201-naked-fns.md b/text/1201-naked-fns.md index 2249726fec1..870e66dd7a7 100644 --- a/text/1201-naked-fns.md +++ b/text/1201-naked-fns.md @@ -126,7 +126,7 @@ unsafe extern "C" fn correct1(x: &mut u8) { } #[naked] -extern "C" fn correct2() { +extern "C" fn correct2(x: &mut u8) { unsafe { *x += 1; } From 7403c08191e83ec2b7eaf44cad423f898d565f92 Mon Sep 17 00:00:00 2001 From: Niko Matsakis Date: Fri, 11 Nov 2016 20:02:52 -0500 Subject: [PATCH 1139/1195] RFC #1728 is "A process for establishing the Rust roadmap" --- text/{0000-north-star.md => 1728-north-star.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-north-star.md => 1728-north-star.md} (100%) diff --git a/text/0000-north-star.md b/text/1728-north-star.md similarity index 100% rename from text/0000-north-star.md rename to text/1728-north-star.md From 84e33e1173f9dd903050820627ab5cde008de68e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Fri, 18 Dec 2015 20:23:01 +0100 Subject: [PATCH 1140/1195] Add Rvalue-static-promotion RFC --- text/0000-rvalue_static_promotion.md | 171 +++++++++++++++++++++++++++ 1 file changed, 171 insertions(+) create mode 100644 text/0000-rvalue_static_promotion.md diff --git a/text/0000-rvalue_static_promotion.md b/text/0000-rvalue_static_promotion.md new file mode 100644 index 00000000000..420b83324fe --- /dev/null +++ b/text/0000-rvalue_static_promotion.md @@ -0,0 +1,171 @@ +- Feature Name: rvalue_static_promotion +- Start Date: 2015-12-18 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +Promote constexpr rvalues to values in static memory instead of +stack slots, and expose those in the language by being able to directly create +`'static` references to them. This would allow code like +`let x: &'static u32 = &42` to work. + +# Motivation +[motivation]: #motivation + +Right now, when dealing with constant values, you have to explicitly define +`const` or `static` items to create references with `'static` lifetime, +which can be unnecessarily verbose if those items never get exposed +in the actual API: + +```rust +fn return_x_or_a_default(x: Option<&u32>) -> &u32 { + if let Some(x) = x { + x + } else { + static DEFAULT_X: u32 = 42; + &DEFAULT_X + } +} +fn return_binop() -> &'static Fn(u32, u32) -> u32 { + const STATIC_TRAIT_OBJECT: &'static Fn(u32, u32) -> u32 + = &|x, y| x + y; + STATIC_TRAIT_OBJECT +} +``` + +Additionally, despite it being memory safe, it is not currently possible to +create a `&'static mut` to a zero-sized type without involving unsafe code: + +```rust +fn return_fn_mut_or_default(&mut self) -> &FnMut(u32, u32) -> u32 { + // error: references in constants may only refer to immutable values + const STATIC_TRAIT_OBJECT: &'static mut FnMut(u32, u32) -> u32 + = &mut |x, y| x * y; + + self.operator.unwrap_or(STATIC_TRAIT_OBJECT) +} +``` + +Lastly, the compiler already special cases a small subset of rvalue +const expressions to have static lifetime - namely the empty array expression: + +```rust +let x: &'static [u8] = &[]; +let y: &'static mut [u8] = &mut []; +``` + +And though they don't have to be seen as such, string literals could be regarded +as the same kind of special sugar: + +```rust +let b: &'static [u8; 4] = b"test"; +// could be seen as `= &[116, 101, 115, 116]` + +let s: &'static str = "foo"; +// could be seen as `= &str([102, 111, 111])` +// given `struct str([u8]);` and the ability to construct compound +// DST structs directly +``` + +With the proposed change, those special cases would instead become +part of a general language feature usable for custom code. + +# Detailed design +[design]: #detailed-design + +Inside a function body's block: + +- If a shared reference to a constexpr rvalue is taken. (`&`) +- And the constexpr does not contain a `UnsafeCell { ... }` constructor. +- And the constexpr does not contain a const fn call returning a type containing a `UnsafeCell`. +- Then instead of translating the value into a stack slot, translate + it into a static memory location and give the resulting reference a + `'static` lifetime. + +Likewise, + +- If a mutable reference to a constexpr rvalue is taken. (`&mut `) +- And the constexpr does not contain a `UnsafeCell { ... }` constructor. +- And the constexpr does not contain a const fn call returning a type containing a `UnsafeCell`. +- _And the type of the rvalue is zero-sized._ +- Then instead of translating the value into a stack slot, translate + it into a static memory location and give the resulting reference a + `'static` lifetime. + +The `UnsafeCell` restrictions are there to ensure that the promoted value is +truly immutable behind the reference (Though not technically needed in the zero-sized case, see alternatives below). + +The zero-sized restriction for mutable references is there because +aliasing mutable references are only safe for zero sized types +(since you never dereference the pointer for them). + +Examples: + +```rust +// OK: +let a: &'static u32 = &32; +let b: &'static Option> = &None; +let c: &'static Fn() -> u32 = &|| 42; + +let d: &'static mut () = &mut (); +let e: &'static mut Fn() -> u32 = &mut || 42; + +// BAD: +let f: &'static Option> = &Some(UnsafeCell { data: 32 }); +let g: &'static Cell = &Cell::new(); // assuming conf fn new() +``` + +These rules above should be consistent with the existing rvalue promotions in `const` +initializer lists: + +```rust +// If this compiles: +const X: &'static T = &; + +// Then this should compile as well: +let x: &'static T = &; +``` + +## Implementation + +The necessary changes in the compiler did already get implemented as +part of codegen optimizations (emitting references-to or memcopies-from values in static memory instead of embedding them in the code). + +All that is left do do is "throw the switch" for the new lifetime semantic +by removing these lines: +https://github.com/rust-lang/rust/blob/29ea4eef9fa6e36f40bc1f31eb1e56bf5941ee72/src/librustc/middle/mem_categorization.rs#L801-L807 + +(And of course fixing any fallout/bitrot that might have happened, adding tests, etc.) + +# Drawbacks +[drawbacks]: #drawbacks + +One more feature with seemingly ad-hoc rules to complicate the language... + +# Alternatives +[alternatives]: #alternatives + +There are two ways this could be taken further with zero-sized types: + +1. Remove the `UnsafeCell` restriction if the type of the rvalue is zero-sized. +2. The above, but also remove the __constexpr__ restriction, applying to any zero-sized rvalue instead. + +Both cases would work because one can't cause memory unsafety with a reference +to a zero sized value, and they would allow more safe code to compile. + +However, they might complicated reasoning about the rules more, +especially with the last one also being possibly confusing in regards to +side-effects. + +Not doing this mostly means relying on `static` and `const` items to create +`'static` references, while empty-array expressions would remain special cased. +It would also not be possible to safely create `&'static mut` references to zero-sized +types, though that part could also be achieved by allowing mutable references to +zero-sized types in constants. + +# Unresolved questions +[unresolved]: #unresolved-questions + +None, beyond "Should we do alternative 1 instead?". From c2ebeaf2017f8fef64844d4f773561ea8e3b122e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Fri, 18 Dec 2015 20:59:27 +0100 Subject: [PATCH 1141/1195] Add generic example --- text/0000-rvalue_static_promotion.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/text/0000-rvalue_static_promotion.md b/text/0000-rvalue_static_promotion.md index 420b83324fe..bce3a11f09d 100644 --- a/text/0000-rvalue_static_promotion.md +++ b/text/0000-rvalue_static_promotion.md @@ -35,6 +35,16 @@ fn return_binop() -> &'static Fn(u32, u32) -> u32 { } ``` +This workaround also has the limitation of not being able to refer to +type parameters of a containing generic functions, eg you can't do this: + +```rust +fn generic() -> &'static Option { + const X: &'static Option = &None::; + X +} +``` + Additionally, despite it being memory safe, it is not currently possible to create a `&'static mut` to a zero-sized type without involving unsafe code: @@ -112,6 +122,12 @@ let c: &'static Fn() -> u32 = &|| 42; let d: &'static mut () = &mut (); let e: &'static mut Fn() -> u32 = &mut || 42; +let h: &'static u32 = &(32 + 64); + +fn generic() -> &'static Option { + &None:: +} + // BAD: let f: &'static Option> = &Some(UnsafeCell { data: 32 }); let g: &'static Cell = &Cell::new(); // assuming conf fn new() From 1d9f54175c62cf9fcee6e7dea00dcf06121779aa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Fri, 18 Dec 2015 21:04:45 +0100 Subject: [PATCH 1142/1195] Add Drawback elobaration to the generic case --- text/0000-rvalue_static_promotion.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/0000-rvalue_static_promotion.md b/text/0000-rvalue_static_promotion.md index bce3a11f09d..cf250758fcf 100644 --- a/text/0000-rvalue_static_promotion.md +++ b/text/0000-rvalue_static_promotion.md @@ -175,9 +175,11 @@ However, they might complicated reasoning about the rules more, especially with the last one also being possibly confusing in regards to side-effects. -Not doing this mostly means relying on `static` and `const` items to create -`'static` references, while empty-array expressions would remain special cased. -It would also not be possible to safely create `&'static mut` references to zero-sized +Not doing this means: + +- Relying on `static` and `const` items to create `'static` references, which won't work in generics. +- Empty-array expressions would remain special cased. +- It would also not be possible to safely create `&'static mut` references to zero-sized types, though that part could also be achieved by allowing mutable references to zero-sized types in constants. From db2444065181515e75384d47fc657cb70682b395 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marvin=20L=C3=B6bel?= Date: Sat, 12 Nov 2016 21:11:30 +0100 Subject: [PATCH 1143/1195] Move mutable rvalue promotion to alternative section --- text/0000-rvalue_static_promotion.md | 79 ++++++++++++++++------------ 1 file changed, 44 insertions(+), 35 deletions(-) diff --git a/text/0000-rvalue_static_promotion.md b/text/0000-rvalue_static_promotion.md index cf250758fcf..039e3906979 100644 --- a/text/0000-rvalue_static_promotion.md +++ b/text/0000-rvalue_static_promotion.md @@ -45,25 +45,11 @@ fn generic() -> &'static Option { } ``` -Additionally, despite it being memory safe, it is not currently possible to -create a `&'static mut` to a zero-sized type without involving unsafe code: - -```rust -fn return_fn_mut_or_default(&mut self) -> &FnMut(u32, u32) -> u32 { - // error: references in constants may only refer to immutable values - const STATIC_TRAIT_OBJECT: &'static mut FnMut(u32, u32) -> u32 - = &mut |x, y| x * y; - - self.operator.unwrap_or(STATIC_TRAIT_OBJECT) -} -``` - -Lastly, the compiler already special cases a small subset of rvalue +However, the compiler already special cases a small subset of rvalue const expressions to have static lifetime - namely the empty array expression: ```rust let x: &'static [u8] = &[]; -let y: &'static mut [u8] = &mut []; ``` And though they don't have to be seen as such, string literals could be regarded @@ -94,22 +80,8 @@ Inside a function body's block: it into a static memory location and give the resulting reference a `'static` lifetime. -Likewise, - -- If a mutable reference to a constexpr rvalue is taken. (`&mut `) -- And the constexpr does not contain a `UnsafeCell { ... }` constructor. -- And the constexpr does not contain a const fn call returning a type containing a `UnsafeCell`. -- _And the type of the rvalue is zero-sized._ -- Then instead of translating the value into a stack slot, translate - it into a static memory location and give the resulting reference a - `'static` lifetime. - The `UnsafeCell` restrictions are there to ensure that the promoted value is -truly immutable behind the reference (Though not technically needed in the zero-sized case, see alternatives below). - -The zero-sized restriction for mutable references is there because -aliasing mutable references are only safe for zero sized types -(since you never dereference the pointer for them). +truly immutable behind the reference. Examples: @@ -119,9 +91,6 @@ let a: &'static u32 = &32; let b: &'static Option> = &None; let c: &'static Fn() -> u32 = &|| 42; -let d: &'static mut () = &mut (); -let e: &'static mut Fn() -> u32 = &mut || 42; - let h: &'static u32 = &(32 + 64); fn generic() -> &'static Option { @@ -134,7 +103,7 @@ let g: &'static Cell = &Cell::new(); // assuming conf fn new() ``` These rules above should be consistent with the existing rvalue promotions in `const` -initializer lists: +initializer expressions: ```rust // If this compiles: @@ -160,9 +129,49 @@ https://github.com/rust-lang/rust/blob/29ea4eef9fa6e36f40bc1f31eb1e56bf5941ee72/ One more feature with seemingly ad-hoc rules to complicate the language... -# Alternatives +# Alternatives, Extensions [alternatives]: #alternatives +It would be possible to extend support to `&'static mut` references, +as long as there is the additional constraint that the +referenced type is zero sized. + +This again has precedence in the array reference constructor: + +```rust +// valid code today +let y: &'static mut [u8] = &mut []; +``` + +The rules would be similar: + +- If a mutable reference to a constexpr rvalue is taken. (`&mut `) +- And the constexpr does not contain a `UnsafeCell { ... }` constructor. +- And the constexpr does not contain a const fn call returning a type containing a `UnsafeCell`. +- _And the type of the rvalue is zero-sized._ +- Then instead of translating the value into a stack slot, translate + it into a static memory location and give the resulting reference a + `'static` lifetime. + +The zero-sized restriction is there because +aliasing mutable references are only safe for zero sized types +(since you never dereference the pointer for them). + +Example: + +```rust +fn return_fn_mut_or_default(&mut self) -> &FnMut(u32, u32) -> u32 { + self.operator.unwrap_or(&mut |x, y| x * y) + // ^ would be okay, since it would be translated like this: + // const STATIC_TRAIT_OBJECT: &'static mut FnMut(u32, u32) -> u32 + // = &mut |x, y| x * y; + // self.operator.unwrap_or(STATIC_TRAIT_OBJECT) +} + +let d: &'static mut () = &mut (); +let e: &'static mut Fn() -> u32 = &mut || 42; +``` + There are two ways this could be taken further with zero-sized types: 1. Remove the `UnsafeCell` restriction if the type of the rvalue is zero-sized. From 438a3ea0fdfc0ae13ab737cf50c9fe6a27ec0e05 Mon Sep 17 00:00:00 2001 From: Alex Crichton Date: Tue, 22 Nov 2016 21:59:03 -0800 Subject: [PATCH 1144/1195] RFC 1725 is unaligned access via `std::ptr` --- text/{0000-unaligned-access.md => 1725-unaligned-access.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-unaligned-access.md => 1725-unaligned-access.md} (90%) diff --git a/text/0000-unaligned-access.md b/text/1725-unaligned-access.md similarity index 90% rename from text/0000-unaligned-access.md rename to text/1725-unaligned-access.md index bf942c65c95..6424f0c61c6 100644 --- a/text/0000-unaligned-access.md +++ b/text/1725-unaligned-access.md @@ -1,7 +1,7 @@ -- Feature Name: unaligned_access +- Feature Name: `unaligned_access` - Start Date: 2016-08-22 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#1725](https://github.com/rust-lang/rfcs/pull/1725) +- Rust Issue: [rust-lang/rust#37955](https://github.com/rust-lang/rust/issues/37955) # Summary [summary]: #summary From 97944a667e78853e348e17a5c2e842d005c9878d Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 2 Dec 2016 15:38:36 -0800 Subject: [PATCH 1145/1195] RFC 1636 is Require documentation for all new features --- ...document_all_features.md => 1636-document_all_features.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-document_all_features.md => 1636-document_all_features.md} (99%) diff --git a/text/0000-document_all_features.md b/text/1636-document_all_features.md similarity index 99% rename from text/0000-document_all_features.md rename to text/1636-document_all_features.md index 4f9c7737eae..a5a89f761ba 100644 --- a/text/0000-document_all_features.md +++ b/text/1636-document_all_features.md @@ -1,7 +1,7 @@ - Feature Name: document_all_features - Start Date: 2016-06-03 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1636 +- Rust Issue: N/A # Summary From 3813ce9f1f5b5f0d36bb83fe50b9f1d25e0ba8f9 Mon Sep 17 00:00:00 2001 From: Mike Date: Wed, 7 Dec 2016 08:47:31 +0900 Subject: [PATCH 1146/1195] Trivial - Fix Typo in Specialization RFC --- text/1210-impl-specialization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1210-impl-specialization.md b/text/1210-impl-specialization.md index 7e75147900f..44039c2ee24 100644 --- a/text/1210-impl-specialization.md +++ b/text/1210-impl-specialization.md @@ -161,7 +161,7 @@ default impl Add for T { ``` This default impl does *not* mean that `Add` is implemented for all `Clone` -data, but jut that when you do impl `Add` and `Self: Clone`, you can leave off +data, but just that when you do impl `Add` and `Self: Clone`, you can leave off `add_assign`: ```rust From 59e35882ef1b5baed2a98c00f8569c7f150246c6 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Tue, 6 Dec 2016 21:47:14 -0500 Subject: [PATCH 1147/1195] Add 'How We Teach This' to template, per RFC 1636. --- 0000-template.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/0000-template.md b/0000-template.md index a45c6110e58..1c778d89394 100644 --- a/0000-template.md +++ b/0000-template.md @@ -20,6 +20,15 @@ This is the bulk of the RFC. Explain the design in enough detail for somebody fa with the language to understand, and for somebody familiar with the compiler to implement. This should get into specifics and corner-cases, and include examples of how the feature is used. +# How We Teach This +[how-we-teach-this]: #how-we-teach-this + +What names and terminology work best for these concepts and why? How is this idea best presented—as a continuation of existing Rust patterns, or as a wholly new one? + +Would the acceptance of this proposal change how Rust is taught to new users at any level? How should this feature be introduced and taught to existing Rust users? + +What additions or changes to the Rust Reference, _The Rust Programming Language_, and/or _Rust by Example_ does it entail? + # Drawbacks [drawbacks]: #drawbacks From 75bbe75358299c43a15fa4c4970e394b5d13d39e Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Wed, 7 Dec 2016 13:00:31 -0500 Subject: [PATCH 1148/1195] Wrap sentences to newlines in "How We Teach This" --- 0000-template.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/0000-template.md b/0000-template.md index 1c778d89394..ef898e3360a 100644 --- a/0000-template.md +++ b/0000-template.md @@ -23,9 +23,11 @@ This should get into specifics and corner-cases, and include examples of how the # How We Teach This [how-we-teach-this]: #how-we-teach-this -What names and terminology work best for these concepts and why? How is this idea best presented—as a continuation of existing Rust patterns, or as a wholly new one? +What names and terminology work best for these concepts and why? +How is this idea best presented—as a continuation of existing Rust patterns, or as a wholly new one? -Would the acceptance of this proposal change how Rust is taught to new users at any level? How should this feature be introduced and taught to existing Rust users? +Would the acceptance of this proposal change how Rust is taught to new users at any level? +How should this feature be introduced and taught to existing Rust users? What additions or changes to the Rust Reference, _The Rust Programming Language_, and/or _Rust by Example_ does it entail? From 79a8539d4edcaa2d050b02510f2ff8d8caa1211f Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 16:01:20 -0700 Subject: [PATCH 1149/1195] Roadmap for 2017 --- text/0000-roadmap-2017.md | 582 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 582 insertions(+) create mode 100644 text/0000-roadmap-2017.md diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md new file mode 100644 index 00000000000..476934b1377 --- /dev/null +++ b/text/0000-roadmap-2017.md @@ -0,0 +1,582 @@ +- Feature Name: N/A +- Start Date: 2016-10-04 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC proposes the *2017 Rust Roadmap*, in accordance with +[RFC 1728](https://github.com/rust-lang/rfcs/pull/1728). The goal of the roadmap +is to lay out a vision for where the Rust project should be in a year's +time. **This year's focus is improving Rust's *productivity*, while retaining its +emphasis on fast, reliable code**. At a high level, by the end of 2017: + +* Rust should have a lower learning curve +* Rust should have a pleasant edit-compile-debug cycle +* Rust should provide a solid, but basic IDE experience +* Rust should provide easy access to high quality crates +* Rust should be well-equipped for writing robust, high-scale servers +* Rust should have 1.0-level crates for essential tasks +* Rust should integrate easily into large build systems +* Rust should integrate easily with C++ code +* Rust's community should provide mentoring at all levels + +The proposal is based on the [2016 survey], systematic outreach, direct +conversations with individual Rust users, and an extensive +[internals thread]. Thanks to everyone who helped with this effort! + +[2016 survey]: https://blog.rust-lang.org/2016/06/30/State-of-Rust-Survey-2016.html +[internals thread]: https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/ + +# Motivation +[motivation]: #motivation + +There's no end of possible improvements to Rust—so what do we use to guide our +thinking? + +The core team has tended to view things not in terms of particular features or +aesthetic goals, but instead in terms of **making Rust successful while staying +true to its core values**. This basic sentiment underlies much of the proposed +roadmap, so let's unpack it a bit. + +## Making Rust successful + +### The measure of success + +What does it mean for Rust to be successful? There are a lot of good answers to +this question, a lot of different things that draw people to use or contribute +to Rust. But regardless of our *personal* values, there's at least one clear +measure for Rust's broad success: **people should be using Rust in +production and reaping clear benefits from doing so**. + +- Production use matters for the obvious reason: it grows the set of + stakeholders with potential to invest in the language and ecosystem. To + deliver on that potential, Rust needs to be part of the backbone of some major + products. + +- Production use measures our *design* success; it's the ultimate reality + check. Rust takes a unique stance on a number of tradeoffs, which we believe + to position it well for writing fast and reliable software. The real test of + those beliefs is people using Rust to build large, production systems, on + which they're betting time and money. + +- The *kind* of production use matters. For Rust to truly be a success, there + should be clear-cut reasons people are employing it rather than another + language. Rust needs to provide crisp, standout benefits to the organizations + using it. + +The idea here is *not* about "taking over the world" with Rust; it's not about +market share for the sake of market share. But if Rust is truly delivering a +valuable new way of programming, we should be seeing that benefit in "the real +world", in production uses that are significant enough to help sustain Rust's +development. + +### The obstacles to success + +At this point, we have a fair amount of data about how Rust is reaching its +audience, through the [2016 survey], informal conversations, and explicit +outreach to (pre-)production shops (writeup coming soon). The data from the +survey is generally corroborated by these other venues, so let's focus on that. + +[2016 survey]: https://blog.rust-lang.org/2016/06/30/State-of-Rust-Survey-2016.html + +We asked both current and potential users what most stands in the way of their +using Rust, and got some pretty clear answers: + +- 1 in 4: learning curve +- 1 in 7: lack of libraries +- 1 in 9: general “maturity” concerns +- 1 in 19: lack of IDEs (1 in 4 non-users) +- 1 in 20: compiler performance + +None of these obstacles is directly about the core language or `std`; people are +generally happy with what the language offers today. Instead, the connecting +theme is *productivity*—how quickly can I start writing real code? bring up a +team? prototype and iterate? debug my code? And so on. + +In other words, our primary challenge isn't making Rust "better" in the +abstract; it's making people *productive* with Rust. The need is most pronounced +in the early stages of Rust learning, where we risk losing a large pool of +interested people if we can't get them over the hump. Evidence from the survey +and elsewhere suggests that once people do get over the initial learning curve, +they tend to stick around. + +So how do we pull it off? + +### Core values + +Part of what makes Rust so exciting is that it attempts to eliminate some +seemingly fundamental tradeoffs. The central such tradeoff is between safety +and speed. Rust strives for + +- uncompromising reliability +- uncompromising performance + +and delivers on this goal largely thanks to its fundamental concept of +ownership. + +But there's a problem: at first glance, "productivity" and "learnability" may +seem at odds with Rust's core goals. It's common to hear the refrain that +"fighting with the borrow checker" is a rite of passage for Rustaceans. Or that +removing papercuts would mean glossing over safety holes or performance cliffs. + +To be sure, there are tradeoffs here. But as above, if there's one thing the +Rust community knows how to do, it's bending the curve around tradeoffs—memory +safety without garbage collection, concurrency without data races, and all the +rest. We have many examples in the language where we've managed to make a +feature pleasant to use, while also providing maximum performance and +safety—closures are a particularly good example, but there are +[others](https://internals.rust-lang.org/t/roadmap-2017-productivity-learning-curve-and-expressiveness/4097). + +And of course, beyond the core language, "productivity" also depends a lot on +tooling and the ecosystem. Cargo is one example where Rust's tooling provides a +huge productivity boost, and we've been working hard on other aspects of +tooling, like the +[compiler's error messages](https://blog.rust-lang.org/2016/08/10/Shape-of-errors-to-come.html), +that likewise have a big impact on productivity. There's so much more we can be +doing in this space. + +In short, **productivity should be a core value of Rust**. By the end of 2017, +let's try to earn the slogan: + +- Rust: fast, reliable, productive—pick three. + +# Detailed design +[design]: #detailed-design + +## Overall strategy + +In the abstract, reaching the kind of adoption we need means bringing +people along a series of distinct steps: + +- Public perception of Rust +- First contact +- Early play, toy projects +- Public projects +- Personal investment +- Professional investment + +We need to (1) provide "drivers", i.e. strong motivation to continue through the +stages and (2) avoid "blockers" that prevent people from progressing. + +At the moment, our most immediate adoption obstacles are mostly about blockers, +rather than a lack of drivers: there are people who see potential value in Rust, +but worry about issues like productivity, tooling, and maturity standing in the +way of use at scale The roadmap proposes a set of goals largely angled at +reducing these blockers. + +However, for Rust to make sense to use in a significant way in production, it +also needs to have a "complete story" for one or more domains of use. The goals +call out a specific domain where we are already seeing promising production use, +and where we have a relatively clear path toward a more complete story. + +Almost all of the goals focus squarely on "productivity" of one kind or another. + +## Goals + +Now to the meat of the roadmap: the goals. Each is phrased in terms of a +*qualitative vision*, trying to carve out what the *experience* of Rust should +be in one year's time. The details mention some possible avenues toward a +solution, but this shouldn't be taken as prescriptive. + +> These goals are partly informed from the [internals thread] about the +roadmap. That thread also posed a number of possible additional goals. Of +course, part of the work of the roadmap is to allocate our limited resources, +which fundamentally means not including some possible goals. Some of the most +promising suggestions that didn't make it into the roadmap proposal itself are +included at the end of the Goals section. + +### Rust should have a lower learning curve + +Rust offers a unique value proposition in part because it offers a unique +feature: its ownership model. Because the concept is not (yet!) a widespread one +in other languages, it is something most people have to learn from scratch +before hitting their stride with Rust. And that often comes on top of other +aspects of Rust that may be less familiar. A common refrain is "the first couple +of weeks are tough, but it's oh so worth it." How many people are bouncing off +of Rust before they get through those first couple of weeks? How many team leads +are reluctant to introduce Rust because of the training needed? (1 in 4 survey +respondents mentioned the learning curve.) + +Here are some strategies we might take to lower the learning curve: + +- **Improved docs**. While the existing Rust book has been successful, we've + learned a lot about teaching Rust, and there's a + [rewrite](http://words.steveklabnik.com/whats-new-with-the-rust-programming-language) + in the works. The effort is laser-focused on the key areas that trip people up + today (ownership, modules, strings, errors). + +- **Improved errors**. We've already made some + [big strides](https://blog.rust-lang.org/2016/08/10/Shape-of-errors-to-come.html) + here, particularly for ownership-related errors, but there's surely more room + for improvement. + +- **Improved language features**. There are a couple of ways that the language + design itself can be oriented toward learnability. First, we can introduce new + features with an explicit eye toward + [how they will be taught](https://github.com/rust-lang/rfcs/pull/1636). Second, + we can improve existing features to make them easier to understand and use -- + things like non-lexical lifetimes being a major example. There's already been + [some discussion on internals](https://internals.rust-lang.org/t/roadmap-2017-productivity-learning-curve-and-expressiveness/4097/) + +- **IDEs and other tooling**. IDEs provide a good opportunity for deeper + teaching. An IDE can visualize errors, for example *showing* you the lifetime + of a borrow. They can also provide deeper inspection of what's going on with + things like method dispatch, type inference, and so on. + +### Rust should have a pleasant edit-compile-debug cycle + +The edit-compile-debug cycle in Rust takes too long, and it's one of the +complaints we hear most often from production users. We've laid down a good +foundation with [MIR] (now turned on by default) and [incremental compilation] +(which recently hit alpha). But we need to continue pushing hard to actually +deliver the improvements. And to fully address the problem, **the improvement +needs to apply to large Rust projects, not just small or mid-sized benchmarks**. + +To get this done, we're also going to need further improvements to the +performance monitoring infrastructure, including more benchmarks. Note, though, +that the goal is stated *qualitatively*, and we need to be careful with what we +measure to ensure we don't lose sight of that goal. + +While the most obvious routes are direct improvements like incremental +compilation, since the focus here is primarily on development (including +debugging), another promising avenue is more usable debug builds. Production +users often say "debug binaries are too slow to run, but release binaries are +too slow to build". There may be a lot of room in the middle. + +Depending on how far we want to take IDE support (see below), pushing +incremental compilation up through the earliest stages of the compiler may also +be important. + +[MIR]: https://blog.rust-lang.org/2016/04/19/MIR.html +[incremental compilation]: https://blog.rust-lang.org/2016/09/08/incremental.html + +### Rust should provide a solid, but basic IDE experience + +For many people—even whole organizations—IDEs are an essential part of the +programming workflow. In the survey, 1 in 4 respondents mentioned requiring IDE +support before using Rust seriously. Tools like [Racer] and the [IntelliJ] Rust +plugin have made great progress this year, but [compiler integration] in its +infancy, which limits the kinds of tools that general IDE plugins can provide. + +The problem statement here says "solid, but basic" rather than "world-class" IDE +support to set realistic expectations for what we can get done this year. Of +course, the precise contours will need to be driven by implementation work, but +we can enumerate some basic constraints for such an IDE here: + +- It should be **reliable**: it shouldn't crash, destroy work, or give inaccurate + results in situations that demand precision (like refactorings). +- It should be **responsive**: the interface should never hang waiting on the + compiler or other computation. In places where waiting is required, the + interface should update as smoothly as possible, while providing + responsiveness throughout. +- It should provide **basic functionality**. At a minimum, that's: syntax + highlighting, basic code navigation (e.g. go-to-definition), code completion, + build support (with Cargo integration), error integration, and code + formatting. + +Note that while some of this functionality is available in existing IDE/plugin +efforts, a key part of this initiative is to (1) lay the foundation for plugins +based on compiler integration (2) pull together existing tools into a single +service that can integrate with multiple IDEs. + +[Racer]: https://github.com/phildawes/racer +[IntelliJ]: https://intellij-rust.github.io/ +[compiler integration]: https://internals.rust-lang.org/t/introducing-rust-language-server-source-release/4209/ + +### Rust should provide easy access to high quality crates + +Another major message from the survey and elsewhere is that Rust's ecosystem, +while growing, is still immature (1 in 9 survey respondents mentioned +this). Maturity is not something we can rush. But there are steps we can take +across the ecosystem to help improve the quality and discoverability of crates, +both of which will help increase the overall sense of maturity. + +Some avenues for quality improvement: + +- Provide stable, extensible test/bench frameworks. +- Provide more push-button CI setup, e.g. have `cargo new` set up Travis/Appveyor. +- Restart the [API guidelines](http://aturon.github.io/) project. +- Use badges on crates.io to signal various quality metrics. +- Perform API reviews on important crates. + +Some avenues for discoverability improvement: + +- Adding categories to crates.io, making it possible to browse lists like + "crates for parsing". +- More sophisticated ranking and/or curation. + +A number of ideas along these lines were discussed in the [Rust Platform thread]. + +[Rust Platform thread]: https://internals.rust-lang.org/t/proposal-the-rust-platform/3745 + +### Rust should be well-equipped for writing robust, high-scale servers + +The biggest area we've seen with interest in production Rust so far is the +server, particularly in cases where high-scale performance, control, and/or +reliability are paramount. At the moment, our ecosystem in this space is +nascent, and production users are having to build a lot from scratch. + +Of the specific domains we might target for having a more complete story, Rust +on the server is the place with the clearest direction and momentum. In a year's +time, it's within reach to drastically improve Rust's server ecosystem and the +overall experience of writing server code. The relevant pieces here include +foundations for async IO, language improvements for async code ergonomics, +shared infrastructure for writing services (including abstractions for +implementing protocols and middleware), and endless interfaces to existing +services/protocols. + +There are two reasons to focus on the robust, high-scale case. Most importantly, +it's the place where Rust has the clearest value proposition relative to other +languages, and hence the place where we're likeliest to achieve significant, +quality production usage (as discussed earlier in the RFC). More generally, the +overall server space is *huge*, so choosing a particular niche provides +essential focus for our efforts. + +### Rust should have 1.0-level crates for essential tasks + +Rust has taken a decidedly lean approach to its standard library, preferring for +much of the typical "batteries included" functionality to live externally in the +crates.io ecosystem. While there are a lot of benefits to that approach, it's +important that we do in fact provide the batteries somewhere: we need 1.0-level +functionality for essential tasks. To pick just one example, the `rand` crate +has suffered from a lack of vision and has effectively stalled before reaching +1.0 maturity, despite its central importance for a non-trivial part of the +ecosystem. + +There are two basic strategies we might take to close these gaps. + +The first is to identify a broad set of "essential tasks" by, for example, +finding the commonalities between large "batteries included" standard libraries, +and focus community efforts on bolstering crates in these areas. With sustained +and systematic effort, we can probably help push a number of these crates to 1.0 +maturity this year. + +A second strategy is to focus specifically on tasks that play to Rust's +strengths. For example, Rust's potential for [fearless concurrency] across a +range of paradigms is one of the most unique and exciting aspects of the +language. But we aren't fully delivering on this potential, due to the +immaturity of libraries in the space. The response to work in this space, like +the recent [futures library announcement], suggests that there is a lot of +pent-up demand and excitement, and that this kind of work can open a lot of +doors for Rust. So concurrency/asynchrony/parallelism is one segment of the +ecosystem that likely deserves particular focus (and feeds into the high-scale +server goal as well); there are likely others. + +[fearless concurrency]: http://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html +[futures library announcement]: http://aturon.github.io/blog/2016/08/11/futures/ + +### Rust should integrate easily into large build systems + +When working with larger organizations interested in using Rust, one of the +first hurdles we tend to run into is fitting into an existing build +system. We've been exploring a number of different approaches, each of which +ends up using Cargo (and sometimes `rustc`) in different ways, with different +stories about how to incorporate crates from the broader crates.io ecosystem. +Part of the issue seems to be a perceived overlap between functionality in Cargo +(and its notion of compilation unit) and in ambient build systems, but we have +yet to truly get to the bottom of the issues—and it may be that the problem is +one of communication, rather than of some technical gap. + +By the end of 2017, this kind of integration should be *easy*: as a community, +we should have a strong understanding of best practices, and potentially build +tooling in support of those practices. And of course, we want to approach this +goal with Rust's values in mind, ensuring that first-class access to the +crates.io ecosystem is a cornerstone of our eventual story. + +### Rust should integrate easily with C++ code + +Rust's current support for interfacing with C is fairly strong, but wrapping a C +library still involves tedious work mirroring declarations and writing C shims +or other glue code. Moreover, many projects that are ripe for Rust integration +are currently using C++, and interfacing with those effectively requires +maintaining an alternative C wrapper for the C++ APIs. This is a problem both +for Rust code that wants to employ existing libraries and for those who want to +integrate Rust into existing C/C++ codebases. + +Why C++ rather than the myriad other languages we might focus on? Of any of the +possible language integrations, C++ came up most consistently across both the +general survey and the commercial outreach. And that's not too surprising: Rust +is well-positioned as a C++ replacement, so it's natural for people with +existing C++ codebases to be looking at Rust but needing a good integration +story for moving incrementally. In addition, many of the features needed for +high quality C/C++ binding support are prerequisites for high quality binding +support in other languages, since runtime internals will generally be accessible +through a C or C++ interface. So in general, C++ seems like the wisest place to +focus on the language integration front for the moment. + +**The goal should be that using a C++ library in Rust is not much harder than +using it in C++**. In other words, it should be possible to directly include C++ +headers (e.g., include! {myproject.hpp}) and have the extern declarations, glue +code, and so forth get generated automatically. This means (eventually) full +support for interfacing with C++ code that uses features like templates, +overloading, classes and virtual calls, and so forth. + +A number of the needed ingredients already exist. With the addition of unions, +Rust itself supports much of the representation infrastructure needed to +describe C and C++ data structures, though gaps (such as bitflags) +remain. Current work on the bindgen tool is extending its support for handling +the layout of more complex C++ data structures (such as classes). Moreover, a +large number of projects have paved the way in terms of identifying best +practices and gaps. What is needed now is to formulate a plan for enabling +better interop, including intermediate milestones that allow early adopters to +make immediate use of each feature as it is added. + +### Rust's community should provide mentoring at all levels + +The Rust community is awesome, in large part because of how welcoming it is. But +we could do a lot more to help grow people into roles in the project, including +pulling together important work items at all level of expertise to direct people +to, providing mentoring, and having a clearer on-ramp to the various official +Rust teams. Outreach and mentoring is also one of the best avenues for +increasing diversity in the project, which, as the survey demonstrates, has a +lot of room for improvement. + +While there's work here for *all* the teams, the community team in particular +will continue to focus on early-stage outreach, while other teams will focus on +leadership onboarding. + +## Areas of support + +The goals above represent the steps we think are most essential to Rust's +success in 2017, and where we want to commit leadership resources from the +various Rust teams. + +Beyond those goals, however, there are a number of areas with strong potential +for Rust that are in a more exploratory phase, with subcommunities already +exploring the frontiers. Some of these areas are important enough that we should +provide *support*, in two forms: + +- Highlighting the areas and trying to coordinate work within their + communities. For example, in the published form of the roadmap, these "areas + of interest" should be included, with links to ongoing work and points of + coordination. + +- Giving more consideration to feature requests that strongly impact these + areas. For example, numeric computation is a potential domain of interest, and + its one where "const generics" would be a great help—where the lack of that + feature is effectively a blocker. That's not to say we would necessarily + tackle such features; the primary goals have to come first. + +Here are two major areas we should consider supporting this way: + +- **Integration with higher-level languages like Javascript, Ruby, Python, Java + and C#**. Lack of integration in general can be a blocker for Rust adoption in + a given context, and so work on any of these fronts has the potential to + increase adoption. They came up many times in the internals thread. (However, + as explained above, C++ integration is the clear winner and hence primary + focus.) There's existing work on integration for these languages and many + others; we should raise awareness of this work, try to improve coordination + across the integration projects, and consider features that would aid + integration. + +- **Embedded devices**. Rust is a very natural fit for programming + resource-constrained devices, and there are some + [nascent efforts](https://github.com/rust-embedded/) to better organize work + in this area, as well as a + [thread](https://internals.rust-lang.org/t/roadmap-2017-needs-of-no-std-embedded-developers/4096) + on the current significant problems in the domain. Embedded devices likewise + came up repeatedly in the internals thread. It's also a potentially huge + market. At the moment, though, it's far from clear what it will take to + achieve significant production use in the embedded space. It would behoove us + to try to get a clearer picture of this space in 2017. + +## Non-goals + +Finally, it's important that the roadmap "have teeth": we should be focusing on +the goals, and avoid getting distracted by other improvements that, whatever +their appeal, could sap bandwidth and our ability to ship what we believe is +most important in 2017. + +To that end, it's worth making some explicit *non*-goals, to set expectations +and short-circuit discussions: + +- No major new language features, except in service of one of the goals. Cases + that have a very strong impact on the "areas of support" may be considered + case-by-case. + +- No major expansions to `std`, except in service of one of the goals. Cases + that have a very strong impact on the "areas of support" may be considered + case-by-case. + +- No Rust 2.0. In particular, no changes to the language or `std` that could be + perceived as "major breaking changes". We need to be doing everything we can + to foster maturity in Rust, both in reality and in perception, and ongoing + stability is an important part of that story. + +## Additional goals from the internals thread + +There were several strong contenders for additional goals from the +internals thread that we might consider: + +- A goal explicitly for + [systematic expansion of commercial use](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/68); + this proposal takes that as a kind of overarching idea for all of the goals. + +- A goal for Rust infrastructure, which came + [up](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/9) + [several](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/68) + [times](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/5). + While this goal seems quite worthwhile in terms of paying dividends across the + project, in terms of our current subteam makeup it's hard to see how to + allocate resources toward this goal without dropping other important goals. We + might consider forming a dedicated infrastructure team, or somehow organizing + and growing our bandwidth in this area. + +- A goal for progress in areas like + [scientific computing](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/52), + [HPC](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/48). + +After an exhaustive look at the thread, the remaining proposals are in one way +or another covered somewhere in the discussion above. + +# Drawbacks and alternatives +[drawbacks]: #drawbacks + +It's a bit difficult to enumerate the full design space here, given how much +there is we could potentially be doing. + +At a high level, though, the biggest alternatives (and potential for drawbacks) +are probably at the strategic level. This roadmap proposal takes the approach of +(1) focusing on reducing clear blockers to Rust adoption, particularly connected +with productivity and (2) choosing one particular "driver" for adoption to +invest in, namely high-scale servers. The balance between blocker/driver focus +could be shifted—it might be the case that by providing more incentive to use +Rust in a particular domain, people are willing to overlook some of its +shortcomings. + +Another possible blind spot is the conservative take on language expansion, +particularly when it comes to productivity. For example, we could put much +greater emphasis on "metaprogramming", and try to complete Plugins 2.0 +in 2017. That kind of investment *could* pay dividends, since libraries can do +amazing things with plugins that could draw people to Rust. But, as above, the +overall strategy of reducing blockers assumes that what's most needed isn't more +flashy examples of Rust's power, but rather more bread-and-butter work on +reducing friction, improving tooling, and just making Rust easier to use across +the board. + +The roadmap is pretty informed by the survey, systematic outreach, numerous +direct conversations, and general strategic thinking. But there could certainly +be blind spots and biases. It's worth double-checking our inputs. + +# Unresolved questions +[unresolved]: #unresolved-questions + +The main unresolved question is how to break the given goals into more +deliverable pieces of work, but that's a process that will happen after the +overall roadmap is approved. + +Are there other "areas of support" we should consider? Should any of these areas +be elevated to a top-level goal (which would likely involve cutting back on some +other goal)? + +Should we consider some loose way of organizing "special interest groups" to +focus on some of the priorities not part of the official goal set, but where +greater coordination would be helpful? This was suggested +[multiple](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/70) +[times](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/135). + +Finally, there were several strong contenders for additional goals from the +internals thread that we might consider, which are listed at the end of the +goals section. From e59d253ac87318b13c8f751aeb1ac875ed4c8ef2 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 16:08:15 -0700 Subject: [PATCH 1150/1195] Clarify Additional Goals section --- text/0000-roadmap-2017.md | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index 476934b1377..88f9f81ef99 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -6,11 +6,7 @@ # Summary [summary]: #summary -This RFC proposes the *2017 Rust Roadmap*, in accordance with -[RFC 1728](https://github.com/rust-lang/rfcs/pull/1728). The goal of the roadmap -is to lay out a vision for where the Rust project should be in a year's -time. **This year's focus is improving Rust's *productivity*, while retaining its -emphasis on fast, reliable code**. At a high level, by the end of 2017: +This RFC proposes the *2017 Rust Roadmap*, in accordance with [RFC 1728](https://github.com/rust-lang/rfcs/pull/1728). The goal of the roadmap is to lay out a vision for where the Rust project should be in a year's time. **This year's focus is improving Rust's *productivity*, while retaining its emphasis on fast, reliable code**. At a high level, by the end of 2017: * Rust should have a lower learning curve * Rust should have a pleasant edit-compile-debug cycle @@ -22,9 +18,7 @@ emphasis on fast, reliable code**. At a high level, by the end of 2017: * Rust should integrate easily with C++ code * Rust's community should provide mentoring at all levels -The proposal is based on the [2016 survey], systematic outreach, direct -conversations with individual Rust users, and an extensive -[internals thread]. Thanks to everyone who helped with this effort! +The proposal is based on the [2016 survey], systematic outreach, direct conversations with individual Rust users, and an extensive [internals thread]. Thanks to everyone who helped with this effort! [2016 survey]: https://blog.rust-lang.org/2016/06/30/State-of-Rust-Survey-2016.html [internals thread]: https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/ @@ -505,10 +499,11 @@ and short-circuit discussions: to foster maturity in Rust, both in reality and in perception, and ongoing stability is an important part of that story. -## Additional goals from the internals thread +## Other ideas from the internals thread -There were several strong contenders for additional goals from the -internals thread that we might consider: +There were several strong contenders for additional goals from the internals +thread that we might consider. To be clear, these are not currently part of the +proposed goals, but we may want to consider elevating them: - A goal explicitly for [systematic expansion of commercial use](https://internals.rust-lang.org/t/setting-our-vision-for-the-2017-cycle/3958/68); From 31c4f92534651d24498d914b6d95173454becb5a Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 16:22:23 -0700 Subject: [PATCH 1151/1195] Missing period --- text/0000-roadmap-2017.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index 88f9f81ef99..1179ddb40a6 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -157,7 +157,7 @@ stages and (2) avoid "blockers" that prevent people from progressing. At the moment, our most immediate adoption obstacles are mostly about blockers, rather than a lack of drivers: there are people who see potential value in Rust, but worry about issues like productivity, tooling, and maturity standing in the -way of use at scale The roadmap proposes a set of goals largely angled at +way of use at scale. The roadmap proposes a set of goals largely angled at reducing these blockers. However, for Rust to make sense to use in a significant way in production, it From 097fccdea6a148d0ffabcb2916f5ae1095a96d83 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 17:06:43 -0700 Subject: [PATCH 1152/1195] JavaScript --- text/0000-roadmap-2017.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index 1179ddb40a6..b4caefcb640 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -455,7 +455,7 @@ provide *support*, in two forms: Here are two major areas we should consider supporting this way: -- **Integration with higher-level languages like Javascript, Ruby, Python, Java +- **Integration with higher-level languages like JavaScript, Ruby, Python, Java and C#**. Lack of integration in general can be a blocker for Rust adoption in a given context, and so work on any of these fronts has the potential to increase adoption. They came up many times in the internals thread. (However, From a2cb43616195ff05cf636bc4ad664c6cd7bf63c9 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 17:07:49 -0700 Subject: [PATCH 1153/1195] clarity supporting areas --- text/0000-roadmap-2017.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index b4caefcb640..64acf1f1197 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -453,7 +453,7 @@ provide *support*, in two forms: feature is effectively a blocker. That's not to say we would necessarily tackle such features; the primary goals have to come first. -Here are two major areas we should consider supporting this way: +Here are two major areas we should support this way: - **Integration with higher-level languages like JavaScript, Ruby, Python, Java and C#**. Lack of integration in general can be a blocker for Rust adoption in From c403028cc4fbcbdd936d33571f757e676a97fad3 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 19:27:28 -0700 Subject: [PATCH 1154/1195] Clarify production framing --- text/0000-roadmap-2017.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index 64acf1f1197..e31ba3df328 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -29,7 +29,7 @@ The proposal is based on the [2016 survey], systematic outreach, direct conversa There's no end of possible improvements to Rust—so what do we use to guide our thinking? -The core team has tended to view things not in terms of particular features or +The core team has tended to view our strategy not in terms of particular features or aesthetic goals, but instead in terms of **making Rust successful while staying true to its core values**. This basic sentiment underlies much of the proposed roadmap, so let's unpack it a bit. @@ -66,6 +66,12 @@ valuable new way of programming, we should be seeing that benefit in "the real world", in production uses that are significant enough to help sustain Rust's development. +That's not to say we should expect to see this usage *immediately*; there's a +long pipeline for technology adoption, so the effects of our work can take a +while to appear. The framing here is about our long-term aims. We should be +making investments in Rust today that will position it well for this kind of +success in the future. + ### The obstacles to success At this point, we have a fair amount of data about how Rust is reaching its From b7ac67c87c85ee24040df8271b8142245f8162e1 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 19:42:13 -0700 Subject: [PATCH 1155/1195] Fix awkward phrasing --- text/0000-roadmap-2017.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index e31ba3df328..2539ecc1a4f 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -195,9 +195,9 @@ in other languages, it is something most people have to learn from scratch before hitting their stride with Rust. And that often comes on top of other aspects of Rust that may be less familiar. A common refrain is "the first couple of weeks are tough, but it's oh so worth it." How many people are bouncing off -of Rust before they get through those first couple of weeks? How many team leads -are reluctant to introduce Rust because of the training needed? (1 in 4 survey -respondents mentioned the learning curve.) +of Rust in those first couple of weeks? How many team leads are reluctant to +introduce Rust because of the training needed? (1 in 4 survey respondents +mentioned the learning curve.) Here are some strategies we might take to lower the learning curve: From 1d10205dc908c638410ef65bfa3c54d9990a0f7c Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Fri, 21 Oct 2016 19:56:16 -0700 Subject: [PATCH 1156/1195] Clarify extra goals from internals --- text/0000-roadmap-2017.md | 68 +++++++++++++++++++++------------------ 1 file changed, 36 insertions(+), 32 deletions(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index 2539ecc1a4f..5fbe4e19e07 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -180,12 +180,12 @@ Now to the meat of the roadmap: the goals. Each is phrased in terms of a be in one year's time. The details mention some possible avenues toward a solution, but this shouldn't be taken as prescriptive. -> These goals are partly informed from the [internals thread] about the +These goals are partly informed from the [internals thread] about the roadmap. That thread also posed a number of possible additional goals. Of course, part of the work of the roadmap is to allocate our limited resources, which fundamentally means not including some possible goals. Some of the most promising suggestions that didn't make it into the roadmap proposal itself are -included at the end of the Goals section. +included in the Alternatives section. ### Rust should have a lower learning curve @@ -505,9 +505,42 @@ and short-circuit discussions: to foster maturity in Rust, both in reality and in perception, and ongoing stability is an important part of that story. +# Drawbacks and alternatives +[drawbacks]: #drawbacks + +It's a bit difficult to enumerate the full design space here, given how much +there is we could potentially be doing. Instead, we'll take a look at some +alternative high-level strategies, and some additional goals from the internals +thread. + +## Overall strategy + +At a high level, though, the biggest alternatives (and potential for drawbacks) +are probably at the strategic level. This roadmap proposal takes the approach of +(1) focusing on reducing clear blockers to Rust adoption, particularly connected +with productivity and (2) choosing one particular "driver" for adoption to +invest in, namely high-scale servers. The balance between blocker/driver focus +could be shifted—it might be the case that by providing more incentive to use +Rust in a particular domain, people are willing to overlook some of its +shortcomings. + +Another possible blind spot is the conservative take on language expansion, +particularly when it comes to productivity. For example, we could put much +greater emphasis on "metaprogramming", and try to complete Plugins 2.0 +in 2017. That kind of investment *could* pay dividends, since libraries can do +amazing things with plugins that could draw people to Rust. But, as above, the +overall strategy of reducing blockers assumes that what's most needed isn't more +flashy examples of Rust's power, but rather more bread-and-butter work on +reducing friction, improving tooling, and just making Rust easier to use across +the board. + +The roadmap is informed by the survey, systematic outreach, numerous direct +conversations, and general strategic thinking. But there could certainly be +blind spots and biases. It's worth double-checking our inputs. + ## Other ideas from the internals thread -There were several strong contenders for additional goals from the internals +Finally, there were several strong contenders for additional goals from the internals thread that we might consider. To be clear, these are not currently part of the proposed goals, but we may want to consider elevating them: @@ -532,35 +565,6 @@ proposed goals, but we may want to consider elevating them: After an exhaustive look at the thread, the remaining proposals are in one way or another covered somewhere in the discussion above. -# Drawbacks and alternatives -[drawbacks]: #drawbacks - -It's a bit difficult to enumerate the full design space here, given how much -there is we could potentially be doing. - -At a high level, though, the biggest alternatives (and potential for drawbacks) -are probably at the strategic level. This roadmap proposal takes the approach of -(1) focusing on reducing clear blockers to Rust adoption, particularly connected -with productivity and (2) choosing one particular "driver" for adoption to -invest in, namely high-scale servers. The balance between blocker/driver focus -could be shifted—it might be the case that by providing more incentive to use -Rust in a particular domain, people are willing to overlook some of its -shortcomings. - -Another possible blind spot is the conservative take on language expansion, -particularly when it comes to productivity. For example, we could put much -greater emphasis on "metaprogramming", and try to complete Plugins 2.0 -in 2017. That kind of investment *could* pay dividends, since libraries can do -amazing things with plugins that could draw people to Rust. But, as above, the -overall strategy of reducing blockers assumes that what's most needed isn't more -flashy examples of Rust's power, but rather more bread-and-butter work on -reducing friction, improving tooling, and just making Rust easier to use across -the board. - -The roadmap is pretty informed by the survey, systematic outreach, numerous -direct conversations, and general strategic thinking. But there could certainly -be blind spots and biases. It's worth double-checking our inputs. - # Unresolved questions [unresolved]: #unresolved-questions From 8e737ce58022082a1bcfafce41c53ee05719f61c Mon Sep 17 00:00:00 2001 From: Sean Griffin Date: Sat, 10 Dec 2016 05:25:00 -0500 Subject: [PATCH 1157/1195] Add suggestions from withoutboats --- text/0000-allow-self-in-where-clauses.md | 42 +++++++++++++++--------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/text/0000-allow-self-in-where-clauses.md b/text/0000-allow-self-in-where-clauses.md index 64e0c2aee59..da257608c9e 100644 --- a/text/0000-allow-self-in-where-clauses.md +++ b/text/0000-allow-self-in-where-clauses.md @@ -6,8 +6,8 @@ # Summary [summary]: #summary -This RFC proposes allowing the `Self` type to be used in where clauses for trait -implementations, as well as referencing associated types for the trait being +This RFC proposes allowing the `Self` type to be used in every position in trait +implementations, including where clauses and other parameters to the trait being implemented. # Motivation @@ -46,28 +46,40 @@ on the associated type. It would be nice to reduce some of that duplication. # Detailed design [design]: #detailed-design -The first half of this RFC is simple. Inside of a where clause for trait -implementations, `Self` will refer to the type the trait is being implemented -for. It will have the same value as `Self` being used in the body of the trait -implementation. +Instead of blocking `Self` from being used in the "header" of a trait impl, +it will be understood to be a reference to the implementation type. For example, +all of these would be valid: -Accessing associated types will have the same result as copying the body of the -associated type into the place where it's being used. That is to say that it -will assume that all constraints hold, and evaluate to what the type would have -been in that case. Ideally one should never have to write `::SomeType`, but in practice it will likely be required to remove -issues with recursive evaluation. +```rust +impl SomeTrait for SomeType where Self: SomeOtherTrait { } + +impl SomeTrait for SomeType { } + +impl SomeTrait for SomeType where SomeOtherType: SomeTrait { } + +impl SomeTrait for SomeType where Self::AssocType: SomeOtherTrait { + AssocType = SomeOtherType; +} +``` + +If the `Self` type is parameterized by `Self`, an error that the type definition +is recursive is thrown, rather than not recognizing self. + +```rust +// The error here is because this would be Vec>, Vec>>, ... +impl SomeTrait for Vec { } +``` # Drawbacks [drawbacks]: #drawbacks -`Self` is always less explicit than the alternative +`Self` is always less explicit than the alternative. # Alternatives [alternatives]: #alternatives -Not implementing this, or only allowing bare `Self` but not associated types in -where clauses +Not implementing this is an alternative, as is accepting Self only in where clauses +and not other positions in the impl header. # Unresolved questions [unresolved]: #unresolved-questions From 7a446f87b48819ce629cc42350a6bf3b49c787bf Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 13 Dec 2016 15:25:17 -0800 Subject: [PATCH 1158/1195] RFC 1566 is Procedural macros --- text/{0000-proc-macros.md => 1566-proc-macros.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-proc-macros.md => 1566-proc-macros.md} (99%) diff --git a/text/0000-proc-macros.md b/text/1566-proc-macros.md similarity index 99% rename from text/0000-proc-macros.md rename to text/1566-proc-macros.md index 00e5cf32207..ff4ebd14d75 100644 --- a/text/0000-proc-macros.md +++ b/text/1566-proc-macros.md @@ -1,7 +1,7 @@ - Feature Name: procedural_macros - Start Date: 2016-02-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1566 +- Rust Issue: https://github.com/rust-lang/rust/issues/38356 # Summary [summary]: #summary From 4f40ba07f2a0730c188cb5db6b0b9c5887ae1801 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 13 Dec 2016 17:20:06 -0800 Subject: [PATCH 1159/1195] Remove C++ goal, and refactor to 'areas of exploration' --- text/0000-roadmap-2017.md | 155 +++++++++++++++++++------------------- 1 file changed, 77 insertions(+), 78 deletions(-) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index 5fbe4e19e07..f12aa6759f6 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -15,9 +15,14 @@ This RFC proposes the *2017 Rust Roadmap*, in accordance with [RFC 1728](https:/ * Rust should be well-equipped for writing robust, high-scale servers * Rust should have 1.0-level crates for essential tasks * Rust should integrate easily into large build systems -* Rust should integrate easily with C++ code * Rust's community should provide mentoring at all levels +In addition, we should make significant strides in *exploring* two areas where +we're not quite ready to set out specific goals: + +* Integration with other languages, running the gamut from C to JavaScript +* Usage in resource-constrained environments + The proposal is based on the [2016 survey], systematic outreach, direct conversations with individual Rust users, and an extensive [internals thread]. Thanks to everyone who helped with this effort! [2016 survey]: https://blog.rust-lang.org/2016/06/30/State-of-Rust-Survey-2016.html @@ -385,44 +390,6 @@ tooling in support of those practices. And of course, we want to approach this goal with Rust's values in mind, ensuring that first-class access to the crates.io ecosystem is a cornerstone of our eventual story. -### Rust should integrate easily with C++ code - -Rust's current support for interfacing with C is fairly strong, but wrapping a C -library still involves tedious work mirroring declarations and writing C shims -or other glue code. Moreover, many projects that are ripe for Rust integration -are currently using C++, and interfacing with those effectively requires -maintaining an alternative C wrapper for the C++ APIs. This is a problem both -for Rust code that wants to employ existing libraries and for those who want to -integrate Rust into existing C/C++ codebases. - -Why C++ rather than the myriad other languages we might focus on? Of any of the -possible language integrations, C++ came up most consistently across both the -general survey and the commercial outreach. And that's not too surprising: Rust -is well-positioned as a C++ replacement, so it's natural for people with -existing C++ codebases to be looking at Rust but needing a good integration -story for moving incrementally. In addition, many of the features needed for -high quality C/C++ binding support are prerequisites for high quality binding -support in other languages, since runtime internals will generally be accessible -through a C or C++ interface. So in general, C++ seems like the wisest place to -focus on the language integration front for the moment. - -**The goal should be that using a C++ library in Rust is not much harder than -using it in C++**. In other words, it should be possible to directly include C++ -headers (e.g., include! {myproject.hpp}) and have the extern declarations, glue -code, and so forth get generated automatically. This means (eventually) full -support for interfacing with C++ code that uses features like templates, -overloading, classes and virtual calls, and so forth. - -A number of the needed ingredients already exist. With the addition of unions, -Rust itself supports much of the representation infrastructure needed to -describe C and C++ data structures, though gaps (such as bitflags) -remain. Current work on the bindgen tool is extending its support for handling -the layout of more complex C++ data structures (such as classes). Moreover, a -large number of projects have paved the way in terms of identifying best -practices and gaps. What is needed now is to formulate a plan for enabling -better interop, including intermediate milestones that allow early adopters to -make immediate use of each feature as it is added. - ### Rust's community should provide mentoring at all levels The Rust community is awesome, in large part because of how welcoming it is. But @@ -437,50 +404,82 @@ While there's work here for *all* the teams, the community team in particular will continue to focus on early-stage outreach, while other teams will focus on leadership onboarding. -## Areas of support +## Areas of exploration The goals above represent the steps we think are most essential to Rust's -success in 2017, and where we want to commit leadership resources from the -various Rust teams. +success in 2017, and where we are in a position to lay out a fairly concrete vision. Beyond those goals, however, there are a number of areas with strong potential for Rust that are in a more exploratory phase, with subcommunities already -exploring the frontiers. Some of these areas are important enough that we should -provide *support*, in two forms: - -- Highlighting the areas and trying to coordinate work within their - communities. For example, in the published form of the roadmap, these "areas - of interest" should be included, with links to ongoing work and points of - coordination. - -- Giving more consideration to feature requests that strongly impact these - areas. For example, numeric computation is a potential domain of interest, and - its one where "const generics" would be a great help—where the lack of that - feature is effectively a blocker. That's not to say we would necessarily - tackle such features; the primary goals have to come first. - -Here are two major areas we should support this way: - -- **Integration with higher-level languages like JavaScript, Ruby, Python, Java - and C#**. Lack of integration in general can be a blocker for Rust adoption in - a given context, and so work on any of these fronts has the potential to - increase adoption. They came up many times in the internals thread. (However, - as explained above, C++ integration is the clear winner and hence primary - focus.) There's existing work on integration for these languages and many - others; we should raise awareness of this work, try to improve coordination - across the integration projects, and consider features that would aid - integration. - -- **Embedded devices**. Rust is a very natural fit for programming - resource-constrained devices, and there are some - [nascent efforts](https://github.com/rust-embedded/) to better organize work - in this area, as well as a - [thread](https://internals.rust-lang.org/t/roadmap-2017-needs-of-no-std-embedded-developers/4096) - on the current significant problems in the domain. Embedded devices likewise - came up repeatedly in the internals thread. It's also a potentially huge - market. At the moment, though, it's far from clear what it will take to - achieve significant production use in the embedded space. It would behoove us - to try to get a clearer picture of this space in 2017. +exploring the frontiers. Some of these areas are important enough that we want +to call them out explicitly, and will expect ongoing progress over the course of +the year. In particular, the subteams are expected to proactively help organize +and/or carry out explorations in these areas, and by the end of the year we +expect to have greater clarity around Rust's story for these areas, putting us +in a position to give more concrete goals in subsequent roadmaps. + +Here are the two proposed Areas of Exploration. + +### Integration with other languages + +Other languages here includes "low-level" cases like C/C++, and "high-level" +cases like JavaScript, Ruby, Python, Java and C#. Rust adoption often depends on +being able to start using it *incrementally*, and language integration is often +a key to doing so -- an intuition substantiated by data from the survey and +commercial outreach. + +Rust's core support for interfacing with C is fairly strong, but wrapping a C +library still involves tedious work mirroring declarations and writing C shims +or other glue code. Moreover, many projects that are ripe for Rust integration +are currently using C++, and interfacing with those effectively requires +maintaining an alternative C wrapper for the C++ APIs. This is a problem both +for Rust code that wants to employ existing libraries and for those who want to +integrate Rust into existing C/C++ codebases. + +For interfacing with "high-level" languages, there is the additional barrier of +working with a runtime system, which often involves integration with a garbage +collector and object system. There are ongoing projects on these fronts, but +it's early days and there are still a lot of open questions. + +Some potential avenues of exploration include: + +- Continuing work on bindgen, with focus on seamless C and eventually C++ + support. This may involve some FFI-related language extensions (like richer + `repr`). +- Other routes for C/C++ integration. +- Continued expansion of existing projects like + [Helix](https://github.com/rustbridge/helix) and + [Neon](https://github.com/dherman/neon), which may require some language + enhancements. +- Continued work on [GC integration hooks](http://manishearth.github.io/blog/2016/08/18/gc-support-in-rust-api-design/) +- Investigation of object system integrations, including DOM and + [GObject](https://internals.rust-lang.org/t/rust-and-gnome-meeting-notes/4339). + +### Usage in resource-constrained environments + +Rust is a natural fit for programming resource-constrained devices, and +there are some [ongoing efforts](https://github.com/rust-embedded/) to better +organize work in this area, as well as a +[thread](https://internals.rust-lang.org/t/roadmap-2017-needs-of-no-std-embedded-developers/4096) +on the current significant problems in the domain. Embedded devices likewise +came up repeatedly in the internals thread. It's also a potentially huge +market. At the moment, though, it's far from clear what it will take to achieve +significant production use in the embedded space. It would behoove us to try to +get a clearer picture of this space in 2017. + +Some potential avenues of exploration include: + +- Continuing work on [rustup](https://github.com/rust-lang-nursery/rustup.rs/), + [xargo](https://github.com/japaric/xargo) and similar tools for easing + embedded development. +- Land ["std-aware Cargo"](https://github.com/rust-lang/rfcs/pull/1133), making + it easier to experiment with ports of the standard library to new platforms. +- Work on + [scenarios](https://internals.rust-lang.org/t/fleshing-out-libstd-scenarios/4206) + or other techniques for cutting down `std` in various ways, depending on + platform capabilities. +- Develop a story for failable allocation in `std` (i.e., without aborting when + out of memory). ## Non-goals From 8b9e3ad202b9d74c49d8cc772dd9e6bef4d3725a Mon Sep 17 00:00:00 2001 From: "changchun.fan" Date: Fri, 16 Dec 2016 14:14:34 +0800 Subject: [PATCH 1160/1195] Fix typo Fix typo --- text/1566-proc-macros.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1566-proc-macros.md b/text/1566-proc-macros.md index ff4ebd14d75..f1942e5d2be 100644 --- a/text/1566-proc-macros.md +++ b/text/1566-proc-macros.md @@ -85,7 +85,7 @@ The value returned replaces the macro use. Attribute-like: ``` -#[prco_macro_attribute] +#[proc_macro_attribute] pub fn foo(Option, TokenStream) -> TokenStream; ``` From 3e925a53df017ffa338ee2717112f2b5d9bf4024 Mon Sep 17 00:00:00 2001 From: Amanieu d'Antras Date: Thu, 22 Dec 2016 04:14:24 +0000 Subject: [PATCH 1161/1195] Update motivation --- text/0000-movecell.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/0000-movecell.md b/text/0000-movecell.md index 41682f0fba6..e1644fdde97 100644 --- a/text/0000-movecell.md +++ b/text/0000-movecell.md @@ -13,6 +13,10 @@ Extend `Cell` to work with non-`Copy` types. It allows safe inner-mutability of non-`Copy` types without the overhead of `RefCell`'s reference counting. +The key idea of `Cell` is to provide a primitive building block to safely support inner mutability. This must be done while maintaining Rust's aliasing requirements for mutable references. Unlike `RefCell` which enforces this at runtime through reference counting, `Cell` does this statically by disallowing any reference (mutable or immutable) to the data contained in the cell. + +While the current implementation only supports `Copy` types, this restriction isn't actually necessary to maintain Rust's aliasing invariants. The only affected API is the `get` function which, by design, is only usable with `Copy` types. + # Detailed design [design]: #detailed-design From b616749961b61a14b3a65ee2e79618dcd00a13dc Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Sat, 24 Dec 2016 14:27:15 -0500 Subject: [PATCH 1162/1195] The Rust Bookshelf --- text/0000-rust-bookshelf.md | 104 ++++++++++++++++++++++++++++++++++++ 1 file changed, 104 insertions(+) create mode 100644 text/0000-rust-bookshelf.md diff --git a/text/0000-rust-bookshelf.md b/text/0000-rust-bookshelf.md new file mode 100644 index 00000000000..2631d4c7196 --- /dev/null +++ b/text/0000-rust-bookshelf.md @@ -0,0 +1,104 @@ +- Feature Name: N/A +- Start Date: 2016-12-25 +- RFC PR: +- Rust Issue: + +# Summary +[summary]: #summary + +Create a "Rust Bookshelf" of learning resources for Rust. + +* Pull the book out of tree into `rust-lang/book`, which holds the second + edition, currently. +* Pull the nomicon and the reference out of tree and convert them to mdBook. +* Pull the cargo docs out of tree and convert them to mdBook. +* Create a new "Nightly Book" in-tree. +* Provide a path forward for more long-form documentation to be maintained by + the project. + +# Motivation +[motivation]: #motivation + +There are a few independent motivations for this RFC. + +* Separate repos for separate projects. +* Consistency between long-form docs. +* A clear place for unstable documentation, which is now needed for + stabilization. +* Better promoting good resources like the 'nomicon, which may not be as well + known as "the book" is. + +These will be discussed further in the detailed design. + +# Detailed design +[design]: #detailed-design + +Several new repositories will be made, one for each of: + +* The Rustinomicon ("the 'nomicon") +* The Cargo Book +* The Rust Reference Manual + +They will all use mdBook to build. They will have their existing text re-worked +into the format; at first a simple conversion, then more major improvements. +Their currnet text will be removed from the main tree. + +The first edition of the book lives in-tree, but the second edition lives in +`rust-lang/book`. We'll remove the existing text from the tree and move it +into `rust-lang/book`. + +A new book will be created from the "Nightly Rust" section of the book. It +will be called "The Nightly Book," and will contain unstable documentation. +This came up when [trying to document RFC +1623](https://github.com/rust-lang/rust/pull/37928). We don't have a unified +way of handling unstable documentation. This will give it a place to develop, +and part of the stabilization process will be moving documentation from this +book into the other parts of the documentation. + +The nightly book will be organized around `#![feature]`s, so that you can look +up the documentation for each feature, as well as seeing which features +currently exist. + +The landing page on doc.rust-lang.org will show off the full bookshelf, to let +people find the documenation they need. It will also link to their respective +repositories. + +Finally, this creates a path for more books in the future: "the FFI Book" would +be one example of a possibility for this kind of thing. The docs team will +develop critera for accepting a book as part of the official project. + +# How We Teach This +[how-we-teach-this]: #how-we-teach-this + +The landing page on doc.rust-lang.org will show off the full bookshelf, to let +people find the documenation they need. It will also link to their respective +repositories. + +# Drawbacks +[drawbacks]: #drawbacks + +A ton of smaller repos can make it harder to find what goes where. + +Removing work from `rust-lang/rust` means people aren't credited in release +notes any more. I will be opening a separate RFC to address this issue, it's +also an issue without this RFC being accepted. + +Operations are harder, but they have to change to support this use-case for +other reasons, so this does not add any extra burden. + +# Alternatives +[alternatives]: #alternatives + +Do nothing. + +Do only one part of this, instead of the whole thing. + +# Unresolved questions +[unresolved]: #unresolved-questions + +How should the first and second editions of the book live in the same +repository? + +What criteria should we use to accept new books? + +Should we adopt "learning Rust with too many Linked Lists"? From 6afaf52506115ece5dc73ea3ab2ddf458703a463 Mon Sep 17 00:00:00 2001 From: Mike Date: Tue, 27 Dec 2016 09:12:51 +0900 Subject: [PATCH 1163/1195] Trivial - RFC 0195 Associated Items - remove text copied from RFC template Removes some text that appears to be copied from the RFC template that should not have made it in the final RFC text. --- text/0195-associated-items.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0195-associated-items.md b/text/0195-associated-items.md index bd60fde79ec..4540b3a3904 100644 --- a/text/0195-associated-items.md +++ b/text/0195-associated-items.md @@ -1,4 +1,4 @@ -- Start Date: (fill me in with today's date, 2014-08-04) +- Start Date: 2014-08-04 - RFC PR #: [rust-lang/rfcs#195](https://github.com/rust-lang/rfcs/pull/195) - Rust Issue #: [rust-lang/rust#17307](https://github.com/rust-lang/rust/issues/17307) From cb422d76458de424debfed4b90caa97ea814e050 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Fri, 30 Dec 2016 11:37:45 -0500 Subject: [PATCH 1164/1195] Update 1728 text with PR # and Issue description --- text/1728-north-star.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1728-north-star.md b/text/1728-north-star.md index efde7dbba3e..8b815c11e7b 100644 --- a/text/1728-north-star.md +++ b/text/1728-north-star.md @@ -1,7 +1,7 @@ - Feature Name: north_star - Start Date: 2016-08-07 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: #1728 +- Rust Issue: N/A # Summary [summary]: #summary From d897ccbb8dd1a4c4f1bfae776a78a035e877b124 Mon Sep 17 00:00:00 2001 From: Chris Krycho Date: Fri, 30 Dec 2016 17:18:08 -0500 Subject: [PATCH 1165/1195] Remove last core-team reference from 1636 I thought I'd caught them all, but apparently NOPE. --- text/1636-document_all_features.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1636-document_all_features.md b/text/1636-document_all_features.md index a5a89f761ba..75336200056 100644 --- a/text/1636-document_all_features.md +++ b/text/1636-document_all_features.md @@ -198,7 +198,7 @@ To be most effective, this will involve some changes both at a process and core- [RFCs README]: https://github.com/rust-lang/rfcs/blob/master/README.md [What the process is]: https://github.com/rust-lang/rfcs/blob/master/README.md#what-the-process-is -This is also an opportunity to allow/enable non-core-team members with less experience to contribute more actively to _The Rust Programming Language_, _Rust by Example_, and the Rust Reference. +This is also an opportunity to allow/enable community members with less experience to contribute more actively to _The Rust Programming Language_, _Rust by Example_, and the Rust Reference. 1. We should write issues for feature documentation, and may flag them as approachable entry points for new users. From c59aa270f02f37f908e55b664c82321efc7fbfbe Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 5 Jan 2017 09:37:29 -0800 Subject: [PATCH 1166/1195] Add cookbooks, examples and patterns --- text/0000-roadmap-2017.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/text/0000-roadmap-2017.md b/text/0000-roadmap-2017.md index f12aa6759f6..e39a835404e 100644 --- a/text/0000-roadmap-2017.md +++ b/text/0000-roadmap-2017.md @@ -212,6 +212,14 @@ Here are some strategies we might take to lower the learning curve: in the works. The effort is laser-focused on the key areas that trip people up today (ownership, modules, strings, errors). +- **Gathering cookbooks, examples, and patterns**. One way to quickly get + productive in a language is to work from a large set of examples and + known-good patterns that can guide your early work. As a community, we could + push crates to include more substantial example code snippets, and organize + efforts around design patterns and cookbooks. (See + [the commentary on the RFC thread](https://github.com/rust-lang/rfcs/pull/1774#issuecomment-269359228) + for much more detail.) + - **Improved errors**. We've already made some [big strides](https://blog.rust-lang.org/2016/08/10/Shape-of-errors-to-come.html) here, particularly for ownership-related errors, but there's surely more room From 9963385ae13f5bf7781d881047e43230e03869f7 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 5 Jan 2017 09:43:47 -0800 Subject: [PATCH 1167/1195] RFC 1774 is Roadmap for 2017 --- text/{0000-roadmap-2017.md => 1774-roadmap-2017.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-roadmap-2017.md => 1774-roadmap-2017.md} (99%) diff --git a/text/0000-roadmap-2017.md b/text/1774-roadmap-2017.md similarity index 99% rename from text/0000-roadmap-2017.md rename to text/1774-roadmap-2017.md index e39a835404e..1b476365f73 100644 --- a/text/0000-roadmap-2017.md +++ b/text/1774-roadmap-2017.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2016-10-04 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1774 +- Rust Issue: N/A # Summary [summary]: #summary From 8873a1a16a3c08f82d1df4ed962c377441782755 Mon Sep 17 00:00:00 2001 From: Dale Wijnand Date: Fri, 6 Jan 2017 00:50:16 +0000 Subject: [PATCH 1168/1195] Fix links & reveal hidden information --- text/1774-roadmap-2017.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1774-roadmap-2017.md b/text/1774-roadmap-2017.md index 1b476365f73..ae260c2090f 100644 --- a/text/1774-roadmap-2017.md +++ b/text/1774-roadmap-2017.md @@ -242,7 +242,7 @@ Here are some strategies we might take to lower the learning curve: The edit-compile-debug cycle in Rust takes too long, and it's one of the complaints we hear most often from production users. We've laid down a good -foundation with [MIR] (now turned on by default) and [incremental compilation] +foundation with [MIR][] (now turned on by default) and [incremental compilation][] (which recently hit alpha). But we need to continue pushing hard to actually deliver the improvements. And to fully address the problem, **the improvement needs to apply to large Rust projects, not just small or mid-sized benchmarks**. From 91bd0510814003dfc9116c26061650a1786ef6a9 Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Sat, 7 Jan 2017 15:05:25 -0500 Subject: [PATCH 1169/1195] update with details --- text/0000-rust-bookshelf.md | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/text/0000-rust-bookshelf.md b/text/0000-rust-bookshelf.md index 2631d4c7196..884dc499ee3 100644 --- a/text/0000-rust-bookshelf.md +++ b/text/0000-rust-bookshelf.md @@ -16,6 +16,10 @@ Create a "Rust Bookshelf" of learning resources for Rust. * Provide a path forward for more long-form documentation to be maintained by the project. +This is largely about how doc.rust-lang.org is organized; today, it points to +the book, the reference, the nomicon, the error index, and the standard library +docs. This suggests unifying the first three into one thing. + # Motivation [motivation]: #motivation @@ -39,17 +43,20 @@ Several new repositories will be made, one for each of: * The Cargo Book * The Rust Reference Manual +These would live under the `rust-lang` organization. + They will all use mdBook to build. They will have their existing text re-worked into the format; at first a simple conversion, then more major improvements. -Their currnet text will be removed from the main tree. +Their current text will be removed from the main tree. The first edition of the book lives in-tree, but the second edition lives in `rust-lang/book`. We'll remove the existing text from the tree and move it into `rust-lang/book`. -A new book will be created from the "Nightly Rust" section of the book. It -will be called "The Nightly Book," and will contain unstable documentation. -This came up when [trying to document RFC +A new book will be created from the "Nightly Rust" section of the book. It will +be called "The Nightly Book," and will contain unstable documentation for both +rustc and Cargo, as well as material that will end up in the reference. This +came up when [trying to document RFC 1623](https://github.com/rust-lang/rust/pull/37928). We don't have a unified way of handling unstable documentation. This will give it a place to develop, and part of the stabilization process will be moving documentation from this @@ -59,6 +66,12 @@ The nightly book will be organized around `#![feature]`s, so that you can look up the documentation for each feature, as well as seeing which features currently exist. +The nightly book is in-tree so that it runs more often, as part of people's +normal test suite. This doesn't mean that the book won't run on every commit; +just that the out-of-tree books will run mostly in CI, whereas the nightly +book will run when developers do `x.py check`. This is similar to how, today, +Traivs runs a subset of the tests, but buildbot runs all of them. + The landing page on doc.rust-lang.org will show off the full bookshelf, to let people find the documenation they need. It will also link to their respective repositories. @@ -93,6 +106,9 @@ Do nothing. Do only one part of this, instead of the whole thing. +Move all of the "bookshelf" into one repository, rather than individual ones. +This would require a lot more label-wrangling, but might be easier. + # Unresolved questions [unresolved]: #unresolved-questions From d51becd78f00e89e7fef6c21cd5a61075e2026ca Mon Sep 17 00:00:00 2001 From: archshift Date: Sat, 7 Jan 2017 17:48:55 -0800 Subject: [PATCH 1170/1195] Include more "fn literal" alternative details, clear up language --- text/0000-closure-to-fn-coercion.md | 82 ++++++++++++++++++----------- 1 file changed, 51 insertions(+), 31 deletions(-) diff --git a/text/0000-closure-to-fn-coercion.md b/text/0000-closure-to-fn-coercion.md index b2c18a55210..bec4c0accdc 100644 --- a/text/0000-closure-to-fn-coercion.md +++ b/text/0000-closure-to-fn-coercion.md @@ -20,8 +20,8 @@ closure with a certain type signature. It is not possible to define a function while at the same time binding it to a function pointer. -This is mainly used for convenience purposes, but in certain situations -the lack of ability to do so creates a significant amount of boilerplate code. +This is, admittedly, a convenience-motivated feature, but in certain situations +the inability to bind code this way creates a significant amount of boilerplate. For example, when attempting to create an array of small, simple, but unique functions, it would be necessary to pre-define each and every function beforehand: @@ -40,10 +40,10 @@ const foo: [fn(&mut u32); 4] = [ ``` This is a trivial example, and one that might not seem too consequential, but the -code doubles with every new item added to the array. With very many elements, +code doubles with every new item added to the array. With a large amount of elements, the duplication begins to seem unwarranted. -Another option, of course, is to use an array of `Fn` instead of `fn`: +A solution, of course, is to use an array of `Fn` instead of `fn`: ```rust const foo: [&'static Fn(&mut u32); 4] = [ @@ -54,19 +54,28 @@ const foo: [&'static Fn(&mut u32); 4] = [ ]; ``` -And this seems to fix the problem. Unfortunately, however, looking closely one -can see that because we use the `Fn` trait, an extra layer of indirection -is added when attempting to run `foo[n](&mut bar)`. +And this seems to fix the problem. Unfortunately, however, because we use +a reference to the `Fn` trait, an extra layer of indirection is added when +attempting to run `foo[n](&mut bar)`. -Rust must use dynamic dispatch because a closure is secretly a struct that -contains references to captured variables, and the code within that closure -must be able to access those references stored in the struct. +Rust must use dynamic dispatch in this situation; a closure with captures is nothing +but a struct containing references to captured variables. The code associated with a +closure must be able to access those references stored in the struct. -In the above example, though, no variables are captured by the closures, -so in theory nothing would stop the compiler from treating them as anonymous -functions. By doing so, unnecessary indirection would be avoided. In situations -where this function pointer array is particularly hot code, the optimization -would be appreciated. +In situations where this function pointer array is particularly hot code, +any optimizations would be appreciated. More generally, it is always preferable +to avoid unnecessary indirection. And, of course, it is impossible to use this syntax +when dealing with FFI. + +Aside from code-size nits, anonymous functions are legitimately useful for programmers. +In the case of callback-heavy code, for example, it can be impractical to define functions +out-of-line, with the requirement of producing confusing (and unnecessary) names for each. +In the very first example given, `inc_X` names were used for the out-of-line functions, but +more complicated behavior might not be so easily representable. + +Finally, this sort of automatic coercion is simply intuitive to the programmer. +In the `&Fn` example, no variables are captured by the closures, so the theory is +that nothing stops the compiler from treating them as anonymous functions. # Detailed design [design]: #detailed-design @@ -107,17 +116,15 @@ const foo: [fn(&mut u32); 4] = [ ]; ``` -Note that once explicitly assigned to an `Fn` trait, the closure can no longer be -coerced into `fn`, even if it has no captures. Just as we cannot do: +Because there does not exist any item in the language that directly produces +a `fn` type, even `fn` items must go through the process of reification. To +perform the coercion, then, rustc must additionally allow the reification of +unsized closures to `fn` types. The implementation of this is simplified by the +fact that closures' capture information is recorded on the type-level. -```rust -let a: u32 = 0; // Coercion -let b: i32 = a; // Can't re-coerce -let x: *const u32 = &a; // Coercion -let y: &u32 = x; // Can't re-coerce -``` +*Note:* once explicitly assigned to an `Fn` trait, the closure can no longer be +coerced into `fn`, even if it has no captures. -We can't similarly re-coerce a `Fn` trait. ```rust let a: &Fn(u32) -> u32 = |foo: u32| { foo + 1 }; let b: fn(u32) -> u32 = *a; // Can't re-coerce @@ -127,7 +134,7 @@ let b: fn(u32) -> u32 = *a; // Can't re-coerce [drawbacks]: #drawbacks This proposal could potentially allow Rust users to accidentally constrain their APIs. -In the case of a crate, a user accidentally returning `fn` instead of `Fn` may find +In the case of a crate, a user returning `fn` instead of `Fn` may find that their code compiles at first, but breaks when the user later needs to capture variables: ```rust @@ -158,13 +165,13 @@ fn func_general<'a>(&'a mut self) -> impl FnMut() -> u32 { } ``` -This drawback is probably outweighed by convenience, simplicity, and the potential for optimization -that comes with the proposed changes, however. +This aspect is probably outweighed by convenience, simplicity, and the potential for optimization +that comes with the proposed changes. # Alternatives [alternatives]: #alternatives -## Anonymous function syntax +## Function literal syntax With this alternative, Rust users would be able to directly bind a function to a variable, without needing to give the function a name. @@ -184,12 +191,24 @@ const foo: [fn(&mut u32); 4] = [ ``` This isn't ideal, however, because it would require giving new semantics -to `fn` syntax. +to `fn` syntax. Additionally, such syntax would either require explicit return types, +or additional reasoning about the literal's return type. + +```rust +fn(x: bool) { !x } +``` + +The above function literal, at first glance, appears to return `()`. This could be +potentially misleading, especially in situations where the literal is bound to a +variable with `let`. + +As with all new syntax, this alternative would carry with it a discovery barrier. +Closure coercion may be preferred due to its intuitiveness. ## Aggressive optimization This is possibly unrealistic, but an alternative would be to continue encouraging -the use of closures with the `Fn` trait, but conduct heavy optimization to determine +the use of closures with the `Fn` trait, but use static analysis to determine when the used closure is "trivial" and does not need indirection. Of course, this would probably significantly complicate the optimization process, and @@ -199,4 +218,5 @@ checking the disassembly of their program. # Unresolved questions [unresolved]: #unresolved-questions -None +Should we generalize this behavior in the future, so that any zero-sized type that +implements `Fn` can be converted into a `fn` pointer? From a0e42551d0bf83de1303902404a69648473a9160 Mon Sep 17 00:00:00 2001 From: Andrew Browne Date: Sat, 14 Jan 2017 01:31:00 +1000 Subject: [PATCH 1171/1195] Fix typos. --- text/1211-mir.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1211-mir.md b/text/1211-mir.md index 38e2e4ade11..e078cac469b 100644 --- a/text/1211-mir.md +++ b/text/1211-mir.md @@ -495,7 +495,7 @@ this, we desugar an array reference like `y = arr[x]` as follows: } B1: { - x = arr[idx] + y = arr[idx] ... } @@ -519,7 +519,7 @@ intrinsics. These operators yield a tuple of (result, overflow), so B0: { tmp = left + right; - if(tmp.1, B1, B2) + if(tmp.1, B2, B1) } B1: { From 849ca2f76982356661c9d99776b6c816bb92de2e Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 23 Jan 2017 15:22:22 -0800 Subject: [PATCH 1172/1195] RFC 1651 is Extend Cell to non-Copy types --- text/{0000-movecell.md => 1651-movecell.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-movecell.md => 1651-movecell.md} (95%) diff --git a/text/0000-movecell.md b/text/1651-movecell.md similarity index 95% rename from text/0000-movecell.md rename to text/1651-movecell.md index e1644fdde97..ec0bc3360d2 100644 --- a/text/0000-movecell.md +++ b/text/1651-movecell.md @@ -1,7 +1,7 @@ - Feature Name: move_cell - Start Date: 2016-06-15 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1651 +- Rust Issue: https://github.com/rust-lang/rust/issues/39264 # Summary [summary]: #summary From 68854f473a6559834664d7e76b8398c499778896 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 19 Apr 2016 17:28:27 +1200 Subject: [PATCH 1173/1195] Macros by example 2.0 (macro!) Macros by example 2.0. A replacement for `macro_rules!`. This is mostly a placeholder RFC since many of the issues affecting the new macro system are (or will be) addressed in other RFCs. This RFC may be expanded at a later date. --- text/0000-macros.md | 141 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 text/0000-macros.md diff --git a/text/0000-macros.md b/text/0000-macros.md new file mode 100644 index 00000000000..7d70981f60f --- /dev/null +++ b/text/0000-macros.md @@ -0,0 +1,141 @@ +- Feature Name: macro +- Start Date: 2016-04-17 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Macros by example 2.0. A replacement for `macro_rules!`. This is mostly a +placeholder RFC since many of the issues affecting the new macro system are +(or will be) addressed in other RFCs. This RFC may be expanded at a later date. + +Currently in this RFC: + +* That we should have a new macro by example system, +* a new keyword for declaring macros. + +In other RFCs: + +* Naming and modularisation (#1561). + +May be added to this RFC later (or might be separate RFCs): + +* more detailed syntax proposal, +* hygiene improvements. + +Note this RFC does not involve procedural macros (aka syntax extensions). + + +# Motivation +[motivation]: #motivation + +There are several changes to the macro by example system which are desirable but +backwards compatible (See [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561) +for some changes to macro naming and modularisation, I would also like to +propose improvements to hygiene in macros, and some improved syntax). + +In order to maintain Rust's backwards compatibility guarantees, we cannot change +the existing system (`macro_rules!`) to accommodate these changes. I therefore +propose a new macro by example system to live alongside `macro_rules!`. + +Example (possible) improvements: + +```rust +// Naming (RFC 1561) + +fn main() { + a::foo!(...); +} + +mod a { + // Macro privacy (TBA) + pub macro! foo { ... } +} +``` + +```rust +// Relative paths (part of hygiene reform, TBA) + +mod a { + pub macro! foo { ... bar() ... } + fn bar() { ... } +} + +fn main() { + a::foo!(...); // Expansion calls a::bar +} +``` + +```rust +// Syntax (TBA) + +macro! foo($a: ident) => { + return $a + 1; +} +``` + +I believe it is extremely important that moving to the new macro system is as +straightforward as possible for both macro users and authors. This must be the +case so that users make the transition to the new system and we are not left +with two systems forever. + +A goal of this design is that for macro users, there is no difference in using +the two systems other than how macros are named. For macro authors, most macros +that work in the old system should work in the new system with minimal changes. +Macros which will need some adjustment are those that exploit holes in the +current hygiene system. + + +# Detailed design +[design]: #detailed-design + +There will be a new system of macros by example using similar syntax and +semantics to the current `macro_rules!` system. + +A macro by example is declared using the `macro` keyword with the `!` +operator. For example, where a macro `foo` is declared today as `macro_rules! +foo { ... }`, it will be declared using `macro! foo { ... }`. I leave the syntax +of the macro body for later specification. + + +# Drawbacks +[drawbacks]: #drawbacks + +There is a risk that `macro_rules!` is good enough for most users and there is +low adoption of the new system. Possibly worse would be that there is high +adoption but little migration from the old system, leading to us having to +support two systems forever. + + +# Alternatives +[alternatives]: #alternatives + +Make backwards incompatible changes to `macro_rules!`. This is probably a +non-starter due to our stability guarantees. We might be able to make something +work if this was considered desirable. + +Limit ourselves to backwards compatible changes to `macro_rules!`. I don't think +this is worthwhile. It's not clear we can make meaningful improvements without +breaking backwards compatibility. + +Don't use a keyword - either make `macro` not a keyword or use a different word +for the macros by example syntax. + +Use `macro` instead of `macro!` (we might want to use bare `macro` for +procedural macros, not clear if the overlap will be a problem). + +Live with the existing system. + + +# Unresolved questions +[unresolved]: #unresolved-questions + +What to do with `macro_rules`? We will need to maintain it at least until `macro!` +is stable. Hopefully, we can then deprecate it (some time will be required to +migrate users to the new system). Eventually, I hope we can remove `macro_rules!`. +That will take a long time, and would require a 2.0 version of Rust to strictly +adhere to our stability guarantees. + +There are many questions still to be answered as this RFC and some sister RFCs +are developed. From f6f5c41639be75997785c67fe71969e09ec4f28e Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Fri, 11 Nov 2016 16:45:22 +1300 Subject: [PATCH 1174/1195] Update the RFC Only major change is moving from `macro!` to `macro` to declare a macro. --- text/0000-macros.md | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/text/0000-macros.md b/text/0000-macros.md index 7d70981f60f..0786a3d430f 100644 --- a/text/0000-macros.md +++ b/text/0000-macros.md @@ -1,4 +1,4 @@ -- Feature Name: macro +- Feature Name: macro_2_0 - Start Date: 2016-04-17 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -19,10 +19,11 @@ In other RFCs: * Naming and modularisation (#1561). -May be added to this RFC later (or might be separate RFCs): +To come in separate RFCs: * more detailed syntax proposal, -* hygiene improvements. +* hygiene improvements, +* more ... Note this RFC does not involve procedural macros (aka syntax extensions). @@ -50,7 +51,7 @@ fn main() { mod a { // Macro privacy (TBA) - pub macro! foo { ... } + pub macro foo { ... } } ``` @@ -58,7 +59,7 @@ mod a { // Relative paths (part of hygiene reform, TBA) mod a { - pub macro! foo { ... bar() ... } + pub macro foo { ... bar() ... } fn bar() { ... } } @@ -70,7 +71,7 @@ fn main() { ```rust // Syntax (TBA) -macro! foo($a: ident) => { +macro foo($a: ident) => { return $a + 1; } ``` @@ -93,10 +94,10 @@ current hygiene system. There will be a new system of macros by example using similar syntax and semantics to the current `macro_rules!` system. -A macro by example is declared using the `macro` keyword with the `!` -operator. For example, where a macro `foo` is declared today as `macro_rules! -foo { ... }`, it will be declared using `macro! foo { ... }`. I leave the syntax -of the macro body for later specification. +A macro by example is declared using the `macro` keyword. For example, where a +macro `foo` is declared today as `macro_rules! foo { ... }`, it will be declared +using `macro foo { ... }`. I leave the syntax of the macro body for later +specification. # Drawbacks @@ -119,23 +120,19 @@ Limit ourselves to backwards compatible changes to `macro_rules!`. I don't think this is worthwhile. It's not clear we can make meaningful improvements without breaking backwards compatibility. +Use `macro!` instead of `macro` (proposed in an earlier version of this RFC). + Don't use a keyword - either make `macro` not a keyword or use a different word for the macros by example syntax. -Use `macro` instead of `macro!` (we might want to use bare `macro` for -procedural macros, not clear if the overlap will be a problem). - Live with the existing system. # Unresolved questions [unresolved]: #unresolved-questions -What to do with `macro_rules`? We will need to maintain it at least until `macro!` +What to do with `macro_rules`? We will need to maintain it at least until `macro` is stable. Hopefully, we can then deprecate it (some time will be required to migrate users to the new system). Eventually, I hope we can remove `macro_rules!`. That will take a long time, and would require a 2.0 version of Rust to strictly adhere to our stability guarantees. - -There are many questions still to be answered as this RFC and some sister RFCs -are developed. From 7dcb7374aee3281c261510ca5af53399a3df60f5 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 31 Jan 2017 09:27:52 +1300 Subject: [PATCH 1175/1195] Use 'declarative macro' and add note on nomenclature --- text/0000-macros.md | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/text/0000-macros.md b/text/0000-macros.md index 0786a3d430f..cf2a66deed5 100644 --- a/text/0000-macros.md +++ b/text/0000-macros.md @@ -6,14 +6,14 @@ # Summary [summary]: #summary -Macros by example 2.0. A replacement for `macro_rules!`. This is mostly a +Decalrative macros 2.0. A replacement for `macro_rules!`. This is mostly a placeholder RFC since many of the issues affecting the new macro system are (or will be) addressed in other RFCs. This RFC may be expanded at a later date. Currently in this RFC: -* That we should have a new macro by example system, -* a new keyword for declaring macros. +* That we should have a new declarative macro system, +* a new keyword for declaring macros (`macro`). In other RFCs: @@ -31,14 +31,14 @@ Note this RFC does not involve procedural macros (aka syntax extensions). # Motivation [motivation]: #motivation -There are several changes to the macro by example system which are desirable but -backwards compatible (See [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561) +There are several changes to the declarative macro system which are desirable but +not backwards compatible (See [RFC 1561](https://github.com/rust-lang/rfcs/pull/1561) for some changes to macro naming and modularisation, I would also like to propose improvements to hygiene in macros, and some improved syntax). In order to maintain Rust's backwards compatibility guarantees, we cannot change the existing system (`macro_rules!`) to accommodate these changes. I therefore -propose a new macro by example system to live alongside `macro_rules!`. +propose a new declarative macro system to live alongside `macro_rules!`. Example (possible) improvements: @@ -91,14 +91,24 @@ current hygiene system. # Detailed design [design]: #detailed-design -There will be a new system of macros by example using similar syntax and +There will be a new system of declarative macros using similar syntax and semantics to the current `macro_rules!` system. -A macro by example is declared using the `macro` keyword. For example, where a +A declarative macro is declared using the `macro` keyword. For example, where a macro `foo` is declared today as `macro_rules! foo { ... }`, it will be declared using `macro foo { ... }`. I leave the syntax of the macro body for later specification. +## Nomencalture + +Throughout this RFC, I use 'declarative macro' to refer to a macro declared +using declarative (and domain specific) syntax (such as the current +`macro_rules!` syntax). The 'declarative macros' name is in opposition to +'procedural macros', which are declared as Rust programs. The specific +declarative syntax using pattern matching and templating is often referred to as +'macros by example'. + +'Pattern macro' has been suggested as an alterantive for 'declarative macro'. # Drawbacks [drawbacks]: #drawbacks @@ -123,7 +133,7 @@ breaking backwards compatibility. Use `macro!` instead of `macro` (proposed in an earlier version of this RFC). Don't use a keyword - either make `macro` not a keyword or use a different word -for the macros by example syntax. +for declarative macros. Live with the existing system. From 4434bb0cc3a5e163788896b0c1ef2458477caa31 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 31 Jan 2017 09:34:06 +1300 Subject: [PATCH 1176/1195] Merge declarative macros 2.0 --- text/{0000-macros.md => 1584-macros.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-macros.md => 1584-macros.md} (97%) diff --git a/text/0000-macros.md b/text/1584-macros.md similarity index 97% rename from text/0000-macros.md rename to text/1584-macros.md index cf2a66deed5..6f03de24289 100644 --- a/text/0000-macros.md +++ b/text/1584-macros.md @@ -1,7 +1,7 @@ - Feature Name: macro_2_0 - Start Date: 2016-04-17 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [1584](https://github.com/rust-lang/rfcs/pull/1584) +- Rust Issue: [39412](https://github.com/rust-lang/rust/issues/39412) # Summary [summary]: #summary From 6f43d81b6b31a827d2e308da2aa912d1c631ec34 Mon Sep 17 00:00:00 2001 From: Jonas Schievink Date: Mon, 30 Jan 2017 21:38:19 +0100 Subject: [PATCH 1177/1195] Fix typo (Decalrative -> Declarative) --- text/1584-macros.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/1584-macros.md b/text/1584-macros.md index 6f03de24289..8e4276ef25b 100644 --- a/text/1584-macros.md +++ b/text/1584-macros.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Decalrative macros 2.0. A replacement for `macro_rules!`. This is mostly a +Declarative macros 2.0. A replacement for `macro_rules!`. This is mostly a placeholder RFC since many of the issues affecting the new macro system are (or will be) addressed in other RFCs. This RFC may be expanded at a later date. From 9a72f3dc6a44119651ed9088987ea90a6ac31149 Mon Sep 17 00:00:00 2001 From: Henning Kowalk Date: Tue, 31 Jan 2017 16:05:04 +0100 Subject: [PATCH 1178/1195] Corrected some spelling errors. L102: Nomencalture -> Nomenclature L111: alterantive -> alternative --- text/1584-macros.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1584-macros.md b/text/1584-macros.md index 8e4276ef25b..a7e72a23477 100644 --- a/text/1584-macros.md +++ b/text/1584-macros.md @@ -99,7 +99,7 @@ macro `foo` is declared today as `macro_rules! foo { ... }`, it will be declared using `macro foo { ... }`. I leave the syntax of the macro body for later specification. -## Nomencalture +## Nomenclature Throughout this RFC, I use 'declarative macro' to refer to a macro declared using declarative (and domain specific) syntax (such as the current @@ -108,7 +108,7 @@ using declarative (and domain specific) syntax (such as the current declarative syntax using pattern matching and templating is often referred to as 'macros by example'. -'Pattern macro' has been suggested as an alterantive for 'declarative macro'. +'Pattern macro' has been suggested as an alternative for 'declarative macro'. # Drawbacks [drawbacks]: #drawbacks From 569abdf4876c0805064636e7041be8c41caacbd1 Mon Sep 17 00:00:00 2001 From: archshift Date: Wed, 1 Feb 2017 08:10:55 -0800 Subject: [PATCH 1179/1195] Update coercion definition in summary --- text/0000-closure-to-fn-coercion.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-closure-to-fn-coercion.md b/text/0000-closure-to-fn-coercion.md index bec4c0accdc..5e6a9bb5f58 100644 --- a/text/0000-closure-to-fn-coercion.md +++ b/text/0000-closure-to-fn-coercion.md @@ -6,8 +6,8 @@ # Summary [summary]: #summary -A non-capturing (that is, does not `Clone` or `move` any local variables) closure -should be coercable to a function pointer (`fn`). +A closure that does not move, borrow, or otherwise access (capture) local +variables should be coercable to a function pointer (`fn`). # Motivation [motivation]: #motivation From d43f4cabb3c3607b5dad6d9dbc7ba3758a3ebc3a Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Mon, 6 Feb 2017 13:14:46 -0500 Subject: [PATCH 1180/1195] RFC 1828 is "Rust Bookshelf" Closes #1828 --- text/0000-rust-contributors.md | 47 +++++++++++++++++++ ...st-bookshelf.md => 1828-rust-bookshelf.md} | 4 +- 2 files changed, 49 insertions(+), 2 deletions(-) create mode 100644 text/0000-rust-contributors.md rename text/{0000-rust-bookshelf.md => 1828-rust-bookshelf.md} (97%) diff --git a/text/0000-rust-contributors.md b/text/0000-rust-contributors.md new file mode 100644 index 00000000000..ef898e3360a --- /dev/null +++ b/text/0000-rust-contributors.md @@ -0,0 +1,47 @@ +- Feature Name: (fill me in with a unique ident, my_awesome_feature) +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +One para explanation of the feature. + +# Motivation +[motivation]: #motivation + +Why are we doing this? What use cases does it support? What is the expected outcome? + +# Detailed design +[design]: #detailed-design + +This is the bulk of the RFC. Explain the design in enough detail for somebody familiar +with the language to understand, and for somebody familiar with the compiler to implement. +This should get into specifics and corner-cases, and include examples of how the feature is used. + +# How We Teach This +[how-we-teach-this]: #how-we-teach-this + +What names and terminology work best for these concepts and why? +How is this idea best presented—as a continuation of existing Rust patterns, or as a wholly new one? + +Would the acceptance of this proposal change how Rust is taught to new users at any level? +How should this feature be introduced and taught to existing Rust users? + +What additions or changes to the Rust Reference, _The Rust Programming Language_, and/or _Rust by Example_ does it entail? + +# Drawbacks +[drawbacks]: #drawbacks + +Why should we *not* do this? + +# Alternatives +[alternatives]: #alternatives + +What other designs have been considered? What is the impact of not doing this? + +# Unresolved questions +[unresolved]: #unresolved-questions + +What parts of the design are still TBD? diff --git a/text/0000-rust-bookshelf.md b/text/1828-rust-bookshelf.md similarity index 97% rename from text/0000-rust-bookshelf.md rename to text/1828-rust-bookshelf.md index 884dc499ee3..1222116a7c8 100644 --- a/text/0000-rust-bookshelf.md +++ b/text/1828-rust-bookshelf.md @@ -1,7 +1,7 @@ - Feature Name: N/A - Start Date: 2016-12-25 -- RFC PR: -- Rust Issue: +- RFC PR: https://github.com/rust-lang/rfcs/pull/1828 +- Rust Issue: https://github.com/rust-lang/rust/issues/39588 # Summary [summary]: #summary From ee6347ea059f8622a0d6141aefeb3fcba33895fd Mon Sep 17 00:00:00 2001 From: Steve Klabnik Date: Wed, 8 Feb 2017 12:16:50 -0500 Subject: [PATCH 1181/1195] Remove file accidentally included in text/ --- text/0000-rust-contributors.md | 47 ---------------------------------- 1 file changed, 47 deletions(-) delete mode 100644 text/0000-rust-contributors.md diff --git a/text/0000-rust-contributors.md b/text/0000-rust-contributors.md deleted file mode 100644 index ef898e3360a..00000000000 --- a/text/0000-rust-contributors.md +++ /dev/null @@ -1,47 +0,0 @@ -- Feature Name: (fill me in with a unique ident, my_awesome_feature) -- Start Date: (fill me in with today's date, YYYY-MM-DD) -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) - -# Summary -[summary]: #summary - -One para explanation of the feature. - -# Motivation -[motivation]: #motivation - -Why are we doing this? What use cases does it support? What is the expected outcome? - -# Detailed design -[design]: #detailed-design - -This is the bulk of the RFC. Explain the design in enough detail for somebody familiar -with the language to understand, and for somebody familiar with the compiler to implement. -This should get into specifics and corner-cases, and include examples of how the feature is used. - -# How We Teach This -[how-we-teach-this]: #how-we-teach-this - -What names and terminology work best for these concepts and why? -How is this idea best presented—as a continuation of existing Rust patterns, or as a wholly new one? - -Would the acceptance of this proposal change how Rust is taught to new users at any level? -How should this feature be introduced and taught to existing Rust users? - -What additions or changes to the Rust Reference, _The Rust Programming Language_, and/or _Rust by Example_ does it entail? - -# Drawbacks -[drawbacks]: #drawbacks - -Why should we *not* do this? - -# Alternatives -[alternatives]: #alternatives - -What other designs have been considered? What is the impact of not doing this? - -# Unresolved questions -[unresolved]: #unresolved-questions - -What parts of the design are still TBD? From 3f235e17d2fe59989f054ff3d1e0fcabb4765fa1 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Tue, 14 Feb 2017 06:53:10 -0800 Subject: [PATCH 1182/1195] RFC 1558 is Allow coercing non-capturing closures to function pointers --- ...0-closure-to-fn-coercion.md => 1558-closure-to-fn-coercion.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-closure-to-fn-coercion.md => 1558-closure-to-fn-coercion.md} (100%) diff --git a/text/0000-closure-to-fn-coercion.md b/text/1558-closure-to-fn-coercion.md similarity index 100% rename from text/0000-closure-to-fn-coercion.md rename to text/1558-closure-to-fn-coercion.md From deaa49d81c6fb555304ada685022d1f43a65c236 Mon Sep 17 00:00:00 2001 From: Joshua T Kalis Date: Wed, 15 Feb 2017 14:28:46 -0500 Subject: [PATCH 1183/1195] Remove duplicate word from documentation Under the heading "What the process is" a word was duplicated assumingly by accident: > "The sub-team will will either close" ... the above should probably be: > "The sub-team will either close" --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 75d926f318f..db1aa9547a1 100644 --- a/README.md +++ b/README.md @@ -113,7 +113,7 @@ are disingenuous about the drawbacks or alternatives tend to be poorly-received. from the larger community, and the author should be prepared to revise it in response. * Each pull request will be labeled with the most relevant [sub-team]. -* Each sub-team triages its RFC PRs. The sub-team will will either close the PR +* Each sub-team triages its RFC PRs. The sub-team will either close the PR (for RFCs that clearly will not be accepted) or assign it a *shepherd*. The shepherd is a trusted developer who is familiar with the RFC process, who will help to move the RFC forward, and ensure that the right people see and review From 0ba091cbe24b2961d65c71d1310ec43910c38f71 Mon Sep 17 00:00:00 2001 From: Joshua T Kalis Date: Thu, 16 Feb 2017 10:06:32 -0500 Subject: [PATCH 1184/1195] Change PR to pull request There was an inconsistent use of "PR" and "pull request" throughout the document. I believe that I read an article from GitHub that they suggested the use of "pull request" instead of "PR" for the purposes of clarity. --- README.md | 46 ++++++++++++++++++++++++---------------------- 1 file changed, 24 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index db1aa9547a1..5680d0b0e2b 100644 --- a/README.md +++ b/README.md @@ -113,29 +113,31 @@ are disingenuous about the drawbacks or alternatives tend to be poorly-received. from the larger community, and the author should be prepared to revise it in response. * Each pull request will be labeled with the most relevant [sub-team]. -* Each sub-team triages its RFC PRs. The sub-team will either close the PR -(for RFCs that clearly will not be accepted) or assign it a *shepherd*. The -shepherd is a trusted developer who is familiar with the RFC process, who will -help to move the RFC forward, and ensure that the right people see and review -it. +* Each sub-team triages its RFC pull requests. The sub-team will either close +the pull request (for RFCs that clearly will not be accepted) or assign it a +*shepherd*. The shepherd is a trusted developer who is familiar with the RFC +process, who will help to move the RFC forward, and ensure that the right people +see and review it. * Build consensus and integrate feedback. RFCs that have broad support are much more likely to make progress than those that don't receive any comments. The shepherd assigned to your RFC should help you get feedback from Rust developers as well. * The shepherd may schedule meetings with the author and/or relevant stakeholders to discuss the issues in greater detail. -* The sub-team will discuss the RFC PR, as much as possible in the comment -thread of the PR itself. Offline discussion will be summarized on the PR comment -thread. +* The sub-team will discuss the RFC pull request, as much as possible in the +comment thread of the pull request itself. Offline discussion will be summarized +on the pull request comment thread. * RFCs rarely go through this process unchanged, especially as alternatives and drawbacks are shown. You can make edits, big and small, to the RFC to -clarify or change the design, but make changes as new commits to the PR, and -leave a comment on the PR explaining your changes. Specifically, do not squash -or rebase commits after they are visible on the PR. +clarify or change the design, but make changes as new commits to the pull +request, and leave a comment on the pull request explaining your changes. +Specifically, do not squash or rebase commits after they are visible on the pull +request. * Once both proponents and opponents have clarified and defended positions and the conversation has settled, the RFC will enter its *final comment period* -(FCP). This is a final opportunity for the community to comment on the PR and is -a reminder for all members of the sub-team to be aware of the RFC. +(FCP). This is a final opportunity for the community to comment on the pull +request and is a reminder for all members of the sub-team to be aware of the +RFC. * The FCP lasts one week. It may be extended if consensus between sub-team members cannot be reached. At the end of the FCP, the [sub-team] will either accept the RFC by merging the pull request, assigning the RFC a number @@ -181,7 +183,7 @@ through to completion: authors should not expect that other project developers will take on responsibility for implementing their accepted feature. -Modifications to active RFC's can be done in follow-up PR's. We strive +Modifications to active RFC's can be done in follow-up pull requests. We strive to write each RFC in a manner that it will reflect the final design of the feature; but the nature of the process means that we cannot expect every merged RFC to actually reflect what the end result will be at @@ -198,7 +200,7 @@ specific guidelines in the sub-team RFC guidelines for the [language](lang_chang ## Reviewing RFC's [Reviewing RFC's]: #reviewing-rfcs -While the RFC PR is up, the shepherd may schedule meetings with the +While the RFC pull request is up, the shepherd may schedule meetings with the author and/or relevant stakeholders to discuss the issues in greater detail, and in some cases the topic may be discussed at a sub-team meeting. In either case a summary from the meeting will be @@ -206,10 +208,10 @@ posted back to the RFC pull request. A sub-team makes final decisions about RFCs after the benefits and drawbacks are well understood. These decisions can be made at any time, but the sub-team will -regularly issue decisions. When a decision is made, the RFC PR will either be -merged or closed. In either case, if the reasoning is not clear from the -discussion in thread, the sub-team will add a comment describing the rationale -for the decision. +regularly issue decisions. When a decision is made, the RFC pull request will +either be merged or closed. In either case, if the reasoning is not clear from +the discussion in thread, the sub-team will add a comment describing the +rationale for the decision. ## Implementing an RFC @@ -240,9 +242,9 @@ closed (as part of the rejection process). An RFC closed with “postponed” is marked as such because we want neither to think about evaluating the proposal nor about implementing the described feature until some time in the future, and we believe that we can afford to wait until then to do so. Historically, -"postponed" was used to postpone features until after 1.0. Postponed PRs may be -re-opened when the time is right. We don't have any formal process for that, you -should ask members of the relevant sub-team. +"postponed" was used to postpone features until after 1.0. Postponed pull +requests may be re-opened when the time is right. We don't have any formal +process for that, you should ask members of the relevant sub-team. Usually an RFC pull request marked as “postponed” has already passed an informal first round of evaluation, namely the round of “do we From 4cd75a6f822fe0a75fd502308d8f4674c31cebf3 Mon Sep 17 00:00:00 2001 From: king6cong Date: Tue, 21 Feb 2017 18:14:17 +0800 Subject: [PATCH 1185/1195] fix member path --- text/1525-cargo-workspace.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/1525-cargo-workspace.md b/text/1525-cargo-workspace.md index 0cd37ca8517..2022fc825b7 100644 --- a/text/1525-cargo-workspace.md +++ b/text/1525-cargo-workspace.md @@ -256,13 +256,13 @@ configuration necessary, are: ```toml # crates/crate1/Cargo.toml [package] - workspace = "../root" + workspace = "../../root" ``` ```toml # crates/crate2/Cargo.toml [package] - workspace = "../root" + workspace = "../../root" ``` Projects like the compiler will likely need exhaustively explicit configuration. From fd0e5b69d2c9baf2d59660a9b31abd0a72e116da Mon Sep 17 00:00:00 2001 From: Igor Polyakov Date: Thu, 2 Mar 2017 04:57:19 -0800 Subject: [PATCH 1186/1195] Updated the RFC to include lessons learned from #1812 --- text/0000-must-use-functions.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index bef70b716a6..14417ff68bc 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -78,6 +78,13 @@ explicitly opt-in to also having important results, e.g. `#[must_use] fn ok(self) -> Option`. This is a natural generalisation of `#[must_use]` to allow fine-grained control of context sensitive info. +One of the most important use-cases for this would be annotating `PartialEq::{eq, ne}` with `#[must_use]`. + +There's a bug in Android where instead of `modem_reset_flag = 0;` the file affected has `modem_reset_flag == 0;`. +Rust does not do better in this case. If you wrote `modem_reset_flag == false;` the compiler would be perfectly happy and wouldn't warn you. + +See further discussion in [#1812.](https://github.com/rust-lang/rfcs/pull/1812) + # Detailed design If a semicolon discards the result of a function or method tagged with From f0a894f514cfd9f13e52d5e68422bf671484d8b5 Mon Sep 17 00:00:00 2001 From: iopq Date: Sun, 18 Jun 2017 03:37:58 -0700 Subject: [PATCH 1187/1195] Update 0000-must-use-functions.md --- text/0000-must-use-functions.md | 175 -------------------------------- 1 file changed, 175 deletions(-) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index 14417ff68bc..2dcd039eccd 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -41,43 +41,6 @@ test.rs:6 returns_result(); ^~~~~~~~~~~~~~~~~ ``` -However, not every "important" (or, "usually want to use") result can -be a type that can be marked `#[must_use]`, for example, sometimes -functions return unopinionated type like `Option<...>` or `u8` that -may lead to confusion if they are ignored. For example, the `Result` type provides - -```rust -pub fn ok(self) -> Option { - match self { - Ok(x) => Some(x), - Err(_) => None, - } -} -``` - -to view any data in the `Ok` variant as an `Option`. Notably, this -does no meaningful computation, in particular, it does not *enforce* -that the `Result` is `ok()`. Someone reading a line of code -`returns_result().ok();` where the returned value is unused -cannot easily tell if that behaviour was correct, or if something else -was intended, possibilities include: - -- `let _ = returns_result();` to ignore the result (as - `returns_result().ok();` does), -- `returns_result().unwrap();` to panic if there was an error, -- `returns_result().ok().something_else();` to do more computation. - -This is somewhat problematic in the context of `Result` in particular, -because `.ok()` does not really (in the authors opinion) represent a -meaningful use of the `Result`, but it still silences the -`#[must_use]` error. - -These cases can be addressed by allowing specific functions to -explicitly opt-in to also having important results, e.g. `#[must_use] -fn ok(self) -> Option`. This is a natural generalisation of -`#[must_use]` to allow fine-grained control of context sensitive info. - One of the most important use-cases for this would be annotating `PartialEq::{eq, ne}` with `#[must_use]`. There's a bug in Android where instead of `modem_reset_flag = 0;` the file affected has `modem_reset_flag == 0;`. @@ -125,16 +88,6 @@ have this problem, since that sort of "passing-through" causes the outer piece of syntax to be of the `#[must_use]` type, and so is considered for the lint itself. -`Result::ok` is occasionally used for silencing the `#[must_use]` -error of `Result`, i.e. the ignoring of `foo().ok();` is -intentional. However, the most common way do ignore such things is -with `let _ =`, and `ok()` is rarely used in comparison, in most -code-bases: 2 instances in the rust-lang/rust codebase (vs. nearly 400 -text matches for `let _ =`) and 4 in the servo/servo (vs. 55 `let _ -=`). See the appendix for a more formal treatment of this -question. Yet another way to write this is `drop(foo())`, although -neither this nor `let _ =` have the method chaining style. - Marking functions `#[must_use]` is a breaking change in certain cases, e.g. if someone is ignoring their result and has the relevant lint (or warnings in general) set to be an error. This is a general problem of @@ -146,134 +99,6 @@ improving/expanding lints. and blocks, so that `(foo());`, `{ foo() };` and even `if cond { foo() } else { 0 };` are linted. -- Provide an additional method on `Result`, e.g. `fn ignore(self) {}`, so - that users who wish to ignore `Result`s can do so in the method - chaining style: `foo().ignore();`. - # Unresolved questions -- Are there many other functions in the standard library/compiler - would benefit from `#[must_use]`? - Should this be feature gated? - -# Appendix: is this going to affect "most code-bases"? - -(tl;dr: unlikely.) - -@mahkoh stated: - -> -1. I, and most code-bases, use ok() to ignore Result. - -Let's investigate. - -I sampled 50 random projects on [Rust CI](http://rust-ci.org), and -grepped for `\.ok` and `let _ =`. - -## Methodology - -Initially just I scrolled around and clicked things, may 10-15, the -rest were running this JS `var list = $("a"); -window.open(list[(Math.random() * list.length) | 0].href, '_blank')` -to open literally random links in a new window. Links that were not -projects (including 404s from deleted projects) and duplicates were -ignored. The grepping was performed by running `runit url`, where -`runit` is the shell function: - -```bash -function runit () { cd ~/tmp; git clone $1; cd $(basename $1); git grep '\.ok' | wc -l; git grep 'let _ =' | wc -l; } -``` - -If there were any `ok`s, I manually read the grep to see if they were -used on not. - -## Data - -| repo | used `\.ok` | unused `\.ok` | `let _ =` | -|------|-------------|---------------|-----------| -| https://github.com/csherratt/obj | 9 | 0 | 1 | -| https://github.com/csherratt/snowmew | 16 | 0 | 0 | -| https://github.com/bluss/petulant-avenger-graphlibrary | 0 | 0 | 12 | -| https://github.com/uutils/coreutils | 15 | 0 | 1 | -| https://github.com/apoelstra/rust-bitcoin/ | 5 | 0 | 3 | -| https://github.com/emk/abort_on_panic-rs | 0 | 0 | 1 | -| https://github.com/japaric/parallel.rs | 2 | 0 | 0 | -| https://github.com/phildawes/racer | 15 | 0 | 0 | -| https://github.com/zargony/rust-fuse | 7 | 7 | 0 | -| https://github.com/jakub-/rust-instrumentation | 0 | 0 | 2 | -| https://github.com/andelf/rust-iconv | 14 | 0 | 0 | -| https://github.com/pshc/brainrust | 25 | 0 | 0 | -| https://github.com/andelf/rust-2048 | 3 | 0 | 0 | -| https://github.com/PistonDevelopers/vecmath | 0 | 0 | 2 | -| https://github.com/japaric/serial.rs | 1 | 0 | 0 | -| https://github.com/servo/html5ever | 14 | 0 | 1 | -| https://github.com/sfackler/r2d2 | 8 | 0 | 0 | -| https://github.com/jamesrhurst/rust-metaflac | 2 | 0 | 0 | -| https://github.com/arjantop/rust-bencode | 3 | 0 | 1 | -| https://github.com/Azdle/dolos | 0 | 2 | 0 | -| https://github.com/ogham/exa | 2 | 0 | 0 | -| https://github.com/aatxe/irc-services | 0 | 0 | 5 | -| https://github.com/nwin/chatIRC | 0 | 0 | 8 | -| https://github.com/reima/rustboy | 1 | 0 | 2 | - -These had no matches at all for `.ok` or `let _ =`: - -- https://github.com/hjr3/hal-rs, -- https://github.com/KokaKiwi/lua-rs, -- https://github.com/dwrensha/capnpc-rust, -- https://github.com/samdoshi/portmidi-rs, -- https://github.com/PistonDevelopers/graphics, -- https://github.com/vberger/ircc-rs, -- https://github.com/stainless-steel/temperature, -- https://github.com/chris-morgan/phantom-enum, -- https://github.com/jeremyletang/rust-portaudio, -- https://github.com/tikue/rust-ml, -- https://github.com/FranklinChen/rust-tau, -- https://github.com/GuillaumeGomez/rust-GSL, -- https://github.com/andelf/rust-httpc, -- https://github.com/huonw/stable_vec, -- https://github.com/TyOverby/rust-termbox, -- https://github.com/japaric/stats.rs, -- https://github.com/omasanori/mesquite, -- https://github.com/andelf/rust-iconv, -- https://github.com/aatxe/dnd, -- https://github.com/pshc/brainrust, -- https://github.com/vsv/rustulator, -- https://github.com/erickt/rust-mongrel2, -- https://github.com/Geal/rust-csv, -- https://github.com/vhbit/base32-rs, -- https://github.com/PistonDevelopers/event, -- https://github.com/untitaker/rust-atomicwrites. - -Disclosure, `snowmew` and `coreutils` were explicitly selected after -recognising their names (i.e. non-randomly), but this before the -`runit` script was used, and before any grepping was performed in any -of these projects. - -The data in R form if you wish to play with it yourself: -```r -structure(list(used.ok = c(9, 16, 0, 15, 5, 0, 2, 15, 7, 0, 14, -25, 3, 0, 1, 14, 8, 2, 3, 0, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, -0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), unused.ok = c(0, -0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, -0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -0, 0, 0, 0, 0, 0, 0), let = c(1, 0, 12, 1, 3, 1, 0, 0, 0, 2, -0, 0, 0, 2, 0, 1, 0, 0, 1, 0, 0, 5, 8, 2, 0, 0, 0, 0, 0, 0, 0, -0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("used.ok", -"unused.ok", "let"), row.names = c(NA, -50L), class = "data.frame") -``` - -## Analysis - -I will assume that a crate author uses *either* `let _ =` or `\.ok()` -for ignoring `Result`s, but not both. The crates with neither `let _ -=`s nor unused `.ok()`s are not interesting, as they haven't indicated -a preference either way. Removing those leaves 14 crates, 2 of which -use `\.ok()` and 12 of which use `let _ =`. - -The null hypothesis is that `\.ok()` is used at least as much as `let -_ =`. A one-sided binomial test (e.g. `binom.test(c(2, 12), -alternative = "less")` in R) has p-value 0.007, leading me to reject -the null hypothesis and accept the alternative, that `let _ =` is used -more than `\.ok`. - -(Sorry for the frequentist analysis.) From 5b332c25863501b8f61d84525005f85794635e61 Mon Sep 17 00:00:00 2001 From: iopq Date: Sun, 18 Jun 2017 03:47:20 -0700 Subject: [PATCH 1188/1195] Update 0000-must-use-functions.md --- text/0000-must-use-functions.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index 2dcd039eccd..2f96ee8b9c4 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -44,7 +44,12 @@ test.rs:6 returns_result(); One of the most important use-cases for this would be annotating `PartialEq::{eq, ne}` with `#[must_use]`. There's a bug in Android where instead of `modem_reset_flag = 0;` the file affected has `modem_reset_flag == 0;`. -Rust does not do better in this case. If you wrote `modem_reset_flag == false;` the compiler would be perfectly happy and wouldn't warn you. +Rust does not do better in this case. If you wrote `modem_reset_flag == false;` the compiler would be perfectly happy and wouldn't warn you. By marking this function `#[must_use]` the compiler would complain about things like: + +``` + modem_reset_flag == false; //warning + modem_reset_flag = false; //ok +``` See further discussion in [#1812.](https://github.com/rust-lang/rfcs/pull/1812) From a2c640b45e7faa5cc97d3ac7dffbb4cbf3674e5e Mon Sep 17 00:00:00 2001 From: iopq Date: Sun, 18 Jun 2017 04:23:50 -0700 Subject: [PATCH 1189/1195] Update 0000-must-use-functions.md --- text/0000-must-use-functions.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index 2f96ee8b9c4..d7ee2d7be66 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -78,6 +78,14 @@ fn qux() { } ``` +The primary motivation is to mark `PartialEq` functions as `#[must_use]`: + +``` +#[must_use = "the result of testing for equality should not be discarded"] +fn eq(&self, other: &Rhs) -> bool; +``` + +The same thing for `ne`, and also `lt`, `gt`, `ge`, `gt` in `PartialOrd`. There is no reason to discard the results of those operations. # Drawbacks From ab711951bacff2852f9732dc44630f862a7f96c5 Mon Sep 17 00:00:00 2001 From: iopq Date: Sun, 18 Jun 2017 04:28:46 -0700 Subject: [PATCH 1190/1195] Update 0000-must-use-functions.md --- text/0000-must-use-functions.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index d7ee2d7be66..a9fde67764f 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -85,7 +85,7 @@ The primary motivation is to mark `PartialEq` functions as `#[must_use]`: fn eq(&self, other: &Rhs) -> bool; ``` -The same thing for `ne`, and also `lt`, `gt`, `ge`, `gt` in `PartialOrd`. There is no reason to discard the results of those operations. +The same thing for `ne`, and also `lt`, `gt`, `ge`, `gt` in `PartialOrd`. There is no reason to discard the results of those operations. This means the `impl`s of these functions are not changed, it still issues a warning even for a custom `impl`. # Drawbacks @@ -111,6 +111,8 @@ improving/expanding lints. - Adjust the rule to propagate `#[must_used]`ness through parentheses and blocks, so that `(foo());`, `{ foo() };` and even `if cond { foo() } else { 0 };` are linted. + +- Should we let particular `impl`s of a function have this attribute? Current design allows you to attach it inside the declaration of the trait. # Unresolved questions From 8fb6b5daf2c0150c57af0c99dab258bca2283537 Mon Sep 17 00:00:00 2001 From: iopq Date: Mon, 19 Jun 2017 14:06:03 -0700 Subject: [PATCH 1191/1195] Update 0000-must-use-functions.md --- text/0000-must-use-functions.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index a9fde67764f..cca88c24df6 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -47,8 +47,8 @@ There's a bug in Android where instead of `modem_reset_flag = 0;` the file affec Rust does not do better in this case. If you wrote `modem_reset_flag == false;` the compiler would be perfectly happy and wouldn't warn you. By marking this function `#[must_use]` the compiler would complain about things like: ``` - modem_reset_flag == false; //warning - modem_reset_flag = false; //ok + modem_reset_flag == returns_bool(); //warning + modem_reset_flag = returns_bool(); //ok ``` See further discussion in [#1812.](https://github.com/rust-lang/rfcs/pull/1812) From 2711ff86bd000ac626bbb4ecda6bffd7c7682180 Mon Sep 17 00:00:00 2001 From: iopq Date: Mon, 19 Jun 2017 14:06:58 -0700 Subject: [PATCH 1192/1195] Update 0000-must-use-functions.md --- text/0000-must-use-functions.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index cca88c24df6..a9fde67764f 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -47,8 +47,8 @@ There's a bug in Android where instead of `modem_reset_flag = 0;` the file affec Rust does not do better in this case. If you wrote `modem_reset_flag == false;` the compiler would be perfectly happy and wouldn't warn you. By marking this function `#[must_use]` the compiler would complain about things like: ``` - modem_reset_flag == returns_bool(); //warning - modem_reset_flag = returns_bool(); //ok + modem_reset_flag == false; //warning + modem_reset_flag = false; //ok ``` See further discussion in [#1812.](https://github.com/rust-lang/rfcs/pull/1812) From 30f8ab03c83f63e485ca081ae00e7cf99e63a116 Mon Sep 17 00:00:00 2001 From: iopq Date: Mon, 19 Jun 2017 14:07:48 -0700 Subject: [PATCH 1193/1195] Update 0000-must-use-functions.md --- text/0000-must-use-functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index a9fde67764f..eed42ba57cf 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -44,7 +44,7 @@ test.rs:6 returns_result(); One of the most important use-cases for this would be annotating `PartialEq::{eq, ne}` with `#[must_use]`. There's a bug in Android where instead of `modem_reset_flag = 0;` the file affected has `modem_reset_flag == 0;`. -Rust does not do better in this case. If you wrote `modem_reset_flag == false;` the compiler would be perfectly happy and wouldn't warn you. By marking this function `#[must_use]` the compiler would complain about things like: +Rust does not do better in this case. If you wrote `modem_reset_flag == false;` the compiler would be perfectly happy and wouldn't warn you. By marking `PartialEq` `#[must_use]` the compiler would complain about things like: ``` modem_reset_flag == false; //warning From 6f7aa4237ebb652bc7b393f71131ab1b97127b23 Mon Sep 17 00:00:00 2001 From: iopq Date: Mon, 10 Jul 2017 20:58:51 -0700 Subject: [PATCH 1194/1195] updated summary --- text/0000-must-use-functions.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-must-use-functions.md b/text/0000-must-use-functions.md index eed42ba57cf..4f4aa8e6344 100644 --- a/text/0000-must-use-functions.md +++ b/text/0000-must-use-functions.md @@ -7,7 +7,7 @@ Support the `#[must_use]` attribute on arbitrary functions, to make the compiler lint when a call to such a function is ignored. Mark -`Result::{ok, err}` `#[must_use]`. +`PartialEq::{eq, ne}` `#[must_use]` as well as `PartialOrd::{lt, gt, le, ge}`. # Motivation @@ -85,7 +85,7 @@ The primary motivation is to mark `PartialEq` functions as `#[must_use]`: fn eq(&self, other: &Rhs) -> bool; ``` -The same thing for `ne`, and also `lt`, `gt`, `ge`, `gt` in `PartialOrd`. There is no reason to discard the results of those operations. This means the `impl`s of these functions are not changed, it still issues a warning even for a custom `impl`. +The same thing for `ne`, and also `lt`, `gt`, `ge`, `le` in `PartialOrd`. There is no reason to discard the results of those operations. This means the `impl`s of these functions are not changed, it still issues a warning even for a custom `impl`. # Drawbacks From f4b68532206f0a3e0664877841b407ab1302c79a Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Mon, 17 Jul 2017 15:41:04 -0700 Subject: [PATCH 1195/1195] RFC 1940 is Allow `#[must_use]` on functions --- ...{0000-must-use-functions.md => 1940-must-use-functions.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-must-use-functions.md => 1940-must-use-functions.md} (97%) diff --git a/text/0000-must-use-functions.md b/text/1940-must-use-functions.md similarity index 97% rename from text/0000-must-use-functions.md rename to text/1940-must-use-functions.md index 4f4aa8e6344..f631e911876 100644 --- a/text/0000-must-use-functions.md +++ b/text/1940-must-use-functions.md @@ -1,7 +1,7 @@ - Feature Name: none? - Start Date: 2015-02-18 -- RFC PR: [rust-lang/rfcs#886](https://github.com/rust-lang/rfcs/pull/886) -- Rust Issue: (leave this empty) +- RFC PR: https://github.com/rust-lang/rfcs/pull/1940 +- Rust Issue: https://github.com/rust-lang/rust/issues/43302 # Summary