Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low level struct updates #5602

Merged
merged 28 commits into from
Jan 13, 2022
Merged

Low level struct updates #5602

merged 28 commits into from
Jan 13, 2022

Conversation

jaredpar
Copy link
Member

@jaredpar jaredpar commented Jan 3, 2022

Updates to the low level struct improvements document. Changes:

  1. Tried to put the compat issue up front. Want readers to understand that as a separate concept as it comes up many times in the feature design
  2. Solved the issue of how non-constructors could return ref struct which had ref fields
  3. Have firm rules now about which life time annotations we will support and how they interact with the existing rules

@jaredpar jaredpar requested a review from a team as a code owner January 3, 2022 20:38

- Allow `ref struct` types to declare `ref` fields.
- Allow the runtime to fully define `Span<T>` using the C# type system and
remove special case type like `ByReference<T>`
- Allow the runtime to fully define `Span<T>` using the C# type system and remove special case type like `ByReference<T>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are tehre any other speical types the runtime has that they could get rid of as well now? (just curious)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No other special types but it's likely that once we have these features we can essentially eliminate the concept of restricted types. They are expressable now in terms of a ref struct with a ref field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW an existing special\restricted types being discussed for supporting dotnet/runtime#45152 (ref struct + reflection with related "fast invoke" feature) is TypedReference which contains a loosely-typed ByReference expressed as a ByReference<byte>.

In order to support dotnet/runtime#45152, a static TypedReference FromRef<T>(ref T value) factory method is proposed which will create a TypedReference that references a ref struct. Today calling such a factory method is not possible with a ref struct since a ref struct type can't be used as a generic parameter, however that is a different issue; also note the method may be called "FromRefStruct" if necessary to distinguish between a ref struct and non-ref-structs so a new constraint like where T : ref struct can be added if necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today calling such a factory method is not possible with a ref struct since a ref struct type can't be used as a generic parameter, however that is a different issue;

There is going to be a proposal soon-ish about how to make ref struct work with generic type arguments. I don't think we fully need that though to enable the TypedReference support. I think as a stop gap measure we could always just adjust the intrinsic __makeref to handle ref struct.

return CreateSpan(ref local);
// Okay
Span<T> span = stackalloc int[1];
return CreateSpan<int>(ref span[0]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait... why is this ok? :) furthermore... if it was legal before... woudl it be bad to become illegal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was legal because there was no mechanism by which CreateSpan(ref int local) could every capture and return the ref. Basically there is no way given the rules of C# and the safe APIs of dotnet/runtime to make the returned Span<T> refer to the passed in value.

Yes you could do this using MemoryMarshal or Unsafe but those are APIs that are considered unsafe. They are essentially outside the bounds of span safety guidelines (and are documented as such)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaredpar, I think the "issue" here is that the code is confusing to read.

If you were to think about this from the perspective of the framework design guidelines and how the BCL names things; the "logical assumption" of CreateSpan<int>(ref T) is that it creates a span over the ref input and so the readers initial "assumption" is that CreateSpan<int>(stackalloc) is returning a span that captures the stack allocation.

The language prevents this today (outside of unsafe code), but its still not how the code reads. This is somewhat made more "confusing" by the fact that part of this doc is meant to enable scenarios where CreateSpan(ref T) capturing the input is possible.

Perhaps it would be more clear if this was CreateSpanThatDoesntCaptureTheInput (e.g. a method without RefThisEscapes) and CreateSpanThatCapturesTheInput (e.g. a method with RefThisEscapes)?
-- or better names, because names are hard and my suggestions are terrible 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this section though is to really drive home the point that

  1. This method creates a Span<T>
  2. It can never, under any circumstances, capture that input value

If I name it say CreateSpanThatDoesntCaptureTheInput it might imply that this particular method does not capture the input but maybe some other method could. Where the problem is there is just no way this can happen, no matter the name or implementatoin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I name it say CreateSpanThatDoesntCaptureTheInput it might imply that this particular method does not capture the input but maybe some other method could. Where the problem is there is just no way this can happen, no matter the name or implementatoin.

I thought that was the point of RefThisEscapes? RefThisEscapes means that you can have a span that captures the ref input. The language would then block it in the case of stackalloc because its unsafe.

That is, there are at least four basic scenarios to consider here, three of which are legal:

// this is legal as `createdSpan` can't capture `ref span[0]` as it isn't attributed with `RefThisEscapes`
Span<T> stackSpan = stackalloc int[1];
var createdSpan = CreateSpanWithoutRefThisEscapes<int>(ref span[0]);

// this is legal as `createdSpan` can't capture `ref span[0]` as it isn't attributed with `RefThisEscapes`
Span<T> stackSpan = stackalloc int[1];
return CreateSpanWithoutRefThisEscapes<int>(ref span[0]);

// this is legal as even though `createdSpan` can capture `ref span[0]` since it is attributed with `RefThisEscapes`, it isn't escaping the scope of `stackSpan`
Span<T> stackSpan = stackalloc int[1];
var createdSpan = CreateSpanWithRefThisEscapes<int>(ref span[0]);

// this is ILLEGAL as `createdSpan` can capture `ref span[0]` and if it did, it would allow `stackSpan` to escape its scope
Span<T> stackSpan = stackalloc int[1];
return CreateSpanWithRefThisEscapes<int>(ref span[0]);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RefThisEscapes means that you can have a span that captures the ref input

No. The only thing that RefThisEscapes alters is that it makes it possible to return this by-ref from instance methods on struct. It does not change your ability to capture a ref as a field.

Copy link
Member

@tannergooding tannergooding Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm getting the attributes mixed up then?

This allows ref fields, which means that you can functionally new Span(ref param) and the created Span will have the scope of the passed input.

So if the CreateSpan call uses that method and returns the span, its lifetime is the same as the input and the compiler does the right analysis.

By default the compiler blocks it, due to existing back-compat rules, etc.

The user manually opts in to supporting that scenario via one of the attributes that are covered by this spec, which is a "breaking change" for existing APIs.

}
```

The reason that all of the above samples are legal is because in the existing design there is no way for the return `Span<T>` to store a reference to the input state of the method call. This is because the span safety rules explicitly depend on `Span<T>` not having a constructor which takes a `ref` parameter and stores it as a field.
All of the above `return` statements are legal by our existing rules because the return of `CreateSpan` is always *safe-to-escape* to the *calling method*. The challenge in this proposal, which comes up many times, is they **must** remain legal in the version of the language which implements these features and / or when `Span<T>` moves to using `ref` fields. The above patterns can legally exist today and cannot become errors when moving to a new language version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and cannot become errors when moving to a new language version.

Would it be ok for them to become errors if there was true memory unsafety before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning wave maybe?

There is, on the BCL side, the interesting case of MemoryMarshal.CreateSpan() which functionally works like the proposed constructor (and via an internal Span(ref T ptr, int length) constructor):

[RefFieldEscapes]
public Span(ref T value) { }

The BCL will need to decide how it works moving forward, including whether it starts erroring just like the proposed constructor, or continues allowing users to do "unsafe" things (potentially with an analyzer, etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be ok for them to become errors if there was true memory unsafety before?

This pattern is completely safe though. The only way this could be unsafe is if Span<T> had a constructor in the form of Span(ref T value) which is explicitly disallowed by the span safety rules. Yes Span<T> does have this CTOR but it's internal and known to be unsafe.

So there is nothing wrong with these patterns. They are safe, legal and honestly expected. The issue is that we must maintain those properties going forward.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning wave maybe?

How would a warning wave help though? Consider that by adding [RefFieldEscapes] to a method you change the semantics from the perspective of the caller. Or consider it this way:

Span<int> MyMethod() {
  int local = 42;
  Span<int> span = CreateSpan(ref local); 
  return span;
}

This is 100% legal today. If we suddenly warn customers into using [RefFieldsEscape] on such methods then it suddenly breaks callers of that method.

Copy link
Member

@CyrusNajmabadi CyrusNajmabadi Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so i understand with that example. You're sayign that that code coudl exist and have legit uses (for example, not captuing 'local' within the returned span). And as such, because of hte legal reasons, we couldn't jsut always start warning/error'ing here.

however, what we can do is have CreateSpan announce its intentions about what happens with the passed in 'ref'. If it ignored it (as an example) it would just do nothing, and the above would continue to be fine. However, if it ends up pushing that parameter into the span it returns, then it shoudl have this attribute (virally spread hopefully from Span<T> inside it) so that hte caller can now error in that case.

--

In other words, the attribute allows cooperation between the caller and callee here on legal cases that should continue working, vs bad cases that ideally should break because they're entire memory unsafe. Is that right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 100% legal today.

Right, and in the case where CreateSpan isn't updated to use some method/constructor marked RefFieldsEscape to create the returned span, it continues being "safe".

  • If CreateSpan were updated to use [RefFieldEscapes] public Span(ref T value) { }, then I would assume the compiler errors as it expects CreateSpan to then be equivalently annotated, etc. This would be a breaking change for a library to make and therefore not recommended.

There are, however, scenarios like with MemoryMarshal.CreateSpan where its known to be "unsafe" and where the compiler (or an analyzer) could surface a warning that it functionally works like [RefFieldEscapes] public Span(ref T value) { } which would help catch problematic cases (particularly for users transitioning from "old unsafe logic" to "new safe logic").

You could also consider cases like return new Span(&local, 1) where its unsafe and legal, but its also a detectable problem and where a warning being surfaced is probably a "good idea" (maybe these are better suited to a library analyzer though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so i understand with that example. You're sayign that that code coudl exist and have legit uses (for example, not captuing 'local' within the returned span). And as such, because of hte legal reasons, we couldn't jsut always start warning/error'ing here.

Exactly. It's perfectly okay to create such an API and give it legitamate uses. That exact pattern exists a few places in dotnet/runtime today (SecureString and some parts of DateTime parsing I was able to find).

In other words, the attribute allows cooperation between the caller and callee here on legal cases that should continue working, vs bad cases that ideally should break because they're entire memory unsafe. Is that right?

yep.

Copy link
Member Author

@jaredpar jaredpar Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are, however, scenarios like with MemoryMarshal.CreateSpan where its known to be "unsafe" and where the compiler (or an analyzer) could surface a warning ...

I'd actually vote in net7.0 that we obsolete this API. Basically say "use new Span<T>" instead please. That would force the caller to use unsafe in the cases where actual unsafe behavior was still needed.

I'm not sure how feasible that is though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep.

Gracias! you probably did write all that out in the PR. but it helped to get that clarity here :)

}
// Error: if this works it breaks the above compat requirements because the implementation
// is now capturing `ref` state.
return Span<T>(ref parameter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure if this is a great example. since the ref is captured, but only in somethign passed back to the caller, this is safe right (and i'm not sure how it breaks a compat requirement "if this works").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try expanding that sample out a bit. Essentially getting through the parts that will work (calling the ctor) and the parts that won't (returning it).

- Else the *safe-to-escape* scope is to the *current method*
- Else the *safe-to-escape* scope of the return is the minimum of
- The existing *safe-to-escape* calculation for method invocation
- All of the *ref-safe-to-escape* values of `ref` and `in` arguments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whooboy. this last part went entirely over my head :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That language takes some getting used to but it's how the original span safety doc was written. I want this doc to provide a map for how these features will fit into our existing spec. Essentially what changes do we need to make to extend our existing rules.

jaredpar and others added 4 commits January 3, 2022 14:23
Co-authored-by: CyrusNajmabadi <cyrus.najmabadi@gmail.com>
@RikkiGibson RikkiGibson self-assigned this Jan 4, 2022

### Provide struct this escape annotation
The rules for the scope of `this` in a `struct` limit the *ref-safe-to-escape* scope to the current method. That means neither `this`, nor any of its fields can return by reference to the caller.
### Provide lifetime annotations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have another annotation for extremely specialized cases where the method implementor wants to prohibit escape, but its not a capturing escape. For instance, the StackAlloc api I proposed in a hackathon a few months ago.

class RuntimeHelpers
{
    static Span<T> StackAlloc<T>(int count) {... }
}

Ideally there would be an annotation that allows specifying that the Span returned from StackAlloc is not safe to escape.

OTOH, this may be so incredibly niche that we should just special case the function and move on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this too. I think it would come down to how many of these functions we ended up creating. If it's only going ot be a small number, say < 5, and they're all effectively runtime intrinsics then possibly we just special case them. If it's more than that though then I'd favor adding an annotation for them.

Co-authored-by: Charles Stoner <10732005+cston@users.noreply.github.com>
Co-authored-by: Aaron Robinson <arobins@microsoft.com>
{
int local = 0;
Span<T> span = new(ref local);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the constructor has [RefFieldsEscape], doesn't this fit the escape scope model since local and span both have the same scope?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. If [RefFieldEscapes] is on the ctor then both local and span have the same safe-to-escape scope

@cston
Copy link
Member

cston commented Jan 7, 2022

    Span<int> span = stakalloc int[42];

stackalloc


Refers to: proposals/low-level-struct-improvements.md:851 in d10b06d. [](commit_id = d10b06d, deletion_comment = False)

Co-authored-by: Charles Stoner <10732005+cston@users.noreply.github.com>

// Ref assembly
ref struct S<T> {
object _o; // force unmanaged
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the additional field change the size of the ref struct for the consumer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally size is calculated at runtime based on the impl assembly, not the reference assembly. It's generally considered acceptable for reference assemblies to add / remove fields. Been a common practice in GenAPI for a number of years.

jaredpar and others added 2 commits January 13, 2022 11:55
Co-authored-by: Charles Stoner <10732005+cston@users.noreply.github.com>
Co-authored-by: Charles Stoner <10732005+cston@users.noreply.github.com>
@jaredpar
Copy link
Member Author

Thanks everyone for their feedback on the proposal. I'm going to proceed with the merge as more detailed planning around this feature is starting to kick off. Need a merged proposal to allow better linking into relevant sections of the doc.

Please don't let that stop you from continuing to ask questions, file issues or suggest changes. More than happy to keep the convo going on this!

@jaredpar jaredpar merged commit b696e13 into dotnet:main Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

8 participants