Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose an attribute to allow sizing a fixed sized buffer based on type and element count. #12320

Closed
tannergooding opened this issue Mar 22, 2019 · 37 comments
Labels
area-TypeSystem-coreclr Bottom Up Work Not part of a theme, epic, or user story
Milestone

Comments

@tannergooding
Copy link
Member

Today, C# emulates fixed-sized buffers by emitting an explicitly sized struct with a single-field. This works decently enough for primitive types which have a constant size across varying platforms, but it does not work well for user-defined structs which may have different packing or different sizes across varying platforms/architectures.

This can be worked around by explicitly declaring a struct that contains x elements of type T, but this can quickly become overly verbose for fixed-sized buffers that contain many elements (e.g. 128 or even 1025).

I propose we expose an attribute that the VM will recognize and which it will use to dynamically size the type based on the information given.

This will allow a user to define something similar to:

public struct DXGI_RGB
{
    public float Red;
    public float Green;
    public float Blue;
}

public struct DXGI_GAMMA_CONTROL
{
    public DXGI_RGB Scale;
    public DXGI_RGB Offset;
    public _GammaCurve_e__FixedBuffer GammaCurve;    // DXGI_RGB[1025]

    [FixedSizedBuffer(typeof(DXGI_RGB), count: 1025)]
    public struct _GammaCurve_e__FixedBuffer
    {
        private DXGI_RGB e0;

        public ref DXGI_RGB this[int index] => ref AsSpan()[index];

        public Span<DXGI_RGB> AsSpan() => MemoryMarshal.CreateSpan(ref e0, 1025);
    }
}

This would also be beneficial for the fixed-sized-buffers proposal in C#, as it would avoid the metadata bloat problem that exists: https://github.com/dotnet/csharplang/blob/725763343ad44a9251b03814e6897d87fe553769/proposals/fixed-sized-buffers.md

@tannergooding
Copy link
Member Author

CC. @jkotas, @jaredpar
Also CC. @davidwrighton, @VSadov who (IIRC) had previous thoughts in this area

Does this sound like something that would be feasible?

@tannergooding
Copy link
Member Author

Such an attribute might end up looking like:

[AttributeUsage(AttributeTargets.Struct, AllowMultiple = false, Inherited = false)]
public sealed class FixedSizedBufferAttribute : Attribute
{
    public FixedSizedBufferAttribute(Type type, int count)
    {
        Type = type;
        Count = count;
    }

    public Type Type { get; }

    public int Count { get; }
}

@tannergooding
Copy link
Member Author

This wouldn't quite work for pointer types, since you can't have typeof(T*), but maybe that could be resolved via an additional property: IsPointerType?

@jaredpar
Copy link
Member

        public ref DXGI_RGB this[int index] => ref AsSpan()[index];

        public Span<DXGI_RGB> AsSpan() => MemoryMarshal.CreateSpan(ref e0, 1025);

For anyone reading ... please don't do this in practice. It violates our rules around lifetimes. Assuming @tannergooding just meant this as an example of how to implement this. Not as guidance for how to AV your application 😄

This seems sensible from the language side. But it does mean that we'd end up tying this feature to MemoryMarshal.CreateSpan which would limit its use down level.

@tannergooding
Copy link
Member Author

Assuming @tannergooding just meant this as an example of how to implement this.

Right, Ideally this would all be handled by the compiler safely and behind the scenes (similar to how indexing a single element in a regular fixed sized buffer no longer requires pinning).

@jkotas
Copy link
Member

jkotas commented Mar 22, 2019

This would not work for generics. Attribute arguments cannot use type parameters.

you can't have typeof(T*)

FWIW, this is artificial C# limitation. You can have typeof(T*) in IL just fine.

@jaredpar
Copy link
Member

FWIW, this is artificial C# limitation. You can have typeof(T*) in IL just fine.

Is the behavior here to just return a Type object that represents a specific pointer type or general pointers? If it's just a limitation on our side maybe we should remove it when we remove the limitation of sizeof in unsafe code. Same theme of removing an artificial limitation.

@tannergooding
Copy link
Member Author

This would not work for generics

That's a good point. I wonder if there is a good/clever way to workaround this limitation.

Not supporting generics would probably still cover most interesting interop scenarios, but would make writing certain types of managed performance optimizations a bit troublesome.

FWIW, this is artificial C# limitation. You can have typeof(T*) in IL just fine.

@jaredpar, is this something that I should log a language request for?

@jaredpar
Copy link
Member

@tannergooding

is this something that I should log a language request for?

See https://github.com/dotnet/coreclr/issues/23408#issuecomment-475690498

@VSadov
Copy link
Member

VSadov commented Mar 22, 2019

A while ago we had a proposal for "Inlined Indexable Data", aka "struct arrays".
A feature is similar to arrays in Go but a bit more fitting for C#.
There was even a working prototype, but not a lot of interest at the time.

I think you are really asking for struct arrays :-).

With that you could do the following:

public struct DXGI_RGB
{
    public float Red;
    public float Green;
    public float Blue;
}

public struct DXGI_GAMMA_CONTROL
{
    public DXGI_RGB Scale;
    public DXGI_RGB Offset;

    // IID variable. Can also be a class field, a parameter, a local, a threadstatic or whatever.
    // DXGI_RGB[1025] is just a sugared struct that statically knows its size and can be indexed.
    public DXGI_RGB[1025] data;     
}

DXGI_GAMMA_CONTROL ctrl = default;

Span<DXGI_RGB> span = ctrl.data[..]; //span escape is tied to ref-escape of ctrl, so this is safe.
ManagedHelper(span);

fixed (void* pData = &ctrl.data)
{
      NativeHelper(pData, ctrl.data.Length);
}

. . . 

I think struct arrays could be very useful, in particular for vector/matrix stuff like in graphics and ML.
If there is more interest now, I may resurrect that.

On the language side the feature is somewhat similar to tuples. There are N predefined structs and for N+ we nest. Perhaps a bit simpler than tuples, since there is only one TElement for the whole thing. There is much less type instantiation involved since nesting is tree-like not lists-like.

There could be some support needed on the runtime side -

  • The prototype did not require any extras, but there were concerns with packing. Sequential packing of generic fields is not documented, but we would need to ensure that for underlying structs.
  • As usual some extra magic like hoisting range checks could be nice too.
  • Possibly a different mechanism to express the structs. I just used what worked already.

@tannergooding
Copy link
Member Author

I think you are really asking for struct arrays :-)

Yes, struct (or "value") arrays are basically a natural extension of the fixed-sized buffer proposal. I remember you and I had talked about them a couple of times in the past.

I think one of the "key" differentiators is that fixed-sized buffers are largely used for interop code today and the primary goal of this extension is to remove what can be a pain point for interop code. While value arrays are likely more oriented towards managed code.

I do think this distinction becomes less important if the marshaller is able to deal with generic types (CoreCLR doesn't today, but Mono does) and if it can treat them as blittable (assuming the equivalent non-generic struct would be blittable), but until that happens, it is actually a fairly important distinction to have.

@jkoritzinsky
Copy link
Member

There is an important note relating to fixed-buffers and interop: Up until a few months ago, very little effort was put into enabling good interop scenarios for fixed-buffers. As a result (for compat reasons), we don't actually correctly support non-blittable fixed buffers, and likely never will for compat reasons (see dotnet/coreclr#20194, dotnet/coreclr#20375, dotnet/coreclr#20558, dotnet/coreclr#20575).

@tannergooding
Copy link
Member Author

@jkoritzinsky, wouldn't this new attribute allow it without breaking back-compat? That is, given that it requires explicit VM support and explicitly lists both the type and element count, it should always be marshallable correctly?

@tannergooding
Copy link
Member Author

(unlike the existing attributes, which really relies on heuristics)

@jkoritzinsky
Copy link
Member

If Roslyn were to continue marking char and bool fixed buffers in the old manner then we could easily distinguish between legacy non-blittable fixed buffers and the new fixed buffers, then we wouldn't have that problem.

The problem is that some people may have been incorrectly using Marshal.SizeOf<T> for types containing fixed char buffers. With the current behavior, it would be "correct" in the sense that it would be at least the value of sizeof(). However, if we were to allow char fixed buffers to be accurately calculated, then the value returned by Marshal.SizeOf<T> would (correctly) return a smaller value for ANSI fixed char buffers. But for people who currently use Marshal.SizeOf<T> instead of sizeof() for structures containing ANSI fixed char buffers, they might get a buffer overrun if they allocate space based off the Marshal.SizeOf<T>() route (since it doesn't require unsafe).

If Roslyn were to only mark non-primitive fixed buffers with the new attribute, we could more easily detect it and accurately account for it. However, it would be more difficult to correctly detect a char fixed buffer and fall back to the "legacy" behavior with the new system.

@jkoritzinsky
Copy link
Member

@jkotas can chime in on the back-compat risk of non-blittable char fixed buffers.

@jaredpar
Copy link
Member

In terms of compat. I don't see any change to how we generate existing fixed sized buffers. That is when the fixed is used in an unsafe context then we'd do the same thing. But if the unsafe is removed I'd prefer to generate everything using a new strategy. It would be odd to special case bool and char also not really possible as we'd like to allow type parameters here too which could later be instantiated as those types

@jkoritzinsky
Copy link
Member

If unsafe is removed then we're in new territory I think (there are no C# programs with "non-unsafe" fixed buffers). In that case, we don't need to be bound to back-compat IMO and we can have full correctness.

@jkotas
Copy link
Member

jkotas commented Mar 22, 2019

I wonder if there is a good/clever way to workaround this limitation.

Simple way is "InlineArrayAttribute" on field that repeats the given field N times in the layout, e.g.:

[AttributeUsage(AttributeTargets.Struct, AllowMultiple = false, Inherited = false)]
public sealed class InlineArrayAttribute : Attribute
{
    public InlineArrayAttribute (int length)
    {
        Length = length;
    }

    public int Length { get; }
}

With this attribute public struct DXGI_GAMMA_CONTROL { ...; public DXGI_RGB[1025] data; } would be get compiled into public struct DXGI_GAMMA_CONTROL { ...; [InlineArrayAttribute(1025)] public DXGI_RGB data; }.

A more complex and ambitious option would be to allow const generic arguments: dotnet/csharplang#749 .

here to just return a Type object that represents a specific pointer type or general pointers?

@jaredpar I am not sure what the question is. Here are a few related comments from the Unmanaged constructed types discussion: dotnet/csharplang#1744 (comment) dotnet/csharplang#1744 (comment)

non-blittable

Non-blittable structs are a pit of failure in interop. I do not think we would want to do anything for non-blittable types here (maybe just explicitly throw for them in the runtime to make it clear that they are not expected to work).

we'd end up tying this feature to MemoryMarshal.CreateSpan which would limit its use down level

None of this runtime support would work down level, and our stated strategy is pick the right designs for new features and not limit them by what works downlevel.

@VSadov
Copy link
Member

VSadov commented Mar 22, 2019

Just wondering - could InlineArrayAttribute work with any element type? (managed, generic, ..)
I am asking because it could be a good emit strategy for struct arrays as well.

Yeah. const generic arguments would be a much more general solution to these problems. :-) That would be a massive change though.

@jkotas
Copy link
Member

jkotas commented Mar 22, 2019

could InlineArrayAttribute work with any element type? (managed, generic, ..)

Yes

@VSadov
Copy link
Member

VSadov commented Mar 22, 2019

could InlineArrayAttribute work with any element type? (managed, generic, ..)

Yes

I immediately like the idea.

@tannergooding
Copy link
Member Author

As do I 👍

@jaredpar
Copy link
Member

@jkotas

I am not sure what the question is. Here are a few related comments from the Unmanaged constructed types discussion

Up until this point I hadn't really thought of Type instances that represent pointers. Didn't know if there was a distinction between typeof(void*) and typeof(int*). That was the crux of my question.

@jkotas
Copy link
Member

jkotas commented Mar 25, 2019

Type instances representing pointers are specific, e.g. Console.WriteLine(typeof(int*) == typeof(void*)); prints false.

@VSadov
Copy link
Member

VSadov commented Mar 25, 2019

void* and int* and even void*** are real "storage types".
One can have a field or a local of such type, can overload a method on a difference, can construct System.Type that represent one of these.

The confusion typically arises because for the purpose of computations (as opposed to storage) all pointer values are treated as native unsigned int, which also makes them easily convertible to each other.

They are real types though from the metadata and reflection point of view.

There are some limitations - cannot box, cannot instantiate a generic type, etc.. , but those are just special cases, not because these are not real types.

@davidwrighton
Copy link
Member

I'm not actually convinced that constant integer arguments would be a massive change at the runtime level. For instance, you could do something like this...

interface INumber<T> { T GetValue(); }
struct _0 : INumber<uint> { uint GetValue() <= 0; }
struct _0<T> where T:INumber<uint> : INumber<uint> { uint GetValue() { T t = default(T); t.GetValue() * 10 + 0; }
struct _1 : INumber<uint> { uint GetValue() <= 1; }
struct _1<T> where T:INumber<uint> : INumber<uint> { uint GetValue() { T t = default(T); t.GetValue() * 10 + 1; }
struct _2 : INumber<uint> { uint GetValue() <= 3; }
struct _2<T> where T:INumber<uint> : INumber<uint> { uint GetValue() { T t = default(T); t.GetValue() * 10 + 2; }
// And so on, for all the other decimal values


// This struct has special implementation in the type loader, in that it requires N to be one of the types defined above, and that it makes itself bit enough.
// Also its interop treatment would be special. Something like, if T is blittable, then InlineArray<T, Anything> is blittable, otherwise, its not marshalable, or somethign to that effect.
struct InlineArray<T, N> where N:INumber<uint>
{
    private T _t;

    // Access via api something like this, appropriately annotated so that the C# compiler can do something safe.
    public Span<T> AsSpan() { N num = default(N); MemberyHelpers.CreateSpan(ref _t, num.GetValue()); }
}

// We could then encode an array of size 21 by something like this
struct WithLargeInlineArray
{
    InlineArray<int, _1<_2>> _field;
}
// I don't think this is a particularly pretty encoding, but it feels hideable by C# should we want it.

This idea wouldn't allow arrays of pointers, and array's of ref structs would continue to be impossible, but it would generally work for a lot of other stuff. I actually think this would be simpler in cost to implement in the runtime as we could put all the special casing into the load of the InlineArray type instead of making the InlineArrayAttribute have to work in all the places we do field layout, field handling, interop, etc. Of course, its a much bigger conceptual idea than just an inline array, so that may be a good reason not to do it.

@tannergooding
Copy link
Member Author

This wouldn't quite work for pointer types, since you can't have typeof(T*), but maybe that could be resolved via an additional property: IsPointerType?

Seems this works against 2.9.0 and current master bits for both typeof(T*) (when T is properly constrained) and things like typeof(int*) or typeof(MyStruct*) (when MyStruct is unmanaged): https://sharplab.io/#v2:D4AQDABCCMDcCwAoEBmKAmCBhAPAFQD4IB3ACwFMAnciPCALggFcA7AWwEMWOBzcgEwgBvJBDFQ0IACwQAsgAoAlMNHi1MAJzyALgE8ADuQD2AM3l4AVIsUJEagL5J7QA===

Not sure what lead me to believe it wasn't working. Possibly something misconfigured when I was testing things out.

@tannergooding
Copy link
Member Author

Ah, I think I realize what was wrong.

while you can have typeof(int*) and you can have a T*, you can't have Span<int*>; which means you can't have fixed ID3D12Resource* renderTargets[2]; that means you can't use the Span<T> trick above to deal with this in the interim and the compiler wouldn't be able to do it either (as it looks to be illegal in IL).

@VSadov
Copy link
Member

VSadov commented Apr 18, 2019

FWIW here is the rough implementation of ValueArray in the prototype.

For the simplicity in the prototype the branching factor is 4.
That is just an arbitrary number. Actual implementation would probably use 8 or 16.

namespace System
{
    public interface IValueArray<T>
    {
        int Length { get; }
    }

#pragma warning disable 169 //not writing to fields, ever.

    public struct ValueArray1<T> : IValueArray<T>
    {
        // compiler uses this when read/write LValue is needed
        public static ref T ItemRef(ref ValueArray1<T> array, int index)
            => ref ValueArrayHelpers.ItemRefImpl<T, ValueArray1<T>>(ref array, index);

        // compiler uses this when readable LValue is needed
        public static ref readonly T ItemRefReadonly(in ValueArray1<T> array, int index)
            => ref ValueArrayHelpers.ItemRefReadonlyImpl<T, ValueArray1<T>>(in array, index);

        public int Length => 1;

        // data
        private T item0;

        public T this[int i]
        {
            get
            {
                return ItemRef(ref this, i);
            }
            set
            {
                ItemRef(ref this, i) = value;
            }
        }
    }

    public struct ValueArray2<T> : IValueArray<T>
    {
        // compiler uses this when read/write LValue is needed
        public static ref T ItemRef(ref ValueArray2<T> array, int index)
            => ref ValueArrayHelpers.ItemRefImpl<T, ValueArray2<T>>(ref array, index);

        // compiler uses this when readable LValue is needed
        public static ref readonly T ItemRefReadonly(in ValueArray2<T> array, int index)
            => ref ValueArrayHelpers.ItemRefReadonlyImpl<T, ValueArray2<T>>(in array, index);

        public int Length => 2;

        // data
        private T item0;
        private T item1;

        public T this[int i]
        {
            get
            {
                return ItemRef(ref this, i);
            }
            set
            {
                ItemRef(ref this, i) = value;
            }
        }
    }

    public struct ValueArray3<T> : IValueArray<T>
    {
        // compiler uses this when read/write LValue is needed
        public static ref T ItemRef(ref ValueArray3<T> array, int index)
            => ref ValueArrayHelpers.ItemRefImpl<T, ValueArray3<T>>(ref array, index);

        // compiler uses this when readable LValue is needed
        public static ref readonly T ItemRefReadonly(in ValueArray3<T> array, int index)
           => ref ValueArrayHelpers.ItemRefReadonlyImpl<T, ValueArray3<T>>(in array, index);

        public int Length => 3;

        // data
        private T item0;
        private T item1;
        private T item2;

        public T this[int i]
        {
            get
            {
                return ItemRef(ref this, i);
            }
            set
            {
                ItemRef(ref this, i) = value;
            }
        }
    }

    public struct ValueArrayN<T, T1, T2, T3, T4> : IValueArray<T>
        where T1 : IValueArray<T>
        where T2 : IValueArray<T>
        where T3 : IValueArray<T>
        where T4 : IValueArray<T>
    {
        // compiler uses this when read/write LValue is needed
        public static ref T ItemRef(ref ValueArrayN<T, T1, T2, T3, T4> array, int index)
            => ref ValueArrayHelpers.ItemRefImpl<T, ValueArrayN<T, T1, T2, T3, T4>>(ref array, index);

        // compiler uses this when readable LValue is needed
        public static ref readonly T ItemRefReadonly(in ValueArrayN<T, T1, T2, T3, T4> array, int index)
            => ref ValueArrayHelpers.ItemRefReadonlyImpl<T, ValueArrayN<T, T1, T2, T3, T4>>(in array, index);

        private static int _length = CheckAndComputeLength();

        private static int CheckAndComputeLength()
        {
            Check<T1>();
            Check<T2>();
            Check<T3>();
            Check<T4>();

            return default(T1).Length + default(T2).Length + default(T3).Length + default(T4).Length;
        }

        private static void Check<U>()
        {
            if (typeof(U) != typeof(ValueArray1<T>) &&
                typeof(U) != typeof(ValueArray2<T>) &&
                typeof(U) != typeof(ValueArray3<T>) &&
                typeof(U).GetGenericTypeDefinition() != typeof(ValueArrayN<,,,,>))
            {
                throw new InvalidOperationException("unexpcted instantiation");
            }
        }

        public int Length => _length;

        // Actual data is stored here
        private T1 items0;
        private T2 items1;
        private T3 items2;
        private T4 items3;

        public T this[int i]
        {
            get
            {
                return ItemRef(ref this, i);
            }
            set
            {
                ItemRef(ref this, i) = value;
            }
        }
    }

    // range-checked indexing via ref math
    internal static class ValueArrayHelpers
    {
        internal static ref TElement ItemRefImpl<TElement, TArray>(ref TArray array, int index)
            where TArray : struct, IValueArray<TElement>
        {
            if ((uint)index >= (uint)array.Length) throw new IndexOutOfRangeException();

            ref TElement firstElement = ref System.Runtime.CompilerServices.Unsafe.As<TArray, TElement>(ref array);
            return ref System.Runtime.CompilerServices.Unsafe.Add(ref firstElement, index);
        }

        internal static ref readonly TElement ItemRefReadonlyImpl<TElement, TArray>(in TArray array, int index)
            where TArray : struct, IValueArray<TElement>
        {
            return ref ItemRefImpl<TElement, TArray>(ref System.Runtime.CompilerServices.Unsafe.AsRef(in array), index);
        }
    }
}

@VSadov
Copy link
Member

VSadov commented Apr 18, 2019

Here are some examples of how compiler would translate things like int[10] x = default; x[5] = 123 to use ValueArray as defined above.

The tests use a simpler version of ValueArray with some helpers and validation missing since that is not necessary when testing compiler.

https://github.com/VSadov/roslyn/blob/0763838ece22cb86d3ac09df948fe2182bf4dbe3/src/Compilers/CSharp/Test/Emit/CodeGen/FixedSizeBufferTests.cs#L732

@VSadov
Copy link
Member

VSadov commented Apr 18, 2019

There is some conceptual similarity with @davidwrighton suggestion - map arrays to some generic nested structs.

The main difference is whether to have the contained data as actual fields and thus having VA_1, VA_2,.., VA_N types. vs
having magic INumber type arguments that have effect of "repeat N times" - So you have just one VA type and nest the type arguments.

@tannergooding
Copy link
Member Author

I think the primary problem with those approaches is that it doesn't work with any T and looks like it would add a significant amount of metadata bloat; even if that could ultimately be normalized in some ways.

I think the attribute based approach (like the one @jkotas or I suggested) might be overall better since it should work with any T (even ref structs and pointer types). There may be some limitations around certain types in that they wouldn't work with Span<T> (ref structs and pointer types), but I think the compiler could ultimately make those work without having two completely separate support mechanisms...

@VSadov
Copy link
Member

VSadov commented Apr 18, 2019

InlineArrayAttribute(N) on a field with a meaning "repeat N times" is certainly more versatile.

The approaches just try to varying degrees reduce the changes to the runtime and when possible reuse the mechanisms of type construction/equality that what we already have - i.e. generics.

@davidwrighton
Copy link
Member

One thing to be aware of is that checking for the presence of an attribute on a field is actually a very expensive operation to do during type loading. For instance, checking for the presence of the ThreadStaticAttribute takes about 0.2% of the startup time of applications, and an attribute that had some effect on all field types (not just statics) would have a larger impact. It might reach as high as 0.5% of startup time sacrificed for this feature. If we can place the information about the size of the field into a type, then we won't need to do anywhere near as much checking, which can reduce the runtime costs when the feature isn't in use by a very large amount.

@lostmsu
Copy link

lostmsu commented Apr 27, 2019

@VSadov I actually started implementing something similar a while ago: https://github.com/losttech/TypeNum . How would you compare that to your approach? Any plan to publish Nuget package?

@jkotas
Copy link
Member

jkotas commented Nov 10, 2021

Duplicate of #61135

@jkotas jkotas marked this as a duplicate of #61135 Nov 10, 2021
@jkotas jkotas closed this as completed Nov 10, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-TypeSystem-coreclr Bottom Up Work Not part of a theme, epic, or user story
Projects
None yet
Development

No branches or pull requests

8 participants