Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Updating the JIT to support marshaling blittable generics. #23899

Closed
wants to merge 5 commits into from
Closed

Updating the JIT to support marshaling blittable generics. #23899

wants to merge 5 commits into from

Conversation

tannergooding
Copy link
Member

@tannergooding
Copy link
Member Author

tannergooding commented Apr 11, 2019

Still TODO:

  • Add tests covering array marshaling
  • Add tests covering Layout Class marshaling
  • Add tests covering COM interop
  • Add tests covering Vector64<T> marshaling
  • Add tests covering Vector128<T> marshaling
  • Add tests covering Vector256<T> marshaling
  • Add tests covering marshaling generic pointers
  • Add tests covering marshaling ref/out blittable generics
  • Add negative tests covering non-blittable generics
  • Add negative tests explicitly covering bool and char
  • Add negative tests covering Nullable<T>
  • Add negative tests covering Vector<T>

@tannergooding
Copy link
Member Author

Responded to feedback so far; added tests covering arrays, points, ref, and out.

I'm going to be camping until Monday, but I'll see how far I can get this tonight.

@jkoritzinsky
Copy link
Member

Can you make sure to add a test for a non-generic struct containing a field that has a type of an instantiated generic struct?

@tannergooding
Copy link
Member Author

CC. @CarolEidt. You might be interested in this from the perspective of the vector ABI support

@tannergooding
Copy link
Member Author

For the Vector64<T>, Vector128<T>, and Vector256<T> code; we will need tests disabled on some platforms until the JIT ABI can be fixed. For example, the JIT for Windows x64 is not currently returning __m128 in register; but it should be.

Will also need to figure out how to handle cases where a given platform doesn't quite support a given type. For example, ARM does not support Vector128<double>, but ARM64 does. This doesn't really matter from an ABI perspective (since all vectors are passed/returned the same), so it might be okay to skip one of the T for a given platform, provided we have a few other samples that do work...

@tannergooding tannergooding mentioned this pull request Apr 14, 2019
@jakobbotsch
Copy link
Member

You can gain some coverage for marshalling Vector128<T> and Vector256<T> by removing the following array in ABI stress:

// We cannot marshal generic types so we cannot just use all types for pinvokees.
// This can be relaxed once https://github.com/dotnet/coreclr/pull/23899 is merged.
private static readonly TypeEx[] s_pinvokeeCandidateArgTypes =
new[]
{
typeof(byte), typeof(short), typeof(int), typeof(long),
typeof(float), typeof(double),
typeof(S1P), typeof(S2P), typeof(S2U), typeof(S3U),
typeof(S4P), typeof(S4U), typeof(S5U), typeof(S6U),
typeof(S7U), typeof(S8P), typeof(S8U), typeof(S9U),
typeof(S10U), typeof(S11U), typeof(S12U), typeof(S13U),
typeof(S14U), typeof(S15U), typeof(S16U), typeof(S17U),
typeof(S31U), typeof(S32U),
typeof(Hfa1), typeof(Hfa2),
}.Select(t => new TypeEx(t)).ToArray();

Adding Vector64<T> would be fine too.

@tannergooding
Copy link
Member Author

Rebased onto current head.

@jkoritzinsky
Copy link
Member

You might need to add an explicit block for Span<T> and ReadOnlySpan<T>. This change likely will make them seem blittable (which they shouldn't be).

@tannergooding
Copy link
Member Author

You might need to add an explicit block for Span and ReadOnlySpan

I'm guessing this is because ByReference<T> only contains an IntPtr internally, correct?

@jkoritzinsky
Copy link
Member

Yep that's why. We actually probably just want to prohibit ByReference<T> since that would handle both cases.

@jkotas
Copy link
Member

jkotas commented Oct 4, 2019

For example, the JIT for Windows x64 is not currently returning __m128 in register; but it should be.

We should keep blocking Vector128 and similar cases for interop until this is fixed.

@tannergooding
Copy link
Member Author

We should keep blocking Vector128 and similar cases for interop until this is fixed.

I'll update to explicitly block these and ByReference<T> for the time being

@jkoritzinsky
Copy link
Member

Can you also ensure that Nullable<T> is still blocked outside of WinRT scenarios so that we aren't locked into the default behavior in case we decide to do something special with the type?

@maryamariyan
Copy link
Member

Thank you for your contribution. As announced in dotnet/coreclr#27549 this repository will be moving to dotnet/runtime on November 13. If you would like to continue working on this PR after this date, the easiest way to move the change to dotnet/runtime is:

  1. In your coreclr repository clone, create patch by running git format-patch origin
  2. In your runtime repository clone, apply the patch by running git apply --directory src/coreclr <path to the patch created in step 1>

@AaronRobinsonMSFT
Copy link
Member

@tannergooding What are you thoughts for this PR with respect to repo consolidation? I am going to assume you would like this in prior to that happening. Please confirm so @jkoritzinsky and myself can prioritize reviewing this PR.

@tannergooding
Copy link
Member Author

@AaronRobinsonMSFT, I'm not particularly concerned either way. I just had some time to finally add some of the tests that were missing.

If I happen to get all needed test coverage and CI passing, then I'm fine with it being reviewed and merged before the consolidation.

Otherwise, I'm also perfectly comfortable shuffling these changes over into a new PR 😄

@AaronRobinsonMSFT
Copy link
Member

I looked at the product changes and they seem reasonable with no concerns from me. I want to peek a bit more at some of the tests themselves, but otherwise if the CI is green and coverage is what has been agreed upon I am signed off. Thanks for driving this work - much appreciated!

@tannergooding
Copy link
Member Author

coverage is what has been agreed upon I am signed off

I still need tests validating Nullable<T> is blocked for non-WinRT scenarios (and I believe that will require a product change as well).

Ideally, I would also have tests covering (validating they don't work) generic COM types, generic classes with LayoutKind.Sequential, and explicit tests covering MyStruct<ReferenceType>.

Right now I'm trying to validate I have the #includes right for Vector64/128/256 on ARM/ARM64. Namely the headers for Windows vs Unix differed for ARM64 (just fixed) and I believe clang is complaining about NEON not being enabled for ARM32.

@tannergooding tannergooding added the post-consolidation PRs which will be hand ported to dotnet/runtime label Nov 7, 2019
@tannergooding
Copy link
Member Author

tannergooding commented Nov 7, 2019

It looks like for ARM32 we target -mfpu=vfpv3 (https://github.com/dotnet/coreclr/blob/master/configurecompiler.cmake#L542) and even support softfp as a fallback.

arm_neon.h therefore can't be included as this needs to be -mfpu=neon instead. Is there any particular reason we target armv7-a but don't also target neon? It was my understanding that, outside of some edge scenarios, armv7a implied neon support (maybe @TamarChristinaArm could comment on this)...

Edit: Looks like GCC documents -mfpu=neon to be an alias for -mfpu=neon-vfpv3 (https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html). LLVM doesn't look to explicitly document these switches so I would guess they are part of the switches that are compatible with GCC

@tannergooding
Copy link
Member Author

tannergooding commented Nov 8, 2019

@CarolEidt, were you planning on having Vector<T> an ABI primitive (like Vector128<T>/Vector256<T>) or a user-defined struct (like Vector2/Vector3/Vector4)?

Just want to make sure I add the appropriate tests here and disallow it for the time being if necessary 😄

@jkotas
Copy link
Member

jkotas commented Nov 8, 2019

Vector<T> should be prohibited from interop. There is no way to use it for interop in reasonable way since it has unpredictable size.

@tannergooding
Copy link
Member Author

@jkotas, what would be the appropriate way to handle Vector<T>*.

That is, as of C# 8 since Vector<T>* is now legal code (and prior to that in IL), so public static extern void Method(Vector<int>* pValue) is currently valid (even without this change).

Should that just be allowed and people are on their own or should we poison even pointers to Vector<T> from being marshalled (this would be a technically breaking change, but I doubt anyone would be broken in practice)?

In a somewhat similar vein, Vector128<T> is "valid" even for unsupported T. For example Vector128<bool> or even Vector128<Vector256<int>> is "fine". This is true even in managed code where = default and = new Vector128<T>() works, but all instance methods will throw due to an unsupported T.

@jkotas
Copy link
Member

jkotas commented Nov 8, 2019

we poison even pointers to Vector from being marshalled

Unmanaged pointers to anything are fine with me.

Do we prohibit any unmanaged pointers in interop today? I would not mind removing any restrictions if there are any.

@tannergooding
Copy link
Member Author

We prohibit bool* and pointers to anything containing a bool. It gives Pointers cannot reference marshaled structures. Use ByRef instead.'

image

There may be others, but that is the one that immediately pops to mind.

@jkoritzinsky
Copy link
Member

We prohibit unmanaged pointers to non-blittable types in parameters and return values.

We've never been able to do so to structs because of the fact that the checking happened in the type loader and we'd end up with a seemingly cyclical type. For example, in the type below:

unsafe struct D
{
     D* ptr;
}

We couldn't validate if D was blittable since we were still loading D.

In the parameter and return value case, the types are fully loaded so we can do the validation.

If we want to allow marshalling unmanaged pointers to nonblittable "unmanaged" types as parameters and return values, I'd be ok with removing that restriction.

@CarolEidt
Copy link

On x86 we can always pass both Vector128 and Vector64 correctly even if there are no intrinsics defined/available on those types.

Is that true even when FEATURE_SIMD is disabled? If so, that's great and helps things 😄

It is not currently true, but I think that we should make it so. What I meant by "We can always ..." was "We should be able to always ..."

However, it's worth noting we can't currently pass Vector64 correctly. It technically corresponds to __m64 and by value returns (on 32-bit Windows at least) needs to be passed in mm0. We have no support for in the JIT today for these MMX registers as MMX is considered legacy and is effectively deprecated.

Right - I'd forgotten that the MMX registers are aliased to the FP stack registers :-( Although we could theoretically support that (we currently have to return float values on the stack for x86, it probably isn't worth it.

@tannergooding
Copy link
Member Author

tannergooding commented Nov 9, 2019

There looks to be some kind of bug in Unsafe.AddByteOffset and Unsafe.Add:
image

Still digging in.

@tannergooding
Copy link
Member Author

tannergooding commented Nov 9, 2019

I forgot to clear the MMX state for the Vector64 native code (which uses __m64), tests were passing locally after I fixed that.

@tannergooding
Copy link
Member Author

tannergooding commented Nov 10, 2019

Ok, I should just need to add explicit tests covering (and ensuring they don't work) classes with LayoutKind.Sequential and generic COM types now and then all requested scenarios should be covered.

I'm going to log an issue for enabling unmanaged pointers to non-blittable types (#23899 (comment)) and follow up with it in a subsequent PR.

@tannergooding tannergooding marked this pull request as ready for review November 10, 2019 21:29
@tannergooding
Copy link
Member Author

CC. @AaronRobinsonMSFT, @jkoritzinsky, @jkotas.

I believe this should be ready for review now. I've done the following:

@tannergooding
Copy link
Member Author

The vast majority of changes here are tests covering the same set of scenarios:

  • ref/out/in/pointer, by value, and array parameters
  • ref/ref readonly returns

The types covered by the tests then touch:

  • float/double, uint/long, and bool/char

These validate that various important ABI conventions are being correctly handled (such as HFA/HVA, two field structs that can be passed in register, etc).

|| m_pMT->HasSameTypeDefAs(MscorlibBinder::GetClass(CLASS__VECTOR64T))
|| m_pMT->HasSameTypeDefAs(MscorlibBinder::GetClass(CLASS__VECTOR128T))
|| m_pMT->HasSameTypeDefAs(MscorlibBinder::GetClass(CLASS__VECTOR256T))
#ifndef CROSSGEN_COMPILE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ifndef necessary? We try to avoid logic differences like this for crossgen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector<T> is blocked from even being type loaded during crossgen (due to its variable size): https://github.com/dotnet/coreclr/blob/master/src/vm/methodtablebuilder.cpp#L1182

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, make sense. I have not realized that this is to avoid failing here 100% of time during crossgen. Comment may be nice

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a matching change to be done for this in crossgen2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. Looking at crossgen2, I don't see any specific handling of Vector<T> and disallowing it as a type.

I did find a section for the FieldLayoutAlgorithm: https://github.com/dotnet/coreclr/blob/master/src/tools/crossgen2/ILCompiler.ReadyToRun/Compiler/ReadyToRunCompilerContext.cs#L50
However, that field layout only supports Vector64/128/256<T>: https://github.com/dotnet/coreclr/blob/master/src/tools/crossgen2/Common/Compiler/VectorFieldLayoutAlgorithm.cs

The only logic I can find for Vector<T> is in the SystemVStructClassificator where it doesn't treat it as an EightBytes struct: https://github.com/dotnet/coreclr/blob/master/src/tools/crossgen2/Common/JitInterface/SystemVStructClassificator.cs#L273

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the easiest way to test Crossgen2 on an specific test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed up a commit which I believe handles this, but it's not been tested (and doesn't look to be covered by CI currently).

@TamarChristinaArm
Copy link

Please correct me if I'm wrong, but my understanding of SVE is that there are a number of restrictions on them including treating them as incomplete types in higher level languages (and not allowing sizeof or use in arrays, etc).

That's correct, the SVE ACLE types we have are incomplete, sizeless and definite.

Additionally, the size can be changed at execution time by modifying ZCR_ELx.LEN

Not by anything user-mode. While technically the kernel is allowed to change it, the expectation is that it won't do this after the process has started. You can have processes with different VLs on the same system though.

However, the reciever/return value is always going to be the "sizeless" type and the instructions themsleves are always the same, regardless of the "actual size"; is that correct?

Yeah, "sizeless" of a specific type. So the way I saw it was that Vector<T> would represent all the SVE sizeless types, e.g. svint8_t would be Vector<Int8>. As in, you don't know how large the vector is but you do know it's element size.

SVE also allows compiling code for a specific VL, in which case in ACLE you are then allowed to cast between the incomplete and a known complete type.

I believe that differs from x86 (and therefore Vector, it being cross-platform) in that on x86 you have different instructions, different types, and even different registers (even if some are subsets of others) for Vector128 vs Vector256.

Ah, wait, I think I'm missing something here.. Is Vector<T> not an actual type but a name for the "grouping" of all Vector*<T> types? But yes for SVE you wouldn't have different function or registers for different VLs.

If/when we expose SVE, I would imagine it would need to be an ARM specific type (under S.R.I.Arm) rather than a cross-platform type (under S.R.Intrinsics, like Vector64, Vector128, and Vector256) and there would need to be very specialized handling to ensure it always goes through the appropriate instructions for access, etc.

Hmm perhaps... that said there are other ISAs other than SVE which have the same understanding as VL agnostic types. So I wouldn't necessarily say this type would need to be SVE specific.

@tannergooding
Copy link
Member Author

@TamarChristinaArm, I'ved forked the discussion off to: https://github.com/dotnet/coreclr/issues/27814

@tannergooding
Copy link
Member Author

Closing as this has been ported to dotnet/runtime repo.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Interop post-consolidation PRs which will be hand ported to dotnet/runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Marshal should be able to handle generic types
8 participants