-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add AVX10v2 API to add Avx10.2 support #109083
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
The following instructions which are part of Avx10.2 are not mentioned above. These fall under mostly 2 groups - 16 bit floating point and FMA instructions `
|
Haven't finished going through the list, but as initial feedback:
|
Thank you. I will leave you a comment when I have made all required changes. |
@tannergooding Thanks for the review. About the nomenclature for These instructions convert four, eight or sixteen packed single-precision floating-point values in the Let me know what you think. |
I'll need to think about it more. It is important we document the behavior which is conversion to It's functionally doing a |
True. I was thinking on similar lines |
I have updated the names and made the other changes. For the and |
Hi Tanner - have you decided on how you want the 'widen' API's to be named? |
I think we should default to the verbose name, which is the most consistent with our other APIs and the least problematic. We'll likely discuss some of the alternatives in API review and it wouldn't hurt to have them listed. Notably we have |
I've updated the names for these, Do you think it makes sense to have 'Widen' in the name for the accumulated dot product ones as well? |
@tannergooding What are the next steps for this? |
I've filtered out the namespace System.Runtime.Intrinsics.X86
{
/// <summary>Provides access to X86 AVX10.1 hardware instructions via intrinsics</summary>
[Intrinsic]
[CLSCompliant(false)]
public abstract class Avx10v2 : Avx10v1
{
// VPDPBSSD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedByteDotProduct(vector128<sbyte> left, Vector128<sbyte> right) => AccumulatedByteDotProduct(left, right, acc);
// VPDPBSUD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedByteDotProduct(vector128<sbyte> left, Vector128<byte> right) => AccumulatedByteDotProduct(left, right, acc);
// VPDPBUUD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedByteDotProduct(vector128<byte> left, Vector128<byte> right) => AccumulatedSignedByteDotProduct(left, right, acc);
// VPDPBSSD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedByteDotProduct(Vector256<sbyte> left, Vector256<sbyte> right) => AccumulatedByteDotProduct(left, right, acc);
// VPDPBSUD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedByteDotProduct(Vector256<sbyte> left, Vector256<byte> right) => AccumulatedSignedByteDotProduct(left, right, acc);
// VPDPBUUD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedByteDotProduct(Vector256<byte> left, Vector256<byte> right) => AccumulatedByteDotProduct(left, right, acc);
// VPDPBSSDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedByteDotProductWithSaturation(vector128<sbyte> left, Vector128<sbyte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPBSUDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedByteDotProductWithSaturation(vector128<sbyte> left, Vector128<byte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPBUUDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedByteDotProductWithSaturation(vector128<byte> left, Vector128<byte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPBSSDS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedByteDotProductWithSaturation(Vector256<sbyte> left, Vector256<sbyte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPBSUDS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedByteDotProductWithSaturation(Vector256<sbyte> left, Vector256<byte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPBUUDS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedByteDotProductWithSaturation(Vector256<byte> left, Vector256<byte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPWSUD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedInt16DotProduct(vector128<short> left, Vector128<ushort> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWUSD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedInt16DotProduct(vector128<ushort> left, Vector128<short> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWUUD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedInt16DotProduct(vector128<ushort> left, Vector128<ushort> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWSUD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedInt16DotProduct(Vector256<short> left, Vector256<ushort> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWUSD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedInt16DotProduct(Vector256<ushort> left, Vector256<short> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWUUD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedInt16DotProduct(Vector256<ushort> left, Vector256<ushort> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWSUDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedInt16DotProductWithSaturation(vector128<short> left, Vector128<ushort> right) => AccumulatedInt16DotProductWithSaturation(left, right, acc);
// VPDPWUSDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedInt16DotProductWithSaturation(vector128<ushort> left, Vector128<short> right) => AccumulatedInt16DotProductWithSaturation(left, right, acc);
// VPDPWUUDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector128<int> AccumulatedInt16DotProductWithSaturation(vector128<ushort> left, Vector128<ushort> right) => AccumulatedInt16DotProductWithSaturation(left, right, acc);
// VPDPWSUDS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedInt16DotProductWithSaturation(Vector256<short> left, Vector256<ushort> right) => AccumulatedSaturatedSignedShortDotProduct(left, right, acc);
// VPDPWUSDS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedInt16DotProductWithSaturation(Vector256<ushort> left, Vector256<short> right) => AccumulatedSaturatedSignedShortDotProduct(left, right, acc);
// VPDPWUUDS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst
public static Vector256<int> AccumulatedInt16DotProductWithSaturation(Vector256<ushort> left, Vector256<ushort> right) => AccumulatedInt16DotProductWithSaturation(left, right, acc);
[Intrinsic]
public abstract class V512 : Avx10v1.V512
{
// VPDPWSUD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector512<int> AccumulatedInt16DotProduct(Vector512<short> left, Vector512<ushort> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWUSD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector512<int> AccumulatedInt16DotProduct(Vector512<ushort> left, Vector512<short> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWUUD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector512<int> AccumulatedInt16DotProduct(Vector512<ushort> left, Vector512<ushort> right) => AccumulatedInt16DotProduct(left, right, acc);
// VPDPWSUDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector512<int> AccumulatedInt16DotProductWithSaturation(Vector512<short> left, Vector512<short> right) => AccumulatedInt16DotProductWithSaturation(left, right, acc);
// VPDPWUSDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector512<int> AccumulatedInt16DotProductWithSaturation(Vector512<short> left, Vector512<ushort> right) => AccumulatedInt16DotProductWithSaturation(left, right, acc);
// VPDPWUUDS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst
public static Vector512<int> AccumulatedInt16DotProductWithSaturation(Vector512<ushort> left, Vector512<ushort> right) => AccumulatedInt16DotProductWithSaturation(left, right, acc);
// VPDPBSSD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> AccumulatedByteDotProduct(Vector512<sbyte> left, Vector512<sbyte> right) => AccumulatedSByteDotProduct(left, right, acc);
// VPDPBSUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> AccumulatedByteDotProduct(Vector512<sbyte> left, Vector512<byte> right) => AccumulatedSByteDotProduct(left, right, acc);
// VPDPBUUD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> AccumulatedByteDotProduct(Vector512<byte> left, Vector512<byte> right) => AccumulatedSByteDotProduct(left, right, acc);
// VPDPBSSDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> AccumulatedByteDotProductWithSaturation(Vector512<sbyte> left, Vector512<sbyte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPBSUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> AccumulatedByteDotProductWithSaturation(Vector512<sbyte> left, Vector512<byte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
// VPDPBUUDS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst
public static Vector512<int> AccumulatedByteDotProductWithSaturation(Vector512<byte> left, Vector512<byte> right) => AccumulatedByteDotProductWithSaturation(left, right, acc);
}
}
} |
namespace System.Runtime.Intrinsics.X86
{
/// <summary>Provides access to X86 AVX10.1 hardware instructions via intrinsics</summary>
[Intrinsic]
[CLSCompliant(false)]
public abstract class Avx10v2 : Avx10v1
{
internal Avx10v2() { }
public static new bool IsSupported { get => IsSupported; }
// VMINMAXPD xmm1{k1}{z}, xmm2, xmm3/m128/m64bcst, imm8
public static Vector128<double> MinMax(Vector128<double> left, Vector128<double> right, [ConstantExpected] byte control) => MinMax(left, right, mode);
// VMINMAXPD ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst {sae}, imm8
public static Vector256<double> MinMax(Vector256<double> left, Vector256<double> right, [ConstantExpected] byte control) => MinMax(left, right, mode);
// VMINMAXPS xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst, imm8
public static Vector128<float> MinMax(Vector128<float> left, Vector128<float> right, [ConstantExpected] byte control) => MinMax(left, right, mode);
// VMINMAXPS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst {sae}, imm8
public static Vector256<float> MinMax(Vector256<float> left, Vector256<float> right, [ConstantExpected] byte control) => MinMax(left, right, mode);
// VMINMAXSD xmm1{k1}{z}, xmm2, xmm3/m64 {sae}, imm8
public static Vector128<double> MinMaxScalar(Vector128<double> left, Vector128<double> right, [ConstantExpected] byte control) => MinMaxScalar(left, right, mode);
// VMINMAXSS xmm1{k1}{z}, xmm2, xmm3/m32 {sae}, imm8
public static Vector128<float> MinMaxScalar(Vector128<float> left, Vector128<float> right, [ConstantExpected] byte control) => MinMaxScalar(left, right, mode);
// VADDPD ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst {er}
public static Vector256<double> Add(Vector256<double> left, Vector256<double> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Add(left, right, mode);
// VADDPS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst {er}
public static Vector256<float> Add(Vector256<float> left, Vector256<float> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Add(left, right, mode);
// VDIVPD ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst {er}
public static Vector256<double> Divide(Vector256<double> left, Vector256<double> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Divide(left, right, mode);
// VDIVPS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst {er}
public static Vector256<float> Divide(Vector256<float> left, Vector256<float> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Divide(left, right, mode);
// VCVTPS2IBS xmm1{k1}{z}, xmm2/m128/m32bcst
public static Vector128<int> ConvertToByteWithSaturationAndWidenToInt32(Vector128<float> value) => ConvertToByteWithSaturationAndWidenToInt32(value);
// VCVTPS2IBS ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<int> ConvertToByteWithSaturationAndWidenToInt32(Vector256<float> value) => ConvertToByteWithSaturationAndWidenToInt32(value);
// VCVTPS2IBS ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<int> ConvertToByteWithSaturationAndWidenToInt32(Vector256<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToByteWithSaturationAndWidenToInt32(value, mode);
// VCVTPS2IUBS xmm1{k1}{z}, xmm2/m128/m32bcst
public static Vector128<uint> ConvertToByteWithSaturationAndWidenToUInt32(Vector128<float> value) => ConvertToByteWithSaturationAndWidenToUInt32(value);
// VCVTPS2IUBS ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<uint> ConvertToByteWithSaturationAndWidenToUInt32(Vector256<float> value) => ConvertToByteWithSaturationAndWidenToUInt32(value);
// VCVTPS2IUBS ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<uint> ConvertToByteWithSaturationAndWidenToUInt32(Vector256<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToByteWithSaturationAndWidenToUInt32(value, mode);
// VCVTTPS2IBS xmm1{k1}{z}, xmm2/m128/m32bcst
public static Vector128<int> ConvertToByteWithTruncatedSaturationAndWidenToInt32(Vector128<float> value) => ConvertToByteWithTruncationSaturationAndWidenToInt32(value);
// VCVTTPS2IBS ymm1{k1}{z}, ymm2/m256/m32bcst {sae}
public static Vector256<int> ConvertToByteWithTruncatedSaturationAndWidenToInt32(Vector256<float> value) => ConvertToVector256SByteWithTruncationSaturation(value);
// VCVTTPS2IUBS xmm1{k1}{z}, xmm2/m128/m32bcst
public static Vector128<uint> ConvertToByteWithTruncatedSaturationAndWidenToUInt32(Vector128<float> value) => ConvertToByteWithTruncatedSaturationAndWidenToUInt32(value);
// VCVTTPS2IUBS ymm1{k1}{z}, ymm2/m256/m32bcst {sae}
public static Vector256<uint> ConvertToByteWithTruncatedSaturationAndWidenToUInt32(Vector256<float> value) => ConvertToByteWithTruncatedSaturationAndWidenToUInt32(value);
// VMOVD xmm1, xmm2/m32
public static Vector128<uint> ConvertScalarToVector128UInt32(Vector128<uint> value) => ConvertScalarToVector128UInt32(value);
// VMOVW xmm1, xmm2/m16
public static Vector128<ushort> ConvertScalarToVector128UInt16(Vector128<ushort> value) => ConvertScalarToVector128UInt16(value);
//The below instructions are those where
//embedded rouding support have been added
//to the existing API
// VCVTDQ2PS ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<float> ConvertToVector256Single(Vector256<int> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256Single(value, mode);
// VCVTPD2DQ xmm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector128<int> ConvertToVector128Int32(Vector256<double> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector128Int32(value, mode);
// VCVTPD2PS xmm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector128<float> ConvertToVector128Single(Vector256<double> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector128Single(value, mode);
// VCVTPD2QQ ymm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector256<long> ConvertToVector256Int64(Vector256<double> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256Int64(value, mode);
// VCVTPD2UDQ xmm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector128<uint> ConvertToVector128UInt32(Vector256<double> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector128UInt32(value, mode);
// VCVTPD2UQQ ymm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector256<ulong> ConvertToVector256UInt64(Vector256<double> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256UInt64(value, mode);
// VCVTPS2DQ ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<int> ConvertToVector256Int32(Vector256<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256Int32(value, mode);
// VCVTPS2QQ ymm1{k1}{z}, xmm2/m128/m32bcst {er}
public static Vector256<long> ConvertToVector256Int64(Vector128<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256Int64(value, mode);
// VCVTPS2UDQ ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<uint> ConvertToVector256UInt32(Vector256<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256UInt32(value, mode);
// VCVTPS2UQQ ymm1{k1}{z}, xmm2/m128/m32bcst {er}
public static Vector256<ulong> ConvertToVector256UInt64(Vector128<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256UInt64(value, mode);
// VCVTQQ2PS xmm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector128<float> ConvertToVector128Single(Vector256<ulong> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector128Single(value, mode);
// VCVTQQ2PD ymm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector256<double> ConvertToVector256Double(Vector256<ulong> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256Double(value, mode);
// VCVTUDQ2PS ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<float> ConvertToVector256Single(Vector256<uint> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256Single(value, mode);
// VCVTUQQ2PS xmm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector128<float> ConvertToVector128Single(Vector256<long> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector128Single(value, mode);
// VCVTUQQ2PD ymm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector256<double> ConvertToVector256Double(Vector256<long> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToVector256Double(value, mode);
// VMULPD ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst {er}
public static Vector256<double> Multiply(Vector256<double> left, Vector256<double> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Multiply(left, right, mode);
// VMULPS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst {er}
public static Vector256<float> Multiply(Vector256<float> left, Vector256<float> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Multiply(left, right, mode);
// VSCALEFPD ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst {er}
public static Vector256<double> Scale(Vector256<double> left, Vector256<double> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Scale(left, right, mode);
// VSCALEFPS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst {er}
public static Vector256<float> Scale(Vector256<float> left, Vector256<float> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Scale(left, right, mode);
// VSQRTPD ymm1{k1}{z}, ymm2/m256/m64bcst {er}
public static Vector256<double> Sqrt(Vector256<double> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Sqrt(value, mode);
// VSQRTPS ymm1{k1}{z}, ymm2/m256/m32bcst {er}
public static Vector256<float> Sqrt(Vector256<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Sqrt(value, mode);
// VSUBPD ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst {er}
public static Vector256<double> Subtract(Vector256<double> left, Vector256<double> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Subtract(left, right, mode);
// VSUBPS ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst {er}
public static Vector256<float> Subtract(Vector256<float> left, Vector256<float> right, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => Subtract(left, right, mode);
[Intrinsic]
public new abstract class X64 : Avx10v1.X64
{
internal X64() { }
public static new bool IsSupported { get => IsSupported; }
}
[Intrinsic]
public abstract class V512 : Avx10v1.V512
{
internal V512() { }
public static new bool IsSupported { get => IsSupported; }
// VMINMAXPD zmm1{k1}{z}, zmm2, zmm3/m512/m64bcst {sae}, imm8
public static Vector512<double> MinMax(Vector512<double> left, Vector512<double> right, [ConstantExpected] byte control) => MinMax(left, right, mode);
// VMINMAXPS zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst {sae}, imm8
public static Vector512<float> MinMax(Vector512<float> left, Vector512<float> right, [ConstantExpected] byte control) => MinMax(left, right, mode);
// VCVTPS2IBS zmm1{k1}{z}, zmm2/m512/m32bcst {er}
public static Vector512<int> ConvertToByteWithSaturationAndWidenToInt32(Vector512<float> value) => ConvertToByteWithSaturationAndWidenToInt32(value);
// VCVTPS2IBS zmm1{k1}{z}, zmm2/m512/m32bcst {er}
public static Vector512<int> ConvertToByteWithSaturationAndWidenToInt32(Vector512<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToByteWithSaturationAndWidenToInt32(value, mode);
// VCVTPS2IUBS zmm1{k1}{z}, zmm2/m512/m32bcst {er}
public static Vector512<uint> ConvertToByteWithSaturationAndWidenToUInt32(Vector512<float> value) => ConvertToByteWithSaturationAndWidenToUInt32(value);
// VCVTPS2IUBS zmm1{k1}{z}, zmm2/m512/m32bcst {er}
public static Vector512<uint> ConvertToByteWithSaturationAndWidenToUInt32(Vector512<float> value, [ConstantExpected(Max = FloatRoundingMode.ToZero)] FloatRoundingMode mode) => ConvertToByteWithSaturationAndWidenToUInt32(value, mode);
// VCVTTPS2IUBS zmm1{k1}{z}, zmm2/m512/m32bcst {sae}
public static Vector512<int> ConvertToByteWithTruncatedSaturationAndWidenToInt32(Vector512<float> value) => ConvertToByteWithTruncatedSaturationAndWidenToInt32(value);
// VCVTTPS2IUBS zmm1{k1}{z}, zmm2/m512/m32bcst {sae}
public static Vector512<uint> ConvertToByteWithTruncatedSaturationAndWidenToUInt32(Vector512<float> value) => ConvertToByteWithTruncatedSaturationAndWidenToUInt32(value);
// This is a 512 extension of previously existing 128/26 inrinsic
// VMPSADBW zmm1{k1}{z}, zmm2, zmm3/m512, imm8
public static Vector512<ushort> MultipleSumAbsoluteDifferences(Vector512<byte> left, Vector512<byte> right, [ConstantExpected] byte mask) => MultipleSumAbsoluteDifferences(left, right, mask);
[Intrinsic]
public new abstract class X64 : Avx10v1.V512.X64
{
internal X64() { }
public static new bool IsSupported { get => IsSupported; }
}
}
}
} |
I will also like to discuss the following API
and would like to change them to
|
@khushal1996 the proposed signatures don't match the .NET naming conventions (we'd still use In particular we already expose the existing Likewise while we expose some instructions like The new |
Thanks @tannergooding
|
In this case, since its a move and no conversions are possible, it'd just be 2-4 |
Thanks @tannergooding Concluding this discussion with addition of following APIs
|
@tannergooding Assuming we are planning on this, do we add this to the proposal somewhere now that the API name has changes from the original approved proposal? |
We'd need to extract them to their own proposal. API review is done for the year so we should get to them sometime in January. We shouldn't be blocked on doing any work around the other approved APIs in the meantime, however. |
Since the APIs listed in description and #109083 (comment) are different from what we agreed upon, just to confirm, for APIs for
Let me know if this is correct |
They should be matching the naming, particularly the first two. The latter two match the general format but were changed from This should match what is under the approving comment here: #109083 (comment) |
I think there is a mismatch between what the approving comment says and what actually we were discussing here #109083 (comment). Our discussion had no conclusion and the approving comment went in the wrong direction because of the original proposal. There is no such thing as |
This is a bit of a nomenclature thing. The result of a It is only the result of an
The spec gives the following:
The nuance is then the |
True. I will change the APIs to the following
|
@khushal1996 this proposal was fully handled in #111209 and so can be closed now, correct? |
@tannergooding yes, this proposal can be considered fully handled and we will have 2 proposals for the remaining APIs |
Hi @tannergooding . I was going through the pending APIs and we are yet to add these 4 APIs for AVX10.2. As discussed, I will open a new issue to discuss them and we can take it forward. Let me know if there are any concerns. |
No concerns, thanks for following up! |
Background and motivation
Intel has announced the features available in the next version of
Avx10
(10.2). In order to support this, .NET needs to expand theAvx10
library to include the new APIs.Avx10.2 spec. Section 7 - 14 in this spec goes over the newly added instructions. A couple of interesting features here are
MinMax
andsaturating conversions
As part of the original API Proposal, the proposed design was for future
Avx10
versions to have their own classes which inherits fromAvx10v1
API Proposal
API Usage
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: