To report problems with this extension, please open a new issue at:
-
Alexey Sotkin, Intel
-
Dounia Khaldi, Intel
-
Mateusz Belicki, Intel
-
Dmitry Sidorov, Intel
-
Ben Ashbaugh, Intel
-
Greg Lueck, Intel
-
Victor Mustya, Intel
-
Arvind Sudarsanam, Intel
Working Draft
This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.
Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products. If you are interested in using this feature in your software product, please let us know!
This extension is written against the SPIR-V Specification, Version 1.6 Revision 2.
This extension is written against SPV_KHR_cooperative_matrix extension specification Revision 3.
This extension is written against SPV_INTEL_bfloat16_conversion extension specification Revision 1.
This extension is written against SPV_INTEL_tensor_float32_rounding extension specification Revision 2.
This extension requires SPIR-V 1.0.
This extension adds new capabilities to SPV_KHR_cooperative_matrix such as special interpretations of matrix’s element type and Packed layout to support Intel VNNI instructions. The extension also adds new instructions for element-wise function apply instruction, get matrix element coordinate and matrix prefetch, adding mechanisms to specify cache level for matrix load and store instructions.
To use this extension within a SPIR-V module, the appropriate OpExtension must be present in the module:
OpExtension "SPV_INTEL_joint_matrix"
This extension introduces new capabilities:
PackedCooperativeMatrixINTEL CooperativeMatrixInvocationInstructionsINTEL CooperativeMatrixTF32ComponentTypeINTEL CooperativeMatrixBFloat16ComponentTypeINTEL CooperativeMatrixCheckedInstructionsINTEL CooperativeMatrixPrefetchINTEL
Instructions added under the CooperativeMatrixInvocationInstructionsINTEL capability:
OpCooperativeMatrixGetElementCoordINTEL OpCooperativeMatrixApplyFunctionINTEL
Instructions added under the CooperativeMatrixPrefetchINTEL capability:
OpCooperativeMatrixPrefetchINTEL
Instructions added under the CooperativeMatrixCheckedInstructionsINTEL capability:
OpCooperativeMatrixLoadCheckedINTEL OpCooperativeMatrixStoreCheckedINTEL OpCooperativeMatrixConstructCheckedINTEL
PackedCooperativeMatrixINTEL |
6434 |
CooperativeMatrixInvocationInstructionsINTEL |
6435 |
CooperativeMatrixTF32ComponentTypeINTEL |
6436 |
CooperativeMatrixBFloat16ComponentTypeINTEL |
6437 |
CooperativeMatrixPrefetchINTEL |
6411 |
CooperativeMatrixCheckedInstructionsINTEL |
6192 |
OpCooperativeMatrixGetElementCoordINTEL |
6440 |
OpCooperativeMatrixApplyFunctionINTEL |
6448 |
OpCooperativeMatrixPrefetchINTEL |
6449 |
OpCooperativeMatrixLoadCheckedINTEL |
6193 |
OpCooperativeMatrixStoreCheckedINTEL |
6194 |
OpCooperativeMatrixConstructCheckedINTEL |
6195 |
Modify section 3.X, Cooperative Matrix Layout adding PackedINTEL layout
Layout | Enabling capability | |
---|---|---|
0x2 |
PackedINTEL |
PackedCooperativeMatrixINTEL |
Modify section 3.X, Cooperative Matrix Operands adding new entries to the table to specify Component Type Interpretation
Interpretation | Enabling capability | |
---|---|---|
0x20 |
MatrixAAndBTF32ComponentsINTEL |
CooperativeMatrixTF32ComponentTypeINTEL |
0x40 |
MatrixAAndBBFloat16ComponentsINTEL |
CooperativeMatrixBFloat16ComponentTypeINTEL |
0x80 |
MatrixCBFloat16ComponentsINTEL |
CooperativeMatrixBFloat16ComponentTypeINTEL |
0x100 |
MatrixResultBFloat16ComponentsINTEL |
CooperativeMatrixBFloat16ComponentTypeINTEL |
Modify Section 3.31, Capability, adding rows to the Capability table:
Capability | Implicitly Declares | |
---|---|---|
6434 |
PackedCooperativeMatrixINTEL |
CooperativeMatrixKHR |
6435 |
CooperativeMatrixInvocationInstructionsINTEL |
CooperativeMatrixKHR |
6436 |
CooperativeMatrixTF32ComponentTypeINTEL |
CooperativeMatrixKHR |
6437 |
CooperativeMatrixBFloat16ComponentTypeINTEL |
CooperativeMatrixKHR |
6411 |
CooperativeMatrixPrefetchINTEL |
CooperativeMatrixKHR |
6192 |
CooperativeMatrixCheckedInstructionsINTEL |
CooperativeMatrixKHR |
Modify OpCooperativeMatrixLoadKHR adding:
Note: To specify cache level for OpCooperativeMatrixLoadKHR one
can use CacheControlLoadINTEL decoration from SPV_INTEL_cache_controls extension.
Modify OpCooperativeMatrixStoreKHR adding:
Note: To specify cache level for OpCooperativeMatrixStoreKHR one
can use CacheControlStoreINTEL decoration from SPV_INTEL_cache_controls extension.
OpCooperativeMatrixLoadCheckedINTEL |
Capability: |
||||||||||
9+variable |
6193 |
<id> |
Result <id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
Optional <id> |
Optional |
OpCooperativeMatrixStoreCheckedINTEL |
Capability: |
|||||||||
8+variable |
6194 |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
Optional <id> |
Optional |
If CooperativeMatrixBFloat16ComponentTypeINTEL and BFloat16ConversionINTEL capabilities are declared, then allow cooperative matrix types for the following conversion instructions (if the component types are appropriate): OpConvertFToBF16INTEL, OpConvertBF16ToFINTEL (See also: SPV_INTEL_bfloat16_conversion extension).
If CooperativeMatrixTF32ComponentTypeINTEL and TensorFloat32RoundingINTEL capabilities are declared, then allow cooperative matrix types for the following conversion instructions (if the component types are appropriate): OpRoundFToTF32INTEL (See also: SPV_INTEL_tensor_float32_rounding extension).
Rev | Date | Author | Changes |
---|---|---|---|
1 |
2021-02-16 |
Alexey Sotkin |
Initial revision |
2 |
2021-09-06 |
Dmitry Sidorov |
Split OpJointMatrixMadINTEL instruction into 4 |
3 |
2021-12-28 |
Dmitry Sidorov |
Add Joint matrix to Composite definition |
4 |
2022-03-10 |
Dmitry Sidorov |
Add OpJointMatrixWorkItemLengthINTEL instruction |
5 |
2022-04-01 |
Dmitry Sidorov |
Add Use parameter to TypeJointMatrixINTEL |
6 |
2022-09-07 |
Dmitry Sidorov |
Make Use parameter to be mandatory |
7 |
2022-10-13 |
Dmitry Sidorov |
Add ComponentTypeInterpretation decoration and OpJointMatrixGetElementCoordINTEL |
8 |
2022-12-02 |
Dmitry Sidorov |
Remove Scope from the instructions and Layout from the type |
9 |
2022-12-07 |
Dmitry Sidorov |
Split main capability into 3 |
10 |
2023-02-01 |
Dmitry Sidorov |
Move ComponentTypeInterpretation to an optional type parameter |
11 |
2023-07-05 |
Dmitry Sidorov |
Update on top of SPV_KHR_cooperative_matrix |
12 |
2023-09-25 |
Dmitry Sidorov |
Add apply function instruction |
13 |
2023-09-25 |
Dmitry Sidorov |
Add convertion instructions for tf32 and bf16 |
14 |
2023-10-11 |
Dmitry Sidorov |
Add matrix prefetch instruction |
15 |
2023-11-06 |
Dmitry Sidorov |
Put deprecation note on OpCooperativeMatrixGetElementCoordINTEL |
16 |
2023-11-06 |
Dmitry Sidorov |
Add checked load, store and construct instructions |