Add simd_extmul_* support for x64 #3084

jlb6740 · 2021-07-14T01:13:45Z

No description provided.

abrown

This looks good to me with the minor comments about inlining and reordering the lowerings. @akirilov-arm, the code_translator.rs part should look OK for the aarch64 backend, right?

abrown · 2021-07-14T17:23:13Z

cranelift/codegen/src/isa/x64/lower.rs

+                            insn: swiden1_high,
+                            input: 0,
+                        },
+                    ];


Not sure why swiden_input is needed: couldn't the InsnInputs be created inline?

abrown · 2021-07-14T17:30:08Z

cranelift/codegen/src/isa/x64/lower.rs

+                                dst,
+                            ));
+                        }
+                        _ => panic!("Unsupported extmul_low_signed type"),


Initially I thought that this matching was too greedy--that it would match sequences that it shouldn't be and then crash because the types are wrong. I think now that it is OK because swiden_high only really allows the following types: I8X16, I16X8, I32X4. I think it might be worthwhile to mention this assumption in some comments at the top so that future readers aren't confused.

v0 = swiden.i64

abrown · 2021-07-14T17:38:22Z

cranelift/codegen/src/isa/x64/lower.rs

@@ -1662,7 +1662,348 @@ fn lower_insn_to_regs<C: LowerCtx<I = Inst>>(

        Opcode::Imul => {
            let ty = ty.unwrap();
-            if ty == types::I64X2 {
+
+            // First check for ext_mul_* instructions. Where possible ext_mul_* lowerings


I think the order of the lowerings should be reversed: scalar lowerings first, then vector lowerings, then pattern matching lowerings. As it stands in this PR, every imul.i64 is going to require a bunch of matching calls and type comparisons before it ever gets lowered. I think scalar multiplication is going to be the most common, then plain vector multiplication, so I would think we would want to make those common paths shorter. (Not sure exactly how much this affects compile time but hopefully you see what I mean).

Sure .. I agree. That was my original thought but I never reverted the initial implementation. Upon looking at this again I see why I naturally put the extmul first. It is because when we share all these instructions with imul the current checks such as types::I64x2, and ty.lane_count() > 1 are entered before we have a chance to check for an opcode source. I think the best we can do is have a matches_input_any at the top and have a branch from there. That is what I will do before merging. I supposed if there is another/better solution we can refactor later.

Also note, the code must remain at the top as the fall through target since I am using "if let Some() .." to check for op sources and there is no if not let Some() ..". Still there is just one branch to get to the other lowerings.

akirilov-arm · 2021-07-14T17:53:06Z

@abrown I will let @sparker-arm comment because he has done the same changes in PR #3070.

jlb6740 force-pushed the implement_ext_mul_x64 branch from ba3d91f to 71574cd Compare July 14, 2021 01:37

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen cranelift:wasm labels Jul 14, 2021

jlb6740 added 2 commits July 13, 2021 19:10

Add simd_extmul_* support for x64

e1d78e6

Add emit tests to ext_mul_* instructions

6c4af89

jlb6740 force-pushed the implement_ext_mul_x64 branch from 71574cd to 6ccd1d8 Compare July 14, 2021 02:27

jlb6740 marked this pull request as ready for review July 14, 2021 02:29

jlb6740 requested review from abrown and akirilov-arm July 14, 2021 02:30

abrown approved these changes Jul 14, 2021

View reviewed changes

Refactor lowering structure for ext_mul on x64 and add comments

02abbe8

jlb6740 force-pushed the implement_ext_mul_x64 branch from 384b69b to 02abbe8 Compare July 15, 2021 07:05

jlb6740 requested a review from abrown July 15, 2021 07:31

jlb6740 merged commit 2452a4c into bytecodealliance:main Jul 15, 2021

abrown mentioned this pull request Jul 15, 2021

Incorrect codegen for i16x8.extmul_high_i8x16_s on x64 #3089

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add simd_extmul_* support for x64 #3084

Add simd_extmul_* support for x64 #3084

jlb6740 commented Jul 14, 2021

abrown left a comment

abrown Jul 14, 2021

abrown Jul 14, 2021

abrown Jul 14, 2021

jlb6740 Jul 15, 2021 •

edited

Loading

jlb6740 Jul 15, 2021

akirilov-arm commented Jul 14, 2021

Add simd_extmul_* support for x64 #3084

Add simd_extmul_* support for x64 #3084

Conversation

jlb6740 commented Jul 14, 2021

abrown left a comment

Choose a reason for hiding this comment

abrown Jul 14, 2021

Choose a reason for hiding this comment

abrown Jul 14, 2021

Choose a reason for hiding this comment

abrown Jul 14, 2021

Choose a reason for hiding this comment

jlb6740 Jul 15, 2021 • edited Loading

Choose a reason for hiding this comment

jlb6740 Jul 15, 2021

Choose a reason for hiding this comment

akirilov-arm commented Jul 14, 2021

jlb6740 Jul 15, 2021 •

edited

Loading