Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[OPS/BLOCKSPARSE] remove unnecessary mask (#1351)
This PR applies a minor patch that removes unnecessary masks in `_dsd_kernel()`. ### Details `offs_bn` is defined as follows and not updated after that. ```py offs_bn = pid_m * TILE_N + tl.arange(0, TILE_N) offs_bn = tl.max_contiguous(tl.multiple_of(offs_bn % DS0, TILE_N), TILE_N) ``` Because `offs_bn = offs_bn % DS0`, this mask is always `True`. ```py b = tl.load(pb, mask=offs_bn[None, :] < DS0) ``` This PR removes this mask (as well as explicit `mask=True`).
- Loading branch information