Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update spmm naive kernel for warpsize 64 #50

Merged
merged 2 commits into from
Jan 4, 2025

Conversation

pnunna93
Copy link
Collaborator

@pnunna93 pnunna93 commented Nov 1, 2024

This PR adds a macro to set warpsize based on gpu for spmm naive kernel

@pnunna93 pnunna93 requested a review from lcskrishna November 1, 2024 21:43
@lcskrishna
Copy link

@pnunna93 Did you run the unit tests?

csrc/kernels.hip Outdated
@@ -2853,6 +2853,7 @@ template <int THREADS, int ITEMS_PER_THREAD, int TILE_ROWS, int TILE_COLS, int T
#define DENORM 1.0f/127.0f
#define MAX_SPARSE_COUNT 32
#define SMEM_SIZE 8*256
#define WARP_SIZE __AMDGCN_WAVEFRONT_SIZE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of __AMDGCN_WAVEFRONT_SIZE can we use the warpSize defined in hip/hip_runtime.h header file.

#define WARP_SIZE warpSize
For example: https://github.com/pytorch/pytorch/blob/2cefbb71cf65696e28ee1bdfce06d6846285e5fb/c10/macros/Macros.h#L322

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it. I have modified 4 bit gemm kernel also to use the same.

@pnunna93
Copy link
Collaborator Author

pnunna93 commented Nov 8, 2024

@pnunna93 Did you run the unit tests?

Yes. This is latest log - BnB_UT_Summary_1108.log

@pnunna93 pnunna93 merged commit e4fe8b5 into rocm_enabled_multi_backend Jan 4, 2025
10 of 26 checks passed
@pnunna93 pnunna93 deleted the spmm_naive_warpsize_64 branch January 4, 2025 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants