Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial grok #169

Merged
merged 52 commits into from
Sep 26, 2024
Merged
Changes from 1 commit
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
ba87a04
initial grok
dan-garvey Sep 5, 2024
de9842b
use name prefix instead of new dataclass
dan-garvey Sep 5, 2024
b7965b1
some hacks
dan-garvey Sep 6, 2024
6d3d261
more hack
dan-garvey Sep 6, 2024
b5f535d
fix moe-ffn
dan-garvey Sep 6, 2024
4095db0
Add in some missing grok specific model structure and constants
KyleHerndon Sep 9, 2024
e71630a
Add attn_output_norm layer
archana-ramalingam Sep 12, 2024
5772a3d
Update MOE block in decode
archana-ramalingam Sep 12, 2024
3f2914a
Some fixes to the grok model
KyleHerndon Sep 12, 2024
7c2e133
Merge branch 'main' into grokstar
archana-ramalingam Sep 12, 2024
e1261f5
Revert "Merge branch 'main' into grokstar"
archana-ramalingam Sep 12, 2024
a242bde
Fix merging main changes
archana-ramalingam Sep 12, 2024
bb40d12
Update tensor trace names
archana-ramalingam Sep 12, 2024
cfa8420
Update moe block test
archana-ramalingam Sep 12, 2024
325696f
Update paged attention block with grok changes
archana-ramalingam Sep 12, 2024
48fce0c
Update paged attention block with grok changes
archana-ramalingam Sep 12, 2024
d9e787c
Add use_grok to MOE block
archana-ramalingam Sep 12, 2024
ab084cc
Use use_grok in MOE block
archana-ramalingam Sep 13, 2024
29e3603
Change MOE activation from silu to gelu for Grok
archana-ramalingam Sep 13, 2024
0670e1d
Allow router weight norm for all MOEs
archana-ramalingam Sep 13, 2024
a4be20b
Update llm_configs to support llama and grok architectures
archana-ramalingam Sep 13, 2024
3049f87
Remove comment
archana-ramalingam Sep 13, 2024
b8240c8
Add optional params for Grok
archana-ramalingam Sep 13, 2024
5bf30e0
Add all models supported in sharktank
archana-ramalingam Sep 13, 2024
d970944
Make rope_freq_base mandatory param
archana-ramalingam Sep 13, 2024
b1fd818
small refactor/cleanup
dan-garvey Sep 24, 2024
85e2f87
more cleanup
dan-garvey Sep 24, 2024
7ed9a23
this shouldn't have been unrebased??
dan-garvey Sep 24, 2024
3510634
fix use_hf args
dan-garvey Sep 24, 2024
a4ff36a
Make use_grok optional in MOE and Attention blocks
archana-ramalingam Sep 24, 2024
940db2f
Add use_grok to moe_block_test
archana-ramalingam Sep 24, 2024
bb2f5a1
fix kv cache test
dan-garvey Sep 24, 2024
b6e52eb
Add PreGatherMoeBlock to import from layers
archana-ramalingam Sep 24, 2024
b790cb5
Add MOE block export for prefill + decode
archana-ramalingam Sep 24, 2024
19218f3
Fix architecture variable
archana-ramalingam Sep 24, 2024
7deb42a
Fix imports
archana-ramalingam Sep 24, 2024
1b6cb6d
Fix rope_freq_base
archana-ramalingam Sep 24, 2024
43b20c4
fix flaky test
dan-garvey Sep 24, 2024
d938a08
Merge branch 'main' into grokstar
archana-ramalingam Sep 24, 2024
6aeeb4f
Add short versions for args
archana-ramalingam Sep 24, 2024
cac489c
Remove use_hf and use_grok options from llama
archana-ramalingam Sep 24, 2024
d5c27fe
Move create_kv_cache to utils folder
archana-ramalingam Sep 24, 2024
10d6c87
Fix error
archana-ramalingam Sep 24, 2024
4816c93
Merge branch 'main' into grokstar
archana-ramalingam Sep 25, 2024
124503f
revert addition of dtype arg
dan-garvey Sep 25, 2024
46c6eb6
Merge branch 'main' into grokstar
dan-garvey Sep 25, 2024
f3a8fb1
Remove attention_dtype
archana-ramalingam Sep 25, 2024
dcc1e8f
Merge branch 'main' into grokstar
dan-garvey Sep 25, 2024
88e38e2
fix missing parenth
dan-garvey Sep 25, 2024
430045b
correctly rebase T_T
dan-garvey Sep 25, 2024
f0a3e31
nonstrict
dan-garvey Sep 26, 2024
e5dc9e9
Merge branch 'main' into grokstar
dan-garvey Sep 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Remove attention_dtype
  • Loading branch information
archana-ramalingam committed Sep 25, 2024
commit f3a8fb18612b60cb2748ffa5ac423104c3833300
1 change: 0 additions & 1 deletion sharktank/sharktank/examples/paged_llm_v1.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,6 @@ def main():

device = torch.device(args.device) if args.device else None
activation_dtype = getattr(torch, args.activation_dtype)
attention_dtype = getattr(torch, args.attention_dtype)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

assert isinstance(activation_dtype, torch.dtype)

dataset = cli.get_input_dataset(args)
Expand Down
Loading