Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Orca prefix sharing benchmark #41

Closed
wants to merge 108 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
108 commits
Select commit Hold shift + click to select a range
5d25a8a
[WIP] Implement orca
WoosukKwon Mar 12, 2023
84253db
Merge branch 'main' into orca
WoosukKwon Mar 12, 2023
935d4ea
Add benchmarking script
WoosukKwon Mar 12, 2023
6c77ccc
Improve benchmark.py
WoosukKwon Mar 12, 2023
4112701
Add dataset args
WoosukKwon Mar 12, 2023
778fb09
Handle zero-input case
WoosukKwon Mar 12, 2023
388f2d3
Merge branch 'benchmark' into orca
WoosukKwon Mar 12, 2023
e7ba3e3
Revert the change in frontend
WoosukKwon Mar 13, 2023
095387f
Minor fix in scheduler
WoosukKwon Mar 13, 2023
a3f39b9
Minor fix
WoosukKwon Mar 13, 2023
4b9e802
Collect performance stats
WoosukKwon Mar 13, 2023
d3d6883
Save performance stats for benchmarking
WoosukKwon Mar 13, 2023
25046c8
Merge branch 'benchmark' into orca
WoosukKwon Mar 13, 2023
4dbb6f5
Add a script to visualize stats
WoosukKwon Mar 13, 2023
ef9f20f
Merge branch 'benchmark' into orca
WoosukKwon Mar 13, 2023
804c087
tmp -> orca
WoosukKwon Mar 13, 2023
42dddbb
Implement get_num_free_blocks for buddy allocator
WoosukKwon Mar 13, 2023
2472619
Fix CPU cache usage to 0
WoosukKwon Mar 13, 2023
a9b7f48
Minor fix in plot_stats.py
WoosukKwon Mar 13, 2023
3b901ac
Merge branch 'benchmark' into orca
WoosukKwon Mar 13, 2023
0ee753c
Remove unused
WoosukKwon Mar 13, 2023
cbd3f0f
Add length estimator
WoosukKwon Mar 13, 2023
6c9c1c3
Merge branch 'main' into benchmark
WoosukKwon Mar 13, 2023
adba37c
Update benchmark.py
WoosukKwon Mar 13, 2023
7384e8e
Fix plot_stats.py
WoosukKwon Mar 13, 2023
21113ce
Merge branch 'benchmark' into orca
WoosukKwon Mar 13, 2023
c56cdbc
Add back #input arrivals
WoosukKwon Mar 14, 2023
2cde267
Add beam search in benchmark.py
WoosukKwon Mar 16, 2023
a70ee0c
Minor
WoosukKwon Mar 16, 2023
8970972
Print avg input seqlen
WoosukKwon Mar 17, 2023
5b2c348
Merge branch 'benchmark' into orca
WoosukKwon Mar 17, 2023
c2904f5
Support parallel generation
WoosukKwon Mar 17, 2023
1193bb1
Fix max batch size to avoid deadlock
WoosukKwon Mar 17, 2023
a70843e
Bug fix
WoosukKwon Mar 17, 2023
562845f
Raise assertion error for deadlock
WoosukKwon Mar 17, 2023
ebab871
Add trace generator
WoosukKwon Apr 6, 2023
3de69a4
Add functions to collect stats
WoosukKwon Apr 6, 2023
3bc7fd8
Add main experiment script
WoosukKwon Apr 6, 2023
b5d6073
Add to gitignore
WoosukKwon Apr 6, 2023
7ff16a5
Save more info
WoosukKwon Apr 6, 2023
a0aad23
Minor
WoosukKwon Apr 6, 2023
59b4155
Minor
WoosukKwon Apr 6, 2023
ccb9826
Add timestamps & num_preemption
WoosukKwon Apr 6, 2023
1a6f707
Save arrival & finish time
WoosukKwon Apr 6, 2023
ae21da2
Add script to visualize stats
WoosukKwon Apr 6, 2023
980f9c9
More colors
WoosukKwon Apr 6, 2023
6b4db61
Minor
WoosukKwon Apr 6, 2023
195e7fb
Minor
WoosukKwon Apr 6, 2023
f5350bd
Refactor
WoosukKwon Apr 6, 2023
4f38abb
Add png to gitignore
WoosukKwon Apr 6, 2023
af6e724
Merge branch 'experiment' into orca
WoosukKwon Apr 6, 2023
78f223f
Fix
WoosukKwon Apr 6, 2023
3331d21
Add assertions
WoosukKwon Apr 6, 2023
6eb1708
Minor
WoosukKwon Apr 6, 2023
ee1d960
Add --len-estimator
WoosukKwon Apr 6, 2023
c504646
Fix output_dir
WoosukKwon Apr 6, 2023
4640039
Merge branch 'experiment' into orca
WoosukKwon Apr 6, 2023
4fa0bcd
Implement beam search
WoosukKwon Apr 6, 2023
d618397
Minor
WoosukKwon Apr 6, 2023
d05cd9e
Fix a bug in beam search
WoosukKwon Apr 6, 2023
02b4cc4
Fix output_dir
WoosukKwon Apr 7, 2023
c01e6f1
Merge branch 'experiment' into orca
WoosukKwon Apr 7, 2023
c0381db
Fix bug
WoosukKwon Apr 7, 2023
9e27133
Merge branch 'experiment' into orca
WoosukKwon Apr 7, 2023
435c09a
OPT-60B -> OPT-66B
WoosukKwon Apr 7, 2023
f85a78b
Merge branch 'experiment' into orca
WoosukKwon Apr 7, 2023
5110a64
n8 -> n3
WoosukKwon Apr 7, 2023
0c89cbb
Merge branch 'experiment' into orca
WoosukKwon Apr 7, 2023
ea1729c
Add max-num-sequences
WoosukKwon Apr 7, 2023
e9427b6
Add max-num-sequences
WoosukKwon Apr 7, 2023
a3ab3f6
Shorten sampling dir name
WoosukKwon Apr 7, 2023
11576fc
max_num_seqs 128 -> 256
WoosukKwon Apr 7, 2023
3353a23
Minor
WoosukKwon Apr 7, 2023
6957e65
Merge branch 'experiment' into orca
WoosukKwon Apr 7, 2023
580afe9
Merge branch 'experiment' into orca
WoosukKwon Apr 7, 2023
a832499
Merge branch 'main' into experiment
WoosukKwon Apr 8, 2023
244e6ed
Merge branch 'experiment' into orca
WoosukKwon Apr 8, 2023
69618a3
Fix bug in trace generator
WoosukKwon Apr 8, 2023
f76ace0
Merge branch 'experiment' into orca
WoosukKwon Apr 8, 2023
fe15d81
Add n6 & n6-beam
WoosukKwon Apr 8, 2023
b3b715e
Merge branch 'experiment' into orca
WoosukKwon Apr 8, 2023
16efbce
Merge branch 'main' into experiment
WoosukKwon Apr 9, 2023
613d4e9
Merge branch 'experiment' into orca
WoosukKwon Apr 9, 2023
833c9b9
Add OPT-175B
WoosukKwon Apr 9, 2023
d87c728
Minor
WoosukKwon Apr 9, 2023
bc80eed
Merge branch 'experiment' into orca
WoosukKwon Apr 9, 2023
f29ce01
Merge branch 'experiment' into orca
WoosukKwon Apr 9, 2023
ce1ba32
Collect memory stats
WoosukKwon Apr 12, 2023
683e6a4
Merge branch 'experiment' into orca
WoosukKwon Apr 12, 2023
fb6dce6
Fix bug in buddy allocator
WoosukKwon Apr 12, 2023
7c6b4cb
Add option to collect memory stats
WoosukKwon Apr 12, 2023
53a2114
Merge branch 'main' into orca
WoosukKwon Apr 12, 2023
f7f75f5
Add timeout
WoosukKwon Apr 15, 2023
2a68d56
Merge branch 'main' into orca
WoosukKwon Apr 15, 2023
c542ea1
Fix memory breakdown
WoosukKwon Apr 17, 2023
3b803f1
integrate trace
suquark Apr 17, 2023
3c789de
add orca benchmark
suquark Apr 17, 2023
00ce7dc
restore original benchmark
suquark Apr 17, 2023
cba33f9
add arg
suquark Apr 17, 2023
c5fb212
fix
suquark Apr 17, 2023
811d7d5
fix
suquark Apr 17, 2023
47baaa2
add comments
suquark Apr 17, 2023
d5dc338
update script
suquark Apr 17, 2023
87766dd
update benchmark script
suquark Apr 17, 2023
f5069ee
chmod
suquark Apr 17, 2023
c9a15cd
update benchmark script
suquark Apr 17, 2023
311da11
update benchmark
suquark Apr 17, 2023
951e7ad
print prefix length
suquark Apr 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix CPU cache usage to 0
  • Loading branch information
WoosukKwon committed Mar 13, 2023
commit 24726191a46d2ff5dbab36119f4fba4843bd6938
4 changes: 1 addition & 3 deletions cacheflow/master/scheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,9 +208,7 @@ def step(self) -> None:
free_gpu_blocks = self.block_manager.gpu_allocator.get_num_free_blocks()
self.gpu_blocks_usage.append(
(self.num_gpu_blocks - free_gpu_blocks) / self.num_gpu_blocks)
free_cpu_blocks = self.block_manager.cpu_allocator.get_num_free_blocks()
self.cpu_blocks_usage.append(
(self.num_cpu_blocks - free_cpu_blocks) / self.num_cpu_blocks)
self.cpu_blocks_usage.append(0)
self.requests_received.append(num_requests)

self.controllers[0].execute_stage(
Expand Down