Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix][Refactor] Fix bugs and refine codes for large scale simulator test #93

Merged
merged 59 commits into from
Feb 19, 2025

Conversation

s5u13b
Copy link
Contributor

@s5u13b s5u13b commented Jan 16, 2025

  1. Simplify request timestamps implementation and add metrics
  2. Set max-instances for auto_scale_up loop
  3. Support retry binding address for zmq server
  4. Support power-of-k-choice for dispatch
  5. Change num_cpus of ProxyActor from 1 to 0
  6. Fix some bugs: abort in AsyncStream, host in glocal launch mode, simulator in global launch mode
  7. Reorg simulator files
  8. Reorg global_scheduler directory
  9. Resort manager and launcher functions
  10. Refine ray_env in conftest
  11. Others Minors

@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics [WIP][Observability] Refine request timestamps implementation and add more metrics Jan 16, 2025
Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 10578.32 73505.90 133588.45 170028.33 171892.32 75987.91
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.86 56.09 70.64 147.21 390.72 73.09

Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 232.00 MB 256.00 MB 312.00 MB 352.00 MB 448.00 MB 472.00 MB 528.00 MB
rayrpc_speed(GB/s) 1.05 1.50 1.78 1.93 2.04 2.12 2.15 2.13 2.24 2.31 2.34 2.29 2.45 2.43 2.37 2.43 2.44 2.43 2.53 2.57 2.50 2.52 2.52 2.54 2.56 2.58 2.47 3.04 3.00 2.76 3.10 3.18 2.98
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 232.00 MB 248.00 MB 264.00 MB 368.00 MB 416.00 MB 472.00 MB
gloo_speed(GB/s) 1.00 1.62 2.08 2.31 2.50 2.67 2.84 2.81 2.98 2.91 2.87 3.12 3.24 3.14 3.37 2.86 2.85 2.75 2.62 2.22 2.29 2.82 2.12 4.19 2.79 3.45 3.33 2.84 2.99 2.81 1.40 2.95 2.79 0.88

@s5u13b s5u13b changed the title [WIP][Observability] Refine request timestamps implementation and add more metrics [Observability] Refine request timestamps implementation and add more metrics Jan 17, 2025
@s5u13b s5u13b force-pushed the request_timestamps branch from 5501476 to 8ec7ba7 Compare January 20, 2025 04:27
Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15971.81 75729.48 117620.09 161436.13 189060.75 71825.76
decode p25 p50 p75 p95 p99 mean
latency(ms) 52.30 56.66 69.52 117.12 372.92 80.02

Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 264.00 MB 336.00 MB 384.00 MB 472.00 MB 544.00 MB
rayrpc_speed(GB/s) 1.03 1.51 1.74 1.93 2.02 2.12 2.15 2.24 2.22 2.25 2.32 2.34 2.45 2.48 2.50 2.43 2.45 2.40 2.45 2.56 2.49 2.56 2.61 2.55 2.49 2.70 2.55 2.59 2.81 3.08 3.01 3.31 3.28
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 240.00 MB 256.00 MB 368.00 MB
gloo_speed(GB/s) 1.04 1.74 2.22 2.47 2.64 2.81 3.12 3.20 3.18 3.41 3.54 3.20 3.54 3.40 3.37 3.17 3.14 2.44 2.73 2.66 3.06 3.07 1.88 3.21 2.99 2.35 2.04 4.90 2.55 0.48 2.44

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 3505.96 75484.03 129908.60 162583.37 185617.71 72493.51
decode p25 p50 p75 p95 p99 mean
latency(ms) 51.51 55.91 70.52 121.72 346.31 73.13

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 240.00 MB 256.00 MB 280.00 MB
rayrpc_speed(GB/s) 3.91 1.05 1.54 1.81 1.91 2.02 2.11 2.17 2.25 2.33 2.36 2.45 2.45 2.36 2.44 2.55 2.52 2.36 2.58 2.53 2.55 2.63 2.75 2.50 2.17 2.63 2.70 2.80 2.72 2.68 2.87 2.76
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 280.00 MB 448.00 MB
gloo_speed(GB/s) 1.00 1.68 2.06 2.30 2.60 2.82 2.94 2.94 2.86 2.89 2.96 3.54 3.13 3.25 3.63 2.73 2.52 2.39 2.72 2.77 2.66 2.38 3.13 2.53 3.04 1.61 2.69 2.30 1.35

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15731.92 68576.88 122544.35 191770.90 192504.54 76629.42
decode p25 p50 p75 p95 p99 mean
latency(ms) 49.42 54.94 68.49 107.64 222.68 65.71

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 232.00 MB 256.00 MB 264.00 MB 288.00 MB 560.00 MB
rayrpc_speed(GB/s) 3.50 1.03 1.53 1.78 1.92 1.99 2.07 2.11 2.14 2.22 2.18 2.29 2.25 2.35 2.26 2.44 2.40 2.51 2.47 2.47 2.57 2.48 2.58 2.46 2.54 2.36 2.46 2.71 2.74 2.72 2.68 2.61 2.99 3.29
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 216.00 MB 224.00 MB 312.00 MB 336.00 MB 416.00 MB
gloo_speed(GB/s) 0.99 1.66 2.09 2.26 2.41 2.76 2.83 2.96 3.10 2.95 3.27 3.46 3.26 3.39 3.31 2.67 3.21 3.18 2.19 2.86 1.66 2.88 0.92 2.74 3.43 3.01 3.12 1.80 1.45 0.95

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 232.00 MB 240.00 MB 264.00 MB 272.00 MB 280.00 MB 328.00 MB 376.00 MB 416.00 MB 600.00 MB 656.00 MB
rayrpc_speed(GB/s) 3.45 1.03 1.52 1.80 1.92 2.07 2.10 2.13 2.18 2.27 2.29 2.29 2.34 2.37 2.46 2.42 2.47 2.52 2.49 2.52 2.56 2.59 2.54 2.53 2.33 2.44 2.51 2.71 2.62 2.59 2.69 2.67 2.62 2.80 3.00 3.10 3.23 3.22 3.64
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 200.00 MB 224.00 MB 232.00 MB 256.00 MB 416.00 MB
gloo_speed(GB/s) 1.04 1.74 2.13 2.46 2.73 2.91 2.94 3.02 3.00 3.36 3.07 3.42 3.25 3.42 3.12 2.86 3.01 3.14 2.20 3.14 2.23 3.94 1.98 2.90 3.12 4.37 3.33 3.39

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 14642.17 80519.75 130939.61 181272.94 181582.98 80402.36
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.12 53.89 62.69 104.63 190.77 63.56

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15635.69 74445.50 125064.97 185430.89 210975.15 76822.74
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.45 54.65 65.08 95.30 191.98 62.26

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 208.00 MB 224.00 MB 232.00 MB 248.00 MB 264.00 MB 272.00 MB 280.00 MB 408.00 MB 416.00 MB 536.00 MB
rayrpc_speed(GB/s) 3.58 1.02 1.50 1.77 1.96 1.99 2.04 2.12 2.11 2.21 2.29 2.27 2.34 2.29 2.37 2.40 2.41 2.45 2.50 2.49 2.42 2.47 2.58 2.61 2.56 2.52 2.77 2.81 2.90 2.66 2.72 3.04 3.37 3.10 3.27
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 224.00 MB 272.00 MB 400.00 MB
gloo_speed(GB/s) 1.02 1.67 2.09 2.41 2.68 2.68 2.87 2.90 2.88 3.02 2.98 3.06 3.41 3.23 3.51 2.62 2.81 2.62 2.88 2.28 2.04 1.93 3.07 2.39 2.51 1.04 3.10 1.29

@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics [Observability] Refine request timestamps implementation and add more metrics & Reorg simulator files Jan 21, 2025
Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 224.00 MB 232.00 MB 240.00 MB 264.00 MB 352.00 MB 480.00 MB 568.00 MB
rayrpc_speed(GB/s) 3.65 1.02 1.51 1.73 1.90 2.03 2.02 2.12 2.15 2.21 2.22 2.28 2.29 2.34 2.41 2.42 2.43 2.37 2.45 2.50 2.49 2.48 2.51 2.55 2.52 2.56 2.66 2.48 2.88 2.55 2.66 2.78 3.12 3.43
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 200.00 MB 224.00 MB 232.00 MB 376.00 MB
gloo_speed(GB/s) 0.99 1.66 2.07 2.39 2.53 2.79 2.87 2.99 2.85 3.22 3.23 3.57 3.22 3.23 3.15 2.95 2.23 2.55 2.74 3.05 1.87 2.17 3.44 3.00 3.41 3.23 2.92

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15374.72 81117.49 134885.05 171695.47 188812.34 76513.39
decode p25 p50 p75 p95 p99 mean
latency(ms) 51.41 55.83 67.09 104.81 298.59 67.25

@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics & Reorg simulator files [Observability] Refine request timestamps implementation and add more metrics Jan 22, 2025
Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 248.00 MB 264.00 MB 280.00 MB 336.00 MB 360.00 MB 480.00 MB 544.00 MB
rayrpc_speed(GB/s) 1.03 1.54 1.80 1.89 2.03 2.06 2.15 2.14 2.28 2.22 2.38 2.30 2.38 2.38 2.39 2.46 2.43 2.33 2.42 2.56 2.46 2.66 2.33 2.62 2.38 2.74 2.59 2.73 2.79 2.92 2.89 3.17 2.99 3.15 3.22
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 312.00 MB 328.00 MB 432.00 MB
gloo_speed(GB/s) 1.03 1.72 2.17 2.39 2.56 2.85 3.03 2.96 3.02 3.12 3.13 3.15 3.26 2.93 3.43 2.73 2.86 3.03 2.64 2.70 2.42 3.27 2.04 3.46 2.44 2.72 2.09 3.18 2.30 3.43

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 16465.44 78088.89 130741.60 166650.61 173568.97 76939.03
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.74 55.46 64.29 100.58 165.65 62.31

@s5u13b s5u13b force-pushed the request_timestamps branch from 1a49a08 to d2894ca Compare February 7, 2025 11:38
Copy link

github-actions bot commented Feb 7, 2025

prefill p25 p50 p75 p95 p99 mean
latency(ms) 3106.97 4274.23 26191.15 88149.30 123252.74 21229.58
decode p25 p50 p75 p95 p99 mean
latency(ms) 71.95 106.97 150.60 1924.35 23199.80 1105.10

Copy link

github-actions bot commented Feb 7, 2025

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 200.00 MB 208.00 MB 248.00 MB 256.00 MB 272.00 MB 280.00 MB 312.00 MB 320.00 MB
rayrpc_speed(GB/s) 3.69 0.89 1.36 1.64 1.83 1.93 2.07 2.15 2.13 2.20 2.27 2.25 2.32 2.31 2.34 2.41 2.41 2.41 2.57 2.44 2.63 2.68 2.51 2.59 2.69 2.69 2.76 2.96 2.78 2.93 3.12 2.96
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 232.00 MB 312.00 MB 392.00 MB
gloo_speed(GB/s) 0.87 1.45 1.88 2.11 2.24 2.41 2.42 2.44 2.67 2.64 2.45 2.78 2.65 2.70 2.60 2.39 2.61 1.74 1.75 2.64 2.11 2.91 2.92 0.91 2.71 2.67 2.52 2.29 2.42
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 224.00 MB 232.00 MB 280.00 MB 320.00 MB 400.00 MB 464.00 MB
nccl_speed(GB/s) 0.20 0.40 0.67 0.81 0.96 1.21 1.44 1.59 1.66 2.04 1.92 2.13 1.98 2.06 2.34 2.34 2.78 3.03 2.56 3.00 3.30 2.97 3.68 2.10 3.64 2.97 1.98 2.84 4.06 3.62

@s5u13b s5u13b force-pushed the request_timestamps branch from 00d3273 to 2c4cc50 Compare February 12, 2025 07:57
@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics [Observability][GlobalScheduler][BugFix] Simplify request timestamps implementation & Support power-of-k-choice for dispatch & Fix some bugs and refine codes for large scale simulator test Feb 12, 2025
@s5u13b s5u13b changed the title [Observability][GlobalScheduler][BugFix] Simplify request timestamps implementation & Support power-of-k-choice for dispatch & Fix some bugs and refine codes for large scale simulator test [BugFix][Refactor] Fix some bugs and refine codes for large scale simulator test Feb 12, 2025
@s5u13b s5u13b requested a review from zhypku February 12, 2025 08:34
commit 48c674b
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 09:41:05 2025 +0000

    Fix lint

commit 322862b
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 09:39:31 2025 +0000

    Fix entrypoints unit test

commit 75af824
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 08:07:26 2025 +0000

    Fix lint

commit 2818c8d
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 08:06:08 2025 +0000

    Fix cr

commit a172468
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 07:01:07 2025 +0000

    Fix lint

commit 3f863b2
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:54:18 2025 +0000

    Add back timestamp

commit 2e53b24
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:45:16 2025 +0000

    Fix lint

commit eea1a3a
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:37:30 2025 +0000

    Add back timestamps

commit b4a45ef
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:21:48 2025 +0000

    Remove old filter

commit f2df197
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:12:53 2025 +0000

    Add _process_model_outputs back

commit a51cf25
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 03:46:45 2025 +0000

    Fix abort

commit 1058ec0
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 02:43:14 2025 +0000

    Remove blank todo

commit 670018e
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 02:36:27 2025 +0000

    Filter out migrating request

commit fa2fc9c
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 06:25:35 2025 +0000

    Remove process_model_outputs request timestamps

commit 2a980ca
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 06:10:49 2025 +0000

    Fix linting

commit 78a1ab4
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 05:30:15 2025 +0000

    Fix request leaking bug of migration

commit 774205b
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 03:11:08 2025 +0000

    Fix

commit 814521e
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 02:57:20 2025 +0000

    Minors

commit b3f0688
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 01:56:09 2025 +0000

    Change ci timeout-minutes
@s5u13b s5u13b force-pushed the request_timestamps branch from 703d4da to 58e2647 Compare February 19, 2025 07:19
Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 2338.43 3059.49 4426.76 80621.01 125077.36 17401.38
decode p25 p50 p75 p95 p99 mean
latency(ms) 87.50 120.55 178.58 1040.41 1692.18 238.98

Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 232.00 MB 248.00 MB 256.00 MB 272.00 MB 280.00 MB 312.00 MB 320.00 MB 352.00 MB 456.00 MB
rayrpc_speed(GB/s) 0.90 1.34 1.66 1.81 1.89 2.00 2.12 2.14 2.18 2.38 2.37 2.33 2.37 2.30 2.33 2.44 2.46 2.54 2.41 2.57 2.60 2.55 2.59 2.61 2.92 2.75 2.65 2.72 2.86 2.74 2.98 2.99 2.24 2.69 2.89 2.97 2.83
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 192.00 MB 200.00 MB 232.00 MB 240.00 MB 280.00 MB 312.00 MB 320.00 MB
gloo_speed(GB/s) 0.87 1.45 1.86 2.07 2.16 2.44 2.65 2.65 2.70 2.79 2.77 2.77 2.81 2.64 3.19 2.63 2.47 2.20 2.90 2.45 2.55 0.63 2.90 1.89 2.55 2.96 2.38 0.42 2.97
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 216.00 MB 232.00 MB 248.00 MB 280.00 MB
nccl_speed(GB/s) 0.20 0.50 0.74 0.88 1.09 1.22 1.45 1.55 1.72 1.84 2.11 2.28 2.24 1.81 2.78 2.49 2.86 2.79 2.59 2.94 2.54 2.66 3.45 3.36 3.93 2.03 2.08 3.61 1.21

@s5u13b s5u13b changed the title [BugFix][Refactor] Fix some bugs and refine codes for large scale simulator test [BugFix][Refactor] Fix bugs and refine codes for large scale simulator test Feb 19, 2025
@s5u13b s5u13b merged commit ff26344 into main Feb 19, 2025
14 checks passed
@s5u13b s5u13b deleted the request_timestamps branch February 19, 2025 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants