Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Pass type directly during reduction #1223

Open
wants to merge 4 commits into
base: gh/vmoens/47/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 19, 2025

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 19, 2025
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 6e10ea39a5e74e66f052f06d7709044e32ae01dd
Pull Request resolved: #1223
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 2a0f011758991f07958b2b1742d3d2136b6e9fb8
Pull Request resolved: #1223
Copy link

github-actions bot commented Feb 19, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.8850μs 20.5048μs 48.7690 KOps/s 47.7345 KOps/s $\color{#35bf28}+2.17\%$
test_plain_set_stack_nested 47.9800μs 20.4873μs 48.8106 KOps/s 47.5427 KOps/s $\color{#35bf28}+2.67\%$
test_plain_set_nested_inplace 60.6610μs 22.4488μs 44.5459 KOps/s 44.1894 KOps/s $\color{#35bf28}+0.81\%$
test_plain_set_stack_nested_inplace 77.0040μs 22.2510μs 44.9419 KOps/s 43.0229 KOps/s $\color{#35bf28}+4.46\%$
test_items 18.6850μs 4.1943μs 238.4177 KOps/s 234.6094 KOps/s $\color{#35bf28}+1.62\%$
test_items_nested 0.5353ms 0.4044ms 2.4731 KOps/s 2.4532 KOps/s $\color{#35bf28}+0.81\%$
test_items_nested_locked 0.5807ms 0.4052ms 2.4678 KOps/s 2.4770 KOps/s $\color{#d91a1a}-0.37\%$
test_items_nested_leaf 0.1337ms 75.3616μs 13.2694 KOps/s 12.8418 KOps/s $\color{#35bf28}+3.33\%$
test_items_stack_nested 0.7252ms 0.4078ms 2.4521 KOps/s 2.4292 KOps/s $\color{#35bf28}+0.94\%$
test_items_stack_nested_leaf 0.1379ms 78.8650μs 12.6799 KOps/s 12.8380 KOps/s $\color{#d91a1a}-1.23\%$
test_items_stack_nested_locked 0.7015ms 0.4085ms 2.4478 KOps/s 2.4416 KOps/s $\color{#35bf28}+0.25\%$
test_keys 15.4890μs 4.1186μs 242.8023 KOps/s 286.1688 KOps/s $\textbf{\color{#d91a1a}-15.15\%}$
test_keys_nested 0.2660ms 0.1618ms 6.1810 KOps/s 6.1381 KOps/s $\color{#35bf28}+0.70\%$
test_keys_nested_locked 1.6958ms 0.1686ms 5.9315 KOps/s 5.9164 KOps/s $\color{#35bf28}+0.26\%$
test_keys_nested_leaf 0.2366ms 0.1418ms 7.0521 KOps/s 7.0685 KOps/s $\color{#d91a1a}-0.23\%$
test_keys_stack_nested 0.2650ms 0.1637ms 6.1082 KOps/s 6.0949 KOps/s $\color{#35bf28}+0.22\%$
test_keys_stack_nested_leaf 0.2606ms 0.1428ms 7.0022 KOps/s 6.9933 KOps/s $\color{#35bf28}+0.13\%$
test_keys_stack_nested_locked 0.2433ms 0.1697ms 5.8940 KOps/s 5.8850 KOps/s $\color{#35bf28}+0.15\%$
test_values 4.9912μs 1.0635μs 940.2531 KOps/s 640.5968 KOps/s $\textbf{\color{#35bf28}+46.78\%}$
test_values_nested 0.1083ms 61.6920μs 16.2096 KOps/s 16.3225 KOps/s $\color{#d91a1a}-0.69\%$
test_values_nested_locked 0.1114ms 61.3355μs 16.3038 KOps/s 16.2468 KOps/s $\color{#35bf28}+0.35\%$
test_values_nested_leaf 0.1375ms 70.5952μs 14.1653 KOps/s 14.2925 KOps/s $\color{#d91a1a}-0.89\%$
test_values_stack_nested 0.1193ms 62.4272μs 16.0187 KOps/s 16.3552 KOps/s $\color{#d91a1a}-2.06\%$
test_values_stack_nested_leaf 0.1245ms 70.0448μs 14.2766 KOps/s 13.6116 KOps/s $\color{#35bf28}+4.89\%$
test_values_stack_nested_locked 0.1121ms 62.4228μs 16.0198 KOps/s 16.3534 KOps/s $\color{#d91a1a}-2.04\%$
test_membership 22.4620μs 0.9017μs 1.1090 MOps/s 1.1534 MOps/s $\color{#d91a1a}-3.85\%$
test_membership_nested 27.9020μs 2.9280μs 341.5293 KOps/s 345.2147 KOps/s $\color{#d91a1a}-1.07\%$
test_membership_nested_leaf 41.8480μs 2.9282μs 341.5123 KOps/s 346.5580 KOps/s $\color{#d91a1a}-1.46\%$
test_membership_stacked_nested 17.4430μs 2.9595μs 337.8905 KOps/s 345.8242 KOps/s $\color{#d91a1a}-2.29\%$
test_membership_stacked_nested_leaf 22.8030μs 2.9589μs 337.9690 KOps/s 344.3473 KOps/s $\color{#d91a1a}-1.85\%$
test_membership_nested_last 49.1820μs 4.3672μs 228.9773 KOps/s 232.2639 KOps/s $\color{#d91a1a}-1.42\%$
test_membership_nested_leaf_last 46.5870μs 4.3846μs 228.0731 KOps/s 233.1887 KOps/s $\color{#d91a1a}-2.19\%$
test_membership_stacked_nested_last 24.8560μs 4.3764μs 228.4992 KOps/s 228.2309 KOps/s $\color{#35bf28}+0.12\%$
test_membership_stacked_nested_leaf_last 30.8280μs 4.4269μs 225.8896 KOps/s 232.3913 KOps/s $\color{#d91a1a}-2.80\%$
test_nested_getleaf 47.8800μs 10.6021μs 94.3209 KOps/s 93.8512 KOps/s $\color{#35bf28}+0.50\%$
test_nested_get 46.6070μs 10.1372μs 98.6466 KOps/s 100.5741 KOps/s $\color{#d91a1a}-1.92\%$
test_stacked_getleaf 40.2460μs 10.5398μs 94.8785 KOps/s 95.1630 KOps/s $\color{#d91a1a}-0.30\%$
test_stacked_get 52.9990μs 9.9516μs 100.4863 KOps/s 100.1030 KOps/s $\color{#35bf28}+0.38\%$
test_nested_getitemleaf 51.8370μs 11.1267μs 89.8743 KOps/s 91.0583 KOps/s $\color{#d91a1a}-1.30\%$
test_nested_getitem 0.1469ms 10.7018μs 93.4423 KOps/s 95.0640 KOps/s $\color{#d91a1a}-1.71\%$
test_stacked_getitemleaf 0.3815ms 11.3572μs 88.0502 KOps/s 90.8626 KOps/s $\color{#d91a1a}-3.10\%$
test_stacked_getitem 52.4380μs 10.6809μs 93.6251 KOps/s 95.0921 KOps/s $\color{#d91a1a}-1.54\%$
test_lock_nested 0.6633ms 0.4047ms 2.4707 KOps/s 2.4389 KOps/s $\color{#35bf28}+1.31\%$
test_lock_stack_nested 0.9342ms 0.4141ms 2.4152 KOps/s 2.3462 KOps/s $\color{#35bf28}+2.94\%$
test_unlock_nested 0.5425ms 0.3274ms 3.0548 KOps/s 2.9764 KOps/s $\color{#35bf28}+2.63\%$
test_unlock_stack_nested 0.5196ms 0.3339ms 2.9951 KOps/s 2.9095 KOps/s $\color{#35bf28}+2.94\%$
test_flatten_speed 0.1923ms 98.5486μs 10.1473 KOps/s 9.8732 KOps/s $\color{#35bf28}+2.78\%$
test_unflatten_speed 0.8536ms 0.5190ms 1.9267 KOps/s 1.9408 KOps/s $\color{#d91a1a}-0.73\%$
test_common_ops 4.8152ms 0.7831ms 1.2771 KOps/s 1.2657 KOps/s $\color{#35bf28}+0.90\%$
test_creation 27.7920μs 2.5141μs 397.7565 KOps/s 394.4056 KOps/s $\color{#35bf28}+0.85\%$
test_creation_empty 52.6250μs 11.8516μs 84.3765 KOps/s 79.2883 KOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_creation_nested_1 42.4490μs 14.5569μs 68.6959 KOps/s 64.8173 KOps/s $\textbf{\color{#35bf28}+5.98\%}$
test_creation_nested_2 43.4310μs 19.0819μs 52.4057 KOps/s 46.4253 KOps/s $\textbf{\color{#35bf28}+12.88\%}$
test_clone 70.7920μs 13.3816μs 74.7293 KOps/s 75.0554 KOps/s $\color{#d91a1a}-0.43\%$
test_getitem[int] 0.8463ms 12.7611μs 78.3629 KOps/s 78.3668 KOps/s $-0.00\%$
test_getitem[slice_int] 0.1275ms 24.1345μs 41.4345 KOps/s 42.1752 KOps/s $\color{#d91a1a}-1.76\%$
test_getitem[range] 0.1664ms 50.5470μs 19.7836 KOps/s 20.1707 KOps/s $\color{#d91a1a}-1.92\%$
test_getitem[tuple] 0.1612ms 20.4325μs 48.9416 KOps/s 48.6161 KOps/s $\color{#35bf28}+0.67\%$
test_getitem[list] 0.1801ms 45.0474μs 22.1989 KOps/s 22.0837 KOps/s $\color{#35bf28}+0.52\%$
test_setitem_dim[int] 56.9760μs 24.9631μs 40.0591 KOps/s 39.5264 KOps/s $\color{#35bf28}+1.35\%$
test_setitem_dim[slice_int] 0.1022ms 50.0825μs 19.9671 KOps/s 19.8976 KOps/s $\color{#35bf28}+0.35\%$
test_setitem_dim[range] 0.1275ms 75.6612μs 13.2168 KOps/s 13.1740 KOps/s $\color{#35bf28}+0.32\%$
test_setitem_dim[tuple] 71.2020μs 39.6721μs 25.2066 KOps/s 24.5678 KOps/s $\color{#35bf28}+2.60\%$
test_setitem 78.5560μs 20.1849μs 49.5419 KOps/s 47.1589 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_set 68.1780μs 19.5965μs 51.0295 KOps/s 48.6505 KOps/s $\color{#35bf28}+4.89\%$
test_set_shared 4.2196ms 0.1792ms 5.5808 KOps/s 5.5045 KOps/s $\color{#35bf28}+1.39\%$
test_update 0.1152ms 22.8413μs 43.7803 KOps/s 41.1430 KOps/s $\textbf{\color{#35bf28}+6.41\%}$
test_update_nested 84.0970μs 33.2093μs 30.1121 KOps/s 28.3662 KOps/s $\textbf{\color{#35bf28}+6.15\%}$
test_update__nested 0.5037ms 33.7272μs 29.6496 KOps/s 30.3214 KOps/s $\color{#d91a1a}-2.22\%$
test_set_nested 77.7120μs 21.5919μs 46.3137 KOps/s 40.4833 KOps/s $\textbf{\color{#35bf28}+14.40\%}$
test_set_nested_new 78.9780μs 26.3499μs 37.9508 KOps/s 36.8924 KOps/s $\color{#35bf28}+2.87\%$
test_select 98.7050μs 42.7016μs 23.4183 KOps/s 23.6909 KOps/s $\color{#d91a1a}-1.15\%$
test_select_nested 0.1261ms 63.3203μs 15.7927 KOps/s 16.0330 KOps/s $\color{#d91a1a}-1.50\%$
test_exclude_nested 0.1375ms 81.5693μs 12.2595 KOps/s 12.5335 KOps/s $\color{#d91a1a}-2.19\%$
test_empty[True] 0.7392ms 0.4123ms 2.4256 KOps/s 2.5035 KOps/s $\color{#d91a1a}-3.11\%$
test_empty[False] 11.6540μs 1.3610μs 734.7709 KOps/s 728.5810 KOps/s $\color{#35bf28}+0.85\%$
test_unbind_speed 0.3539ms 0.2698ms 3.7068 KOps/s 3.6723 KOps/s $\color{#35bf28}+0.94\%$
test_unbind_speed_stack0 0.5228ms 0.2671ms 3.7437 KOps/s 3.6828 KOps/s $\color{#35bf28}+1.65\%$
test_unbind_speed_stack1 0.1011s 0.7221ms 1.3849 KOps/s 1.2197 KOps/s $\textbf{\color{#35bf28}+13.55\%}$
test_split 0.1006s 1.7536ms 570.2457 Ops/s 574.4426 Ops/s $\color{#d91a1a}-0.73\%$
test_chunk 0.1021s 1.7507ms 571.1974 Ops/s 631.7694 Ops/s $\textbf{\color{#d91a1a}-9.59\%}$
test_consolidate_njt[False-None] 8.4555ms 8.2012ms 121.9327 Ops/s 109.4120 Ops/s $\textbf{\color{#35bf28}+11.44\%}$
test_creation[device0] 0.2223ms 91.4560μs 10.9342 KOps/s 10.7784 KOps/s $\color{#35bf28}+1.45\%$
test_creation_from_tensor 3.6087ms 94.8949μs 10.5380 KOps/s 10.3299 KOps/s $\color{#35bf28}+2.01\%$
test_add_one[memmap_tensor0] 0.1132ms 5.0658μs 197.4009 KOps/s 188.1869 KOps/s $\color{#35bf28}+4.90\%$
test_contiguous[memmap_tensor0] 10.8500μs 0.5149μs 1.9423 MOps/s 1.9459 MOps/s $\color{#d91a1a}-0.18\%$
test_stack[memmap_tensor0] 25.6570μs 3.4219μs 292.2347 KOps/s 284.6706 KOps/s $\color{#35bf28}+2.66\%$
test_memmaptd_index 0.9541ms 0.2273ms 4.3993 KOps/s 4.3199 KOps/s $\color{#35bf28}+1.84\%$
test_memmaptd_index_astensor 0.4549ms 0.3123ms 3.2017 KOps/s 3.1796 KOps/s $\color{#35bf28}+0.69\%$
test_memmaptd_index_op 1.3660ms 0.5825ms 1.7167 KOps/s 1.6608 KOps/s $\color{#35bf28}+3.37\%$
test_serialize_model 0.1287s 0.1155s 8.6610 Ops/s 8.6532 Ops/s $\color{#35bf28}+0.09\%$
test_serialize_model_pickle 0.4465s 0.3913s 2.5558 Ops/s 2.5180 Ops/s $\color{#35bf28}+1.50\%$
test_serialize_weights 0.1248s 0.1146s 8.7260 Ops/s 8.8421 Ops/s $\color{#d91a1a}-1.31\%$
test_serialize_weights_returnearly 0.1745s 0.1590s 6.2882 Ops/s 6.1184 Ops/s $\color{#35bf28}+2.78\%$
test_serialize_weights_pickle 0.5389s 0.4752s 2.1045 Ops/s 1.2207 Ops/s $\textbf{\color{#35bf28}+72.39\%}$
test_serialize_weights_filesystem 0.2393s 0.1545s 6.4711 Ops/s 7.0373 Ops/s $\textbf{\color{#d91a1a}-8.05\%}$
test_serialize_model_filesystem 0.1582s 0.1477s 6.7715 Ops/s 6.8512 Ops/s $\color{#d91a1a}-1.16\%$
test_reshape_pytree 0.2958ms 27.8503μs 35.9062 KOps/s 38.5249 KOps/s $\textbf{\color{#d91a1a}-6.80\%}$
test_reshape_td 76.2120μs 32.4102μs 30.8545 KOps/s 31.1467 KOps/s $\color{#d91a1a}-0.94\%$
test_view_pytree 78.0440μs 25.4872μs 39.2354 KOps/s 38.3536 KOps/s $\color{#35bf28}+2.30\%$
test_view_td 90.7100μs 40.1389μs 24.9135 KOps/s 24.7819 KOps/s $\color{#35bf28}+0.53\%$
test_unbind_pytree 66.3540μs 29.2150μs 34.2289 KOps/s 34.0076 KOps/s $\color{#35bf28}+0.65\%$
test_unbind_td 0.3017ms 39.5112μs 25.3093 KOps/s 24.0675 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_split_pytree 86.9130μs 28.7654μs 34.7639 KOps/s 32.9528 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_split_td 0.1969ms 45.1837μs 22.1319 KOps/s 22.6602 KOps/s $\color{#d91a1a}-2.33\%$
test_add_pytree 77.6150μs 35.4837μs 28.1820 KOps/s 28.0182 KOps/s $\color{#35bf28}+0.58\%$
test_add_td 0.2189ms 57.8149μs 17.2966 KOps/s 17.5886 KOps/s $\color{#d91a1a}-1.66\%$
test_compile_add_one_nested[tensordict-compile] 0.1639ms 67.1738μs 14.8867 KOps/s 15.1355 KOps/s $\color{#d91a1a}-1.64\%$
test_compile_add_one_nested[tensordict-eager] 0.3321ms 0.1695ms 5.9011 KOps/s 5.8407 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_one_nested[pytree-compile] 0.1430ms 46.0206μs 21.7294 KOps/s 21.6704 KOps/s $\color{#35bf28}+0.27\%$
test_compile_add_one_nested[pytree-eager] 0.2602ms 0.1179ms 8.4784 KOps/s 8.4505 KOps/s $\color{#35bf28}+0.33\%$
test_compile_copy_nested[tensordict-compile] 83.0350μs 28.2003μs 35.4606 KOps/s 35.8259 KOps/s $\color{#d91a1a}-1.02\%$
test_compile_copy_nested[tensordict-eager] 0.1128ms 57.5370μs 17.3801 KOps/s 17.2363 KOps/s $\color{#35bf28}+0.83\%$
test_compile_copy_nested[pytree-compile] 0.1606ms 80.2090μs 12.4674 KOps/s 12.3672 KOps/s $\color{#35bf28}+0.81\%$
test_compile_copy_nested[pytree-eager] 0.1259ms 65.5538μs 15.2546 KOps/s 14.9079 KOps/s $\color{#35bf28}+2.33\%$
test_compile_add_one_flat[tensordict-compile] 0.1786ms 0.1077ms 9.2831 KOps/s 9.3678 KOps/s $\color{#d91a1a}-0.90\%$
test_compile_add_one_flat[tensordict-eager] 0.6671ms 0.2184ms 4.5789 KOps/s 4.4129 KOps/s $\color{#35bf28}+3.76\%$
test_compile_add_one_flat[tensorclass-compile] 0.1100ms 47.6021μs 21.0075 KOps/s 21.7331 KOps/s $\color{#d91a1a}-3.34\%$
test_compile_add_one_flat[tensorclass-eager] 0.7277ms 68.6080μs 14.5756 KOps/s 14.6421 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_add_one_flat[pytree-compile] 0.1782ms 0.1008ms 9.9186 KOps/s 9.9629 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_add_one_flat[pytree-eager] 1.1920ms 0.2062ms 4.8485 KOps/s 4.9660 KOps/s $\color{#d91a1a}-2.37\%$
test_compile_add_self_flat[tensordict-eager] 0.6111ms 0.2357ms 4.2430 KOps/s 4.3106 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_add_self_flat[tensordict-compile] 0.1950ms 0.1079ms 9.2693 KOps/s 9.1988 KOps/s $\color{#35bf28}+0.77\%$
test_compile_add_self_flat[tensorclass-eager] 0.6001ms 62.2908μs 16.0537 KOps/s 16.1418 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_add_self_flat[tensorclass-compile] 0.1009ms 48.8834μs 20.4568 KOps/s 20.3219 KOps/s $\color{#35bf28}+0.66\%$
test_compile_add_self_flat[pytree-eager] 0.2406ms 0.1577ms 6.3418 KOps/s 6.3404 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_self_flat[pytree-compile] 0.2623ms 0.1018ms 9.8217 KOps/s 9.9723 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_copy_flat[tensordict-compile] 53.8110μs 21.5170μs 46.4749 KOps/s 46.2204 KOps/s $\color{#35bf28}+0.55\%$
test_compile_copy_flat[tensordict-eager] 0.1302ms 66.5840μs 15.0186 KOps/s 15.0179 KOps/s $+0.00\%$
test_compile_copy_flat[pytree-compile] 0.5049ms 86.1777μs 11.6039 KOps/s 12.1669 KOps/s $\color{#d91a1a}-4.63\%$
test_compile_copy_flat[pytree-eager] 0.1398ms 67.6328μs 14.7857 KOps/s 14.2558 KOps/s $\color{#35bf28}+3.72\%$
test_compile_assign_and_add[tensordict-compile] 0.7922ms 0.2143ms 4.6668 KOps/s 4.6176 KOps/s $\color{#35bf28}+1.07\%$
test_compile_assign_and_add[tensordict-eager] 1.8737ms 1.3583ms 736.2081 Ops/s 727.1543 Ops/s $\color{#35bf28}+1.25\%$
test_compile_assign_and_add[pytree-compile] 0.3062ms 0.2119ms 4.7191 KOps/s 4.7600 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_assign_and_add[pytree-eager] 0.8908ms 0.8135ms 1.2293 KOps/s 1.2231 KOps/s $\color{#35bf28}+0.51\%$
test_compile_assign_and_add_stack[compile] 0.6383ms 0.4578ms 2.1842 KOps/s 2.2034 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_assign_and_add_stack[eager] 2.9344ms 2.7140ms 368.4594 Ops/s 356.3522 Ops/s $\color{#35bf28}+3.40\%$
test_compile_indexing[tensor-tensordict-compile] 0.1051ms 38.3125μs 26.1012 KOps/s 26.8075 KOps/s $\color{#d91a1a}-2.63\%$
test_compile_indexing[tensor-tensordict-eager] 0.5736ms 33.5731μs 29.7858 KOps/s 29.2009 KOps/s $\color{#35bf28}+2.00\%$
test_compile_indexing[tensor-tensorclass-compile] 79.5690μs 32.2906μs 30.9688 KOps/s 32.6874 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.3034ms 23.2104μs 43.0840 KOps/s 43.8223 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_indexing[tensor-pytree-compile] 0.1009ms 31.9755μs 31.2739 KOps/s 32.0431 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_indexing[tensor-pytree-eager] 84.1670μs 23.7056μs 42.1842 KOps/s 43.6599 KOps/s $\color{#d91a1a}-3.38\%$
test_compile_indexing[slice-tensordict-compile] 0.1304ms 54.5326μs 18.3377 KOps/s 19.5877 KOps/s $\textbf{\color{#d91a1a}-6.38\%}$
test_compile_indexing[slice-tensordict-eager] 0.3654ms 20.2705μs 49.3327 KOps/s 49.6106 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[slice-tensorclass-compile] 0.1086ms 46.8924μs 21.3254 KOps/s 22.4418 KOps/s $\color{#d91a1a}-4.97\%$
test_compile_indexing[slice-tensorclass-eager] 60.4520μs 18.6948μs 53.4909 KOps/s 54.4442 KOps/s $\color{#d91a1a}-1.75\%$
test_compile_indexing[slice-pytree-compile] 0.1089ms 47.5922μs 21.0118 KOps/s 22.0481 KOps/s $\color{#d91a1a}-4.70\%$
test_compile_indexing[slice-pytree-eager] 59.8310μs 18.7344μs 53.3779 KOps/s 53.7702 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_indexing[int-tensordict-compile] 0.1398ms 54.9686μs 18.1922 KOps/s 18.4957 KOps/s $\color{#d91a1a}-1.64\%$
test_compile_indexing[int-tensordict-eager] 0.8875ms 20.1624μs 49.5972 KOps/s 50.2575 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_indexing[int-tensorclass-compile] 0.1080ms 47.5647μs 21.0240 KOps/s 22.0219 KOps/s $\color{#d91a1a}-4.53\%$
test_compile_indexing[int-tensorclass-eager] 54.1210μs 18.6478μs 53.6256 KOps/s 54.2379 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_indexing[int-pytree-compile] 0.5884ms 47.3604μs 21.1147 KOps/s 21.3672 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_indexing[int-pytree-eager] 74.6190μs 18.7988μs 53.1949 KOps/s 54.7519 KOps/s $\color{#d91a1a}-2.84\%$
test_mod_add[eager] 0.1045ms 34.6082μs 28.8949 KOps/s 29.1898 KOps/s $\color{#d91a1a}-1.01\%$
test_mod_add[compile] 0.1532ms 63.8922μs 15.6514 KOps/s 15.6641 KOps/s $\color{#d91a1a}-0.08\%$
test_mod_add[compile-overhead] 0.1346ms 63.1456μs 15.8364 KOps/s 16.0312 KOps/s $\color{#d91a1a}-1.22\%$
test_mod_wrap[eager] 0.4522ms 0.2205ms 4.5342 KOps/s 4.3958 KOps/s $\color{#35bf28}+3.15\%$
test_mod_wrap[compile] 1.6467ms 0.2227ms 4.4904 KOps/s 4.2943 KOps/s $\color{#35bf28}+4.57\%$
test_mod_wrap[compile-overhead] 0.4411ms 0.2216ms 4.5124 KOps/s 4.1871 KOps/s $\textbf{\color{#35bf28}+7.77\%}$
test_mod_wrap_and_backward[eager] 15.2552ms 11.4430ms 87.3894 Ops/s 90.6309 Ops/s $\color{#d91a1a}-3.58\%$
test_mod_wrap_and_backward[compile] 13.8527ms 11.2881ms 88.5887 Ops/s 91.9739 Ops/s $\color{#d91a1a}-3.68\%$
test_mod_wrap_and_backward[compile-overhead] 15.8753ms 11.3926ms 87.7760 Ops/s 90.6443 Ops/s $\color{#d91a1a}-3.16\%$
test_seq_add[eager] 0.1999ms 0.1170ms 8.5465 KOps/s 8.6379 KOps/s $\color{#d91a1a}-1.06\%$
test_seq_add[compile] 0.1768ms 75.9183μs 13.1720 KOps/s 13.5876 KOps/s $\color{#d91a1a}-3.06\%$
test_seq_add[compile-overhead] 0.3011ms 71.7022μs 13.9466 KOps/s 13.8431 KOps/s $\color{#35bf28}+0.75\%$
test_seq_wrap[eager] 0.6083ms 0.4401ms 2.2725 KOps/s 2.2243 KOps/s $\color{#35bf28}+2.17\%$
test_seq_wrap[compile] 0.4655ms 0.2377ms 4.2074 KOps/s 4.1371 KOps/s $\color{#35bf28}+1.70\%$
test_seq_wrap[compile-overhead] 0.3355ms 0.2363ms 4.2314 KOps/s 4.1749 KOps/s $\color{#35bf28}+1.35\%$
test_func_call_runtime[False-eager] 0.9513ms 0.5441ms 1.8378 KOps/s 1.7666 KOps/s $\color{#35bf28}+4.03\%$
test_func_call_runtime[False-compile] 0.5980ms 0.4383ms 2.2813 KOps/s 2.2526 KOps/s $\color{#35bf28}+1.27\%$
test_func_call_runtime[False-compile-overhead] 0.5205ms 0.4379ms 2.2836 KOps/s 2.2593 KOps/s $\color{#35bf28}+1.07\%$
test_func_call_runtime[True-eager] 0.8961ms 0.7570ms 1.3210 KOps/s 1.2957 KOps/s $\color{#35bf28}+1.95\%$
test_func_call_runtime[True-compile] 2.1094ms 0.4628ms 2.1607 KOps/s 2.1418 KOps/s $\color{#35bf28}+0.88\%$
test_func_call_runtime[True-compile-overhead] 0.8775ms 0.4604ms 2.1718 KOps/s 2.1334 KOps/s $\color{#35bf28}+1.80\%$
test_func_call_cm_runtime[False-eager] 0.9403ms 0.5410ms 1.8483 KOps/s 1.8098 KOps/s $\color{#35bf28}+2.13\%$
test_func_call_cm_runtime[False-compile] 0.5755ms 0.4370ms 2.2885 KOps/s 2.2558 KOps/s $\color{#35bf28}+1.45\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7459ms 0.4375ms 2.2858 KOps/s 2.2508 KOps/s $\color{#35bf28}+1.56\%$
test_func_call_cm_runtime[True-eager] 1.4318ms 0.8946ms 1.1178 KOps/s 1.0896 KOps/s $\color{#35bf28}+2.59\%$
test_func_call_cm_runtime[True-compile] 0.9547ms 0.7962ms 1.2560 KOps/s 1.2076 KOps/s $\color{#35bf28}+4.01\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0039ms 0.7990ms 1.2515 KOps/s 1.2029 KOps/s $\color{#35bf28}+4.04\%$
test_vmap_func_call_cm_runtime[eager] 3.0207ms 1.8909ms 528.8469 Ops/s 515.5253 Ops/s $\color{#35bf28}+2.58\%$
test_vmap_func_call_cm_runtime[compile] 0.9374ms 0.5334ms 1.8749 KOps/s 1.8646 KOps/s $\color{#35bf28}+0.55\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8625ms 0.5313ms 1.8821 KOps/s 1.8679 KOps/s $\color{#35bf28}+0.76\%$
test_distributed 0.2245ms 0.1232ms 8.1140 KOps/s 7.8014 KOps/s $\color{#35bf28}+4.01\%$
test_tdmodule 0.5294ms 26.7239μs 37.4197 KOps/s 36.5522 KOps/s $\color{#35bf28}+2.37\%$
test_tdmodule_dispatch 74.0280μs 47.8735μs 20.8884 KOps/s 20.2603 KOps/s $\color{#35bf28}+3.10\%$
test_tdseq 45.2850μs 27.8031μs 35.9672 KOps/s 33.6787 KOps/s $\textbf{\color{#35bf28}+6.80\%}$
test_tdseq_dispatch 0.1255ms 54.5873μs 18.3193 KOps/s 18.4617 KOps/s $\color{#d91a1a}-0.77\%$
test_instantiation_functorch 2.3550ms 1.5107ms 661.9292 Ops/s 660.5450 Ops/s $\color{#35bf28}+0.21\%$
test_exec_functorch 0.3438ms 0.1783ms 5.6073 KOps/s 5.5380 KOps/s $\color{#35bf28}+1.25\%$
test_exec_functional_call 0.3164ms 0.1737ms 5.7570 KOps/s 5.7275 KOps/s $\color{#35bf28}+0.52\%$
test_exec_td_decorator 0.4943ms 0.2351ms 4.2538 KOps/s 4.2530 KOps/s $\color{#35bf28}+0.02\%$
test_vmap_mlp_speed_decorator[True-True] 0.8579ms 0.6548ms 1.5272 KOps/s 1.5093 KOps/s $\color{#35bf28}+1.19\%$
test_vmap_mlp_speed_decorator[True-False] 1.0240ms 0.6633ms 1.5076 KOps/s 1.4806 KOps/s $\color{#35bf28}+1.82\%$
test_vmap_mlp_speed_decorator[False-True] 0.7429ms 0.5251ms 1.9045 KOps/s 1.8674 KOps/s $\color{#35bf28}+1.99\%$
test_vmap_mlp_speed_decorator[False-False] 0.9551ms 0.5288ms 1.8912 KOps/s 1.8654 KOps/s $\color{#35bf28}+1.39\%$
test_to_module_speed[True] 1.8557ms 1.3050ms 766.3047 Ops/s 764.5533 Ops/s $\color{#35bf28}+0.23\%$
test_to_module_speed[False] 1.4532ms 1.2788ms 781.9869 Ops/s 775.6743 Ops/s $\color{#35bf28}+0.81\%$
test_tc_init 82.9350μs 45.3374μs 22.0568 KOps/s 21.6296 KOps/s $\color{#35bf28}+1.98\%$
test_tc_init_nested 0.1551ms 89.1290μs 11.2197 KOps/s 10.8892 KOps/s $\color{#35bf28}+3.03\%$
test_tc_first_layer_tensor 18.3040μs 1.5699μs 636.9640 KOps/s 645.6686 KOps/s $\color{#d91a1a}-1.35\%$
test_tc_first_layer_nontensor 25.1870μs 4.7424μs 210.8631 KOps/s 213.8098 KOps/s $\color{#d91a1a}-1.38\%$
test_tc_second_layer_tensor 42.5200μs 2.9598μs 337.8656 KOps/s 350.0351 KOps/s $\color{#d91a1a}-3.48\%$
test_tc_second_layer_nontensor 27.5920μs 6.1993μs 161.3088 KOps/s 166.2020 KOps/s $\color{#d91a1a}-2.94\%$
test_unbind 0.2332s 12.8928ms 77.5630 Ops/s 68.1804 Ops/s $\textbf{\color{#35bf28}+13.76\%}$
test_full_like 8.8565ms 6.5159ms 153.4701 Ops/s 133.3698 Ops/s $\textbf{\color{#35bf28}+15.07\%}$
test_zeros_like 5.0391ms 2.6270ms 380.6604 Ops/s 222.6627 Ops/s $\textbf{\color{#35bf28}+70.96\%}$
test_ones_like 4.6784ms 3.2041ms 312.0983 Ops/s 320.7666 Ops/s $\color{#d91a1a}-2.70\%$
test_clone 6.0001ms 4.7928ms 208.6451 Ops/s 205.8061 Ops/s $\color{#35bf28}+1.38\%$
test_squeeze 59.7510μs 12.1628μs 82.2181 KOps/s 78.8345 KOps/s $\color{#35bf28}+4.29\%$
test_unsqueeze 0.1650ms 90.4950μs 11.0503 KOps/s 10.6375 KOps/s $\color{#35bf28}+3.88\%$
test_split 0.4795ms 0.1933ms 5.1745 KOps/s 5.2276 KOps/s $\color{#d91a1a}-1.01\%$
test_permute 0.4128ms 0.1975ms 5.0635 KOps/s 4.9725 KOps/s $\color{#35bf28}+1.83\%$
test_stack 26.8243ms 24.4554ms 40.8908 Ops/s 40.0737 Ops/s $\color{#35bf28}+2.04\%$
test_cat 44.8406ms 25.1761ms 39.7203 Ops/s 40.5086 Ops/s $\color{#d91a1a}-1.95\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 9ca5bf2a5bc1f3fd88c29360fb088836ce35e8a7
Pull Request resolved: #1223
Copy link

github-actions bot commented Feb 19, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 33.1300μs 12.5570μs 79.6366 KOps/s 79.1701 KOps/s $\color{#35bf28}+0.59\%$
test_plain_set_stack_nested 45.4200μs 12.6355μs 79.1420 KOps/s 78.5162 KOps/s $\color{#35bf28}+0.80\%$
test_plain_set_nested_inplace 38.6700μs 13.6957μs 73.0155 KOps/s 72.6685 KOps/s $\color{#35bf28}+0.48\%$
test_plain_set_stack_nested_inplace 42.0900μs 13.5604μs 73.7442 KOps/s 73.2678 KOps/s $\color{#35bf28}+0.65\%$
test_items 25.1000μs 2.8982μs 345.0403 KOps/s 330.5454 KOps/s $\color{#35bf28}+4.39\%$
test_items_nested 0.4142ms 0.3652ms 2.7384 KOps/s 2.7136 KOps/s $\color{#35bf28}+0.91\%$
test_items_nested_locked 0.4066ms 0.3665ms 2.7285 KOps/s 2.7319 KOps/s $\color{#d91a1a}-0.13\%$
test_items_nested_leaf 89.7210μs 60.4545μs 16.5414 KOps/s 16.5910 KOps/s $\color{#d91a1a}-0.30\%$
test_items_stack_nested 0.4035ms 0.3650ms 2.7396 KOps/s 2.7501 KOps/s $\color{#d91a1a}-0.38\%$
test_items_stack_nested_leaf 88.0410μs 62.2274μs 16.0701 KOps/s 15.9832 KOps/s $\color{#35bf28}+0.54\%$
test_items_stack_nested_locked 0.4184ms 0.3637ms 2.7493 KOps/s 2.7532 KOps/s $\color{#d91a1a}-0.14\%$
test_keys 29.0100μs 3.4499μs 289.8641 KOps/s 290.4680 KOps/s $\color{#d91a1a}-0.21\%$
test_keys_nested 0.1247ms 88.1331μs 11.3465 KOps/s 11.3732 KOps/s $\color{#d91a1a}-0.24\%$
test_keys_nested_locked 0.7403ms 94.0278μs 10.6351 KOps/s 10.7402 KOps/s $\color{#d91a1a}-0.98\%$
test_keys_nested_leaf 0.1022ms 79.5002μs 12.5786 KOps/s 12.6613 KOps/s $\color{#d91a1a}-0.65\%$
test_keys_stack_nested 0.1184ms 89.4472μs 11.1798 KOps/s 11.3032 KOps/s $\color{#d91a1a}-1.09\%$
test_keys_stack_nested_leaf 0.1076ms 80.6979μs 12.3919 KOps/s 12.5131 KOps/s $\color{#d91a1a}-0.97\%$
test_keys_stack_nested_locked 0.1217ms 95.0318μs 10.5228 KOps/s 10.5205 KOps/s $\color{#35bf28}+0.02\%$
test_values 6.0683μs 0.8604μs 1.1622 MOps/s 1.1699 MOps/s $\color{#d91a1a}-0.66\%$
test_values_nested 61.5910μs 37.2829μs 26.8219 KOps/s 26.6680 KOps/s $\color{#35bf28}+0.58\%$
test_values_nested_locked 64.1610μs 39.0512μs 25.6074 KOps/s 25.4169 KOps/s $\color{#35bf28}+0.75\%$
test_values_nested_leaf 67.8510μs 42.2505μs 23.6684 KOps/s 23.5142 KOps/s $\color{#35bf28}+0.66\%$
test_values_stack_nested 65.9000μs 38.2393μs 26.1511 KOps/s 26.4199 KOps/s $\color{#d91a1a}-1.02\%$
test_values_stack_nested_leaf 65.2110μs 42.7629μs 23.3848 KOps/s 23.3657 KOps/s $\color{#35bf28}+0.08\%$
test_values_stack_nested_locked 66.5010μs 39.8273μs 25.1084 KOps/s 25.2497 KOps/s $\color{#d91a1a}-0.56\%$
test_membership 2.0295μs 0.5263μs 1.9000 MOps/s 1.8530 MOps/s $\color{#35bf28}+2.53\%$
test_membership_nested 37.9600μs 2.1031μs 475.4785 KOps/s 473.2621 KOps/s $\color{#35bf28}+0.47\%$
test_membership_nested_leaf 19.5600μs 2.0648μs 484.3063 KOps/s 491.6508 KOps/s $\color{#d91a1a}-1.49\%$
test_membership_stacked_nested 34.8300μs 2.1225μs 471.1487 KOps/s 467.1690 KOps/s $\color{#35bf28}+0.85\%$
test_membership_stacked_nested_leaf 32.0600μs 2.1245μs 470.6892 KOps/s 467.5773 KOps/s $\color{#35bf28}+0.67\%$
test_membership_nested_last 26.7910μs 3.1389μs 318.5842 KOps/s 320.0244 KOps/s $\color{#d91a1a}-0.45\%$
test_membership_nested_leaf_last 41.8800μs 3.1386μs 318.6179 KOps/s 319.9374 KOps/s $\color{#d91a1a}-0.41\%$
test_membership_stacked_nested_last 31.2200μs 4.1182μs 242.8248 KOps/s 242.4356 KOps/s $\color{#35bf28}+0.16\%$
test_membership_stacked_nested_leaf_last 32.1000μs 4.1326μs 241.9756 KOps/s 243.7340 KOps/s $\color{#d91a1a}-0.72\%$
test_nested_getleaf 33.1000μs 6.2424μs 160.1949 KOps/s 162.3213 KOps/s $\color{#d91a1a}-1.31\%$
test_nested_get 33.3300μs 5.9340μs 168.5207 KOps/s 170.4022 KOps/s $\color{#d91a1a}-1.10\%$
test_stacked_getleaf 49.0910μs 6.3298μs 157.9827 KOps/s 164.0702 KOps/s $\color{#d91a1a}-3.71\%$
test_stacked_get 86.6010μs 5.7952μs 172.5558 KOps/s 171.2912 KOps/s $\color{#35bf28}+0.74\%$
test_nested_getitemleaf 34.0700μs 6.3767μs 156.8206 KOps/s 154.2281 KOps/s $\color{#35bf28}+1.68\%$
test_nested_getitem 29.9910μs 6.1189μs 163.4291 KOps/s 165.1074 KOps/s $\color{#d91a1a}-1.02\%$
test_stacked_getitemleaf 37.6910μs 6.3736μs 156.8961 KOps/s 155.4861 KOps/s $\color{#35bf28}+0.91\%$
test_stacked_getitem 37.4500μs 6.0214μs 166.0747 KOps/s 166.7937 KOps/s $\color{#d91a1a}-0.43\%$
test_lock_nested 0.4010ms 0.3367ms 2.9702 KOps/s 2.9679 KOps/s $\color{#35bf28}+0.08\%$
test_lock_stack_nested 0.4189ms 0.3420ms 2.9242 KOps/s 2.8863 KOps/s $\color{#35bf28}+1.31\%$
test_unlock_nested 0.3493ms 0.2816ms 3.5507 KOps/s 3.5313 KOps/s $\color{#35bf28}+0.55\%$
test_unlock_stack_nested 0.3111ms 0.2809ms 3.5598 KOps/s 3.5172 KOps/s $\color{#35bf28}+1.21\%$
test_flatten_speed 0.1129ms 77.2906μs 12.9382 KOps/s 12.8772 KOps/s $\color{#35bf28}+0.47\%$
test_unflatten_speed 0.3673ms 0.3231ms 3.0948 KOps/s 3.1082 KOps/s $\color{#d91a1a}-0.43\%$
test_common_ops 0.7480ms 0.6189ms 1.6159 KOps/s 1.6137 KOps/s $\color{#35bf28}+0.13\%$
test_creation 0.1267ms 1.7607μs 567.9610 KOps/s 566.9679 KOps/s $\color{#35bf28}+0.18\%$
test_creation_empty 38.2300μs 8.4816μs 117.9026 KOps/s 115.0093 KOps/s $\color{#35bf28}+2.52\%$
test_creation_nested_1 37.1700μs 10.1111μs 98.9014 KOps/s 96.1851 KOps/s $\color{#35bf28}+2.82\%$
test_creation_nested_2 48.7710μs 13.0141μs 76.8397 KOps/s 76.1703 KOps/s $\color{#35bf28}+0.88\%$
test_clone 51.2910μs 10.5303μs 94.9644 KOps/s 92.5007 KOps/s $\color{#35bf28}+2.66\%$
test_getitem[int] 1.1259ms 10.4895μs 95.3333 KOps/s 94.4960 KOps/s $\color{#35bf28}+0.89\%$
test_getitem[slice_int] 0.1076ms 20.6850μs 48.3443 KOps/s 48.3081 KOps/s $\color{#35bf28}+0.07\%$
test_getitem[range] 0.1319ms 38.3790μs 26.0559 KOps/s 26.2403 KOps/s $\color{#d91a1a}-0.70\%$
test_getitem[tuple] 0.1048ms 17.9670μs 55.6576 KOps/s 55.4325 KOps/s $\color{#35bf28}+0.41\%$
test_getitem[list] 0.1545ms 35.6608μs 28.0420 KOps/s 27.9282 KOps/s $\color{#35bf28}+0.41\%$
test_setitem_dim[int] 46.9400μs 19.2545μs 51.9358 KOps/s 48.4558 KOps/s $\textbf{\color{#35bf28}+7.18\%}$
test_setitem_dim[slice_int] 69.4410μs 38.8755μs 25.7231 KOps/s 24.8987 KOps/s $\color{#35bf28}+3.31\%$
test_setitem_dim[range] 76.3110μs 54.2038μs 18.4489 KOps/s 17.6594 KOps/s $\color{#35bf28}+4.47\%$
test_setitem_dim[tuple] 53.5410μs 33.1225μs 30.1910 KOps/s 29.7849 KOps/s $\color{#35bf28}+1.36\%$
test_setitem 60.1300μs 15.2768μs 65.4586 KOps/s 64.6335 KOps/s $\color{#35bf28}+1.28\%$
test_set 79.0910μs 14.6560μs 68.2315 KOps/s 66.2499 KOps/s $\color{#35bf28}+2.99\%$
test_set_shared 0.5071ms 0.1572ms 6.3610 KOps/s 6.2850 KOps/s $\color{#35bf28}+1.21\%$
test_update 0.3195ms 17.7533μs 56.3275 KOps/s 54.8607 KOps/s $\color{#35bf28}+2.67\%$
test_update_nested 70.3900μs 23.4074μs 42.7215 KOps/s 41.6193 KOps/s $\color{#35bf28}+2.65\%$
test_update__nested 0.5006ms 25.8303μs 38.7143 KOps/s 38.6343 KOps/s $\color{#35bf28}+0.21\%$
test_set_nested 76.8300μs 16.0207μs 62.4193 KOps/s 61.1013 KOps/s $\color{#35bf28}+2.16\%$
test_set_nested_new 79.9810μs 18.2057μs 54.9280 KOps/s 53.7992 KOps/s $\color{#35bf28}+2.10\%$
test_select 78.9510μs 30.2554μs 33.0519 KOps/s 32.2510 KOps/s $\color{#35bf28}+2.48\%$
test_select_nested 73.6810μs 43.7972μs 22.8325 KOps/s 22.6436 KOps/s $\color{#35bf28}+0.83\%$
test_exclude_nested 88.4910μs 63.0202μs 15.8679 KOps/s 15.7881 KOps/s $\color{#35bf28}+0.51\%$
test_empty[True] 0.3350ms 0.2952ms 3.3881 KOps/s 3.3974 KOps/s $\color{#d91a1a}-0.27\%$
test_empty[False] 2.1315μs 0.8444μs 1.1842 MOps/s 1.1652 MOps/s $\color{#35bf28}+1.63\%$
test_to 86.6310μs 56.1034μs 17.8242 KOps/s 16.5078 KOps/s $\textbf{\color{#35bf28}+7.97\%}$
test_to_nonblocking 77.7000μs 47.0734μs 21.2434 KOps/s 21.4596 KOps/s $\color{#d91a1a}-1.01\%$
test_unbind_speed 0.2998ms 0.2409ms 4.1506 KOps/s 4.1543 KOps/s $\color{#d91a1a}-0.09\%$
test_unbind_speed_stack0 0.3064ms 0.2393ms 4.1787 KOps/s 4.1166 KOps/s $\color{#35bf28}+1.51\%$
test_unbind_speed_stack1 92.9789ms 0.7306ms 1.3687 KOps/s 1.3563 KOps/s $\color{#35bf28}+0.91\%$
test_split 1.5723ms 1.4528ms 688.3268 Ops/s 626.6267 Ops/s $\textbf{\color{#35bf28}+9.85\%}$
test_chunk 94.7964ms 1.7451ms 573.0374 Ops/s 624.4282 Ops/s $\textbf{\color{#d91a1a}-8.23\%}$
test_consolidate[False-None] 2.8562ms 2.7772ms 360.0725 Ops/s 365.1427 Ops/s $\color{#d91a1a}-1.39\%$
test_consolidate[default-None] 1.8133ms 1.7286ms 578.5015 Ops/s 590.5399 Ops/s $\color{#d91a1a}-2.04\%$
test_consolidate[reduce-overhead-None] 1.8570ms 1.7768ms 562.8189 Ops/s 576.4568 Ops/s $\color{#d91a1a}-2.37\%$
test_consolidate_njt[False-None] 6.9526ms 6.5928ms 151.6813 Ops/s 152.0302 Ops/s $\color{#d91a1a}-0.23\%$
test_to[False-False-None] 1.8097ms 1.7298ms 578.0873 Ops/s 587.5137 Ops/s $\color{#d91a1a}-1.60\%$
test_to[True-False-None] 1.6087ms 1.3876ms 720.6923 Ops/s 731.1790 Ops/s $\color{#d91a1a}-1.43\%$
test_to[within-False-None] 0.2951s 5.4883ms 182.2062 Ops/s 237.8970 Ops/s $\textbf{\color{#d91a1a}-23.41\%}$
test_to[True-default-None] 5.7136ms 5.3258ms 187.7655 Ops/s 188.2238 Ops/s $\color{#d91a1a}-0.24\%$
test_to_njt[False-False-None] 7.1418ms 6.9989ms 142.8804 Ops/s 143.9772 Ops/s $\color{#d91a1a}-0.76\%$
test_to_njt[True-False-None] 5.7181ms 5.5740ms 179.4031 Ops/s 178.5933 Ops/s $\color{#35bf28}+0.45\%$
test_to_njt[within-False-None] 13.2585ms 12.4269ms 80.4709 Ops/s 81.8274 Ops/s $\color{#d91a1a}-1.66\%$
test_creation[device0] 0.4637ms 80.1888μs 12.4706 KOps/s 12.5774 KOps/s $\color{#d91a1a}-0.85\%$
test_creation_from_tensor 0.4783ms 86.7402μs 11.5287 KOps/s 11.5858 KOps/s $\color{#d91a1a}-0.49\%$
test_add_one[memmap_tensor0] 0.4686ms 6.7563μs 148.0091 KOps/s 148.0512 KOps/s $\color{#d91a1a}-0.03\%$
test_contiguous[memmap_tensor0] 1.9370μs 0.4188μs 2.3875 MOps/s 2.3119 MOps/s $\color{#35bf28}+3.27\%$
test_stack[memmap_tensor0] 41.1810μs 4.2818μs 233.5457 KOps/s 231.6412 KOps/s $\color{#35bf28}+0.82\%$
test_memmaptd_index 1.7834ms 0.2469ms 4.0498 KOps/s 4.1237 KOps/s $\color{#d91a1a}-1.79\%$
test_memmaptd_index_astensor 0.4394ms 0.3084ms 3.2431 KOps/s 3.3087 KOps/s $\color{#d91a1a}-1.98\%$
test_memmaptd_index_op 0.7409ms 0.5884ms 1.6996 KOps/s 1.7032 KOps/s $\color{#d91a1a}-0.21\%$
test_serialize_model 0.1307s 0.1301s 7.6875 Ops/s 7.6686 Ops/s $\color{#35bf28}+0.25\%$
test_serialize_model_pickle 1.3494s 1.2096s 0.8267 Ops/s 0.8263 Ops/s $\color{#35bf28}+0.05\%$
test_serialize_weights 0.2782s 0.1507s 6.6376 Ops/s 7.7093 Ops/s $\textbf{\color{#d91a1a}-13.90\%}$
test_serialize_weights_returnearly 0.3337s 54.1093ms 18.4811 Ops/s 15.6169 Ops/s $\textbf{\color{#35bf28}+18.34\%}$
test_serialize_weights_pickle 1.3483s 1.1830s 0.8453 Ops/s 0.8224 Ops/s $\color{#35bf28}+2.79\%$
test_reshape_pytree 58.5300μs 22.5463μs 44.3532 KOps/s 44.0217 KOps/s $\color{#35bf28}+0.75\%$
test_reshape_td 63.5510μs 27.2206μs 36.7369 KOps/s 36.3202 KOps/s $\color{#35bf28}+1.15\%$
test_view_pytree 48.9500μs 21.8831μs 45.6973 KOps/s 44.7474 KOps/s $\color{#35bf28}+2.12\%$
test_view_td 65.1700μs 32.7944μs 30.4930 KOps/s 28.1435 KOps/s $\textbf{\color{#35bf28}+8.35\%}$
test_unbind_pytree 53.5600μs 27.7152μs 36.0813 KOps/s 35.2475 KOps/s $\color{#35bf28}+2.37\%$
test_unbind_td 0.5329ms 37.3236μs 26.7927 KOps/s 26.4049 KOps/s $\color{#35bf28}+1.47\%$
test_split_pytree 55.8200μs 29.7693μs 33.5917 KOps/s 33.7084 KOps/s $\color{#d91a1a}-0.35\%$
test_split_td 0.7255ms 38.6030μs 25.9048 KOps/s 25.2477 KOps/s $\color{#35bf28}+2.60\%$
test_add_pytree 72.0210μs 34.6800μs 28.8351 KOps/s 28.0797 KOps/s $\color{#35bf28}+2.69\%$
test_add_td 87.6810μs 48.4438μs 20.6425 KOps/s 17.5370 KOps/s $\textbf{\color{#35bf28}+17.71\%}$
test_compile_add_one_nested[tensordict-compile] 0.1803ms 0.1221ms 8.1899 KOps/s 7.8578 KOps/s $\color{#35bf28}+4.23\%$
test_compile_add_one_nested[tensordict-eager] 0.2296ms 0.1338ms 7.4757 KOps/s 7.4044 KOps/s $\color{#35bf28}+0.96\%$
test_compile_add_one_nested[pytree-compile] 0.1379ms 95.9642μs 10.4206 KOps/s 10.2090 KOps/s $\color{#35bf28}+2.07\%$
test_compile_add_one_nested[pytree-eager] 0.9772ms 0.1516ms 6.5981 KOps/s 6.6628 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_copy_nested[tensordict-compile] 77.6610μs 24.5493μs 40.7344 KOps/s 40.0518 KOps/s $\color{#35bf28}+1.70\%$
test_compile_copy_nested[tensordict-eager] 62.9500μs 29.4923μs 33.9072 KOps/s 33.5207 KOps/s $\color{#35bf28}+1.15\%$
test_compile_copy_nested[pytree-compile] 0.3793ms 63.8359μs 15.6652 KOps/s 15.2968 KOps/s $\color{#35bf28}+2.41\%$
test_compile_copy_nested[pytree-eager] 79.7900μs 48.9212μs 20.4411 KOps/s 20.2183 KOps/s $\color{#35bf28}+1.10\%$
test_compile_add_one_flat[tensordict-compile] 0.1874ms 0.1447ms 6.9129 KOps/s 6.9668 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_add_one_flat[tensordict-eager] 0.3060ms 0.2169ms 4.6108 KOps/s 4.6596 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_add_one_flat[tensorclass-compile] 0.1371ms 99.5618μs 10.0440 KOps/s 10.1670 KOps/s $\color{#d91a1a}-1.21\%$
test_compile_add_one_flat[tensorclass-eager] 0.1140ms 55.5123μs 18.0140 KOps/s 17.7005 KOps/s $\color{#35bf28}+1.77\%$
test_compile_add_one_flat[pytree-compile] 0.1821ms 0.1369ms 7.3047 KOps/s 7.3466 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_add_one_flat[pytree-eager] 0.5877ms 0.4879ms 2.0496 KOps/s 2.0730 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_add_self_flat[tensordict-eager] 0.3875ms 0.2602ms 3.8434 KOps/s 3.8411 KOps/s $\color{#35bf28}+0.06\%$
test_compile_add_self_flat[tensordict-compile] 0.1884ms 0.1462ms 6.8394 KOps/s 6.9587 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_add_self_flat[tensorclass-eager] 0.1802ms 71.2530μs 14.0345 KOps/s 14.6954 KOps/s $\color{#d91a1a}-4.50\%$
test_compile_add_self_flat[tensorclass-compile] 0.1614ms 98.8583μs 10.1155 KOps/s 10.0645 KOps/s $\color{#35bf28}+0.51\%$
test_compile_add_self_flat[pytree-eager] 0.4499ms 0.4117ms 2.4292 KOps/s 2.4748 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_add_self_flat[pytree-compile] 0.1739ms 0.1352ms 7.3951 KOps/s 7.3896 KOps/s $\color{#35bf28}+0.07\%$
test_compile_copy_flat[tensordict-compile] 49.0610μs 18.7013μs 53.4723 KOps/s 53.2675 KOps/s $\color{#35bf28}+0.38\%$
test_compile_copy_flat[tensordict-eager] 65.1500μs 31.5165μs 31.7294 KOps/s 31.8655 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_copy_flat[pytree-compile] 0.1145ms 70.5343μs 14.1775 KOps/s 14.3641 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_copy_flat[pytree-eager] 0.2387ms 52.5451μs 19.0313 KOps/s 19.1129 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_assign_and_add[tensordict-compile] 1.6235ms 0.3922ms 2.5500 KOps/s 2.1611 KOps/s $\textbf{\color{#35bf28}+18.00\%}$
test_compile_assign_and_add[tensordict-eager] 2.8735ms 2.6373ms 379.1824 Ops/s 379.1402 Ops/s $\color{#35bf28}+0.01\%$
test_compile_assign_and_add[pytree-compile] 1.5977ms 0.4323ms 2.3134 KOps/s 2.2594 KOps/s $\color{#35bf28}+2.39\%$
test_compile_assign_and_add[pytree-eager] 3.0276ms 2.6388ms 378.9533 Ops/s 381.9982 Ops/s $\color{#d91a1a}-0.80\%$
test_compile_indexing[tensor-tensordict-compile] 0.5496ms 0.1201ms 8.3253 KOps/s 8.5128 KOps/s $\color{#d91a1a}-2.20\%$
test_compile_indexing[tensor-tensordict-eager] 0.5637ms 84.6004μs 11.8203 KOps/s 11.6824 KOps/s $\color{#35bf28}+1.18\%$
test_compile_indexing[tensor-tensorclass-compile] 0.6545ms 0.1138ms 8.7840 KOps/s 8.8105 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_indexing[tensor-tensorclass-eager] 0.4692ms 71.5230μs 13.9815 KOps/s 13.8635 KOps/s $\color{#35bf28}+0.85\%$
test_compile_indexing[tensor-pytree-compile] 0.1743ms 0.1148ms 8.7119 KOps/s 8.7277 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_indexing[tensor-pytree-eager] 0.4907ms 72.0632μs 13.8767 KOps/s 13.8734 KOps/s $\color{#35bf28}+0.02\%$
test_compile_indexing[slice-tensordict-compile] 0.1664ms 0.1041ms 9.6100 KOps/s 9.9534 KOps/s $\color{#d91a1a}-3.45\%$
test_compile_indexing[slice-tensordict-eager] 0.4158ms 17.2604μs 57.9361 KOps/s 57.5758 KOps/s $\color{#35bf28}+0.63\%$
test_compile_indexing[slice-tensorclass-compile] 0.5048ms 95.7785μs 10.4408 KOps/s 10.3908 KOps/s $\color{#35bf28}+0.48\%$
test_compile_indexing[slice-tensorclass-eager] 71.8100μs 15.6984μs 63.7010 KOps/s 63.6035 KOps/s $\color{#35bf28}+0.15\%$
test_compile_indexing[slice-pytree-compile] 0.5041ms 99.5899μs 10.0412 KOps/s 10.1192 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_indexing[slice-pytree-eager] 0.4070ms 15.6538μs 63.8824 KOps/s 64.7475 KOps/s $\color{#d91a1a}-1.34\%$
test_compile_indexing[int-tensordict-compile] 0.5152ms 0.1056ms 9.4694 KOps/s 9.8573 KOps/s $\color{#d91a1a}-3.93\%$
test_compile_indexing[int-tensordict-eager] 0.5593ms 17.2024μs 58.1315 KOps/s 58.7122 KOps/s $\color{#d91a1a}-0.99\%$
test_compile_indexing[int-tensorclass-compile] 0.4967ms 95.9531μs 10.4218 KOps/s 10.3226 KOps/s $\color{#35bf28}+0.96\%$
test_compile_indexing[int-tensorclass-eager] 51.0100μs 15.7284μs 63.5793 KOps/s 64.2241 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_indexing[int-pytree-compile] 0.4896ms 96.4307μs 10.3701 KOps/s 10.3038 KOps/s $\color{#35bf28}+0.64\%$
test_compile_indexing[int-pytree-eager] 0.4037ms 15.5472μs 64.3202 KOps/s 64.5883 KOps/s $\color{#d91a1a}-0.42\%$
test_mod_add[eager] 0.4384ms 40.3971μs 24.7542 KOps/s 26.2611 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_mod_add[compile] 0.4833ms 81.3039μs 12.2995 KOps/s 11.5096 KOps/s $\textbf{\color{#35bf28}+6.86\%}$
test_mod_add[compile-overhead] 0.3310ms 0.1689ms 5.9214 KOps/s 5.1342 KOps/s $\textbf{\color{#35bf28}+15.33\%}$
test_mod_wrap[eager] 0.6642ms 0.2492ms 4.0125 KOps/s 3.9414 KOps/s $\color{#35bf28}+1.80\%$
test_mod_wrap[compile] 0.5871ms 0.2975ms 3.3613 KOps/s 3.4744 KOps/s $\color{#d91a1a}-3.26\%$
test_mod_wrap[compile-overhead] 7.4457ms 3.8536ms 259.5002 Ops/s 264.0500 Ops/s $\color{#d91a1a}-1.72\%$
test_mod_wrap_and_backward[eager] 1.5086ms 1.3583ms 736.1985 Ops/s 685.3425 Ops/s $\textbf{\color{#35bf28}+7.42\%}$
test_mod_wrap_and_backward[compile] 1.3666ms 1.2746ms 784.5444 Ops/s 777.7880 Ops/s $\color{#35bf28}+0.87\%$
test_mod_wrap_and_backward[compile-overhead] 1.3693ms 0.9329ms 1.0719 KOps/s 1.0712 KOps/s $\color{#35bf28}+0.07\%$
test_seq_add[eager] 0.1879ms 0.1197ms 8.3514 KOps/s 8.1025 KOps/s $\color{#35bf28}+3.07\%$
test_seq_add[compile] 0.1789ms 93.6966μs 10.6727 KOps/s 10.6812 KOps/s $\color{#d91a1a}-0.08\%$
test_seq_add[compile-overhead] 0.2102ms 0.1301ms 7.6873 KOps/s 7.6153 KOps/s $\color{#35bf28}+0.94\%$
test_seq_wrap[eager] 0.5161ms 0.4365ms 2.2908 KOps/s 2.3005 KOps/s $\color{#d91a1a}-0.42\%$
test_seq_wrap[compile] 0.3685ms 0.3058ms 3.2705 KOps/s 3.2507 KOps/s $\color{#35bf28}+0.61\%$
test_seq_wrap[compile-overhead] 0.2869ms 0.2260ms 4.4241 KOps/s 4.3604 KOps/s $\color{#35bf28}+1.46\%$
test_func_call_runtime[False-eager] 0.8687ms 0.7897ms 1.2662 KOps/s 1.3346 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_func_call_runtime[False-compile] 0.9588ms 0.7538ms 1.3266 KOps/s 1.3269 KOps/s $\color{#d91a1a}-0.02\%$
test_func_call_runtime[False-compile-overhead] 0.4257ms 0.3669ms 2.7254 KOps/s 2.7041 KOps/s $\color{#35bf28}+0.78\%$
test_func_call_runtime[True-eager] 0.9717ms 0.9073ms 1.1021 KOps/s 1.1009 KOps/s $\color{#35bf28}+0.11\%$
test_func_call_runtime[True-compile] 0.8871ms 0.7829ms 1.2773 KOps/s 1.2504 KOps/s $\color{#35bf28}+2.15\%$
test_func_call_runtime[True-compile-overhead] 0.4410ms 0.3870ms 2.5839 KOps/s 2.5688 KOps/s $\color{#35bf28}+0.59\%$
test_func_call_cm_runtime[False-eager] 0.8454ms 0.7795ms 1.2829 KOps/s 1.3456 KOps/s $\color{#d91a1a}-4.66\%$
test_func_call_cm_runtime[False-compile] 1.1070ms 0.7567ms 1.3216 KOps/s 1.3282 KOps/s $\color{#d91a1a}-0.50\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4189ms 0.3683ms 2.7153 KOps/s 2.7073 KOps/s $\color{#35bf28}+0.30\%$
test_func_call_cm_runtime[True-eager] 1.1044ms 1.0048ms 995.2661 Ops/s 976.3317 Ops/s $\color{#35bf28}+1.94\%$
test_func_call_cm_runtime[True-compile] 1.1076ms 1.0060ms 994.0393 Ops/s 989.6385 Ops/s $\color{#35bf28}+0.44\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0689ms 1.0028ms 997.2021 Ops/s 983.9413 Ops/s $\color{#35bf28}+1.35\%$
test_vmap_func_call_cm_runtime[eager] 2.5282ms 2.1035ms 475.4059 Ops/s 467.4250 Ops/s $\color{#35bf28}+1.71\%$
test_vmap_func_call_cm_runtime[compile] 0.9862ms 0.8214ms 1.2174 KOps/s 1.2034 KOps/s $\color{#35bf28}+1.17\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5702ms 0.4183ms 2.3908 KOps/s 2.3660 KOps/s $\color{#35bf28}+1.05\%$
test_distributed 3.0196ms 0.1933ms 5.1727 KOps/s 8.4654 KOps/s $\textbf{\color{#d91a1a}-38.90\%}$
test_tdmodule 57.0110μs 20.7861μs 48.1090 KOps/s 48.2111 KOps/s $\color{#d91a1a}-0.21\%$
test_tdmodule_dispatch 63.8900μs 37.1617μs 26.9094 KOps/s 27.2221 KOps/s $\color{#d91a1a}-1.15\%$
test_tdseq 42.5500μs 21.8762μs 45.7119 KOps/s 48.0281 KOps/s $\color{#d91a1a}-4.82\%$
test_tdseq_dispatch 78.0010μs 41.0342μs 24.3699 KOps/s 25.6122 KOps/s $\color{#d91a1a}-4.85\%$
test_instantiation_functorch 1.6785ms 1.5456ms 646.9935 Ops/s 637.6680 Ops/s $\color{#35bf28}+1.46\%$
test_exec_functorch 0.2011ms 0.1444ms 6.9256 KOps/s 6.9161 KOps/s $\color{#35bf28}+0.14\%$
test_exec_functional_call 0.2124ms 0.1375ms 7.2724 KOps/s 7.1023 KOps/s $\color{#35bf28}+2.40\%$
test_exec_td_decorator 0.3781ms 0.1901ms 5.2605 KOps/s 5.2424 KOps/s $\color{#35bf28}+0.35\%$
test_vmap_mlp_speed_decorator[True-True] 0.8389ms 0.6886ms 1.4521 KOps/s 1.4430 KOps/s $\color{#35bf28}+0.63\%$
test_vmap_mlp_speed_decorator[True-False] 0.8189ms 0.6913ms 1.4465 KOps/s 1.4423 KOps/s $\color{#35bf28}+0.29\%$
test_vmap_mlp_speed_decorator[False-True] 0.7160ms 0.6004ms 1.6655 KOps/s 1.6592 KOps/s $\color{#35bf28}+0.38\%$
test_vmap_mlp_speed_decorator[False-False] 0.7423ms 0.6000ms 1.6666 KOps/s 1.6549 KOps/s $\color{#35bf28}+0.70\%$
test_vmap_transformer_speed_decorator[True-True] 19.9385ms 19.3135ms 51.7772 Ops/s 51.7390 Ops/s $\color{#35bf28}+0.07\%$
test_vmap_transformer_speed_decorator[True-False] 19.4224ms 19.3268ms 51.7416 Ops/s 51.8224 Ops/s $\color{#d91a1a}-0.16\%$
test_vmap_transformer_speed_decorator[False-True] 19.2109ms 19.1082ms 52.3334 Ops/s 52.2726 Ops/s $\color{#35bf28}+0.12\%$
test_vmap_transformer_speed_decorator[False-False] 19.5489ms 19.1547ms 52.2065 Ops/s 52.2559 Ops/s $\color{#d91a1a}-0.09\%$
test_to_module_speed[True] 1.4609ms 0.9881ms 1.0120 KOps/s 1.0155 KOps/s $\color{#d91a1a}-0.34\%$
test_to_module_speed[False] 1.3852ms 0.9677ms 1.0334 KOps/s 1.0376 KOps/s $\color{#d91a1a}-0.40\%$
test_tc_init 58.5100μs 36.1058μs 27.6964 KOps/s 26.9386 KOps/s $\color{#35bf28}+2.81\%$
test_tc_init_nested 0.1667ms 71.9176μs 13.9048 KOps/s 13.2733 KOps/s $\color{#35bf28}+4.76\%$
test_tc_first_layer_tensor 21.3300μs 0.7912μs 1.2639 MOps/s 1.2587 MOps/s $\color{#35bf28}+0.42\%$
test_tc_first_layer_nontensor 22.3110μs 2.2220μs 450.0496 KOps/s 447.0886 KOps/s $\color{#35bf28}+0.66\%$
test_tc_second_layer_tensor 13.1852μs 1.4022μs 713.1722 KOps/s 713.5022 KOps/s $\color{#d91a1a}-0.05\%$
test_tc_second_layer_nontensor 34.8800μs 2.9491μs 339.0857 KOps/s 338.0095 KOps/s $\color{#35bf28}+0.32\%$
test_unbind 0.2135s 12.1936ms 82.0102 Ops/s 139.3735 Ops/s $\textbf{\color{#d91a1a}-41.16\%}$
test_full_like 9.4743ms 9.2124ms 108.5488 Ops/s 108.2303 Ops/s $\color{#35bf28}+0.29\%$
test_zeros_like 4.7086ms 4.1993ms 238.1324 Ops/s 231.3950 Ops/s $\color{#35bf28}+2.91\%$
test_ones_like 5.0022ms 4.3329ms 230.7925 Ops/s 231.1030 Ops/s $\color{#d91a1a}-0.13\%$
test_clone 11.3711ms 9.1209ms 109.6377 Ops/s 69.0717 Ops/s $\textbf{\color{#35bf28}+58.73\%}$
test_squeeze 59.2300μs 10.0215μs 99.7857 KOps/s 99.7274 KOps/s $\color{#35bf28}+0.06\%$
test_unsqueeze 0.1217ms 77.0992μs 12.9703 KOps/s 13.0711 KOps/s $\color{#d91a1a}-0.77\%$
test_split 0.3825ms 0.1696ms 5.8959 KOps/s 6.0956 KOps/s $\color{#d91a1a}-3.28\%$
test_permute 0.2494ms 0.1962ms 5.0967 KOps/s 5.1224 KOps/s $\color{#d91a1a}-0.50\%$
test_stack 50.8269ms 50.4611ms 19.8173 Ops/s 19.7246 Ops/s $\color{#35bf28}+0.47\%$
test_cat 50.9674ms 50.4291ms 19.8298 Ops/s 19.7985 Ops/s $\color{#35bf28}+0.16\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 20, 2025
ghstack-source-id: 526f9ce8202fc48bc64ed7c8094c9c72f3bc4a71
Pull Request resolved: #1223
@vmoens vmoens added the bug Something isn't working label Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants