Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Enforce zip(..., strict=True) in TDModules #1212

Merged
merged 2 commits into from
Feb 6, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 6, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 6, 2025
ghstack-source-id: 8515d8393c9b6f3deb1c76b2161fab58599e4945
Pull Request resolved: #1212
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 6, 2025
@vmoens vmoens added bug Something isn't working suitable for minor labels Feb 6, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 6, 2025
ghstack-source-id: 8515d8393c9b6f3deb1c76b2161fab58599e4945
Pull Request resolved: #1212
Copy link

github-actions bot commented Feb 6, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 53.8510μs 20.9333μs 47.7708 KOps/s 48.8313 KOps/s $\color{#d91a1a}-2.17\%$
test_plain_set_stack_nested 61.3460μs 20.9120μs 47.8195 KOps/s 47.9966 KOps/s $\color{#d91a1a}-0.37\%$
test_plain_set_nested_inplace 63.5600μs 22.8847μs 43.6973 KOps/s 44.2429 KOps/s $\color{#d91a1a}-1.23\%$
test_plain_set_stack_nested_inplace 82.7360μs 22.6770μs 44.0976 KOps/s 44.3642 KOps/s $\color{#d91a1a}-0.60\%$
test_items 25.6590μs 4.1841μs 239.0013 KOps/s 241.5190 KOps/s $\color{#d91a1a}-1.04\%$
test_items_nested 0.7118ms 0.4054ms 2.4666 KOps/s 2.4477 KOps/s $\color{#35bf28}+0.77\%$
test_items_nested_locked 0.7387ms 0.4060ms 2.4629 KOps/s 2.4739 KOps/s $\color{#d91a1a}-0.45\%$
test_items_nested_leaf 0.1485ms 75.8872μs 13.1774 KOps/s 13.1239 KOps/s $\color{#35bf28}+0.41\%$
test_items_stack_nested 0.4864ms 0.4058ms 2.4640 KOps/s 2.4636 KOps/s $\color{#35bf28}+0.02\%$
test_items_stack_nested_leaf 0.1560ms 79.2404μs 12.6198 KOps/s 12.7124 KOps/s $\color{#d91a1a}-0.73\%$
test_items_stack_nested_locked 0.6822ms 0.4071ms 2.4561 KOps/s 2.4613 KOps/s $\color{#d91a1a}-0.21\%$
test_keys 21.2800μs 3.5294μs 283.3375 KOps/s 283.1453 KOps/s $\color{#35bf28}+0.07\%$
test_keys_nested 0.2690ms 0.1672ms 5.9798 KOps/s 6.1610 KOps/s $\color{#d91a1a}-2.94\%$
test_keys_nested_locked 1.9072ms 0.1741ms 5.7446 KOps/s 5.9314 KOps/s $\color{#d91a1a}-3.15\%$
test_keys_nested_leaf 0.2274ms 0.1456ms 6.8689 KOps/s 7.0387 KOps/s $\color{#d91a1a}-2.41\%$
test_keys_stack_nested 0.2947ms 0.1670ms 5.9875 KOps/s 6.0909 KOps/s $\color{#d91a1a}-1.70\%$
test_keys_stack_nested_leaf 0.2323ms 0.1439ms 6.9502 KOps/s 7.0401 KOps/s $\color{#d91a1a}-1.28\%$
test_keys_stack_nested_locked 0.2855ms 0.1706ms 5.8614 KOps/s 5.9043 KOps/s $\color{#d91a1a}-0.73\%$
test_values 15.3190μs 1.1575μs 863.9030 KOps/s 872.0078 KOps/s $\color{#d91a1a}-0.93\%$
test_values_nested 0.1213ms 61.6542μs 16.2195 KOps/s 16.2724 KOps/s $\color{#d91a1a}-0.32\%$
test_values_nested_locked 0.1289ms 61.3872μs 16.2900 KOps/s 16.2611 KOps/s $\color{#35bf28}+0.18\%$
test_values_nested_leaf 0.1218ms 70.3469μs 14.2153 KOps/s 14.1677 KOps/s $\color{#35bf28}+0.34\%$
test_values_stack_nested 0.1183ms 63.0287μs 15.8658 KOps/s 15.9617 KOps/s $\color{#d91a1a}-0.60\%$
test_values_stack_nested_leaf 0.2328ms 71.4859μs 13.9888 KOps/s 13.9865 KOps/s $\color{#35bf28}+0.02\%$
test_values_stack_nested_locked 0.1229ms 62.9709μs 15.8803 KOps/s 15.9690 KOps/s $\color{#d91a1a}-0.56\%$
test_membership 25.1770μs 0.8660μs 1.1547 MOps/s 1.3768 MOps/s $\textbf{\color{#d91a1a}-16.13\%}$
test_membership_nested 33.0220μs 2.8957μs 345.3392 KOps/s 347.0686 KOps/s $\color{#d91a1a}-0.50\%$
test_membership_nested_leaf 34.5650μs 2.8716μs 348.2412 KOps/s 346.9198 KOps/s $\color{#35bf28}+0.38\%$
test_membership_stacked_nested 38.0920μs 2.8169μs 354.9990 KOps/s 349.0537 KOps/s $\color{#35bf28}+1.70\%$
test_membership_stacked_nested_leaf 48.1610μs 2.9008μs 344.7304 KOps/s 348.4254 KOps/s $\color{#d91a1a}-1.06\%$
test_membership_nested_last 26.6500μs 4.3579μs 229.4696 KOps/s 231.1831 KOps/s $\color{#d91a1a}-0.74\%$
test_membership_nested_leaf_last 57.6390μs 4.3565μs 229.5432 KOps/s 229.8804 KOps/s $\color{#d91a1a}-0.15\%$
test_membership_stacked_nested_last 29.3650μs 4.3552μs 229.6085 KOps/s 229.9304 KOps/s $\color{#d91a1a}-0.14\%$
test_membership_stacked_nested_leaf_last 61.6160μs 4.3629μs 229.2054 KOps/s 230.4563 KOps/s $\color{#d91a1a}-0.54\%$
test_nested_getleaf 40.5270μs 10.4759μs 95.4571 KOps/s 95.1955 KOps/s $\color{#35bf28}+0.27\%$
test_nested_get 67.3470μs 9.9124μs 100.8836 KOps/s 99.9722 KOps/s $\color{#35bf28}+0.91\%$
test_stacked_getleaf 66.0540μs 10.3629μs 96.4982 KOps/s 95.7661 KOps/s $\color{#35bf28}+0.76\%$
test_stacked_get 38.8830μs 9.8691μs 101.3262 KOps/s 100.9439 KOps/s $\color{#35bf28}+0.38\%$
test_nested_getitemleaf 55.8750μs 11.0136μs 90.7969 KOps/s 89.9908 KOps/s $\color{#35bf28}+0.90\%$
test_nested_getitem 49.8440μs 10.4363μs 95.8190 KOps/s 92.6213 KOps/s $\color{#35bf28}+3.45\%$
test_stacked_getitemleaf 62.5880μs 10.9967μs 90.9365 KOps/s 91.2508 KOps/s $\color{#d91a1a}-0.34\%$
test_stacked_getitem 63.3190μs 10.3946μs 96.2042 KOps/s 93.6108 KOps/s $\color{#35bf28}+2.77\%$
test_lock_nested 0.6783ms 0.4096ms 2.4413 KOps/s 2.4065 KOps/s $\color{#35bf28}+1.45\%$
test_lock_stack_nested 0.6887ms 0.4222ms 2.3684 KOps/s 2.3460 KOps/s $\color{#35bf28}+0.96\%$
test_unlock_nested 0.6728ms 0.3397ms 2.9436 KOps/s 2.9634 KOps/s $\color{#d91a1a}-0.67\%$
test_unlock_stack_nested 0.5406ms 0.3404ms 2.9374 KOps/s 2.8962 KOps/s $\color{#35bf28}+1.42\%$
test_flatten_speed 0.1765ms 99.9619μs 10.0038 KOps/s 9.8214 KOps/s $\color{#35bf28}+1.86\%$
test_unflatten_speed 0.8664ms 0.5208ms 1.9202 KOps/s 1.9482 KOps/s $\color{#d91a1a}-1.44\%$
test_common_ops 1.0025ms 0.7951ms 1.2576 KOps/s 1.1953 KOps/s $\textbf{\color{#35bf28}+5.21\%}$
test_creation 28.5030μs 2.5292μs 395.3797 KOps/s 403.1710 KOps/s $\color{#d91a1a}-1.93\%$
test_creation_empty 42.0800μs 12.6250μs 79.2081 KOps/s 79.6598 KOps/s $\color{#d91a1a}-0.57\%$
test_creation_nested_1 56.0760μs 15.6422μs 63.9295 KOps/s 64.4009 KOps/s $\color{#d91a1a}-0.73\%$
test_creation_nested_2 60.7840μs 20.0226μs 49.9435 KOps/s 49.8003 KOps/s $\color{#35bf28}+0.29\%$
test_clone 72.5370μs 13.5071μs 74.0349 KOps/s 72.7797 KOps/s $\color{#35bf28}+1.72\%$
test_getitem[int] 0.9021ms 13.1814μs 75.8643 KOps/s 77.4506 KOps/s $\color{#d91a1a}-2.05\%$
test_getitem[slice_int] 0.1342ms 24.6628μs 40.5470 KOps/s 42.2446 KOps/s $\color{#d91a1a}-4.02\%$
test_getitem[range] 0.2034ms 51.4173μs 19.4487 KOps/s 20.2203 KOps/s $\color{#d91a1a}-3.82\%$
test_getitem[tuple] 0.1256ms 20.4786μs 48.8314 KOps/s 49.8936 KOps/s $\color{#d91a1a}-2.13\%$
test_getitem[list] 0.2372ms 46.5333μs 21.4900 KOps/s 21.8014 KOps/s $\color{#d91a1a}-1.43\%$
test_setitem_dim[int] 60.9840μs 25.5355μs 39.1612 KOps/s 38.1613 KOps/s $\color{#35bf28}+2.62\%$
test_setitem_dim[slice_int] 93.4660μs 50.6914μs 19.7272 KOps/s 19.7017 KOps/s $\color{#35bf28}+0.13\%$
test_setitem_dim[range] 0.1457ms 76.9308μs 12.9987 KOps/s 12.8228 KOps/s $\color{#35bf28}+1.37\%$
test_setitem_dim[tuple] 81.3530μs 40.6284μs 24.6133 KOps/s 24.3621 KOps/s $\color{#35bf28}+1.03\%$
test_setitem 93.3360μs 21.0497μs 47.5067 KOps/s 46.3305 KOps/s $\color{#35bf28}+2.54\%$
test_set 78.7590μs 20.4248μs 48.9600 KOps/s 47.7451 KOps/s $\color{#35bf28}+2.54\%$
test_set_shared 4.7504ms 0.1798ms 5.5613 KOps/s 5.5043 KOps/s $\color{#35bf28}+1.04\%$
test_update 0.1391ms 24.0032μs 41.6612 KOps/s 40.7271 KOps/s $\color{#35bf28}+2.29\%$
test_update_nested 81.7640μs 33.8460μs 29.5456 KOps/s 28.5525 KOps/s $\color{#35bf28}+3.48\%$
test_update__nested 1.2035ms 34.0907μs 29.3335 KOps/s 28.9386 KOps/s $\color{#35bf28}+1.36\%$
test_set_nested 57.0070μs 22.4566μs 44.5303 KOps/s 43.0396 KOps/s $\color{#35bf28}+3.46\%$
test_set_nested_new 0.1110ms 27.4585μs 36.4185 KOps/s 36.2649 KOps/s $\color{#35bf28}+0.42\%$
test_select 0.1279ms 43.6623μs 22.9031 KOps/s 22.7704 KOps/s $\color{#35bf28}+0.58\%$
test_select_nested 0.1090ms 62.7543μs 15.9352 KOps/s 15.9754 KOps/s $\color{#d91a1a}-0.25\%$
test_exclude_nested 0.1707ms 80.9927μs 12.3468 KOps/s 12.4551 KOps/s $\color{#d91a1a}-0.87\%$
test_empty[True] 0.5387ms 0.4100ms 2.4393 KOps/s 2.4835 KOps/s $\color{#d91a1a}-1.78\%$
test_empty[False] 7.1032μs 1.3858μs 721.6239 KOps/s 725.1353 KOps/s $\color{#d91a1a}-0.48\%$
test_unbind_speed 0.3359ms 0.2686ms 3.7223 KOps/s 3.6569 KOps/s $\color{#35bf28}+1.79\%$
test_unbind_speed_stack0 0.5897ms 0.2658ms 3.7619 KOps/s 3.7168 KOps/s $\color{#35bf28}+1.21\%$
test_unbind_speed_stack1 99.0742ms 0.7221ms 1.3848 KOps/s 1.2542 KOps/s $\textbf{\color{#35bf28}+10.41\%}$
test_split 0.1101s 1.8031ms 554.6113 Ops/s 564.4595 Ops/s $\color{#d91a1a}-1.74\%$
test_chunk 98.9448ms 1.7748ms 563.4366 Ops/s 620.0436 Ops/s $\textbf{\color{#d91a1a}-9.13\%}$
test_consolidate_njt[False-None] 8.5912ms 8.2694ms 120.9277 Ops/s 108.6097 Ops/s $\textbf{\color{#35bf28}+11.34\%}$
test_creation[device0] 3.8462ms 93.4718μs 10.6984 KOps/s 10.9346 KOps/s $\color{#d91a1a}-2.16\%$
test_creation_from_tensor 0.2739ms 93.9709μs 10.6416 KOps/s 10.4711 KOps/s $\color{#35bf28}+1.63\%$
test_add_one[memmap_tensor0] 84.0880μs 4.8029μs 208.2089 KOps/s 188.0603 KOps/s $\textbf{\color{#35bf28}+10.71\%}$
test_contiguous[memmap_tensor0] 13.7970μs 0.5307μs 1.8843 MOps/s 1.9527 MOps/s $\color{#d91a1a}-3.50\%$
test_stack[memmap_tensor0] 26.9910μs 3.5658μs 280.4446 KOps/s 286.1820 KOps/s $\color{#d91a1a}-2.00\%$
test_memmaptd_index 1.3119ms 0.2336ms 4.2808 KOps/s 4.3205 KOps/s $\color{#d91a1a}-0.92\%$
test_memmaptd_index_astensor 0.5952ms 0.3187ms 3.1380 KOps/s 3.1470 KOps/s $\color{#d91a1a}-0.29\%$
test_memmaptd_index_op 0.8199ms 0.5925ms 1.6878 KOps/s 1.6540 KOps/s $\color{#35bf28}+2.05\%$
test_serialize_model 0.2215s 0.1306s 7.6569 Ops/s 8.7886 Ops/s $\textbf{\color{#d91a1a}-12.88\%}$
test_serialize_model_pickle 0.4927s 0.4050s 2.4692 Ops/s 2.5398 Ops/s $\color{#d91a1a}-2.78\%$
test_serialize_weights 0.1348s 0.1158s 8.6350 Ops/s 8.7871 Ops/s $\color{#d91a1a}-1.73\%$
test_serialize_weights_returnearly 0.1730s 0.1610s 6.2094 Ops/s 6.3814 Ops/s $\color{#d91a1a}-2.70\%$
test_serialize_weights_pickle 0.6200s 0.4604s 2.1719 Ops/s 1.1990 Ops/s $\textbf{\color{#35bf28}+81.14\%}$
test_serialize_weights_filesystem 0.2521s 0.1595s 6.2694 Ops/s 7.0644 Ops/s $\textbf{\color{#d91a1a}-11.25\%}$
test_serialize_model_filesystem 0.1633s 0.1481s 6.7538 Ops/s 7.1508 Ops/s $\textbf{\color{#d91a1a}-5.55\%}$
test_reshape_pytree 76.5840μs 26.0512μs 38.3860 KOps/s 36.8780 KOps/s $\color{#35bf28}+4.09\%$
test_reshape_td 96.2120μs 32.5220μs 30.7484 KOps/s 31.4936 KOps/s $\color{#d91a1a}-2.37\%$
test_view_pytree 76.2640μs 26.1785μs 38.1993 KOps/s 38.9949 KOps/s $\color{#d91a1a}-2.04\%$
test_view_td 87.3950μs 38.7048μs 25.8366 KOps/s 26.9321 KOps/s $\color{#d91a1a}-4.07\%$
test_unbind_pytree 79.7310μs 29.4873μs 33.9129 KOps/s 33.6184 KOps/s $\color{#35bf28}+0.88\%$
test_unbind_td 0.3483ms 39.7408μs 25.1631 KOps/s 24.8447 KOps/s $\color{#35bf28}+1.28\%$
test_split_pytree 93.6760μs 29.0556μs 34.4168 KOps/s 34.1243 KOps/s $\color{#35bf28}+0.86\%$
test_split_td 0.5653ms 47.2305μs 21.1727 KOps/s 21.9025 KOps/s $\color{#d91a1a}-3.33\%$
test_add_pytree 0.1505ms 35.2973μs 28.3308 KOps/s 27.9936 KOps/s $\color{#35bf28}+1.20\%$
test_add_td 0.1678ms 57.7832μs 17.3061 KOps/s 17.1213 KOps/s $\color{#35bf28}+1.08\%$
test_compile_add_one_nested[tensordict-compile] 0.1373ms 67.0936μs 14.9046 KOps/s 14.9539 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_add_one_nested[tensordict-eager] 0.3276ms 0.1739ms 5.7489 KOps/s 5.7902 KOps/s $\color{#d91a1a}-0.71\%$
test_compile_add_one_nested[pytree-compile] 0.1166ms 46.1948μs 21.6475 KOps/s 21.9751 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_add_one_nested[pytree-eager] 0.1921ms 0.1178ms 8.4892 KOps/s 8.2839 KOps/s $\color{#35bf28}+2.48\%$
test_compile_copy_nested[tensordict-compile] 82.2450μs 28.4278μs 35.1769 KOps/s 36.0621 KOps/s $\color{#d91a1a}-2.45\%$
test_compile_copy_nested[tensordict-eager] 0.1202ms 58.5064μs 17.0921 KOps/s 16.7634 KOps/s $\color{#35bf28}+1.96\%$
test_compile_copy_nested[pytree-compile] 0.2116ms 80.5937μs 12.4079 KOps/s 12.4629 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_copy_nested[pytree-eager] 0.1281ms 65.6102μs 15.2415 KOps/s 14.9595 KOps/s $\color{#35bf28}+1.89\%$
test_compile_add_one_flat[tensordict-compile] 0.1756ms 0.1073ms 9.3164 KOps/s 9.4471 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_add_one_flat[tensordict-eager] 0.4261ms 0.2176ms 4.5949 KOps/s 4.6396 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_add_one_flat[tensorclass-compile] 98.3350μs 47.5691μs 21.0221 KOps/s 21.3904 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_add_one_flat[tensorclass-eager] 0.1599ms 68.0141μs 14.7028 KOps/s 14.3670 KOps/s $\color{#35bf28}+2.34\%$
test_compile_add_one_flat[pytree-compile] 0.1804ms 0.1007ms 9.9336 KOps/s 10.0587 KOps/s $\color{#d91a1a}-1.24\%$
test_compile_add_one_flat[pytree-eager] 0.6507ms 0.2077ms 4.8143 KOps/s 4.9497 KOps/s $\color{#d91a1a}-2.74\%$
test_compile_add_self_flat[tensordict-eager] 0.6117ms 0.2381ms 4.1995 KOps/s 4.3321 KOps/s $\color{#d91a1a}-3.06\%$
test_compile_add_self_flat[tensordict-compile] 0.2320ms 0.1129ms 8.8577 KOps/s 9.2897 KOps/s $\color{#d91a1a}-4.65\%$
test_compile_add_self_flat[tensorclass-eager] 1.3172ms 65.3268μs 15.3077 KOps/s 16.2233 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2041ms 48.7979μs 20.4927 KOps/s 20.7380 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_add_self_flat[pytree-eager] 0.2588ms 0.1569ms 6.3725 KOps/s 6.3055 KOps/s $\color{#35bf28}+1.06\%$
test_compile_add_self_flat[pytree-compile] 0.2073ms 0.1003ms 9.9675 KOps/s 9.8228 KOps/s $\color{#35bf28}+1.47\%$
test_compile_copy_flat[tensordict-compile] 65.7840μs 21.8805μs 45.7027 KOps/s 46.7537 KOps/s $\color{#d91a1a}-2.25\%$
test_compile_copy_flat[tensordict-eager] 0.1269ms 67.4151μs 14.8335 KOps/s 14.7558 KOps/s $\color{#35bf28}+0.53\%$
test_compile_copy_flat[pytree-compile] 0.1766ms 80.7527μs 12.3835 KOps/s 12.3812 KOps/s $\color{#35bf28}+0.02\%$
test_compile_copy_flat[pytree-eager] 0.1352ms 67.1688μs 14.8879 KOps/s 14.8435 KOps/s $\color{#35bf28}+0.30\%$
test_compile_assign_and_add[tensordict-compile] 0.4222ms 0.2161ms 4.6280 KOps/s 4.6191 KOps/s $\color{#35bf28}+0.19\%$
test_compile_assign_and_add[tensordict-eager] 1.5656ms 1.3639ms 733.2098 Ops/s 720.1438 Ops/s $\color{#35bf28}+1.81\%$
test_compile_assign_and_add[pytree-compile] 0.2871ms 0.2111ms 4.7368 KOps/s 4.7165 KOps/s $\color{#35bf28}+0.43\%$
test_compile_assign_and_add[pytree-eager] 1.4329ms 0.8264ms 1.2100 KOps/s 1.2105 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_assign_and_add_stack[compile] 0.8332ms 0.4622ms 2.1637 KOps/s 2.2108 KOps/s $\color{#d91a1a}-2.13\%$
test_compile_assign_and_add_stack[eager] 3.0513ms 2.7492ms 363.7424 Ops/s 354.1209 Ops/s $\color{#35bf28}+2.72\%$
test_compile_indexing[tensor-tensordict-compile] 0.1539ms 41.2943μs 24.2164 KOps/s 26.1915 KOps/s $\textbf{\color{#d91a1a}-7.54\%}$
test_compile_indexing[tensor-tensordict-eager] 0.8489ms 34.2752μs 29.1756 KOps/s 30.0706 KOps/s $\color{#d91a1a}-2.98\%$
test_compile_indexing[tensor-tensorclass-compile] 76.3340μs 31.4911μs 31.7551 KOps/s 33.0660 KOps/s $\color{#d91a1a}-3.96\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1073ms 23.1135μs 43.2648 KOps/s 42.9805 KOps/s $\color{#35bf28}+0.66\%$
test_compile_indexing[tensor-pytree-compile] 0.1201ms 33.8721μs 29.5228 KOps/s 32.9896 KOps/s $\textbf{\color{#d91a1a}-10.51\%}$
test_compile_indexing[tensor-pytree-eager] 63.6300μs 23.7498μs 42.1056 KOps/s 43.5451 KOps/s $\color{#d91a1a}-3.31\%$
test_compile_indexing[slice-tensordict-compile] 0.1384ms 56.0952μs 17.8268 KOps/s 19.2118 KOps/s $\textbf{\color{#d91a1a}-7.21\%}$
test_compile_indexing[slice-tensordict-eager] 0.3335ms 20.2612μs 49.3555 KOps/s 50.0626 KOps/s $\color{#d91a1a}-1.41\%$
test_compile_indexing[slice-tensorclass-compile] 0.1482ms 47.0439μs 21.2567 KOps/s 22.1005 KOps/s $\color{#d91a1a}-3.82\%$
test_compile_indexing[slice-tensorclass-eager] 84.9100μs 18.2739μs 54.7228 KOps/s 53.4045 KOps/s $\color{#35bf28}+2.47\%$
test_compile_indexing[slice-pytree-compile] 0.1228ms 47.6665μs 20.9791 KOps/s 21.6090 KOps/s $\color{#d91a1a}-2.92\%$
test_compile_indexing[slice-pytree-eager] 96.7720μs 18.3789μs 54.4103 KOps/s 52.9444 KOps/s $\color{#35bf28}+2.77\%$
test_compile_indexing[int-tensordict-compile] 0.1403ms 58.0675μs 17.2213 KOps/s 18.7016 KOps/s $\textbf{\color{#d91a1a}-7.92\%}$
test_compile_indexing[int-tensordict-eager] 1.0833ms 20.2742μs 49.3237 KOps/s 49.8746 KOps/s $\color{#d91a1a}-1.10\%$
test_compile_indexing[int-tensorclass-compile] 0.1067ms 47.6662μs 20.9792 KOps/s 21.6362 KOps/s $\color{#d91a1a}-3.04\%$
test_compile_indexing[int-tensorclass-eager] 85.1800μs 18.4072μs 54.3266 KOps/s 53.8348 KOps/s $\color{#35bf28}+0.91\%$
test_compile_indexing[int-pytree-compile] 0.1316ms 47.7922μs 20.9239 KOps/s 21.6709 KOps/s $\color{#d91a1a}-3.45\%$
test_compile_indexing[int-pytree-eager] 95.4930μs 18.3285μs 54.5598 KOps/s 54.3258 KOps/s $\color{#35bf28}+0.43\%$
test_mod_add[eager] 89.2280μs 36.3004μs 27.5479 KOps/s 28.1124 KOps/s $\color{#d91a1a}-2.01\%$
test_mod_add[compile] 0.1419ms 65.7761μs 15.2031 KOps/s 15.2074 KOps/s $\color{#d91a1a}-0.03\%$
test_mod_add[compile-overhead] 0.1411ms 65.3984μs 15.2909 KOps/s 15.4762 KOps/s $\color{#d91a1a}-1.20\%$
test_mod_wrap[eager] 0.3423ms 0.2171ms 4.6058 KOps/s 4.3678 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_mod_wrap[compile] 2.2420ms 0.2276ms 4.3935 KOps/s 4.2958 KOps/s $\color{#35bf28}+2.27\%$
test_mod_wrap[compile-overhead] 0.4413ms 0.2250ms 4.4444 KOps/s 4.4386 KOps/s $\color{#35bf28}+0.13\%$
test_mod_wrap_and_backward[eager] 18.8012ms 12.7906ms 78.1824 Ops/s 91.9392 Ops/s $\textbf{\color{#d91a1a}-14.96\%}$
test_mod_wrap_and_backward[compile] 14.9818ms 12.7216ms 78.6067 Ops/s 84.8104 Ops/s $\textbf{\color{#d91a1a}-7.31\%}$
test_mod_wrap_and_backward[compile-overhead] 14.1621ms 11.7918ms 84.8045 Ops/s 85.2968 Ops/s $\color{#d91a1a}-0.58\%$
test_seq_add[eager] 0.2160ms 0.1162ms 8.6083 KOps/s 8.5391 KOps/s $\color{#35bf28}+0.81\%$
test_seq_add[compile] 0.1388ms 77.3195μs 12.9334 KOps/s 12.9313 KOps/s $\color{#35bf28}+0.02\%$
test_seq_add[compile-overhead] 0.1582ms 75.9377μs 13.1687 KOps/s 13.1198 KOps/s $\color{#35bf28}+0.37\%$
test_seq_wrap[eager] 0.8461ms 0.4486ms 2.2292 KOps/s 2.1941 KOps/s $\color{#35bf28}+1.60\%$
test_seq_wrap[compile] 0.7049ms 0.2489ms 4.0177 KOps/s 4.0773 KOps/s $\color{#d91a1a}-1.46\%$
test_seq_wrap[compile-overhead] 0.4520ms 0.2436ms 4.1058 KOps/s 4.0830 KOps/s $\color{#35bf28}+0.56\%$
test_func_call_runtime[False-eager] 0.8135ms 0.5354ms 1.8677 KOps/s 1.8316 KOps/s $\color{#35bf28}+1.97\%$
test_func_call_runtime[False-compile] 0.6263ms 0.4426ms 2.2593 KOps/s 2.2325 KOps/s $\color{#35bf28}+1.20\%$
test_func_call_runtime[False-compile-overhead] 0.9726ms 0.4457ms 2.2437 KOps/s 2.2564 KOps/s $\color{#d91a1a}-0.56\%$
test_func_call_runtime[True-eager] 0.9293ms 0.7488ms 1.3355 KOps/s 1.3248 KOps/s $\color{#35bf28}+0.80\%$
test_func_call_runtime[True-compile] 0.6321ms 0.4609ms 2.1697 KOps/s 2.1537 KOps/s $\color{#35bf28}+0.74\%$
test_func_call_runtime[True-compile-overhead] 0.5728ms 0.4598ms 2.1750 KOps/s 2.1390 KOps/s $\color{#35bf28}+1.68\%$
test_func_call_cm_runtime[False-eager] 0.7540ms 0.5255ms 1.9028 KOps/s 1.8229 KOps/s $\color{#35bf28}+4.38\%$
test_func_call_cm_runtime[False-compile] 0.7869ms 0.4379ms 2.2835 KOps/s 2.2494 KOps/s $\color{#35bf28}+1.52\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6398ms 0.4351ms 2.2983 KOps/s 2.2525 KOps/s $\color{#35bf28}+2.03\%$
test_func_call_cm_runtime[True-eager] 1.0288ms 0.8861ms 1.1285 KOps/s 1.1084 KOps/s $\color{#35bf28}+1.82\%$
test_func_call_cm_runtime[True-compile] 0.9106ms 0.7785ms 1.2845 KOps/s 1.2446 KOps/s $\color{#35bf28}+3.21\%$
test_func_call_cm_runtime[True-compile-overhead] 1.2147ms 0.7886ms 1.2680 KOps/s 1.2300 KOps/s $\color{#35bf28}+3.09\%$
test_vmap_func_call_cm_runtime[eager] 2.5536ms 1.8827ms 531.1438 Ops/s 515.6853 Ops/s $\color{#35bf28}+3.00\%$
test_vmap_func_call_cm_runtime[compile] 1.0696ms 0.5374ms 1.8609 KOps/s 1.8529 KOps/s $\color{#35bf28}+0.44\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6632ms 0.5330ms 1.8760 KOps/s 1.8317 KOps/s $\color{#35bf28}+2.42\%$
test_distributed 0.2673ms 0.1226ms 8.1574 KOps/s 8.0289 KOps/s $\color{#35bf28}+1.60\%$
test_tdmodule 64.3410μs 26.5737μs 37.6312 KOps/s 37.0008 KOps/s $\color{#35bf28}+1.70\%$
test_tdmodule_dispatch 82.1150μs 52.0278μs 19.2205 KOps/s 20.5672 KOps/s $\textbf{\color{#d91a1a}-6.55\%}$
test_tdseq 57.9590μs 28.0351μs 35.6695 KOps/s 33.4545 KOps/s $\textbf{\color{#35bf28}+6.62\%}$
test_tdseq_dispatch 93.6660μs 53.4957μs 18.6931 KOps/s 18.1775 KOps/s $\color{#35bf28}+2.84\%$
test_instantiation_functorch 1.7068ms 1.5039ms 664.9552 Ops/s 659.7755 Ops/s $\color{#35bf28}+0.79\%$
test_exec_functorch 0.3221ms 0.1749ms 5.7187 KOps/s 5.4823 KOps/s $\color{#35bf28}+4.31\%$
test_exec_functional_call 0.3348ms 0.1721ms 5.8108 KOps/s 5.8058 KOps/s $\color{#35bf28}+0.09\%$
test_exec_td_decorator 0.4393ms 0.2238ms 4.4681 KOps/s 4.1551 KOps/s $\textbf{\color{#35bf28}+7.53\%}$
test_vmap_mlp_speed_decorator[True-True] 1.1282ms 0.6551ms 1.5265 KOps/s 1.5386 KOps/s $\color{#d91a1a}-0.79\%$
test_vmap_mlp_speed_decorator[True-False] 0.8923ms 0.6506ms 1.5370 KOps/s 1.5269 KOps/s $\color{#35bf28}+0.66\%$
test_vmap_mlp_speed_decorator[False-True] 0.6959ms 0.5236ms 1.9098 KOps/s 1.8960 KOps/s $\color{#35bf28}+0.73\%$
test_vmap_mlp_speed_decorator[False-False] 0.7074ms 0.5218ms 1.9166 KOps/s 1.8928 KOps/s $\color{#35bf28}+1.26\%$
test_to_module_speed[True] 2.1640ms 1.3214ms 756.7916 Ops/s 763.7252 Ops/s $\color{#d91a1a}-0.91\%$
test_to_module_speed[False] 2.0853ms 1.2926ms 773.6502 Ops/s 780.3354 Ops/s $\color{#d91a1a}-0.86\%$
test_tc_init 79.0590μs 48.0857μs 20.7962 KOps/s 21.3406 KOps/s $\color{#d91a1a}-2.55\%$
test_tc_init_nested 0.2205ms 96.3495μs 10.3789 KOps/s 10.7309 KOps/s $\color{#d91a1a}-3.28\%$
test_tc_first_layer_tensor 36.5990μs 1.5377μs 650.3174 KOps/s 654.4991 KOps/s $\color{#d91a1a}-0.64\%$
test_tc_first_layer_nontensor 22.9630μs 4.7213μs 211.8080 KOps/s 217.5134 KOps/s $\color{#d91a1a}-2.62\%$
test_tc_second_layer_tensor 41.2880μs 2.9135μs 343.2344 KOps/s 357.2334 KOps/s $\color{#d91a1a}-3.92\%$
test_tc_second_layer_nontensor 23.7150μs 6.0594μs 165.0332 KOps/s 170.0315 KOps/s $\color{#d91a1a}-2.94\%$
test_unbind 0.2149s 14.3637ms 69.6201 Ops/s 77.0477 Ops/s $\textbf{\color{#d91a1a}-9.64\%}$
test_full_like 8.4826ms 6.8251ms 146.5188 Ops/s 132.1244 Ops/s $\textbf{\color{#35bf28}+10.89\%}$
test_zeros_like 4.3857ms 2.6421ms 378.4804 Ops/s 358.5789 Ops/s $\textbf{\color{#35bf28}+5.55\%}$
test_ones_like 3.6462ms 3.0646ms 326.3055 Ops/s 302.5821 Ops/s $\textbf{\color{#35bf28}+7.84\%}$
test_clone 5.1253ms 4.7593ms 210.1164 Ops/s 206.3418 Ops/s $\color{#35bf28}+1.83\%$
test_squeeze 60.8610μs 12.3737μs 80.8168 KOps/s 82.6143 KOps/s $\color{#d91a1a}-2.18\%$
test_unsqueeze 0.1595ms 92.1464μs 10.8523 KOps/s 11.1909 KOps/s $\color{#d91a1a}-3.03\%$
test_split 0.4704ms 0.1944ms 5.1433 KOps/s 5.2014 KOps/s $\color{#d91a1a}-1.12\%$
test_permute 0.3387ms 0.1996ms 5.0110 KOps/s 5.0555 KOps/s $\color{#d91a1a}-0.88\%$
test_stack 31.4467ms 23.5888ms 42.3929 Ops/s 41.7058 Ops/s $\color{#35bf28}+1.65\%$
test_cat 25.6654ms 23.2408ms 43.0278 Ops/s 41.4973 Ops/s $\color{#35bf28}+3.69\%$

Copy link

github-actions bot commented Feb 6, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}51$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.9410μs 11.3443μs 88.1499 KOps/s 73.7386 KOps/s $\textbf{\color{#35bf28}+19.54\%}$
test_plain_set_stack_nested 43.8610μs 11.4233μs 87.5403 KOps/s 74.9246 KOps/s $\textbf{\color{#35bf28}+16.84\%}$
test_plain_set_nested_inplace 46.5000μs 12.4216μs 80.5047 KOps/s 69.0263 KOps/s $\textbf{\color{#35bf28}+16.63\%}$
test_plain_set_stack_nested_inplace 42.5310μs 12.5029μs 79.9816 KOps/s 67.5499 KOps/s $\textbf{\color{#35bf28}+18.40\%}$
test_items 22.4100μs 2.9901μs 334.4325 KOps/s 344.3494 KOps/s $\color{#d91a1a}-2.88\%$
test_items_nested 0.4153ms 0.3706ms 2.6981 KOps/s 2.7268 KOps/s $\color{#d91a1a}-1.05\%$
test_items_nested_locked 0.3993ms 0.3734ms 2.6782 KOps/s 2.7441 KOps/s $\color{#d91a1a}-2.40\%$
test_items_nested_leaf 84.0810μs 59.4042μs 16.8338 KOps/s 16.9621 KOps/s $\color{#d91a1a}-0.76\%$
test_items_stack_nested 2.8936ms 0.3757ms 2.6619 KOps/s 2.7378 KOps/s $\color{#d91a1a}-2.77\%$
test_items_stack_nested_leaf 97.0210μs 60.7154μs 16.4703 KOps/s 16.8073 KOps/s $\color{#d91a1a}-2.01\%$
test_items_stack_nested_locked 0.4131ms 0.3702ms 2.7013 KOps/s 2.7323 KOps/s $\color{#d91a1a}-1.13\%$
test_keys 23.0500μs 3.4592μs 289.0828 KOps/s 288.8354 KOps/s $\color{#35bf28}+0.09\%$
test_keys_nested 0.1218ms 89.1283μs 11.2198 KOps/s 11.2892 KOps/s $\color{#d91a1a}-0.61\%$
test_keys_nested_locked 0.7250ms 95.5117μs 10.4699 KOps/s 10.5866 KOps/s $\color{#d91a1a}-1.10\%$
test_keys_nested_leaf 0.1129ms 80.0278μs 12.4957 KOps/s 12.6091 KOps/s $\color{#d91a1a}-0.90\%$
test_keys_stack_nested 0.1197ms 89.6451μs 11.1551 KOps/s 11.2581 KOps/s $\color{#d91a1a}-0.92\%$
test_keys_stack_nested_leaf 0.1188ms 81.4310μs 12.2803 KOps/s 12.5847 KOps/s $\color{#d91a1a}-2.42\%$
test_keys_stack_nested_locked 0.1470ms 96.1369μs 10.4018 KOps/s 10.5030 KOps/s $\color{#d91a1a}-0.96\%$
test_values 6.8733μs 0.8520μs 1.1737 MOps/s 1.1739 MOps/s $\color{#d91a1a}-0.01\%$
test_values_nested 85.4410μs 37.9435μs 26.3550 KOps/s 26.7502 KOps/s $\color{#d91a1a}-1.48\%$
test_values_nested_locked 90.1110μs 39.3472μs 25.4148 KOps/s 25.6078 KOps/s $\color{#d91a1a}-0.75\%$
test_values_nested_leaf 0.1337ms 42.3877μs 23.5918 KOps/s 23.9066 KOps/s $\color{#d91a1a}-1.32\%$
test_values_stack_nested 77.4110μs 38.5448μs 25.9439 KOps/s 26.4399 KOps/s $\color{#d91a1a}-1.88\%$
test_values_stack_nested_leaf 81.8610μs 43.1330μs 23.1841 KOps/s 23.6338 KOps/s $\color{#d91a1a}-1.90\%$
test_values_stack_nested_locked 67.8810μs 39.8207μs 25.1126 KOps/s 25.4984 KOps/s $\color{#d91a1a}-1.51\%$
test_membership 1.6230μs 0.5101μs 1.9605 MOps/s 1.9873 MOps/s $\color{#d91a1a}-1.35\%$
test_membership_nested 28.7910μs 2.1222μs 471.2022 KOps/s 492.6942 KOps/s $\color{#d91a1a}-4.36\%$
test_membership_nested_leaf 16.7150μs 2.0306μs 492.4674 KOps/s 493.3532 KOps/s $\color{#d91a1a}-0.18\%$
test_membership_stacked_nested 40.0510μs 2.1206μs 471.5590 KOps/s 473.2419 KOps/s $\color{#d91a1a}-0.36\%$
test_membership_stacked_nested_leaf 24.7500μs 2.1119μs 473.5014 KOps/s 462.3722 KOps/s $\color{#35bf28}+2.41\%$
test_membership_nested_last 38.3300μs 3.2172μs 310.8315 KOps/s 320.1553 KOps/s $\color{#d91a1a}-2.91\%$
test_membership_nested_leaf_last 29.4400μs 3.1573μs 316.7305 KOps/s 318.4955 KOps/s $\color{#d91a1a}-0.55\%$
test_membership_stacked_nested_last 35.6000μs 6.8389μs 146.2222 KOps/s 318.6518 KOps/s $\textbf{\color{#d91a1a}-54.11\%}$
test_membership_stacked_nested_leaf_last 44.0210μs 6.8006μs 147.0463 KOps/s 319.6473 KOps/s $\textbf{\color{#d91a1a}-54.00\%}$
test_nested_getleaf 34.0900μs 6.1715μs 162.0346 KOps/s 161.4048 KOps/s $\color{#35bf28}+0.39\%$
test_nested_get 41.3100μs 5.8324μs 171.4568 KOps/s 168.5970 KOps/s $\color{#35bf28}+1.70\%$
test_stacked_getleaf 48.1910μs 6.1570μs 162.4172 KOps/s 161.8906 KOps/s $\color{#35bf28}+0.33\%$
test_stacked_get 36.8500μs 5.8715μs 170.3156 KOps/s 170.6999 KOps/s $\color{#d91a1a}-0.23\%$
test_nested_getitemleaf 29.9600μs 6.4369μs 155.3539 KOps/s 155.7804 KOps/s $\color{#d91a1a}-0.27\%$
test_nested_getitem 30.6100μs 6.2271μs 160.5897 KOps/s 161.8402 KOps/s $\color{#d91a1a}-0.77\%$
test_stacked_getitemleaf 43.3710μs 6.5886μs 151.7777 KOps/s 154.7392 KOps/s $\color{#d91a1a}-1.91\%$
test_stacked_getitem 32.3210μs 6.1344μs 163.0141 KOps/s 161.9394 KOps/s $\color{#35bf28}+0.66\%$
test_lock_nested 8.8174ms 0.3522ms 2.8391 KOps/s 2.8206 KOps/s $\color{#35bf28}+0.66\%$
test_lock_stack_nested 0.4354ms 0.3419ms 2.9251 KOps/s 2.8604 KOps/s $\color{#35bf28}+2.26\%$
test_unlock_nested 0.5386ms 0.2855ms 3.5022 KOps/s 3.5121 KOps/s $\color{#d91a1a}-0.28\%$
test_unlock_stack_nested 0.3367ms 0.2777ms 3.6006 KOps/s 3.4531 KOps/s $\color{#35bf28}+4.27\%$
test_flatten_speed 0.1129ms 77.0375μs 12.9807 KOps/s 12.9088 KOps/s $\color{#35bf28}+0.56\%$
test_unflatten_speed 0.3654ms 0.3295ms 3.0346 KOps/s 3.0825 KOps/s $\color{#d91a1a}-1.55\%$
test_common_ops 0.7555ms 0.5778ms 1.7307 KOps/s 1.4947 KOps/s $\textbf{\color{#35bf28}+15.79\%}$
test_creation 0.1136ms 1.7559μs 569.4967 KOps/s 587.7570 KOps/s $\color{#d91a1a}-3.11\%$
test_creation_empty 27.0800μs 6.5156μs 153.4771 KOps/s 99.3935 KOps/s $\textbf{\color{#35bf28}+54.41\%}$
test_creation_nested_1 30.3200μs 8.2287μs 121.5257 KOps/s 84.7385 KOps/s $\textbf{\color{#35bf28}+43.41\%}$
test_creation_nested_2 47.5310μs 10.8771μs 91.9361 KOps/s 68.4206 KOps/s $\textbf{\color{#35bf28}+34.37\%}$
test_clone 58.1410μs 10.6370μs 94.0117 KOps/s 86.8943 KOps/s $\textbf{\color{#35bf28}+8.19\%}$
test_getitem[int] 1.2873ms 10.5455μs 94.8274 KOps/s 91.9573 KOps/s $\color{#35bf28}+3.12\%$
test_getitem[slice_int] 0.1087ms 20.8337μs 47.9993 KOps/s 48.2119 KOps/s $\color{#d91a1a}-0.44\%$
test_getitem[range] 0.1382ms 37.0567μs 26.9857 KOps/s 26.3873 KOps/s $\color{#35bf28}+2.27\%$
test_getitem[tuple] 0.1071ms 18.0585μs 55.3756 KOps/s 53.8125 KOps/s $\color{#35bf28}+2.90\%$
test_getitem[list] 0.1322ms 33.5537μs 29.8029 KOps/s 29.5358 KOps/s $\color{#35bf28}+0.90\%$
test_setitem_dim[int] 57.0110μs 19.0272μs 52.5564 KOps/s 48.9652 KOps/s $\textbf{\color{#35bf28}+7.33\%}$
test_setitem_dim[slice_int] 78.4400μs 38.6099μs 25.9001 KOps/s 25.3157 KOps/s $\color{#35bf28}+2.31\%$
test_setitem_dim[range] 90.9110μs 53.2763μs 18.7701 KOps/s 18.5588 KOps/s $\color{#35bf28}+1.14\%$
test_setitem_dim[tuple] 52.4310μs 31.2135μs 32.0374 KOps/s 29.3524 KOps/s $\textbf{\color{#35bf28}+9.15\%}$
test_setitem 64.9910μs 14.3878μs 69.5034 KOps/s 57.8397 KOps/s $\textbf{\color{#35bf28}+20.17\%}$
test_set 62.4110μs 14.0654μs 71.0967 KOps/s 60.9970 KOps/s $\textbf{\color{#35bf28}+16.56\%}$
test_set_shared 0.5214ms 0.1666ms 6.0022 KOps/s 6.2251 KOps/s $\color{#d91a1a}-3.58\%$
test_update 0.4010ms 15.5710μs 64.2221 KOps/s 50.1540 KOps/s $\textbf{\color{#35bf28}+28.05\%}$
test_update_nested 78.0400μs 22.2364μs 44.9713 KOps/s 38.1439 KOps/s $\textbf{\color{#35bf28}+17.90\%}$
test_update__nested 0.5407ms 28.0970μs 35.5910 KOps/s 37.5250 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_set_nested 68.9210μs 16.2731μs 61.4510 KOps/s 56.2941 KOps/s $\textbf{\color{#35bf28}+9.16\%}$
test_set_nested_new 64.6810μs 19.6027μs 51.0133 KOps/s 48.5642 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_select 88.5800μs 30.9948μs 32.2634 KOps/s 30.5621 KOps/s $\textbf{\color{#35bf28}+5.57\%}$
test_select_nested 78.2510μs 43.9568μs 22.7496 KOps/s 22.8212 KOps/s $\color{#d91a1a}-0.31\%$
test_exclude_nested 0.1033ms 64.3280μs 15.5453 KOps/s 15.6822 KOps/s $\color{#d91a1a}-0.87\%$
test_empty[True] 0.3486ms 0.3031ms 3.2996 KOps/s 3.3299 KOps/s $\color{#d91a1a}-0.91\%$
test_empty[False] 3.3991μs 0.8413μs 1.1887 MOps/s 1.2000 MOps/s $\color{#d91a1a}-0.95\%$
test_to 85.3510μs 57.6449μs 17.3476 KOps/s 15.5147 KOps/s $\textbf{\color{#35bf28}+11.81\%}$
test_to_nonblocking 90.6410μs 47.6275μs 20.9963 KOps/s 19.3672 KOps/s $\textbf{\color{#35bf28}+8.41\%}$
test_unbind_speed 0.2865ms 0.2368ms 4.2223 KOps/s 4.0010 KOps/s $\textbf{\color{#35bf28}+5.53\%}$
test_unbind_speed_stack0 0.3519ms 0.2359ms 4.2383 KOps/s 4.0580 KOps/s $\color{#35bf28}+4.44\%$
test_unbind_speed_stack1 92.8934ms 0.7284ms 1.3729 KOps/s 1.3422 KOps/s $\color{#35bf28}+2.29\%$
test_split 1.5821ms 1.4622ms 683.8949 Ops/s 629.7114 Ops/s $\textbf{\color{#35bf28}+8.60\%}$
test_chunk 95.4813ms 1.7303ms 577.9414 Ops/s 625.0394 Ops/s $\textbf{\color{#d91a1a}-7.54\%}$
test_consolidate[False-None] 3.3636ms 2.6853ms 372.4013 Ops/s 368.4247 Ops/s $\color{#35bf28}+1.08\%$
test_consolidate[default-None] 1.7522ms 1.6855ms 593.2818 Ops/s 603.3532 Ops/s $\color{#d91a1a}-1.67\%$
test_consolidate[reduce-overhead-None] 1.7974ms 1.7232ms 580.3218 Ops/s 595.1270 Ops/s $\color{#d91a1a}-2.49\%$
test_consolidate_njt[False-None] 6.9940ms 6.6503ms 150.3696 Ops/s 152.0889 Ops/s $\color{#d91a1a}-1.13\%$
test_to[False-False-None] 1.8208ms 1.7272ms 578.9700 Ops/s 574.7510 Ops/s $\color{#35bf28}+0.73\%$
test_to[True-False-None] 1.5663ms 1.3208ms 757.1003 Ops/s 747.6265 Ops/s $\color{#35bf28}+1.27\%$
test_to[within-False-None] 4.2326ms 4.1136ms 243.0940 Ops/s 239.4193 Ops/s $\color{#35bf28}+1.53\%$
test_to[True-default-None] 5.8267ms 5.4522ms 183.4131 Ops/s 177.4522 Ops/s $\color{#35bf28}+3.36\%$
test_to_njt[False-False-None] 7.3249ms 7.0166ms 142.5182 Ops/s 140.4201 Ops/s $\color{#35bf28}+1.49\%$
test_to_njt[True-False-None] 5.9812ms 5.5533ms 180.0734 Ops/s 175.1844 Ops/s $\color{#35bf28}+2.79\%$
test_to_njt[within-False-None] 12.9237ms 12.2764ms 81.4572 Ops/s 80.5714 Ops/s $\color{#35bf28}+1.10\%$
test_creation[device0] 0.4498ms 84.3632μs 11.8535 KOps/s 11.8487 KOps/s $\color{#35bf28}+0.04\%$
test_creation_from_tensor 0.4598ms 85.4832μs 11.6982 KOps/s 11.8112 KOps/s $\color{#d91a1a}-0.96\%$
test_add_one[memmap_tensor0] 0.4568ms 6.6389μs 150.6268 KOps/s 137.0051 KOps/s $\textbf{\color{#35bf28}+9.94\%}$
test_contiguous[memmap_tensor0] 2.0145μs 0.4172μs 2.3971 MOps/s 2.3844 MOps/s $\color{#35bf28}+0.53\%$
test_stack[memmap_tensor0] 41.5500μs 4.3876μs 227.9169 KOps/s 223.1432 KOps/s $\color{#35bf28}+2.14\%$
test_memmaptd_index 1.6174ms 0.2410ms 4.1500 KOps/s 4.1503 KOps/s $-0.01\%$
test_memmaptd_index_astensor 0.4349ms 0.3048ms 3.2809 KOps/s 3.2786 KOps/s $\color{#35bf28}+0.07\%$
test_memmaptd_index_op 0.6567ms 0.5472ms 1.8275 KOps/s 1.6118 KOps/s $\textbf{\color{#35bf28}+13.38\%}$
test_serialize_model 0.4538s 0.1763s 5.6708 Ops/s 7.7031 Ops/s $\textbf{\color{#d91a1a}-26.38\%}$
test_serialize_model_pickle 1.3454s 1.1912s 0.8395 Ops/s 0.8252 Ops/s $\color{#35bf28}+1.73\%$
test_serialize_weights 0.1313s 0.1298s 7.7021 Ops/s 7.7351 Ops/s $\color{#d91a1a}-0.43\%$
test_serialize_weights_returnearly 0.3368s 55.1331ms 18.1379 Ops/s 22.8396 Ops/s $\textbf{\color{#d91a1a}-20.59\%}$
test_serialize_weights_pickle 1.3741s 1.1947s 0.8370 Ops/s 0.8193 Ops/s $\color{#35bf28}+2.17\%$
test_reshape_pytree 59.7110μs 22.1821μs 45.0813 KOps/s 44.3987 KOps/s $\color{#35bf28}+1.54\%$
test_reshape_td 61.0600μs 27.3808μs 36.5220 KOps/s 35.4386 KOps/s $\color{#35bf28}+3.06\%$
test_view_pytree 49.6800μs 22.5238μs 44.3974 KOps/s 44.6533 KOps/s $\color{#d91a1a}-0.57\%$
test_view_td 71.9900μs 32.1827μs 31.0726 KOps/s 27.8505 KOps/s $\textbf{\color{#35bf28}+11.57\%}$
test_unbind_pytree 56.6910μs 28.0294μs 35.6768 KOps/s 35.1528 KOps/s $\color{#35bf28}+1.49\%$
test_unbind_td 0.7173ms 36.4772μs 27.4144 KOps/s 26.7809 KOps/s $\color{#35bf28}+2.37\%$
test_split_pytree 67.7810μs 31.4416μs 31.8050 KOps/s 32.8956 KOps/s $\color{#d91a1a}-3.32\%$
test_split_td 0.9177ms 38.5282μs 25.9550 KOps/s 25.0164 KOps/s $\color{#35bf28}+3.75\%$
test_add_pytree 81.8410μs 34.7495μs 28.7774 KOps/s 26.8524 KOps/s $\textbf{\color{#35bf28}+7.17\%}$
test_add_td 0.1824ms 47.4006μs 21.0968 KOps/s 18.3472 KOps/s $\textbf{\color{#35bf28}+14.99\%}$
test_compile_add_one_nested[tensordict-compile] 0.1829ms 0.1282ms 7.8014 KOps/s 7.6857 KOps/s $\color{#35bf28}+1.51\%$
test_compile_add_one_nested[tensordict-eager] 0.2519ms 0.1344ms 7.4413 KOps/s 7.3010 KOps/s $\color{#35bf28}+1.92\%$
test_compile_add_one_nested[pytree-compile] 0.1351ms 97.3163μs 10.2758 KOps/s 10.1922 KOps/s $\color{#35bf28}+0.82\%$
test_compile_add_one_nested[pytree-eager] 1.4343ms 0.1494ms 6.6928 KOps/s 6.4452 KOps/s $\color{#35bf28}+3.84\%$
test_compile_copy_nested[tensordict-compile] 60.4100μs 25.1777μs 39.7177 KOps/s 40.7287 KOps/s $\color{#d91a1a}-2.48\%$
test_compile_copy_nested[tensordict-eager] 72.3200μs 29.4899μs 33.9100 KOps/s 33.6491 KOps/s $\color{#35bf28}+0.78\%$
test_compile_copy_nested[pytree-compile] 98.9510μs 65.1671μs 15.3452 KOps/s 15.7283 KOps/s $\color{#d91a1a}-2.44\%$
test_compile_copy_nested[pytree-eager] 89.5800μs 49.4758μs 20.2119 KOps/s 20.2317 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_add_one_flat[tensordict-compile] 0.2021ms 0.1430ms 6.9953 KOps/s 7.0468 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_add_one_flat[tensordict-eager] 0.3167ms 0.2172ms 4.6041 KOps/s 4.5541 KOps/s $\color{#35bf28}+1.10\%$
test_compile_add_one_flat[tensorclass-compile] 0.1467ms 97.4922μs 10.2572 KOps/s 9.9880 KOps/s $\color{#35bf28}+2.70\%$
test_compile_add_one_flat[tensorclass-eager] 0.1146ms 55.3484μs 18.0674 KOps/s 16.8540 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_compile_add_one_flat[pytree-compile] 0.1901ms 0.1407ms 7.1093 KOps/s 7.3482 KOps/s $\color{#d91a1a}-3.25\%$
test_compile_add_one_flat[pytree-eager] 0.5384ms 0.4776ms 2.0939 KOps/s 2.0001 KOps/s $\color{#35bf28}+4.69\%$
test_compile_add_self_flat[tensordict-eager] 0.4157ms 0.2621ms 3.8153 KOps/s 3.7964 KOps/s $\color{#35bf28}+0.50\%$
test_compile_add_self_flat[tensordict-compile] 0.2103ms 0.1445ms 6.9193 KOps/s 6.7947 KOps/s $\color{#35bf28}+1.83\%$
test_compile_add_self_flat[tensorclass-eager] 0.1649ms 67.8231μs 14.7442 KOps/s 14.0381 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1448ms 0.1003ms 9.9707 KOps/s 9.5957 KOps/s $\color{#35bf28}+3.91\%$
test_compile_add_self_flat[pytree-eager] 0.4770ms 0.4062ms 2.4618 KOps/s 2.4421 KOps/s $\color{#35bf28}+0.81\%$
test_compile_add_self_flat[pytree-compile] 0.1745ms 0.1367ms 7.3156 KOps/s 7.4194 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_copy_flat[tensordict-compile] 45.0600μs 19.5587μs 51.1282 KOps/s 52.8573 KOps/s $\color{#d91a1a}-3.27\%$
test_compile_copy_flat[tensordict-eager] 67.4610μs 31.4959μs 31.7501 KOps/s 32.2290 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_copy_flat[pytree-compile] 0.1050ms 70.6913μs 14.1460 KOps/s 14.2827 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_copy_flat[pytree-eager] 0.1395ms 51.4032μs 19.4541 KOps/s 19.3362 KOps/s $\color{#35bf28}+0.61\%$
test_compile_assign_and_add[tensordict-compile] 1.6272ms 0.3931ms 2.5438 KOps/s 2.2335 KOps/s $\textbf{\color{#35bf28}+13.89\%}$
test_compile_assign_and_add[tensordict-eager] 2.9652ms 2.7324ms 365.9796 Ops/s 368.3649 Ops/s $\color{#d91a1a}-0.65\%$
test_compile_assign_and_add[pytree-compile] 1.5697ms 0.4276ms 2.3386 KOps/s 2.1996 KOps/s $\textbf{\color{#35bf28}+6.32\%}$
test_compile_assign_and_add[pytree-eager] 2.8119ms 2.6481ms 377.6232 Ops/s 369.6093 Ops/s $\color{#35bf28}+2.17\%$
test_compile_indexing[tensor-tensordict-compile] 0.5941ms 0.1208ms 8.2755 KOps/s 8.2725 KOps/s $\color{#35bf28}+0.04\%$
test_compile_indexing[tensor-tensordict-eager] 0.6424ms 85.4149μs 11.7076 KOps/s 11.6985 KOps/s $\color{#35bf28}+0.08\%$
test_compile_indexing[tensor-tensorclass-compile] 0.4365ms 0.1127ms 8.8733 KOps/s 8.8510 KOps/s $\color{#35bf28}+0.25\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1052ms 68.7010μs 14.5558 KOps/s 13.4912 KOps/s $\textbf{\color{#35bf28}+7.89\%}$
test_compile_indexing[tensor-pytree-compile] 0.1716ms 0.1078ms 9.2750 KOps/s 8.7332 KOps/s $\textbf{\color{#35bf28}+6.20\%}$
test_compile_indexing[tensor-pytree-eager] 0.1207ms 71.7924μs 13.9291 KOps/s 13.5088 KOps/s $\color{#35bf28}+3.11\%$
test_compile_indexing[slice-tensordict-compile] 0.1510ms 0.1040ms 9.6110 KOps/s 9.4226 KOps/s $\color{#35bf28}+2.00\%$
test_compile_indexing[slice-tensordict-eager] 0.1550ms 18.9258μs 52.8378 KOps/s 52.2490 KOps/s $\color{#35bf28}+1.13\%$
test_compile_indexing[slice-tensorclass-compile] 0.1556ms 96.8789μs 10.3222 KOps/s 10.4082 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[slice-tensorclass-eager] 68.2910μs 16.1161μs 62.0499 KOps/s 61.9189 KOps/s $\color{#35bf28}+0.21\%$
test_compile_indexing[slice-pytree-compile] 0.1501ms 98.4032μs 10.1623 KOps/s 9.9141 KOps/s $\color{#35bf28}+2.50\%$
test_compile_indexing[slice-pytree-eager] 61.0610μs 16.2478μs 61.5469 KOps/s 61.2745 KOps/s $\color{#35bf28}+0.44\%$
test_compile_indexing[int-tensordict-compile] 0.1564ms 0.1062ms 9.4196 KOps/s 9.3314 KOps/s $\color{#35bf28}+0.94\%$
test_compile_indexing[int-tensordict-eager] 0.5852ms 18.5631μs 53.8703 KOps/s 54.5752 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_indexing[int-tensorclass-compile] 0.1591ms 0.1013ms 9.8714 KOps/s 9.6548 KOps/s $\color{#35bf28}+2.24\%$
test_compile_indexing[int-tensorclass-eager] 64.8310μs 15.9995μs 62.5020 KOps/s 61.5621 KOps/s $\color{#35bf28}+1.53\%$
test_compile_indexing[int-pytree-compile] 0.1536ms 96.1248μs 10.4031 KOps/s 9.7468 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_compile_indexing[int-pytree-eager] 42.8710μs 16.0195μs 62.4239 KOps/s 61.4608 KOps/s $\color{#35bf28}+1.57\%$
test_mod_add[eager] 0.1262ms 37.3214μs 26.7943 KOps/s 22.8649 KOps/s $\textbf{\color{#35bf28}+17.18\%}$
test_mod_add[compile] 0.4106ms 81.3505μs 12.2925 KOps/s 11.7005 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_mod_add[compile-overhead] 0.3418ms 0.1714ms 5.8347 KOps/s 5.7598 KOps/s $\color{#35bf28}+1.30\%$
test_mod_wrap[eager] 0.3287ms 0.2487ms 4.0204 KOps/s 3.8829 KOps/s $\color{#35bf28}+3.54\%$
test_mod_wrap[compile] 0.6306ms 0.2957ms 3.3819 KOps/s 3.4748 KOps/s $\color{#d91a1a}-2.67\%$
test_mod_wrap[compile-overhead] 6.9695ms 3.7183ms 268.9401 Ops/s 268.1232 Ops/s $\color{#35bf28}+0.30\%$
test_mod_wrap_and_backward[eager] 1.5123ms 1.3613ms 734.5931 Ops/s 676.9530 Ops/s $\textbf{\color{#35bf28}+8.51\%}$
test_mod_wrap_and_backward[compile] 1.5055ms 1.2778ms 782.5868 Ops/s 722.3606 Ops/s $\textbf{\color{#35bf28}+8.34\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4185ms 0.9337ms 1.0710 KOps/s 963.2584 Ops/s $\textbf{\color{#35bf28}+11.18\%}$
test_seq_add[eager] 0.1814ms 0.1246ms 8.0239 KOps/s 8.2413 KOps/s $\color{#d91a1a}-2.64\%$
test_seq_add[compile] 0.1447ms 94.5986μs 10.5710 KOps/s 11.4898 KOps/s $\textbf{\color{#d91a1a}-8.00\%}$
test_seq_add[compile-overhead] 0.1822ms 0.1371ms 7.2932 KOps/s 7.7384 KOps/s $\textbf{\color{#d91a1a}-5.75\%}$
test_seq_wrap[eager] 0.5584ms 0.4429ms 2.2576 KOps/s 2.3047 KOps/s $\color{#d91a1a}-2.04\%$
test_seq_wrap[compile] 0.3719ms 0.3143ms 3.1818 KOps/s 3.3094 KOps/s $\color{#d91a1a}-3.86\%$
test_seq_wrap[compile-overhead] 0.2791ms 0.2312ms 4.3257 KOps/s 4.4251 KOps/s $\color{#d91a1a}-2.25\%$
test_func_call_runtime[False-eager] 1.0820ms 0.7784ms 1.2847 KOps/s 1.3350 KOps/s $\color{#d91a1a}-3.77\%$
test_func_call_runtime[False-compile] 0.9246ms 0.7784ms 1.2847 KOps/s 1.3131 KOps/s $\color{#d91a1a}-2.17\%$
test_func_call_runtime[False-compile-overhead] 0.4349ms 0.3632ms 2.7531 KOps/s 2.7397 KOps/s $\color{#35bf28}+0.49\%$
test_func_call_runtime[True-eager] 0.9759ms 0.8953ms 1.1170 KOps/s 1.0872 KOps/s $\color{#35bf28}+2.74\%$
test_func_call_runtime[True-compile] 0.8808ms 0.7937ms 1.2600 KOps/s 1.2475 KOps/s $\color{#35bf28}+1.00\%$
test_func_call_runtime[True-compile-overhead] 0.5088ms 0.3959ms 2.5258 KOps/s 2.6035 KOps/s $\color{#d91a1a}-2.98\%$
test_func_call_cm_runtime[False-eager] 1.1210ms 0.7564ms 1.3220 KOps/s 1.2544 KOps/s $\textbf{\color{#35bf28}+5.39\%}$
test_func_call_cm_runtime[False-compile] 1.1348ms 0.7439ms 1.3442 KOps/s 1.3003 KOps/s $\color{#35bf28}+3.37\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4604ms 0.3732ms 2.6794 KOps/s 2.7075 KOps/s $\color{#d91a1a}-1.04\%$
test_func_call_cm_runtime[True-eager] 1.1748ms 1.0043ms 995.7074 Ops/s 973.5487 Ops/s $\color{#35bf28}+2.28\%$
test_func_call_cm_runtime[True-compile] 1.4088ms 0.9878ms 1.0124 KOps/s 998.1195 Ops/s $\color{#35bf28}+1.43\%$
test_func_call_cm_runtime[True-compile-overhead] 1.4087ms 0.9889ms 1.0112 KOps/s 982.0557 Ops/s $\color{#35bf28}+2.97\%$
test_vmap_func_call_cm_runtime[eager] 2.6056ms 2.1084ms 474.3037 Ops/s 470.5406 Ops/s $\color{#35bf28}+0.80\%$
test_vmap_func_call_cm_runtime[compile] 1.2080ms 0.8454ms 1.1829 KOps/s 1.1591 KOps/s $\color{#35bf28}+2.05\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8354ms 0.4257ms 2.3488 KOps/s 2.3694 KOps/s $\color{#d91a1a}-0.87\%$
test_distributed 3.0709ms 0.2359ms 4.2390 KOps/s 8.3073 KOps/s $\textbf{\color{#d91a1a}-48.97\%}$
test_tdmodule 0.1712ms 19.5604μs 51.1237 KOps/s 47.4626 KOps/s $\textbf{\color{#35bf28}+7.71\%}$
test_tdmodule_dispatch 0.4325ms 34.8469μs 28.6969 KOps/s 26.4580 KOps/s $\textbf{\color{#35bf28}+8.46\%}$
test_tdseq 37.9410μs 19.2452μs 51.9610 KOps/s 44.8289 KOps/s $\textbf{\color{#35bf28}+15.91\%}$
test_tdseq_dispatch 58.7310μs 35.8927μs 27.8608 KOps/s 23.9665 KOps/s $\textbf{\color{#35bf28}+16.25\%}$
test_instantiation_functorch 1.9466ms 1.5721ms 636.1002 Ops/s 627.6643 Ops/s $\color{#35bf28}+1.34\%$
test_exec_functorch 0.1934ms 0.1422ms 7.0303 KOps/s 6.4288 KOps/s $\textbf{\color{#35bf28}+9.36\%}$
test_exec_functional_call 0.5199ms 0.1359ms 7.3570 KOps/s 6.5733 KOps/s $\textbf{\color{#35bf28}+11.92\%}$
test_exec_td_decorator 0.5900ms 0.1864ms 5.3638 KOps/s 4.9847 KOps/s $\textbf{\color{#35bf28}+7.60\%}$
test_vmap_mlp_speed_decorator[True-True] 1.1099ms 0.7198ms 1.3894 KOps/s 1.3723 KOps/s $\color{#35bf28}+1.25\%$
test_vmap_mlp_speed_decorator[True-False] 1.1148ms 0.6922ms 1.4447 KOps/s 1.4009 KOps/s $\color{#35bf28}+3.13\%$
test_vmap_mlp_speed_decorator[False-True] 1.0444ms 0.6360ms 1.5722 KOps/s 1.6478 KOps/s $\color{#d91a1a}-4.59\%$
test_vmap_mlp_speed_decorator[False-False] 0.7399ms 0.6292ms 1.5892 KOps/s 1.6237 KOps/s $\color{#d91a1a}-2.13\%$
test_vmap_transformer_speed_decorator[True-True] 19.3854ms 19.2637ms 51.9111 Ops/s 51.6513 Ops/s $\color{#35bf28}+0.50\%$
test_vmap_transformer_speed_decorator[True-False] 19.3586ms 19.2600ms 51.9211 Ops/s 51.5917 Ops/s $\color{#35bf28}+0.64\%$
test_vmap_transformer_speed_decorator[False-True] 19.2244ms 19.1130ms 52.3205 Ops/s 51.5296 Ops/s $\color{#35bf28}+1.53\%$
test_vmap_transformer_speed_decorator[False-False] 19.3283ms 19.1245ms 52.2888 Ops/s 51.5509 Ops/s $\color{#35bf28}+1.43\%$
test_to_module_speed[True] 1.4207ms 0.9725ms 1.0283 KOps/s 1.0380 KOps/s $\color{#d91a1a}-0.94\%$
test_to_module_speed[False] 1.1433ms 0.9474ms 1.0556 KOps/s 1.0472 KOps/s $\color{#35bf28}+0.80\%$
test_tc_init 87.7810μs 35.9337μs 27.8290 KOps/s 25.1544 KOps/s $\textbf{\color{#35bf28}+10.63\%}$
test_tc_init_nested 0.1076ms 73.0131μs 13.6962 KOps/s 12.5288 KOps/s $\textbf{\color{#35bf28}+9.32\%}$
test_tc_first_layer_tensor 5.2743μs 0.7005μs 1.4275 MOps/s 1.1888 MOps/s $\textbf{\color{#35bf28}+20.08\%}$
test_tc_first_layer_nontensor 29.1010μs 2.2605μs 442.3709 KOps/s 440.3719 KOps/s $\color{#35bf28}+0.45\%$
test_tc_second_layer_tensor 7.9250μs 1.4187μs 704.8944 KOps/s 701.4689 KOps/s $\color{#35bf28}+0.49\%$
test_tc_second_layer_nontensor 23.0400μs 3.0400μs 328.9454 KOps/s 334.6317 KOps/s $\color{#d91a1a}-1.70\%$
test_unbind 0.2195s 12.2590ms 81.5725 Ops/s 144.2281 Ops/s $\textbf{\color{#d91a1a}-43.44\%}$
test_full_like 9.7426ms 9.1609ms 109.1592 Ops/s 107.5303 Ops/s $\color{#35bf28}+1.51\%$
test_zeros_like 4.8653ms 4.3186ms 231.5580 Ops/s 114.9075 Ops/s $\textbf{\color{#35bf28}+101.52\%}$
test_ones_like 5.4531ms 4.3359ms 230.6328 Ops/s 230.8668 Ops/s $\color{#d91a1a}-0.10\%$
test_clone 7.0312ms 6.3979ms 156.3001 Ops/s 155.7481 Ops/s $\color{#35bf28}+0.35\%$
test_squeeze 61.8410μs 10.2518μs 97.5435 KOps/s 103.1150 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_unsqueeze 0.1257ms 74.1656μs 13.4833 KOps/s 13.6365 KOps/s $\color{#d91a1a}-1.12\%$
test_split 0.3788ms 0.1697ms 5.8932 KOps/s 6.2177 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_permute 0.2895ms 0.1873ms 5.3384 KOps/s 5.5541 KOps/s $\color{#d91a1a}-3.88\%$
test_stack 53.4303ms 51.9466ms 19.2505 Ops/s 19.8721 Ops/s $\color{#d91a1a}-3.13\%$
test_cat 50.8474ms 50.3313ms 19.8684 Ops/s 19.8797 Ops/s $\color{#d91a1a}-0.06\%$

@vmoens vmoens merged commit 708f09c into gh/vmoens/48/base Feb 6, 2025
50 of 51 checks passed
vmoens added a commit that referenced this pull request Feb 6, 2025
ghstack-source-id: 8515d8393c9b6f3deb1c76b2161fab58599e4945
Pull Request resolved: #1212
@vmoens vmoens deleted the gh/vmoens/48/head branch February 6, 2025 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. suitable for minor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants