-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Proper auto-batch size for unbatched tensors #1213
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Feb 6, 2025
ghstack-source-id: 1ad6616dfcdd55bd055512e96a1e942b27d02ec8 Pull Request resolved: #1213
3 tasks
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 57.1070μs | 20.9949μs | 47.6306 KOps/s | 47.3509 KOps/s | |
test_plain_set_stack_nested | 60.3330μs | 21.3570μs | 46.8230 KOps/s | 47.2402 KOps/s | |
test_plain_set_nested_inplace | 52.1180μs | 23.0887μs | 43.3112 KOps/s | 43.2040 KOps/s | |
test_plain_set_stack_nested_inplace | 77.2350μs | 23.0352μs | 43.4119 KOps/s | 43.2579 KOps/s | |
test_items | 22.9730μs | 4.2081μs | 237.6365 KOps/s | 239.3590 KOps/s | |
test_items_nested | 0.5575ms | 0.4090ms | 2.4450 KOps/s | 2.4625 KOps/s | |
test_items_nested_locked | 0.5758ms | 0.4086ms | 2.4476 KOps/s | 2.4693 KOps/s | |
test_items_nested_leaf | 0.1386ms | 76.5577μs | 13.0620 KOps/s | 12.7316 KOps/s | |
test_items_stack_nested | 0.5105ms | 0.4141ms | 2.4149 KOps/s | 2.4587 KOps/s | |
test_items_stack_nested_leaf | 0.1516ms | 76.7125μs | 13.0357 KOps/s | 12.6961 KOps/s | |
test_items_stack_nested_locked | 0.5792ms | 0.4117ms | 2.4292 KOps/s | 2.4623 KOps/s | |
test_keys | 23.6540μs | 3.6379μs | 274.8804 KOps/s | 283.1849 KOps/s | |
test_keys_nested | 0.3841ms | 0.1665ms | 6.0049 KOps/s | 6.0751 KOps/s | |
test_keys_nested_locked | 1.8287ms | 0.1716ms | 5.8274 KOps/s | 5.8874 KOps/s | |
test_keys_nested_leaf | 0.2346ms | 0.1442ms | 6.9348 KOps/s | 6.9952 KOps/s | |
test_keys_stack_nested | 0.2639ms | 0.1639ms | 6.1003 KOps/s | 5.9270 KOps/s | |
test_keys_stack_nested_leaf | 0.3663ms | 0.1446ms | 6.9142 KOps/s | 7.0398 KOps/s | |
test_keys_stack_nested_locked | 0.2956ms | 0.1706ms | 5.8610 KOps/s | 5.8567 KOps/s | |
test_values | 5.8368μs | 1.0471μs | 955.0585 KOps/s | 922.8459 KOps/s | |
test_values_nested | 0.1703ms | 64.6425μs | 15.4697 KOps/s | 16.0257 KOps/s | |
test_values_nested_locked | 0.1320ms | 62.6129μs | 15.9711 KOps/s | 14.9438 KOps/s | |
test_values_nested_leaf | 0.1665ms | 71.1531μs | 14.0542 KOps/s | 13.9788 KOps/s | |
test_values_stack_nested | 0.1238ms | 62.7405μs | 15.9387 KOps/s | 16.0849 KOps/s | |
test_values_stack_nested_leaf | 0.1243ms | 70.9114μs | 14.1021 KOps/s | 14.1068 KOps/s | |
test_values_stack_nested_locked | 0.1173ms | 63.0481μs | 15.8609 KOps/s | 16.1650 KOps/s | |
test_membership | 2.3865μs | 0.7362μs | 1.3583 MOps/s | 1.3925 MOps/s | |
test_membership_nested | 30.4670μs | 2.9125μs | 343.3439 KOps/s | 342.2051 KOps/s | |
test_membership_nested_leaf | 38.5090μs | 2.9186μs | 342.6277 KOps/s | 348.2251 KOps/s | |
test_membership_stacked_nested | 28.9340μs | 2.9232μs | 342.0875 KOps/s | 350.5871 KOps/s | |
test_membership_stacked_nested_leaf | 27.2710μs | 2.9090μs | 343.7634 KOps/s | 344.2109 KOps/s | |
test_membership_nested_last | 56.7260μs | 4.3027μs | 232.4112 KOps/s | 234.8880 KOps/s | |
test_membership_nested_leaf_last | 33.1120μs | 4.3829μs | 228.1578 KOps/s | 234.5166 KOps/s | |
test_membership_stacked_nested_last | 34.2850μs | 4.3258μs | 231.1693 KOps/s | 231.4955 KOps/s | |
test_membership_stacked_nested_leaf_last | 23.6840μs | 4.2784μs | 233.7318 KOps/s | 231.0090 KOps/s | |
test_nested_getleaf | 41.6080μs | 10.5816μs | 94.5040 KOps/s | 93.0397 KOps/s | |
test_nested_get | 55.4980μs | 9.9369μs | 100.6354 KOps/s | 97.5632 KOps/s | |
test_stacked_getleaf | 48.4700μs | 10.4552μs | 95.6464 KOps/s | 94.5557 KOps/s | |
test_stacked_get | 41.8590μs | 9.7382μs | 102.6881 KOps/s | 98.6425 KOps/s | |
test_nested_getitemleaf | 41.0070μs | 11.2888μs | 88.5836 KOps/s | 89.2181 KOps/s | |
test_nested_getitem | 34.4240μs | 10.6744μs | 93.6822 KOps/s | 92.3114 KOps/s | |
test_stacked_getitemleaf | 61.2750μs | 10.9861μs | 91.0239 KOps/s | 87.9566 KOps/s | |
test_stacked_getitem | 36.8790μs | 10.6455μs | 93.9360 KOps/s | 94.4582 KOps/s | |
test_lock_nested | 4.8462ms | 0.4182ms | 2.3909 KOps/s | 2.4561 KOps/s | |
test_lock_stack_nested | 0.7249ms | 0.4252ms | 2.3518 KOps/s | 2.3566 KOps/s | |
test_unlock_nested | 0.4830ms | 0.3389ms | 2.9510 KOps/s | 2.9957 KOps/s | |
test_unlock_stack_nested | 0.4230ms | 0.3453ms | 2.8963 KOps/s | 2.9095 KOps/s | |
test_flatten_speed | 0.1915ms | 0.1003ms | 9.9724 KOps/s | 9.8341 KOps/s | |
test_unflatten_speed | 0.6650ms | 0.5175ms | 1.9323 KOps/s | 1.9069 KOps/s | |
test_common_ops | 0.9938ms | 0.8425ms | 1.1870 KOps/s | 1.1856 KOps/s | |
test_creation | 43.4750μs | 2.4700μs | 404.8501 KOps/s | 409.4829 KOps/s | |
test_creation_empty | 40.8170μs | 13.1051μs | 76.3062 KOps/s | 77.1654 KOps/s | |
test_creation_nested_1 | 49.1920μs | 16.0730μs | 62.2162 KOps/s | 62.6387 KOps/s | |
test_creation_nested_2 | 50.8950μs | 20.4705μs | 48.8507 KOps/s | 48.5126 KOps/s | |
test_clone | 70.9840μs | 13.7898μs | 72.5173 KOps/s | 73.7750 KOps/s | |
test_getitem[int] | 0.8343ms | 13.0429μs | 76.6702 KOps/s | 78.9796 KOps/s | |
test_getitem[slice_int] | 0.1421ms | 24.6628μs | 40.5468 KOps/s | 41.2549 KOps/s | |
test_getitem[range] | 0.1899ms | 49.7749μs | 20.0905 KOps/s | 19.4154 KOps/s | |
test_getitem[tuple] | 0.1194ms | 19.9847μs | 50.0383 KOps/s | 49.4537 KOps/s | |
test_getitem[list] | 0.1538ms | 45.4943μs | 21.9808 KOps/s | 21.7180 KOps/s | |
test_setitem_dim[int] | 61.6460μs | 25.4827μs | 39.2423 KOps/s | 38.3277 KOps/s | |
test_setitem_dim[slice_int] | 0.1259ms | 53.8677μs | 18.5640 KOps/s | 19.0695 KOps/s | |
test_setitem_dim[range] | 0.1420ms | 80.5051μs | 12.4216 KOps/s | 12.6922 KOps/s | |
test_setitem_dim[tuple] | 75.1010μs | 41.8603μs | 23.8890 KOps/s | 23.3218 KOps/s | |
test_setitem | 71.2030μs | 22.1641μs | 45.1179 KOps/s | 46.0977 KOps/s | |
test_set | 72.4060μs | 21.4600μs | 46.5983 KOps/s | 47.3070 KOps/s | |
test_set_shared | 0.3999ms | 0.1837ms | 5.4430 KOps/s | 5.4130 KOps/s | |
test_update | 0.1166ms | 25.3697μs | 39.4170 KOps/s | 40.4770 KOps/s | |
test_update_nested | 98.9350μs | 35.6823μs | 28.0251 KOps/s | 28.2475 KOps/s | |
test_update__nested | 0.4387ms | 34.4127μs | 29.0591 KOps/s | 29.4901 KOps/s | |
test_set_nested | 78.0360μs | 23.6230μs | 42.3315 KOps/s | 43.5489 KOps/s | |
test_set_nested_new | 73.8290μs | 28.1466μs | 35.5283 KOps/s | 35.5630 KOps/s | |
test_select | 0.1077ms | 44.4116μs | 22.5166 KOps/s | 22.4954 KOps/s | |
test_select_nested | 0.1373ms | 63.6458μs | 15.7120 KOps/s | 15.8656 KOps/s | |
test_exclude_nested | 0.1685ms | 81.0528μs | 12.3376 KOps/s | 12.3626 KOps/s | |
test_empty[True] | 0.5984ms | 0.4117ms | 2.4292 KOps/s | 2.4616 KOps/s | |
test_empty[False] | 6.6300μs | 1.3860μs | 721.5146 KOps/s | 726.4998 KOps/s | |
test_unbind_speed | 0.4627ms | 0.2734ms | 3.6581 KOps/s | 3.7355 KOps/s | |
test_unbind_speed_stack0 | 0.3748ms | 0.2693ms | 3.7128 KOps/s | 3.7492 KOps/s | |
test_unbind_speed_stack1 | 0.1023s | 0.7435ms | 1.3449 KOps/s | 1.2426 KOps/s | |
test_split | 0.1049s | 1.7510ms | 571.1145 Ops/s | 562.8396 Ops/s | |
test_chunk | 0.1134s | 1.7863ms | 559.8285 Ops/s | 617.4815 Ops/s | |
test_consolidate_njt[False-None] | 8.6840ms | 8.1931ms | 122.0539 Ops/s | 107.4551 Ops/s | |
test_creation[device0] | 3.8754ms | 93.9538μs | 10.6435 KOps/s | 10.5794 KOps/s | |
test_creation_from_tensor | 0.2595ms | 94.5022μs | 10.5818 KOps/s | 10.5076 KOps/s | |
test_add_one[memmap_tensor0] | 83.9670μs | 5.2581μs | 190.1814 KOps/s | 186.5720 KOps/s | |
test_contiguous[memmap_tensor0] | 13.6150μs | 0.5071μs | 1.9722 MOps/s | 1.8946 MOps/s | |
test_stack[memmap_tensor0] | 30.0960μs | 3.7983μs | 263.2747 KOps/s | 270.9095 KOps/s | |
test_memmaptd_index | 1.1440ms | 0.2325ms | 4.3011 KOps/s | 4.3736 KOps/s | |
test_memmaptd_index_astensor | 0.4843ms | 0.3173ms | 3.1517 KOps/s | 3.1679 KOps/s | |
test_memmaptd_index_op | 0.8870ms | 0.6243ms | 1.6018 KOps/s | 1.5940 KOps/s | |
test_serialize_model | 0.2210s | 0.1329s | 7.5232 Ops/s | 8.6560 Ops/s | |
test_serialize_model_pickle | 0.4308s | 0.3889s | 2.5714 Ops/s | 2.5372 Ops/s | |
test_serialize_weights | 0.1285s | 0.1135s | 8.8115 Ops/s | 8.4364 Ops/s | |
test_serialize_weights_returnearly | 0.3594s | 0.1876s | 5.3313 Ops/s | 5.6365 Ops/s | |
test_serialize_weights_pickle | 0.5355s | 0.4207s | 2.3771 Ops/s | 2.5327 Ops/s | |
test_serialize_weights_filesystem | 0.2474s | 0.1554s | 6.4343 Ops/s | 7.0182 Ops/s | |
test_serialize_model_filesystem | 0.1547s | 0.1436s | 6.9643 Ops/s | 6.6335 Ops/s | |
test_reshape_pytree | 72.0750μs | 26.1560μs | 38.2321 KOps/s | 37.8392 KOps/s | |
test_reshape_td | 79.8190μs | 32.5921μs | 30.6823 KOps/s | 31.4498 KOps/s | |
test_view_pytree | 66.9150μs | 25.8773μs | 38.6439 KOps/s | 38.6282 KOps/s | |
test_view_td | 90.1580μs | 38.4297μs | 26.0215 KOps/s | 26.2487 KOps/s | |
test_unbind_pytree | 68.6580μs | 29.1583μs | 34.2956 KOps/s | 34.1627 KOps/s | |
test_unbind_td | 0.3363ms | 39.6499μs | 25.2208 KOps/s | 25.0255 KOps/s | |
test_split_pytree | 89.1970μs | 28.6044μs | 34.9596 KOps/s | 34.9636 KOps/s | |
test_split_td | 0.5907ms | 45.0840μs | 22.1808 KOps/s | 22.3819 KOps/s | |
test_add_pytree | 97.3830μs | 36.1285μs | 27.6789 KOps/s | 27.1092 KOps/s | |
test_add_td | 0.1438ms | 60.1986μs | 16.6117 KOps/s | 16.1146 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2063ms | 67.8793μs | 14.7320 KOps/s | 14.9304 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3061ms | 0.1715ms | 5.8321 KOps/s | 5.6990 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 96.5510μs | 46.4407μs | 21.5328 KOps/s | 21.5085 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2004ms | 0.1197ms | 8.3571 KOps/s | 8.2351 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 57.6280μs | 28.4809μs | 35.1113 KOps/s | 36.3469 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1252ms | 58.9449μs | 16.9650 KOps/s | 17.1440 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1906ms | 82.0468μs | 12.1882 KOps/s | 12.5092 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1348ms | 65.5904μs | 15.2461 KOps/s | 14.8708 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2462ms | 0.1083ms | 9.2296 KOps/s | 8.9758 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3691ms | 0.2179ms | 4.5894 KOps/s | 4.5901 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1117ms | 48.2675μs | 20.7179 KOps/s | 20.9094 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1908ms | 69.2666μs | 14.4370 KOps/s | 14.6627 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1869ms | 0.1021ms | 9.7897 KOps/s | 9.8070 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3275ms | 0.2051ms | 4.8746 KOps/s | 4.8958 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3713ms | 0.2333ms | 4.2864 KOps/s | 4.2783 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2150ms | 0.1115ms | 8.9652 KOps/s | 9.0390 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1421ms | 64.1538μs | 15.5875 KOps/s | 16.1087 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1299ms | 48.6225μs | 20.5666 KOps/s | 20.4847 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3123ms | 0.1583ms | 6.3153 KOps/s | 6.2684 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2258ms | 0.1008ms | 9.9237 KOps/s | 9.6689 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 50.3750μs | 21.5255μs | 46.4566 KOps/s | 47.3016 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1341ms | 67.4743μs | 14.8205 KOps/s | 14.9753 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1602ms | 81.5523μs | 12.2621 KOps/s | 12.2914 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1292ms | 67.0616μs | 14.9117 KOps/s | 14.5980 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2983ms | 0.2167ms | 4.6141 KOps/s | 4.6269 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.7631ms | 1.4079ms | 710.2999 Ops/s | 716.8569 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3482ms | 0.2116ms | 4.7265 KOps/s | 4.7208 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0078ms | 0.8374ms | 1.1942 KOps/s | 1.1968 KOps/s | |
test_compile_assign_and_add_stack[compile] | 1.0273ms | 0.4866ms | 2.0552 KOps/s | 2.1716 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.0436ms | 2.7773ms | 360.0634 Ops/s | 352.8001 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 97.2920μs | 39.1050μs | 25.5722 KOps/s | 26.0728 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5628ms | 33.0731μs | 30.2360 KOps/s | 29.7562 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 75.3510μs | 31.5108μs | 31.7351 KOps/s | 32.9986 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 82.7150μs | 22.9159μs | 43.6378 KOps/s | 44.0026 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 84.7490μs | 32.4947μs | 30.7742 KOps/s | 31.9310 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 88.6660μs | 22.8668μs | 43.7315 KOps/s | 43.4447 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1319ms | 54.1294μs | 18.4743 KOps/s | 18.3707 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3799ms | 19.7023μs | 50.7555 KOps/s | 48.7564 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1157ms | 46.7913μs | 21.3715 KOps/s | 21.0231 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 77.2240μs | 18.2580μs | 54.7706 KOps/s | 52.8804 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1072ms | 47.1439μs | 21.2117 KOps/s | 20.7126 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 59.2110μs | 18.5488μs | 53.9118 KOps/s | 52.9155 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1293ms | 55.7975μs | 17.9219 KOps/s | 17.8321 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0816ms | 19.8290μs | 50.4312 KOps/s | 50.3308 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1145ms | 47.4613μs | 21.0698 KOps/s | 20.8894 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 67.9480μs | 18.3744μs | 54.4234 KOps/s | 52.8649 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1141ms | 46.6165μs | 21.4516 KOps/s | 20.6856 KOps/s | |
test_compile_indexing[int-pytree-eager] | 56.9260μs | 18.2841μs | 54.6923 KOps/s | 53.6871 KOps/s | |
test_mod_add[eager] | 88.7560μs | 36.9388μs | 27.0718 KOps/s | 27.1944 KOps/s | |
test_mod_add[compile] | 0.1245ms | 66.9673μs | 14.9327 KOps/s | 15.1110 KOps/s | |
test_mod_add[compile-overhead] | 0.1259ms | 65.5926μs | 15.2456 KOps/s | 15.1043 KOps/s | |
test_mod_wrap[eager] | 0.3875ms | 0.2300ms | 4.3477 KOps/s | 4.2952 KOps/s | |
test_mod_wrap[compile] | 2.0261ms | 0.2345ms | 4.2650 KOps/s | 4.2648 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4214ms | 0.2308ms | 4.3325 KOps/s | 4.2686 KOps/s | |
test_mod_wrap_and_backward[eager] | 16.8172ms | 12.9565ms | 77.1812 Ops/s | 73.9185 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.5086ms | 11.9268ms | 83.8448 Ops/s | 86.4909 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.4350ms | 11.6809ms | 85.6101 Ops/s | 89.5651 Ops/s | |
test_seq_add[eager] | 0.2433ms | 0.1215ms | 8.2297 KOps/s | 7.9756 KOps/s | |
test_seq_add[compile] | 0.1493ms | 78.6100μs | 12.7210 KOps/s | 12.7766 KOps/s | |
test_seq_add[compile-overhead] | 0.1502ms | 78.0352μs | 12.8147 KOps/s | 13.2193 KOps/s | |
test_seq_wrap[eager] | 0.7818ms | 0.4619ms | 2.1650 KOps/s | 2.1327 KOps/s | |
test_seq_wrap[compile] | 0.4558ms | 0.2492ms | 4.0123 KOps/s | 4.0443 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4348ms | 0.2480ms | 4.0322 KOps/s | 4.0739 KOps/s | |
test_func_call_runtime[False-eager] | 0.8320ms | 0.5502ms | 1.8177 KOps/s | 1.7868 KOps/s | |
test_func_call_runtime[False-compile] | 0.5913ms | 0.4526ms | 2.2097 KOps/s | 2.2420 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6270ms | 0.4502ms | 2.2213 KOps/s | 2.2183 KOps/s | |
test_func_call_runtime[True-eager] | 0.8982ms | 0.7712ms | 1.2967 KOps/s | 1.2984 KOps/s | |
test_func_call_runtime[True-compile] | 0.9898ms | 0.4706ms | 2.1250 KOps/s | 2.1545 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 1.0208ms | 0.4715ms | 2.1209 KOps/s | 2.1124 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9346ms | 0.5546ms | 1.8030 KOps/s | 1.8008 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5457ms | 0.4463ms | 2.2406 KOps/s | 2.2309 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7330ms | 0.4439ms | 2.2527 KOps/s | 2.2178 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.8527ms | 0.9126ms | 1.0958 KOps/s | 1.0828 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.6044ms | 0.8175ms | 1.2233 KOps/s | 1.2249 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1140ms | 0.8090ms | 1.2360 KOps/s | 1.2103 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5402ms | 1.9272ms | 518.8781 Ops/s | 512.7897 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7495ms | 0.5415ms | 1.8469 KOps/s | 1.8291 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0955ms | 0.5479ms | 1.8252 KOps/s | 1.8254 KOps/s | |
test_distributed | 1.3204ms | 0.1286ms | 7.7742 KOps/s | 7.7052 KOps/s | |
test_tdmodule | 84.7290μs | 28.1792μs | 35.4872 KOps/s | 35.6846 KOps/s | |
test_tdmodule_dispatch | 76.6330μs | 51.0532μs | 19.5874 KOps/s | 19.6791 KOps/s | |
test_tdseq | 56.0250μs | 29.7308μs | 33.6352 KOps/s | 32.8777 KOps/s | |
test_tdseq_dispatch | 74.6490μs | 55.4863μs | 18.0225 KOps/s | 17.6827 KOps/s | |
test_instantiation_functorch | 1.6083ms | 1.5146ms | 660.2616 Ops/s | 651.0226 Ops/s | |
test_exec_functorch | 0.3273ms | 0.1836ms | 5.4461 KOps/s | 5.5475 KOps/s | |
test_exec_functional_call | 0.3382ms | 0.1755ms | 5.6990 KOps/s | 5.6055 KOps/s | |
test_exec_td_decorator | 0.5138ms | 0.2358ms | 4.2409 KOps/s | 4.2233 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2539ms | 0.6811ms | 1.4682 KOps/s | 1.4890 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9278ms | 0.6758ms | 1.4797 KOps/s | 1.4783 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8688ms | 0.5437ms | 1.8393 KOps/s | 1.8497 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7534ms | 0.5437ms | 1.8392 KOps/s | 1.8445 KOps/s | |
test_to_module_speed[True] | 2.1325ms | 1.3274ms | 753.3513 Ops/s | 759.0852 Ops/s | |
test_to_module_speed[False] | 1.8217ms | 1.2878ms | 776.5105 Ops/s | 773.7262 Ops/s | |
test_tc_init | 95.7390μs | 49.0205μs | 20.3996 KOps/s | 20.3739 KOps/s | |
test_tc_init_nested | 0.1901ms | 99.3319μs | 10.0673 KOps/s | 10.0808 KOps/s | |
test_tc_first_layer_tensor | 28.6630μs | 1.6113μs | 620.6139 KOps/s | 665.5847 KOps/s | |
test_tc_first_layer_nontensor | 23.4940μs | 4.7046μs | 212.5577 KOps/s | 218.9070 KOps/s | |
test_tc_second_layer_tensor | 34.5340μs | 2.8837μs | 346.7736 KOps/s | 362.4403 KOps/s | |
test_tc_second_layer_nontensor | 42.7500μs | 5.9779μs | 167.2833 KOps/s | 168.5727 KOps/s | |
test_unbind | 0.2541s | 14.3446ms | 69.7129 Ops/s | 75.7509 Ops/s | |
test_full_like | 9.7960ms | 7.7712ms | 128.6809 Ops/s | 117.8524 Ops/s | |
test_zeros_like | 4.5833ms | 3.0174ms | 331.4110 Ops/s | 367.2735 Ops/s | |
test_ones_like | 3.8613ms | 3.3535ms | 298.1966 Ops/s | 319.0202 Ops/s | |
test_clone | 6.5847ms | 5.4588ms | 183.1911 Ops/s | 196.2277 Ops/s | |
test_squeeze | 93.1650μs | 12.7253μs | 78.5837 KOps/s | 80.3976 KOps/s | |
test_unsqueeze | 0.1637ms | 93.1776μs | 10.7322 KOps/s | 10.9223 KOps/s | |
test_split | 0.3431ms | 0.1972ms | 5.0709 KOps/s | 5.0861 KOps/s | |
test_permute | 0.3500ms | 0.2021ms | 4.9484 KOps/s | 4.9232 KOps/s | |
test_stack | 27.9861ms | 25.5588ms | 39.1254 Ops/s | 38.6171 Ops/s | |
test_cat | 28.5121ms | 25.6800ms | 38.9409 Ops/s | 38.7201 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.2011ms | 11.2552μs | 88.8477 KOps/s | 77.0871 KOps/s | |
test_plain_set_stack_nested | 33.3710μs | 11.3554μs | 88.0640 KOps/s | 76.7191 KOps/s | |
test_plain_set_nested_inplace | 0.1771ms | 12.2472μs | 81.6515 KOps/s | 70.6160 KOps/s | |
test_plain_set_stack_nested_inplace | 41.0810μs | 12.3024μs | 81.2848 KOps/s | 71.4602 KOps/s | |
test_items | 71.9710μs | 2.8772μs | 347.5600 KOps/s | 345.3599 KOps/s | |
test_items_nested | 0.4222ms | 0.3690ms | 2.7100 KOps/s | 2.7281 KOps/s | |
test_items_nested_locked | 0.4187ms | 0.3647ms | 2.7416 KOps/s | 2.7519 KOps/s | |
test_items_nested_leaf | 80.4410μs | 58.9167μs | 16.9731 KOps/s | 16.9430 KOps/s | |
test_items_stack_nested | 0.4360ms | 0.3654ms | 2.7364 KOps/s | 2.7506 KOps/s | |
test_items_stack_nested_leaf | 92.6720μs | 60.2808μs | 16.5890 KOps/s | 16.5154 KOps/s | |
test_items_stack_nested_locked | 0.4150ms | 0.3714ms | 2.6926 KOps/s | 2.7261 KOps/s | |
test_keys | 29.5210μs | 3.5009μs | 285.6403 KOps/s | 284.1845 KOps/s | |
test_keys_nested | 0.1179ms | 88.6969μs | 11.2743 KOps/s | 11.4113 KOps/s | |
test_keys_nested_locked | 0.7495ms | 94.3335μs | 10.6007 KOps/s | 10.6893 KOps/s | |
test_keys_nested_leaf | 0.1060ms | 78.3669μs | 12.7605 KOps/s | 12.5692 KOps/s | |
test_keys_stack_nested | 0.1178ms | 89.0011μs | 11.2358 KOps/s | 11.1211 KOps/s | |
test_keys_stack_nested_leaf | 0.1062ms | 79.7269μs | 12.5428 KOps/s | 12.4132 KOps/s | |
test_keys_stack_nested_locked | 0.1294ms | 95.4871μs | 10.4726 KOps/s | 10.4282 KOps/s | |
test_values | 5.6383μs | 0.8630μs | 1.1588 MOps/s | 1.1763 MOps/s | |
test_values_nested | 67.3020μs | 37.5960μs | 26.5986 KOps/s | 26.1389 KOps/s | |
test_values_nested_locked | 0.1037ms | 39.1845μs | 25.5203 KOps/s | 24.6713 KOps/s | |
test_values_nested_leaf | 68.9610μs | 42.3267μs | 23.6258 KOps/s | 23.1690 KOps/s | |
test_values_stack_nested | 80.3810μs | 38.8513μs | 25.7392 KOps/s | 25.5797 KOps/s | |
test_values_stack_nested_leaf | 0.1163ms | 43.1716μs | 23.1634 KOps/s | 22.9729 KOps/s | |
test_values_stack_nested_locked | 72.6110μs | 40.0906μs | 24.9435 KOps/s | 24.6235 KOps/s | |
test_membership | 1.7211μs | 0.5486μs | 1.8228 MOps/s | 1.8196 MOps/s | |
test_membership_nested | 48.2910μs | 2.1072μs | 474.5746 KOps/s | 489.3804 KOps/s | |
test_membership_nested_leaf | 13.8705μs | 2.0603μs | 485.3695 KOps/s | 489.1017 KOps/s | |
test_membership_stacked_nested | 68.4810μs | 2.1183μs | 472.0857 KOps/s | 467.7627 KOps/s | |
test_membership_stacked_nested_leaf | 29.3600μs | 2.1341μs | 468.5753 KOps/s | 468.1121 KOps/s | |
test_membership_nested_last | 34.0700μs | 3.1650μs | 315.9512 KOps/s | 310.5232 KOps/s | |
test_membership_nested_leaf_last | 26.8200μs | 3.1573μs | 316.7243 KOps/s | 311.9119 KOps/s | |
test_membership_stacked_nested_last | 0.1387ms | 5.4840μs | 182.3488 KOps/s | 312.9915 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.2010μs | 5.4773μs | 182.5712 KOps/s | 315.3807 KOps/s | |
test_nested_getleaf | 35.6600μs | 6.1719μs | 162.0236 KOps/s | 163.3556 KOps/s | |
test_nested_get | 35.1610μs | 5.8316μs | 171.4786 KOps/s | 172.5176 KOps/s | |
test_stacked_getleaf | 27.6210μs | 6.1680μs | 162.1260 KOps/s | 161.6157 KOps/s | |
test_stacked_get | 32.9910μs | 5.8756μs | 170.1953 KOps/s | 171.3769 KOps/s | |
test_nested_getitemleaf | 82.7410μs | 6.5020μs | 153.7988 KOps/s | 156.4412 KOps/s | |
test_nested_getitem | 35.2900μs | 6.1881μs | 161.6015 KOps/s | 163.2347 KOps/s | |
test_stacked_getitemleaf | 32.4310μs | 6.4040μs | 156.1532 KOps/s | 155.6216 KOps/s | |
test_stacked_getitem | 32.8010μs | 6.1166μs | 163.4905 KOps/s | 164.8941 KOps/s | |
test_lock_nested | 10.4080ms | 0.3526ms | 2.8359 KOps/s | 2.8966 KOps/s | |
test_lock_stack_nested | 0.4712ms | 0.3406ms | 2.9357 KOps/s | 2.9370 KOps/s | |
test_unlock_nested | 0.6474ms | 0.2831ms | 3.5327 KOps/s | 3.6299 KOps/s | |
test_unlock_stack_nested | 0.4464ms | 0.2786ms | 3.5890 KOps/s | 3.6013 KOps/s | |
test_flatten_speed | 0.1149ms | 75.7192μs | 13.2067 KOps/s | 13.2617 KOps/s | |
test_unflatten_speed | 0.5182ms | 0.3213ms | 3.1123 KOps/s | 3.0614 KOps/s | |
test_common_ops | 0.8439ms | 0.5826ms | 1.7165 KOps/s | 1.5605 KOps/s | |
test_creation | 0.1218ms | 1.7845μs | 560.3859 KOps/s | 560.3042 KOps/s | |
test_creation_empty | 33.4000μs | 6.4870μs | 154.1545 KOps/s | 102.1742 KOps/s | |
test_creation_nested_1 | 37.9710μs | 8.1518μs | 122.6718 KOps/s | 87.3806 KOps/s | |
test_creation_nested_2 | 40.4810μs | 10.9476μs | 91.3441 KOps/s | 69.9926 KOps/s | |
test_clone | 45.8200μs | 11.0915μs | 90.1594 KOps/s | 93.4204 KOps/s | |
test_getitem[int] | 1.2545ms | 11.1251μs | 89.8867 KOps/s | 94.5777 KOps/s | |
test_getitem[slice_int] | 0.1456ms | 21.7651μs | 45.9451 KOps/s | 48.8945 KOps/s | |
test_getitem[range] | 0.2103ms | 40.0821μs | 24.9488 KOps/s | 26.8865 KOps/s | |
test_getitem[tuple] | 0.1087ms | 18.6193μs | 53.7076 KOps/s | 55.8988 KOps/s | |
test_getitem[list] | 0.1438ms | 34.2434μs | 29.2027 KOps/s | 30.4922 KOps/s | |
test_setitem_dim[int] | 0.1405ms | 20.2527μs | 49.3761 KOps/s | 49.7825 KOps/s | |
test_setitem_dim[slice_int] | 65.2010μs | 38.9765μs | 25.6565 KOps/s | 25.9890 KOps/s | |
test_setitem_dim[range] | 95.8420μs | 55.5598μs | 17.9986 KOps/s | 18.8378 KOps/s | |
test_setitem_dim[tuple] | 67.8310μs | 33.1859μs | 30.1333 KOps/s | 30.7438 KOps/s | |
test_setitem | 0.1347ms | 14.5559μs | 68.7009 KOps/s | 63.3757 KOps/s | |
test_set | 55.7510μs | 14.0511μs | 71.1687 KOps/s | 65.1160 KOps/s | |
test_set_shared | 0.5789ms | 0.1613ms | 6.2003 KOps/s | 6.1507 KOps/s | |
test_update | 0.2158ms | 15.9326μs | 62.7643 KOps/s | 52.6241 KOps/s | |
test_update_nested | 58.3110μs | 21.2485μs | 47.0620 KOps/s | 40.9407 KOps/s | |
test_update__nested | 0.5766ms | 27.7576μs | 36.0262 KOps/s | 38.8164 KOps/s | |
test_set_nested | 68.2310μs | 15.5968μs | 64.1157 KOps/s | 60.5052 KOps/s | |
test_set_nested_new | 53.1510μs | 17.6255μs | 56.7361 KOps/s | 52.7113 KOps/s | |
test_select | 70.3610μs | 29.0523μs | 34.4207 KOps/s | 32.3890 KOps/s | |
test_select_nested | 67.0110μs | 44.2838μs | 22.5816 KOps/s | 22.7780 KOps/s | |
test_exclude_nested | 0.1316ms | 63.5439μs | 15.7372 KOps/s | 15.9850 KOps/s | |
test_empty[True] | 0.4427ms | 0.2951ms | 3.3889 KOps/s | 3.3572 KOps/s | |
test_empty[False] | 3.7740μs | 0.8381μs | 1.1932 MOps/s | 1.1931 MOps/s | |
test_to | 87.2310μs | 58.6451μs | 17.0517 KOps/s | 17.4923 KOps/s | |
test_to_nonblocking | 0.1952ms | 47.6951μs | 20.9665 KOps/s | 20.6021 KOps/s | |
test_unbind_speed | 0.2773ms | 0.2460ms | 4.0651 KOps/s | 4.1549 KOps/s | |
test_unbind_speed_stack0 | 0.2724ms | 0.2379ms | 4.2029 KOps/s | 4.1898 KOps/s | |
test_unbind_speed_stack1 | 0.1007s | 0.7384ms | 1.3542 KOps/s | 1.3555 KOps/s | |
test_split | 0.1118s | 1.6586ms | 602.9126 Ops/s | 621.1709 Ops/s | |
test_chunk | 0.1053s | 1.6377ms | 610.6090 Ops/s | 632.0932 Ops/s | |
test_consolidate[False-None] | 3.4773ms | 2.6947ms | 371.0938 Ops/s | 372.2742 Ops/s | |
test_consolidate[default-None] | 2.3837ms | 1.7372ms | 575.6464 Ops/s | 586.6357 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9216ms | 1.7696ms | 565.0908 Ops/s | 577.4533 Ops/s | |
test_consolidate_njt[False-None] | 6.8647ms | 6.5208ms | 153.3564 Ops/s | 151.1578 Ops/s | |
test_to[False-False-None] | 0.3275s | 2.2882ms | 437.0327 Ops/s | 564.0187 Ops/s | |
test_to[True-False-None] | 1.5590ms | 1.3432ms | 744.5070 Ops/s | 749.8854 Ops/s | |
test_to[within-False-None] | 4.4843ms | 4.1512ms | 240.8939 Ops/s | 241.1613 Ops/s | |
test_to[True-default-None] | 6.0979ms | 5.3005ms | 188.6632 Ops/s | 192.1335 Ops/s | |
test_to_njt[False-False-None] | 7.3839ms | 6.9788ms | 143.2905 Ops/s | 141.6569 Ops/s | |
test_to_njt[True-False-None] | 5.8444ms | 5.4792ms | 182.5079 Ops/s | 178.3952 Ops/s | |
test_to_njt[within-False-None] | 13.0869ms | 12.1463ms | 82.3295 Ops/s | 80.7192 Ops/s | |
test_creation[device0] | 0.4649ms | 81.1177μs | 12.3278 KOps/s | 11.7668 KOps/s | |
test_creation_from_tensor | 0.5508ms | 85.8724μs | 11.6452 KOps/s | 11.2785 KOps/s | |
test_add_one[memmap_tensor0] | 0.4111ms | 7.1451μs | 139.9565 KOps/s | 145.3637 KOps/s | |
test_contiguous[memmap_tensor0] | 9.4507μs | 0.4165μs | 2.4007 MOps/s | 2.3650 MOps/s | |
test_stack[memmap_tensor0] | 0.1845ms | 4.7125μs | 212.2022 KOps/s | 230.1414 KOps/s | |
test_memmaptd_index | 1.7888ms | 0.2532ms | 3.9495 KOps/s | 4.1624 KOps/s | |
test_memmaptd_index_astensor | 0.4502ms | 0.3117ms | 3.2086 KOps/s | 3.3268 KOps/s | |
test_memmaptd_index_op | 0.7600ms | 0.5703ms | 1.7533 KOps/s | 1.6908 KOps/s | |
test_serialize_model | 0.1319s | 0.1306s | 7.6555 Ops/s | 7.6188 Ops/s | |
test_serialize_model_pickle | 2.8583s | 1.8101s | 0.5525 Ops/s | 0.8242 Ops/s | |
test_serialize_weights | 0.1313s | 0.1301s | 7.6881 Ops/s | 7.6647 Ops/s | |
test_serialize_weights_returnearly | 0.4055s | 66.4271ms | 15.0541 Ops/s | 22.8270 Ops/s | |
test_serialize_weights_pickle | 1.5857s | 1.2800s | 0.7813 Ops/s | 0.8174 Ops/s | |
test_reshape_pytree | 57.3110μs | 22.4552μs | 44.5332 KOps/s | 44.9605 KOps/s | |
test_reshape_td | 0.1859ms | 26.7394μs | 37.3980 KOps/s | 36.5625 KOps/s | |
test_view_pytree | 0.1822ms | 22.2641μs | 44.9154 KOps/s | 45.8537 KOps/s | |
test_view_td | 0.1933ms | 31.3700μs | 31.8776 KOps/s | 30.7310 KOps/s | |
test_unbind_pytree | 0.1680ms | 28.2910μs | 35.3469 KOps/s | 36.0203 KOps/s | |
test_unbind_td | 0.6312ms | 36.4684μs | 27.4210 KOps/s | 27.5293 KOps/s | |
test_split_pytree | 0.2253ms | 29.6928μs | 33.6782 KOps/s | 34.1080 KOps/s | |
test_split_td | 0.7176ms | 38.5058μs | 25.9701 KOps/s | 14.5238 KOps/s | |
test_add_pytree | 0.2281ms | 35.2957μs | 28.3321 KOps/s | 28.6584 KOps/s | |
test_add_td | 0.2412ms | 44.9560μs | 22.2440 KOps/s | 19.7658 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.3060ms | 0.1222ms | 8.1827 KOps/s | 7.5951 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.5260ms | 0.1309ms | 7.6406 KOps/s | 7.4545 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2388ms | 95.7200μs | 10.4471 KOps/s | 9.9692 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.0230ms | 0.1496ms | 6.6827 KOps/s | 6.2055 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.4494ms | 23.9008μs | 41.8396 KOps/s | 44.2021 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.4211ms | 29.1539μs | 34.3007 KOps/s | 33.6662 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4755ms | 64.8089μs | 15.4300 KOps/s | 15.0188 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4469ms | 49.6476μs | 20.1420 KOps/s | 20.0541 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2893ms | 0.1422ms | 7.0341 KOps/s | 6.9944 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6391ms | 0.2202ms | 4.5416 KOps/s | 4.5661 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2465ms | 0.1008ms | 9.9196 KOps/s | 9.9044 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4829ms | 58.3638μs | 17.1339 KOps/s | 17.6842 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2774ms | 0.1369ms | 7.3047 KOps/s | 7.3916 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8673ms | 0.4864ms | 2.0561 KOps/s | 2.0742 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6946ms | 0.2654ms | 3.7685 KOps/s | 3.7448 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5425ms | 0.1436ms | 6.9641 KOps/s | 7.0496 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2133ms | 70.8439μs | 14.1155 KOps/s | 14.3900 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2523ms | 0.1010ms | 9.8983 KOps/s | 10.1864 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5753ms | 0.4058ms | 2.4645 KOps/s | 2.4641 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.5277ms | 0.1345ms | 7.4336 KOps/s | 7.3800 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.4247ms | 18.4640μs | 54.1595 KOps/s | 57.6213 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4255ms | 31.1080μs | 32.1461 KOps/s | 32.3713 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.4625ms | 69.8037μs | 14.3259 KOps/s | 14.2144 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1138ms | 50.8184μs | 19.6779 KOps/s | 19.5502 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6620ms | 0.4495ms | 2.2245 KOps/s | 2.1505 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9207ms | 2.6442ms | 378.1917 Ops/s | 386.0796 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6256ms | 0.4371ms | 2.2878 KOps/s | 2.2305 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8806ms | 2.6560ms | 376.4991 Ops/s | 380.4933 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.3206ms | 0.1165ms | 8.5804 KOps/s | 8.3750 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5970ms | 80.2600μs | 12.4595 KOps/s | 12.4888 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.6887ms | 0.1106ms | 9.0387 KOps/s | 9.3306 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2354ms | 67.8891μs | 14.7299 KOps/s | 14.5064 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2937ms | 0.1144ms | 8.7426 KOps/s | 9.0348 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2575ms | 67.5284μs | 14.8086 KOps/s | 13.8068 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2800ms | 0.1062ms | 9.4148 KOps/s | 9.7720 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1541ms | 17.2351μs | 58.0213 KOps/s | 58.3665 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2489ms | 98.8125μs | 10.1202 KOps/s | 10.3344 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1463ms | 15.8985μs | 62.8989 KOps/s | 63.2284 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2550ms | 0.1010ms | 9.8994 KOps/s | 10.2527 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1406ms | 16.0302μs | 62.3823 KOps/s | 64.1626 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2839ms | 0.1067ms | 9.3717 KOps/s | 9.5850 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6634ms | 17.0171μs | 58.7645 KOps/s | 58.7605 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2526ms | 0.1016ms | 9.8415 KOps/s | 10.1995 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1535ms | 15.9291μs | 62.7783 KOps/s | 63.9376 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2493ms | 0.1022ms | 9.7846 KOps/s | 10.1830 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1629ms | 15.9632μs | 62.6441 KOps/s | 63.1737 KOps/s | |
test_mod_add[eager] | 0.2221ms | 38.1758μs | 26.1946 KOps/s | 24.8495 KOps/s | |
test_mod_add[compile] | 0.2271ms | 80.6128μs | 12.4050 KOps/s | 11.6049 KOps/s | |
test_mod_add[compile-overhead] | 0.3505ms | 0.1751ms | 5.7104 KOps/s | 5.5694 KOps/s | |
test_mod_wrap[eager] | 0.4135ms | 0.2615ms | 3.8236 KOps/s | 3.8435 KOps/s | |
test_mod_wrap[compile] | 0.4764ms | 0.2990ms | 3.3448 KOps/s | 3.4371 KOps/s | |
test_mod_wrap[compile-overhead] | 7.5653ms | 3.9959ms | 250.2547 Ops/s | 399.0243 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6253ms | 1.4492ms | 690.0491 Ops/s | 678.6411 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5754ms | 1.3861ms | 721.4400 Ops/s | 724.6930 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.7148ms | 1.0784ms | 927.2951 Ops/s | 941.1596 Ops/s | |
test_seq_add[eager] | 0.2973ms | 0.1147ms | 8.7183 KOps/s | 8.2161 KOps/s | |
test_seq_add[compile] | 0.2913ms | 88.7198μs | 11.2714 KOps/s | 11.2664 KOps/s | |
test_seq_add[compile-overhead] | 0.2869ms | 0.1308ms | 7.6478 KOps/s | 7.6793 KOps/s | |
test_seq_wrap[eager] | 0.6042ms | 0.4248ms | 2.3541 KOps/s | 2.2585 KOps/s | |
test_seq_wrap[compile] | 0.4815ms | 0.3025ms | 3.3060 KOps/s | 3.2861 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3770ms | 0.2269ms | 4.4079 KOps/s | 4.3877 KOps/s | |
test_func_call_runtime[False-eager] | 0.9221ms | 0.7436ms | 1.3449 KOps/s | 1.2808 KOps/s | |
test_func_call_runtime[False-compile] | 0.9796ms | 0.7388ms | 1.3535 KOps/s | 1.3293 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5102ms | 0.3681ms | 2.7164 KOps/s | 2.7055 KOps/s | |
test_func_call_runtime[True-eager] | 1.1172ms | 0.9089ms | 1.1002 KOps/s | 1.0826 KOps/s | |
test_func_call_runtime[True-compile] | 0.9302ms | 0.7575ms | 1.3201 KOps/s | 1.2793 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5221ms | 0.3875ms | 2.5807 KOps/s | 2.5554 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9371ms | 0.7439ms | 1.3443 KOps/s | 1.3196 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9482ms | 0.7534ms | 1.3273 KOps/s | 1.3281 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5135ms | 0.3682ms | 2.7161 KOps/s | 2.6848 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1880ms | 1.0161ms | 984.2020 Ops/s | 969.8706 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1955ms | 1.0031ms | 996.9571 Ops/s | 985.9188 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2155ms | 1.0076ms | 992.4786 Ops/s | 980.7352 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5281ms | 2.1159ms | 472.6122 Ops/s | 466.6312 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9594ms | 0.8087ms | 1.2366 KOps/s | 1.2024 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5864ms | 0.4193ms | 2.3849 KOps/s | 2.3467 KOps/s | |
test_distributed | 4.0403ms | 0.1895ms | 5.2783 KOps/s | 7.9738 KOps/s | |
test_tdmodule | 94.1120μs | 19.1043μs | 52.3442 KOps/s | 44.1991 KOps/s | |
test_tdmodule_dispatch | 67.5310μs | 32.9359μs | 30.3620 KOps/s | 25.0918 KOps/s | |
test_tdseq | 40.9310μs | 19.5448μs | 51.1644 KOps/s | 44.8294 KOps/s | |
test_tdseq_dispatch | 56.8610μs | 36.2680μs | 27.5725 KOps/s | 24.5453 KOps/s | |
test_instantiation_functorch | 1.6901ms | 1.5752ms | 634.8533 Ops/s | 633.0756 Ops/s | |
test_exec_functorch | 0.2527ms | 0.1477ms | 6.7686 KOps/s | 6.9467 KOps/s | |
test_exec_functional_call | 0.2855ms | 0.1413ms | 7.0763 KOps/s | 7.2294 KOps/s | |
test_exec_td_decorator | 0.3754ms | 0.1912ms | 5.2307 KOps/s | 5.3309 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8888ms | 0.6885ms | 1.4524 KOps/s | 1.4296 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8513ms | 0.6896ms | 1.4501 KOps/s | 1.4315 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7661ms | 0.6034ms | 1.6573 KOps/s | 1.6173 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7960ms | 0.6025ms | 1.6598 KOps/s | 1.5877 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.2696ms | 19.3711ms | 51.6233 Ops/s | 51.3085 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.7340ms | 19.3579ms | 51.6586 Ops/s | 51.5398 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.6152ms | 19.2123ms | 52.0501 Ops/s | 51.6562 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.5396ms | 19.2416ms | 51.9707 Ops/s | 51.8442 Ops/s | |
test_to_module_speed[True] | 1.5002ms | 0.9587ms | 1.0431 KOps/s | 1.0255 KOps/s | |
test_to_module_speed[False] | 1.0591ms | 0.9393ms | 1.0646 KOps/s | 1.0561 KOps/s | |
test_tc_init | 82.3820μs | 34.1019μs | 29.3239 KOps/s | 27.2712 KOps/s | |
test_tc_init_nested | 0.1118ms | 67.1599μs | 14.8898 KOps/s | 13.9452 KOps/s | |
test_tc_first_layer_tensor | 23.0800μs | 0.8375μs | 1.1941 MOps/s | 1.1913 MOps/s | |
test_tc_first_layer_nontensor | 27.2300μs | 2.2658μs | 441.3531 KOps/s | 437.0780 KOps/s | |
test_tc_second_layer_tensor | 24.8300μs | 1.5397μs | 649.4580 KOps/s | 698.7636 KOps/s | |
test_tc_second_layer_nontensor | 47.5510μs | 2.9765μs | 335.9617 KOps/s | 328.8358 KOps/s | |
test_unbind | 0.2449s | 13.0896ms | 76.3964 Ops/s | 140.2193 Ops/s | |
test_full_like | 12.0507ms | 10.7174ms | 93.3061 Ops/s | 96.2215 Ops/s | |
test_zeros_like | 10.1103ms | 7.5683ms | 132.1308 Ops/s | 216.3332 Ops/s | |
test_ones_like | 5.5755ms | 4.7552ms | 210.2949 Ops/s | 211.9273 Ops/s | |
test_clone | 9.1430ms | 7.6361ms | 130.9566 Ops/s | 131.3813 Ops/s | |
test_squeeze | 0.1603ms | 9.7011μs | 103.0809 KOps/s | 103.4029 KOps/s | |
test_unsqueeze | 0.2142ms | 71.8271μs | 13.9223 KOps/s | 13.8785 KOps/s | |
test_split | 0.3871ms | 0.1593ms | 6.2777 KOps/s | 6.2864 KOps/s | |
test_permute | 0.3326ms | 0.1852ms | 5.4002 KOps/s | 5.6124 KOps/s | |
test_stack | 54.2895ms | 53.0875ms | 18.8368 Ops/s | 18.7559 Ops/s | |
test_cat | 53.6724ms | 52.8348ms | 18.9269 Ops/s | 18.9065 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 7, 2025
ghstack-source-id: 1ad6616dfcdd55bd055512e96a1e942b27d02ec8 Pull Request resolved: #1213
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
suitable for minor
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):