Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vcpkg] Make RealFilesystem::remove_all much, much faster, and start benchmarking #7570

Merged
merged 1 commit into from
Aug 7, 2019

Conversation

strega-nil
Copy link
Contributor

@strega-nil strega-nil commented Aug 7, 2019

It turns out that parallel file remove (#7228) is not good for performance -- generally getting a 3x slowdown. Since I added the catch2 testing framework in #7315, it's now easy to benchmark, and so I did. Results in a follow up comment.

Sorry for the broken code, I need to leave to feed my cat and will fix it at some point before EOD tomorrow.

@strega-nil
Copy link
Contributor Author

Windows benchmark:

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------

Windows -- new code, experimental::filesystem file removal APIs
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1    1.36532 s 
                                                 14.5197 ms   14.0631 ms   15.2857 ms 
                                                 2.96732 ms   2.07362 ms   4.82291 ms 

large directory, no symlinks                            100            1    1.51618 m 
                                                 919.815 ms   892.162 ms   948.154 ms 
                                                 142.304 ms   125.394 ms   164.287 ms

small directory, symlinks                               100            1    1.67181 s 
                                                 20.0017 ms   18.9441 ms   22.1276 ms 
                                                 7.30606 ms   4.41098 ms   13.7864 ms

large directory, symlinks                               100            1    1.93433 m 
                                                  1.21616 s    1.18261 s     1.2529 s 
                                                 179.248 ms   155.233 ms   211.348 ms

===============================================================================

Windows -- new code, Win32 file removal APIs
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1   921.578 ms 
                                                 10.0654 ms   9.75478 ms    10.483 ms 
                                                 1.81859 ms   1.44119 ms   2.38577 ms

large directory, no symlinks                            100            1    1.21752 m 
                                                 691.345 ms   668.997 ms   717.393 ms 
                                                 122.546 ms   104.264 ms   149.243 ms 

small directory, symlinks                               100            1    1.15584 s 
                                                 12.2765 ms   11.8531 ms    12.947 ms 
                                                 2.67494 ms   1.89749 ms   3.98427 ms 

large directory, symlinks                               100            1    1.88496 m 
                                                  867.55 ms   834.645 ms   909.628 ms 
                                                  188.64 ms   152.074 ms   244.909 ms

===============================================================================

Windows -- old code
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1    5.97863 s 
                                                 52.3063 ms   51.1884 ms   53.4849 ms 
                                                 5.87737 ms   5.28542 ms   6.66985 ms

large directory, no symlinks                            100            1    5.67639 m 
                                                  3.38404 s    3.30412 s    3.46134 s 
                                                 400.032 ms   349.271 ms   467.637 ms

small directory, symlinks                               100            1    7.69278 s 
                                                 74.8141 ms   72.5641 ms   77.5622 ms 
                                                  12.679 ms   10.4307 ms   15.7661 ms 

large directory, symlinks                               100            1    8.38528 m 
                                                  5.33924 s    5.21239 s    5.46853 s 
                                                 657.042 ms   567.943 ms   768.801 ms

===============================================================================

Windows -- experimental::filesystem
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1    1.58408 s 
                                                 17.3231 ms   16.9278 ms   18.0017 ms 
                                                 2.56955 ms   1.68922 ms   3.95341 ms

large directory, no symlinks                            100            1    1.86649 m 
                                                  1.20479 s    1.16511 s     1.2479 s 
                                                  211.24 ms   184.978 ms   246.992 ms

@strega-nil
Copy link
Contributor Author

WSL:

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------

WSL -- new code, experimental::filesystem file removal APIs
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1   969.089 ms
                                                 10.7625 ms   10.4192 ms   11.1792 ms
                                                 1.92546 ms   1.63526 ms   2.43703 ms

large directory, no symlinks                            100            1    1.01362 m
                                                 816.256 ms   789.089 ms   844.171 ms
                                                 139.535 ms   122.914 ms   160.293 ms

small directory, symlinks                               100            1    1.57949 s
                                                 16.5445 ms   16.0957 ms   17.1105 ms
                                                  2.5621 ms   2.10547 ms   3.52658 ms

large directory, symlinks                               100            1    1.86719 m
                                                  1.12277 s    1.08478 s    1.16547 s
                                                 205.199 ms   176.685 ms    268.19 ms

===============================================================================

WSL -- new code, unix file removal APIs
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1    1.25699 s
                                                  13.335 ms   12.9462 ms   13.7772 ms
                                                 2.13009 ms   1.87612 ms   2.42105 ms

large directory, no symlinks                            100            1    1.99072 m
                                                 807.767 ms   782.092 ms   835.008 ms
                                                 135.137 ms   119.467 ms   156.756 ms

small directory, symlinks                               100            1     1.7474 s
                                                 17.9516 ms   17.3578 ms   18.6353 ms
                                                 3.23356 ms    2.8123 ms   4.22177 ms

large directory, symlinks                               100            1    1.69178 m
                                                  1.04647 s    1.01282 s    1.08552 s
                                                 184.841 ms   156.257 ms   259.267 ms

===============================================================================

WSL -- old code
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1     8.7871 s
                                                  85.173 ms   84.1377 ms   86.3339 ms
                                                 5.57359 ms   4.83268 ms   6.52703 ms

large directory, no symlinks                            100            1    9.61462 m
                                                  6.00963 s     5.8691 s    6.14369 s
                                                 700.966 ms   618.945 ms   817.319 ms

small directory, symlinks                               100            1    10.0322 s
                                                  106.84 ms   104.855 ms   109.179 ms
                                                 10.9884 ms   9.17304 ms   14.2424 ms

large directory, symlinks                               100            1    12.0166 m
                                                  7.55028 s    7.36628 s    7.74645 s
                                                 970.479 ms   841.272 ms    1.15951 s

===============================================================================

WSL -- experimental::filesystem
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1     1.2865 s
                                                 16.2459 ms   15.5452 ms   17.1238 ms
                                                 3.97899 ms   3.31691 ms   5.19475 ms

large directory, no symlinks                            100            1    2.08604 m
                                                 957.696 ms   926.848 ms   991.947 ms
                                                 166.605 ms   143.924 ms   198.511 ms

small directory, symlinks                               100            1     1.7562 s
                                                 19.2352 ms   18.4872 ms   20.2999 ms
                                                 4.52481 ms   3.47153 ms   6.53988 ms

large directory, symlinks                               100            1    2.24866 m
                                                  1.32128 s    1.28821 s    1.35661 s
                                                 175.077 ms   151.212 ms   209.965 ms

@strega-nil
Copy link
Contributor Author

macOS:

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
macOS -- new code, experimental::filesystem file removal APIs
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1   181.553 ms 
                                                 1.74245 ms   1.70887 ms   1.78472 ms 
                                                 192.034 us   156.196 us   240.753 us 
                                                                                      
large directory, no symlinks                            100            1    10.9128 s 
                                                 126.577 ms   120.115 ms   144.048 ms 
                                                 50.0075 ms   14.7951 ms   103.465 ms 
                                                                                      
small directory, symlinks                               100            1    252.48 ms 
                                                 2.23302 ms   2.19617 ms   2.27281 ms 
                                                 195.582 us   172.078 us   223.993 us 
                                                                                      
large directory, symlinks                               100            1    14.1338 s 
                                                 170.302 ms   163.912 ms   181.838 ms 
                                                 42.8113 ms   28.1574 ms   64.7644 ms 

===============================================================================

macOS -- new code, unix file removal APIs
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1   169.444 ms 
                                                 1.60607 ms   1.58032 ms   1.64111 ms 
                                                 153.131 us    119.19 us   215.444 us 
                                                                                      
large directory, no symlinks                            100            1     10.062 s 
                                                 120.215 ms   113.865 ms   134.606 ms 
                                                 46.1967 ms   20.4014 ms   83.0354 ms 
                                                                                      
small directory, symlinks                               100            1   237.033 ms 
                                                 2.11555 ms     2.079 ms   2.15388 ms 
                                                 190.987 us   170.024 us   216.615 us 
                                                                                      
large directory, symlinks                               100            1    13.2841 s 
                                                 158.126 ms   152.492 ms    169.67 ms 
                                                 39.5632 ms   22.8409 ms   73.7222 ms

===============================================================================

macOS -- old code
-------------------------------------------------------------------------------
small directory, no symlinks                            100            1   380.816 ms 
                                                 3.80082 ms   3.77002 ms   3.83759 ms 
                                                 171.811 us   143.548 us   220.606 us 
                                                                                      
large directory, no symlinks                            100            1    26.0354 s 
                                                 255.124 ms   245.326 ms   268.469 ms 
                                                 57.7743 ms   43.2626 ms   83.0709 ms 
                                                                                      
small directory, symlinks                               100            1   570.966 ms 
                                                 6.01282 ms   5.72368 ms   7.00345 ms 
                                                 2.44931 ms   798.172 us   5.55886 ms 
                                                                                      
large directory, symlinks                               100            1    43.8827 s 
                                                 376.234 ms   361.874 ms   399.342 ms 
                                                  91.032 ms   64.0431 ms   157.049 ms

===============================================================================

macOS -- experimental::filesystem
-------------------------------------------------------------------------------

small directory, no symlinks                            100            1     177.5 ms 
                                                 1.67153 ms   1.64634 ms   1.70409 ms 
                                                 144.582 us   114.366 us   198.142 us 
                                                                                      
large directory, no symlinks                            100            1    10.6392 s 
                                                 121.892 ms    118.04 ms   133.184 ms 
                                                 30.8224 ms   13.2209 ms   67.1489 ms 
                                                                                      
small directory, symlinks                               100            1   251.983 ms 
                                                 2.24401 ms   2.20416 ms   2.28854 ms 
                                                 214.376 us   187.655 us   248.183 us 
                                                                                      
large directory, symlinks                               100            1    14.2726 s 
                                                 169.468 ms   163.525 ms   182.781 ms 
                                                 43.0249 ms   19.0991 ms    79.277 ms

@strega-nil strega-nil changed the title Make RealFilesystem::remove_all much, much faster, and start benchmarking [vcpkg] Make RealFilesystem::remove_all much, much faster, and start benchmarking Aug 7, 2019
@strega-nil strega-nil requested a review from ras0219-msft August 7, 2019 17:17
@strega-nil strega-nil self-assigned this Aug 7, 2019
@strega-nil strega-nil added the info:internal This PR or Issue was filed by the vcpkg team. label Aug 7, 2019
toolsrc/include/pch.h Outdated Show resolved Hide resolved
toolsrc/include/vcpkg/base/files.h Outdated Show resolved Hide resolved
toolsrc/src/vcpkg/base/files.cpp Outdated Show resolved Hide resolved
toolsrc/src/vcpkg/base/files.cpp Outdated Show resolved Hide resolved
toolsrc/src/vcpkg/base/files.cpp Outdated Show resolved Hide resolved
toolsrc/src/vcpkg/base/files.cpp Outdated Show resolved Hide resolved
I added benchmarks to measure how fast the parallel remove_all code was
-- it turns out, about 3x slower than stdfs::remove_all. Since this was
the case, I removed all of the parallelism and rewrote it serially, and
ended up about 30% faster than stdfs::remove_all (in addition to
supporting symlinks).

In addition, I did the following three orthogonal changes:
  - simplified the work queue, basing it on Billy O'Neal's idea
  - Fix warnings on older versions of compilers in tests, by splitting
    the pragmas out of pch.h.
  - Ran clang-format on some files

In fixing up remove_all, the following changes were made:
  - On Windows, regular symlinks and directory symlinks are distinct;
    as an example, to remove directory symlinks (and junctions, for that
    matter), one must use RemoveDirectory. Only on Windows, I added new
    `file_type` and `file_status` types, with `file_type` including a new
    `directory_symlink` enumerator, and `file_status` being exactly the
    same as the old one except using the new `file_type`. On Unix, I
    didn't make that change since they don't make a distinction.
  - I added new `symlink_status` and `status` functions which use the
    new `file_status` on Windows.
  - I made `Filesystem::exists` call `fs::exists(status(p))`, as opposed
    to the old version which called `stdfs::exists` directly.
  - Added benchmarks to `vcpkg-test/files.cpp`. They test the
    performance of `remove_all` on small directories (~20 files), with
    symlinks and without, and on large directories (~2000 files), with
    symlinks and without.
@strega-nil strega-nil merged commit e79f0dc into microsoft:master Aug 7, 2019
@strega-nil strega-nil deleted the benchmark branch August 7, 2019 23:51
strega-nil added a commit to strega-nil/vcpkg that referenced this pull request May 5, 2021
I added benchmarks to measure how fast the parallel remove_all code was
-- it turns out, about 3x slower than stdfs::remove_all. Since this was
the case, I removed all of the parallelism and rewrote it serially, and
ended up about 30% faster than stdfs::remove_all (in addition to
supporting symlinks).

In addition, I did the following three orthogonal changes:
  - simplified the work queue, basing it on Billy O'Neal's idea
  - Fix warnings on older versions of compilers in tests, by splitting
    the pragmas out of pch.h.
  - Ran clang-format on some files

In fixing up remove_all, the following changes were made:
  - On Windows, regular symlinks and directory symlinks are distinct;
    as an example, to remove directory symlinks (and junctions, for that
    matter), one must use RemoveDirectory. Only on Windows, I added new
    `file_type` and `file_status` types, with `file_type` including a new
    `directory_symlink` enumerator, and `file_status` being exactly the
    same as the old one except using the new `file_type`. On Unix, I
    didn't make that change since they don't make a distinction.
  - I added new `symlink_status` and `status` functions which use the
    new `file_status` on Windows.
  - I made `Filesystem::exists` call `fs::exists(status(p))`, as opposed
    to the old version which called `stdfs::exists` directly.
  - Added benchmarks to `vcpkg-test/files.cpp`. They test the
    performance of `remove_all` on small directories (~20 files), with
    symlinks and without, and on large directories (~2000 files), with
    symlinks and without.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
info:internal This PR or Issue was filed by the vcpkg team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants