-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize FlatVector<T>::copyRanges #6566
Conversation
✅ Deploy Preview for meta-velox canceled.
|
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. We also optimize FlatVector::copyRanges (which is used by Array/MapVector::copyRanges). ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.43ms 64.80 array_constructor_ARRAY_nullfree#facebookincubator#2 24.50ms 40.81 array_constructor_ARRAY_nullfree#facebookincubator#3 35.12ms 28.47 array_constructor_ARRAY_nullfree##2_null 54.52ms 18.34 array_constructor_ARRAY_nullfree##2_const 43.28ms 23.10 array_constructor_ARRAY_nulls#facebookincubator#1 28.60ms 34.96 array_constructor_ARRAY_nulls#facebookincubator#2 50.82ms 19.68 array_constructor_ARRAY_nulls#facebookincubator#3 70.31ms 14.22 array_constructor_ARRAY_nulls##2_null 64.43ms 15.52 array_constructor_ARRAY_nulls##2_const 80.71ms 12.39 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.49ms 286.59 array_constructor_INTEGER_nullfree#facebookincubator#2 7.91ms 126.46 array_constructor_INTEGER_nullfree#facebookincubator#3 11.99ms 83.41 array_constructor_INTEGER_nullfree##2_null 12.57ms 79.55 array_constructor_INTEGER_nullfree##2_const 11.03ms 90.67 array_constructor_INTEGER_nulls#facebookincubator#1 4.37ms 228.97 array_constructor_INTEGER_nulls#facebookincubator#2 9.99ms 100.14 array_constructor_INTEGER_nulls#facebookincubator#3 14.79ms 67.60 array_constructor_INTEGER_nulls##2_null 12.21ms 81.92 array_constructor_INTEGER_nulls##2_const 12.64ms 79.12 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.38ms 57.53 array_constructor_MAP_nullfree#facebookincubator#2 29.41ms 34.00 array_constructor_MAP_nullfree#facebookincubator#3 38.30ms 26.11 array_constructor_MAP_nullfree##2_null 58.52ms 17.09 array_constructor_MAP_nullfree##2_const 48.62ms 20.57 array_constructor_MAP_nulls#facebookincubator#1 30.60ms 32.68 array_constructor_MAP_nulls#facebookincubator#2 53.94ms 18.54 array_constructor_MAP_nulls#facebookincubator#3 86.48ms 11.56 array_constructor_MAP_nulls##2_null 69.53ms 14.38 array_constructor_MAP_nulls##2_const 87.56ms 11.42 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.64ms 177.44 array_constructor_ROW_nullfree#facebookincubator#2 14.40ms 69.44 array_constructor_ROW_nullfree#facebookincubator#3 21.46ms 46.59 array_constructor_ROW_nullfree##2_null 19.14ms 52.26 array_constructor_ROW_nullfree##2_const 18.60ms 53.77 array_constructor_ROW_nulls#facebookincubator#1 10.97ms 91.18 array_constructor_ROW_nulls#facebookincubator#2 18.29ms 54.67 array_constructor_ROW_nulls#facebookincubator#3 28.57ms 35.01 array_constructor_ROW_nulls##2_null 25.10ms 39.84 array_constructor_ROW_nulls##2_const 24.55ms 40.74 ``` Differential Revision: D49269500
06d22c3
to
4ffe92e
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. We also optimize FlatVector::copyRanges (which is used by Array/MapVector::copyRanges). ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.43ms 64.80 array_constructor_ARRAY_nullfree#facebookincubator#2 24.50ms 40.81 array_constructor_ARRAY_nullfree#facebookincubator#3 35.12ms 28.47 array_constructor_ARRAY_nullfree##2_null 54.52ms 18.34 array_constructor_ARRAY_nullfree##2_const 43.28ms 23.10 array_constructor_ARRAY_nulls#facebookincubator#1 28.60ms 34.96 array_constructor_ARRAY_nulls#facebookincubator#2 50.82ms 19.68 array_constructor_ARRAY_nulls#facebookincubator#3 70.31ms 14.22 array_constructor_ARRAY_nulls##2_null 64.43ms 15.52 array_constructor_ARRAY_nulls##2_const 80.71ms 12.39 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.49ms 286.59 array_constructor_INTEGER_nullfree#facebookincubator#2 7.91ms 126.46 array_constructor_INTEGER_nullfree#facebookincubator#3 11.99ms 83.41 array_constructor_INTEGER_nullfree##2_null 12.57ms 79.55 array_constructor_INTEGER_nullfree##2_const 11.03ms 90.67 array_constructor_INTEGER_nulls#facebookincubator#1 4.37ms 228.97 array_constructor_INTEGER_nulls#facebookincubator#2 9.99ms 100.14 array_constructor_INTEGER_nulls#facebookincubator#3 14.79ms 67.60 array_constructor_INTEGER_nulls##2_null 12.21ms 81.92 array_constructor_INTEGER_nulls##2_const 12.64ms 79.12 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.38ms 57.53 array_constructor_MAP_nullfree#facebookincubator#2 29.41ms 34.00 array_constructor_MAP_nullfree#facebookincubator#3 38.30ms 26.11 array_constructor_MAP_nullfree##2_null 58.52ms 17.09 array_constructor_MAP_nullfree##2_const 48.62ms 20.57 array_constructor_MAP_nulls#facebookincubator#1 30.60ms 32.68 array_constructor_MAP_nulls#facebookincubator#2 53.94ms 18.54 array_constructor_MAP_nulls#facebookincubator#3 86.48ms 11.56 array_constructor_MAP_nulls##2_null 69.53ms 14.38 array_constructor_MAP_nulls##2_const 87.56ms 11.42 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.64ms 177.44 array_constructor_ROW_nullfree#facebookincubator#2 14.40ms 69.44 array_constructor_ROW_nullfree#facebookincubator#3 21.46ms 46.59 array_constructor_ROW_nullfree##2_null 19.14ms 52.26 array_constructor_ROW_nullfree##2_const 18.60ms 53.77 array_constructor_ROW_nulls#facebookincubator#1 10.97ms 91.18 array_constructor_ROW_nulls#facebookincubator#2 18.29ms 54.67 array_constructor_ROW_nulls#facebookincubator#3 28.57ms 35.01 array_constructor_ROW_nulls##2_null 25.10ms 39.84 array_constructor_ROW_nulls##2_const 24.55ms 40.74 ``` Differential Revision: D49269500
4ffe92e
to
881069c
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. We also optimize FlatVector::copyRanges (which is used by Array/MapVector::copyRanges). ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.43ms 64.80 array_constructor_ARRAY_nullfree#facebookincubator#2 24.50ms 40.81 array_constructor_ARRAY_nullfree#facebookincubator#3 35.12ms 28.47 array_constructor_ARRAY_nullfree##2_null 54.52ms 18.34 array_constructor_ARRAY_nullfree##2_const 43.28ms 23.10 array_constructor_ARRAY_nulls#facebookincubator#1 28.60ms 34.96 array_constructor_ARRAY_nulls#facebookincubator#2 50.82ms 19.68 array_constructor_ARRAY_nulls#facebookincubator#3 70.31ms 14.22 array_constructor_ARRAY_nulls##2_null 64.43ms 15.52 array_constructor_ARRAY_nulls##2_const 80.71ms 12.39 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.49ms 286.59 array_constructor_INTEGER_nullfree#facebookincubator#2 7.91ms 126.46 array_constructor_INTEGER_nullfree#facebookincubator#3 11.99ms 83.41 array_constructor_INTEGER_nullfree##2_null 12.57ms 79.55 array_constructor_INTEGER_nullfree##2_const 11.03ms 90.67 array_constructor_INTEGER_nulls#facebookincubator#1 4.37ms 228.97 array_constructor_INTEGER_nulls#facebookincubator#2 9.99ms 100.14 array_constructor_INTEGER_nulls#facebookincubator#3 14.79ms 67.60 array_constructor_INTEGER_nulls##2_null 12.21ms 81.92 array_constructor_INTEGER_nulls##2_const 12.64ms 79.12 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.38ms 57.53 array_constructor_MAP_nullfree#facebookincubator#2 29.41ms 34.00 array_constructor_MAP_nullfree#facebookincubator#3 38.30ms 26.11 array_constructor_MAP_nullfree##2_null 58.52ms 17.09 array_constructor_MAP_nullfree##2_const 48.62ms 20.57 array_constructor_MAP_nulls#facebookincubator#1 30.60ms 32.68 array_constructor_MAP_nulls#facebookincubator#2 53.94ms 18.54 array_constructor_MAP_nulls#facebookincubator#3 86.48ms 11.56 array_constructor_MAP_nulls##2_null 69.53ms 14.38 array_constructor_MAP_nulls##2_const 87.56ms 11.42 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.64ms 177.44 array_constructor_ROW_nullfree#facebookincubator#2 14.40ms 69.44 array_constructor_ROW_nullfree#facebookincubator#3 21.46ms 46.59 array_constructor_ROW_nullfree##2_null 19.14ms 52.26 array_constructor_ROW_nullfree##2_const 18.60ms 53.77 array_constructor_ROW_nulls#facebookincubator#1 10.97ms 91.18 array_constructor_ROW_nulls#facebookincubator#2 18.29ms 54.67 array_constructor_ROW_nulls#facebookincubator#3 28.57ms 35.01 array_constructor_ROW_nulls##2_null 25.10ms 39.84 array_constructor_ROW_nulls##2_const 24.55ms 40.74 ``` Differential Revision: D49269500
881069c
to
5d2e8fc
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. We also optimize FlatVector::copyRanges (which is used by Array/MapVector::copyRanges). ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.43ms 64.80 array_constructor_ARRAY_nullfree#facebookincubator#2 24.50ms 40.81 array_constructor_ARRAY_nullfree#facebookincubator#3 35.12ms 28.47 array_constructor_ARRAY_nullfree##2_null 54.52ms 18.34 array_constructor_ARRAY_nullfree##2_const 43.28ms 23.10 array_constructor_ARRAY_nulls#facebookincubator#1 28.60ms 34.96 array_constructor_ARRAY_nulls#facebookincubator#2 50.82ms 19.68 array_constructor_ARRAY_nulls#facebookincubator#3 70.31ms 14.22 array_constructor_ARRAY_nulls##2_null 64.43ms 15.52 array_constructor_ARRAY_nulls##2_const 80.71ms 12.39 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.49ms 286.59 array_constructor_INTEGER_nullfree#facebookincubator#2 7.91ms 126.46 array_constructor_INTEGER_nullfree#facebookincubator#3 11.99ms 83.41 array_constructor_INTEGER_nullfree##2_null 12.57ms 79.55 array_constructor_INTEGER_nullfree##2_const 11.03ms 90.67 array_constructor_INTEGER_nulls#facebookincubator#1 4.37ms 228.97 array_constructor_INTEGER_nulls#facebookincubator#2 9.99ms 100.14 array_constructor_INTEGER_nulls#facebookincubator#3 14.79ms 67.60 array_constructor_INTEGER_nulls##2_null 12.21ms 81.92 array_constructor_INTEGER_nulls##2_const 12.64ms 79.12 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.38ms 57.53 array_constructor_MAP_nullfree#facebookincubator#2 29.41ms 34.00 array_constructor_MAP_nullfree#facebookincubator#3 38.30ms 26.11 array_constructor_MAP_nullfree##2_null 58.52ms 17.09 array_constructor_MAP_nullfree##2_const 48.62ms 20.57 array_constructor_MAP_nulls#facebookincubator#1 30.60ms 32.68 array_constructor_MAP_nulls#facebookincubator#2 53.94ms 18.54 array_constructor_MAP_nulls#facebookincubator#3 86.48ms 11.56 array_constructor_MAP_nulls##2_null 69.53ms 14.38 array_constructor_MAP_nulls##2_const 87.56ms 11.42 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.64ms 177.44 array_constructor_ROW_nullfree#facebookincubator#2 14.40ms 69.44 array_constructor_ROW_nullfree#facebookincubator#3 21.46ms 46.59 array_constructor_ROW_nullfree##2_null 19.14ms 52.26 array_constructor_ROW_nullfree##2_const 18.60ms 53.77 array_constructor_ROW_nulls#facebookincubator#1 10.97ms 91.18 array_constructor_ROW_nulls#facebookincubator#2 18.29ms 54.67 array_constructor_ROW_nulls#facebookincubator#3 28.57ms 35.01 array_constructor_ROW_nulls##2_null 25.10ms 39.84 array_constructor_ROW_nulls##2_const 24.55ms 40.74 ``` Differential Revision: D49269500
5d2e8fc
to
39c2744
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
39c2744
to
9a61532
Compare
Summary: array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. We also optimize FlatVector::copyRanges (which is used by Array/MapVector::copyRanges). ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.43ms 64.80 array_constructor_ARRAY_nullfree#facebookincubator#2 24.50ms 40.81 array_constructor_ARRAY_nullfree#facebookincubator#3 35.12ms 28.47 array_constructor_ARRAY_nullfree##2_null 54.52ms 18.34 array_constructor_ARRAY_nullfree##2_const 43.28ms 23.10 array_constructor_ARRAY_nulls#facebookincubator#1 28.60ms 34.96 array_constructor_ARRAY_nulls#facebookincubator#2 50.82ms 19.68 array_constructor_ARRAY_nulls#facebookincubator#3 70.31ms 14.22 array_constructor_ARRAY_nulls##2_null 64.43ms 15.52 array_constructor_ARRAY_nulls##2_const 80.71ms 12.39 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.49ms 286.59 array_constructor_INTEGER_nullfree#facebookincubator#2 7.91ms 126.46 array_constructor_INTEGER_nullfree#facebookincubator#3 11.99ms 83.41 array_constructor_INTEGER_nullfree##2_null 12.57ms 79.55 array_constructor_INTEGER_nullfree##2_const 11.03ms 90.67 array_constructor_INTEGER_nulls#facebookincubator#1 4.37ms 228.97 array_constructor_INTEGER_nulls#facebookincubator#2 9.99ms 100.14 array_constructor_INTEGER_nulls#facebookincubator#3 14.79ms 67.60 array_constructor_INTEGER_nulls##2_null 12.21ms 81.92 array_constructor_INTEGER_nulls##2_const 12.64ms 79.12 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.38ms 57.53 array_constructor_MAP_nullfree#facebookincubator#2 29.41ms 34.00 array_constructor_MAP_nullfree#facebookincubator#3 38.30ms 26.11 array_constructor_MAP_nullfree##2_null 58.52ms 17.09 array_constructor_MAP_nullfree##2_const 48.62ms 20.57 array_constructor_MAP_nulls#facebookincubator#1 30.60ms 32.68 array_constructor_MAP_nulls#facebookincubator#2 53.94ms 18.54 array_constructor_MAP_nulls#facebookincubator#3 86.48ms 11.56 array_constructor_MAP_nulls##2_null 69.53ms 14.38 array_constructor_MAP_nulls##2_const 87.56ms 11.42 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.64ms 177.44 array_constructor_ROW_nullfree#facebookincubator#2 14.40ms 69.44 array_constructor_ROW_nullfree#facebookincubator#3 21.46ms 46.59 array_constructor_ROW_nullfree##2_null 19.14ms 52.26 array_constructor_ROW_nullfree##2_const 18.60ms 53.77 array_constructor_ROW_nulls#facebookincubator#1 10.97ms 91.18 array_constructor_ROW_nulls#facebookincubator#2 18.29ms 54.67 array_constructor_ROW_nulls#facebookincubator#3 28.57ms 35.01 array_constructor_ROW_nulls##2_null 25.10ms 39.84 array_constructor_ROW_nulls##2_const 24.55ms 40.74 ``` Differential Revision: D49269500
This pull request was exported from Phabricator. Differential Revision: D49269500 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: Pull Request resolved: facebookincubator#6566 array_constructor is very slow: facebookincubator#5958 (comment) array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types: ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` FlatVector<T>::copy(source, rows, toSourceRow) is faster. Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower. The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression. Hence, we use copy for primitive types and structs of these and copyRanges for everything else. We also optimize FlatVector::copyRanges (which is used by Array/MapVector::copyRanges). ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 16.80ms 59.53 array_constructor_ARRAY_nullfree#facebookincubator#2 27.02ms 37.01 array_constructor_ARRAY_nullfree#facebookincubator#3 38.03ms 26.30 array_constructor_ARRAY_nullfree##2_null 52.86ms 18.92 array_constructor_ARRAY_nullfree##2_const 54.97ms 18.19 array_constructor_ARRAY_nulls#facebookincubator#1 30.61ms 32.66 array_constructor_ARRAY_nulls#facebookincubator#2 55.01ms 18.18 array_constructor_ARRAY_nulls#facebookincubator#3 80.69ms 12.39 array_constructor_ARRAY_nulls##2_null 69.10ms 14.47 array_constructor_ARRAY_nulls##2_const 103.85ms 9.63 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.43ms 64.80 array_constructor_ARRAY_nullfree#facebookincubator#2 24.50ms 40.81 array_constructor_ARRAY_nullfree#facebookincubator#3 35.12ms 28.47 array_constructor_ARRAY_nullfree##2_null 54.52ms 18.34 array_constructor_ARRAY_nullfree##2_const 43.28ms 23.10 array_constructor_ARRAY_nulls#facebookincubator#1 28.60ms 34.96 array_constructor_ARRAY_nulls#facebookincubator#2 50.82ms 19.68 array_constructor_ARRAY_nulls#facebookincubator#3 70.31ms 14.22 array_constructor_ARRAY_nulls##2_null 64.43ms 15.52 array_constructor_ARRAY_nulls##2_const 80.71ms 12.39 Before: array_constructor_INTEGER_nullfree#facebookincubator#1 19.72ms 50.71 array_constructor_INTEGER_nullfree#facebookincubator#2 34.51ms 28.97 array_constructor_INTEGER_nullfree#facebookincubator#3 47.95ms 20.86 array_constructor_INTEGER_nullfree##2_null 58.68ms 17.04 array_constructor_INTEGER_nullfree##2_const 45.15ms 22.15 array_constructor_INTEGER_nulls#facebookincubator#1 29.99ms 33.34 array_constructor_INTEGER_nulls#facebookincubator#2 55.32ms 18.08 array_constructor_INTEGER_nulls#facebookincubator#3 78.53ms 12.73 array_constructor_INTEGER_nulls##2_null 72.24ms 13.84 array_constructor_INTEGER_nulls##2_const 71.13ms 14.06 After: array_constructor_INTEGER_nullfree#facebookincubator#1 3.49ms 286.59 array_constructor_INTEGER_nullfree#facebookincubator#2 7.91ms 126.46 array_constructor_INTEGER_nullfree#facebookincubator#3 11.99ms 83.41 array_constructor_INTEGER_nullfree##2_null 12.57ms 79.55 array_constructor_INTEGER_nullfree##2_const 11.03ms 90.67 array_constructor_INTEGER_nulls#facebookincubator#1 4.37ms 228.97 array_constructor_INTEGER_nulls#facebookincubator#2 9.99ms 100.14 array_constructor_INTEGER_nulls#facebookincubator#3 14.79ms 67.60 array_constructor_INTEGER_nulls##2_null 12.21ms 81.92 array_constructor_INTEGER_nulls##2_const 12.64ms 79.12 Before: array_constructor_MAP_nullfree#facebookincubator#1 17.34ms 57.65 array_constructor_MAP_nullfree#facebookincubator#2 29.84ms 33.51 array_constructor_MAP_nullfree#facebookincubator#3 41.51ms 24.09 array_constructor_MAP_nullfree##2_null 56.57ms 17.68 array_constructor_MAP_nullfree##2_const 71.68ms 13.95 array_constructor_MAP_nulls#facebookincubator#1 36.22ms 27.61 array_constructor_MAP_nulls#facebookincubator#2 68.18ms 14.67 array_constructor_MAP_nulls#facebookincubator#3 95.12ms 10.51 array_constructor_MAP_nulls##2_null 86.42ms 11.57 array_constructor_MAP_nulls##2_const 120.10ms 8.33 After: array_constructor_MAP_nullfree#facebookincubator#1 17.38ms 57.53 array_constructor_MAP_nullfree#facebookincubator#2 29.41ms 34.00 array_constructor_MAP_nullfree#facebookincubator#3 38.30ms 26.11 array_constructor_MAP_nullfree##2_null 58.52ms 17.09 array_constructor_MAP_nullfree##2_const 48.62ms 20.57 array_constructor_MAP_nulls#facebookincubator#1 30.60ms 32.68 array_constructor_MAP_nulls#facebookincubator#2 53.94ms 18.54 array_constructor_MAP_nulls#facebookincubator#3 86.48ms 11.56 array_constructor_MAP_nulls##2_null 69.53ms 14.38 array_constructor_MAP_nulls##2_const 87.56ms 11.42 Before: array_constructor_ROW_nullfree#facebookincubator#1 33.88ms 29.52 array_constructor_ROW_nullfree#facebookincubator#2 62.00ms 16.13 array_constructor_ROW_nullfree#facebookincubator#3 89.54ms 11.17 array_constructor_ROW_nullfree##2_null 78.46ms 12.75 array_constructor_ROW_nullfree##2_const 95.53ms 10.47 array_constructor_ROW_nulls#facebookincubator#1 44.11ms 22.67 array_constructor_ROW_nulls#facebookincubator#2 115.43ms 8.66 array_constructor_ROW_nulls#facebookincubator#3 173.61ms 5.76 array_constructor_ROW_nulls##2_null 130.40ms 7.67 array_constructor_ROW_nulls##2_const 169.97ms 5.88 After: array_constructor_ROW_nullfree#facebookincubator#1 5.64ms 177.44 array_constructor_ROW_nullfree#facebookincubator#2 14.40ms 69.44 array_constructor_ROW_nullfree#facebookincubator#3 21.46ms 46.59 array_constructor_ROW_nullfree##2_null 19.14ms 52.26 array_constructor_ROW_nullfree##2_const 18.60ms 53.77 array_constructor_ROW_nulls#facebookincubator#1 10.97ms 91.18 array_constructor_ROW_nulls#facebookincubator#2 18.29ms 54.67 array_constructor_ROW_nulls#facebookincubator#3 28.57ms 35.01 array_constructor_ROW_nulls##2_null 25.10ms 39.84 array_constructor_ROW_nulls##2_const 24.55ms 40.74 ``` Differential Revision: D49269500 fbshipit-source-id: 1f5ff279be27a55f72930457991cefec0d39fcf4
9a61532
to
fde358b
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree#facebookincubator#1 15.80ms 63.30 array_constructor_ARRAY_nullfree#facebookincubator#2 25.59ms 39.08 array_constructor_ARRAY_nullfree#facebookincubator#3 34.49ms 28.99 array_constructor_ARRAY_nullfree##2_null 58.96ms 16.96 array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree#facebookincubator#1 15.53ms 64.39 array_constructor_ARRAY_nullfree#facebookincubator#2 25.75ms 38.84 array_constructor_ARRAY_nullfree#facebookincubator#3 34.37ms 29.10 array_constructor_ARRAY_nullfree##2_null 59.63ms 16.77 array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree#facebookincubator#1 16.89ms 59.20 array_constructor_MAP_nullfree#facebookincubator#2 29.24ms 34.20 array_constructor_MAP_nullfree#facebookincubator#3 41.11ms 24.33 array_constructor_MAP_nullfree##2_null 61.98ms 16.14 array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree#facebookincubator#1 17.07ms 58.57 array_constructor_MAP_nullfree#facebookincubator#2 28.39ms 35.23 array_constructor_MAP_nullfree#facebookincubator#3 42.34ms 23.62 array_constructor_MAP_nullfree##2_null 65.24ms 15.33 array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500 fbshipit-source-id: 5b58c3942224962691e9b7f114f5134513328b5d
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
3bb8e21
to
c5c14d5
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
c5c14d5
to
6df9270
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500
6df9270
to
94076ec
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D49269500 |
94076ec
to
5b60720
Compare
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500 fbshipit-source-id: 8fc304cce31f782aef818e6d8ce052360af8e832
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Differential Revision: D49269500 fbshipit-source-id: 054dcec94cc3768443440ffb57d58c96f9b1f006
5b60720
to
ab29e53
Compare
This pull request was exported from Phabricator. Differential Revision: D49269500 |
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 2fdf6f2a717a3bc4bf54fb3d9bd970157a8c415b
ab29e53
to
b3e532f
Compare
Summary: Fixes facebookincubator#6611 Pull Request resolved: facebookincubator#6613 Differential Revision: D49362420 Pulled By: mbasmanova fbshipit-source-id: 5a78e1982b2607c400794a61cf368ab4f644d749
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: b7271a981d33885b2d4bc3219bc15f069bcd06d3
This pull request was exported from Phabricator. Differential Revision: D49269500 |
b3e532f
to
585bcbf
Compare
This pull request has been merged in bd8d20b. |
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
Summary: Pull Request resolved: facebookincubator#6566 FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment) ``` FlatVector.h void copyRanges( const BaseVector* source, const folly::Range<const BaseVector::CopyRange*>& ranges) override { for (auto& range : ranges) { copy(source, range.targetIndex, range.sourceIndex, range.count); } } ``` This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same. An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps. ``` Before: array_constructor_ARRAY_nullfree##2_const 54.31ms 18.41 array_constructor_ARRAY_nulls#facebookincubator#1 33.50ms 29.85 array_constructor_ARRAY_nulls#facebookincubator#2 59.05ms 16.93 array_constructor_ARRAY_nulls#facebookincubator#3 88.36ms 11.32 array_constructor_ARRAY_nulls##2_null 74.53ms 13.42 array_constructor_ARRAY_nulls##2_const 102.54ms 9.75 After: array_constructor_ARRAY_nullfree##2_const 41.36ms 24.18 array_constructor_ARRAY_nulls#facebookincubator#1 30.51ms 32.78 array_constructor_ARRAY_nulls#facebookincubator#2 55.13ms 18.14 array_constructor_ARRAY_nulls#facebookincubator#3 77.93ms 12.83 array_constructor_ARRAY_nulls##2_null 68.84ms 14.53 array_constructor_ARRAY_nulls##2_const 83.91ms 11.92 Before: array_constructor_MAP_nullfree##2_const 67.44ms 14.83 array_constructor_MAP_nulls#facebookincubator#1 37.00ms 27.03 array_constructor_MAP_nulls#facebookincubator#2 67.76ms 14.76 array_constructor_MAP_nulls#facebookincubator#3 100.88ms 9.91 array_constructor_MAP_nulls##2_null 84.22ms 11.87 array_constructor_MAP_nulls##2_const 122.55ms 8.16 After: array_constructor_MAP_nullfree##2_const 49.94ms 20.03 array_constructor_MAP_nulls#facebookincubator#1 34.34ms 29.12 array_constructor_MAP_nulls#facebookincubator#2 55.23ms 18.11 array_constructor_MAP_nulls#facebookincubator#3 82.64ms 12.10 array_constructor_MAP_nulls##2_null 70.74ms 14.14 array_constructor_MAP_nulls##2_const 88.13ms 11.35 ``` Reviewed By: laithsakka Differential Revision: D49269500 fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: #5958 (comment)
This change optimizes FlatVector::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.
An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.