ruby: Fix object cache lookups on 32-bit platforms #13494

stanhu · 2023-08-09T20:37:31Z

#13204 refactored the Ruby object cache to use a key of LL2NUM(key_val) instead of LL2NUM(key_val >> 2). On 32-bit systems, LL2NUM(key_val) returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail.

This commit restores the previous behavior of using ObjectCache_GetKey, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms.

As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses:

1 bit for the FIXNUM_FLAG.
1 bit for the sign flag.

Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value is 1073741823 (2^30 - 1).

For example, a possible VALUE pointer address on a 32-bit system:

0xff5b4af8 => 4284173048

Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum:

(0xff5b4af8 >> 2) => 1071043262

This bug can also manifest on a 64-bit system if the upper bits are 0xff.

Closes #13481

stanhu · 2023-08-09T21:14:55Z

ruby/ext/google/protobuf_c/protobuf.c

  VALUE key_val = (VALUE)key;
  PBRUBY_ASSERT((key_val & 3) == 0);
-  return rb_funcall(weak_obj_cache, item_try_add, 2, LL2NUM(key_val), val);
+  // Avoid overflow on 32-bit systems by discarding the bottom zeros


I think the issue isn't overflow, but in https://github.com/ruby/ruby/blob/abd15ac775d41e6485f728fe0fad4cddf138d3ec/include/ruby/internal/arithmetic/long_long.h#L75-L86 it appears that if the value can fit as a Fixnum it will be stored that way. I suspect that on a 32-bit system, the value doesn't fit, so it has to be allocated on a heap via Bignum.

Example values:

0xff5b4af8 => 4284173048
(0xff5b4af8 >> 2) => 1071043262

Every time we get a new key on a 32-bit system, LL2NUM(key_val) previously allocated a new Bignum, which has a different value.

Can you add an unit test or an assert so that we can't regress this in the future? Most code paths aren't picky about a Fixnum vs a Bignum, but this one does.

I've confirmed that this is indeed a Bignum vs. Fixnum issue. I've added an assertion, though the assertion will only be triggered on a 32-bit system with the NDEBUG compile flag omitted. The entire test suite fails with v3.24.0 at the moment even without this assertion.

esrauchg · 2023-08-10T20:25:05Z

It would be great if we can get this PR merged this week to get it released as soon as possible; tests would be appreciated but since this is a partial rollback of a recent change if we can only get this in without tests this week that is likely still preferable.

stanhu · 2023-08-10T22:12:51Z

@esrauchg The best way to test this would be to build and run the Ruby tests on a 32-bit image, such as i386/ubuntu:latest. Unfortunately, I don't have the bandwidth to do this.

UPDATE: Never mind, done.

protocolbuffers#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail. This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms. As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses: * 1 bit for the `FIXNUM_FLAG`. * 1 bit for the sign bit. Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value is 1073741823 (2^30 - 1). For example, a possible VALUE pointer address on a 32-bit system: 0xff5b4af8 => 4284173048 Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum: (0xff5b4af8 >> 2) => 1071043262 This bug can also manifest on a 64-bit system if the upper bits are 0xff. Closes protocolbuffers#13481

This will prevent regressions in a 32-bit environment.

stanhu · 2023-08-15T00:50:16Z

.github/workflows/test_ruby.yml

@@ -43,6 +43,39 @@ jobs:
          bazel-cache: ruby_linux/${{ matrix.ruby }}_${{ matrix.bazel }}
          bazel: test //ruby/... //ruby/tests:ruby_version --test_env=KOKORO_RUBY_VERSION --test_env=BAZEL=true ${{ matrix.ffi == 'FFI' && '--//ruby:ffi=enabled --test_env=PROTOCOL_BUFFERS_RUBY_IMPLEMENTATION=FFI' || '' }}

+  linux-32bit:
+    name: Linux aarch64


Oops, there is a typo here.

#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail. This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms. As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses: * 1 bit for the `FIXNUM_FLAG`. * 1 bit for the sign flag. Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value is 1073741823 (2^30 - 1). For example, a possible VALUE pointer address on a 32-bit system: 0xff5b4af8 => 4284173048 Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum: (0xff5b4af8 >> 2) => 1071043262 This bug can also manifest on a 64-bit system if the upper bits are 0xff. Closes #13481 Closes #13494 COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a PiperOrigin-RevId: 557189479

Every language has very different handling of utf8 validation. Any with proto2/proto3 differences will receive language-specific features for edition zero to better model these subtle differences. COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a PiperOrigin-RevId: 552625656

This helps make the API more complete, since the FeatureSet object will always be fully resolved on any accessible features. This specifically targets C++ plugins though, which will now have their features filled in by default. Before, any proto files that didn't include the language-specific features would result in unresolved extensions in the generators. COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a PiperOrigin-RevId: 553224298

This has been replaced with language-specific features and will not be included in Edition 2023. COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a PiperOrigin-RevId: 557215503

#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail. This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms. As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses: * 1 bit for the `FIXNUM_FLAG`. * 1 bit for the sign flag. Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value is 1073741823 (2^30 - 1). For example, a possible VALUE pointer address on a 32-bit system: 0xff5b4af8 => 4284173048 Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum: (0xff5b4af8 >> 2) => 1071043262 This bug can also manifest on a 64-bit system if the upper bits are 0xff. Closes #13481 Closes #13494 COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a FUTURE_COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a PiperOrigin-RevId: 557216800

#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail. This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms. As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses: * 1 bit for the `FIXNUM_FLAG`. * 1 bit for the sign flag. Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value is 1073741823 (2^30 - 1). For example, a possible VALUE pointer address on a 32-bit system: 0xff5b4af8 => 4284173048 Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum: (0xff5b4af8 >> 2) => 1071043262 This bug can also manifest on a 64-bit system if the upper bits are 0xff. Closes #13481 Closes #13494 COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a PiperOrigin-RevId: 557211768

#13204 refactored the Ruby object cache to use a key of `LL2NUM(key_val)` instead of `LL2NUM(key_val >> 2)`. On 32-bit systems, `LL2NUM(key_val)` returns inconsistent results because a large value has to be stored as a Bignum on the heap. This causes cache lookups to fail. This commit restores the previous behavior of using `ObjectCache_GetKey`, which discards the lower 2 bits, which are zero. This enables a key to be stored as a Fixnum on both 32 and 64-bit platforms. As https://patshaughnessy.net/2014/1/9/how-big-is-a-bignum describes, a Fixnum uses: * 1 bit for the `FIXNUM_FLAG`. * 1 bit for the sign flag. Therefore the largest possible Fixnum value on a 64-bit value is 4611686018427387903 (2^62 - 1). On a 32-bit system, the largest value is 1073741823 (2^30 - 1). For example, a possible VALUE pointer address on a 32-bit system: 0xff5b4af8 => 4284173048 Dropping the lower 2 bits makes up for the loss of range to these flags. In the example above, we see that shifting by 2 turns the value into a 30-bit number, which can be represented as a Fixnum: (0xff5b4af8 >> 2) => 1071043262 This bug can also manifest on a 64-bit system if the upper bits are 0xff. Closes #13481 Closes #13494 COPYBARA_INTEGRATE_REVIEW=#13494 from stanhu:sh-fix-ruby-protobuf-32bit d63122a PiperOrigin-RevId: 557211768 Co-authored-by: Stan Hu <stanhu@gmail.com>

stanhu requested a review from a team as a code owner August 9, 2023 20:37

stanhu requested review from JasonLunn and removed request for a team August 9, 2023 20:37

fowles requested a review from esrauchg August 9, 2023 20:41

esrauchg approved these changes Aug 9, 2023

View reviewed changes

JasonLunn added the ruby label Aug 9, 2023

stanhu commented Aug 9, 2023

View reviewed changes

stanhu force-pushed the sh-fix-ruby-protobuf-32bit branch from 84070be to 21b1bb6 Compare August 9, 2023 21:43

JasonLunn added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 9, 2023

github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 9, 2023

stanhu mentioned this pull request Aug 10, 2023

[ruby] Unable to instantiate a descriptor in v3.24.0 on 32-bit system #13481

Closed

stanhu force-pushed the sh-fix-ruby-protobuf-32bit branch from 21b1bb6 to 8d3bd0e Compare August 11, 2023 00:59

stanhu requested a review from a team as a code owner August 11, 2023 00:59

stanhu requested review from haberman and removed request for a team August 11, 2023 00:59

ruby: Add CI job for linux-32bit

2606768

This will prevent regressions in a 32-bit environment.

stanhu force-pushed the sh-fix-ruby-protobuf-32bit branch from 8d3bd0e to 2606768 Compare August 11, 2023 01:01

zhangskz added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 15, 2023

github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Aug 15, 2023

Fix typo in Ruby 32-bit build

d63122a

stanhu commented Aug 15, 2023

View reviewed changes

copybara-service bot mentioned this pull request Aug 15, 2023

ruby: Fix object cache lookups on 32-bit platforms (#13494) #13551

Closed

copybara-service bot mentioned this pull request Aug 15, 2023

Editions: Migrate string_field_validation to a C++ feature #13553

Closed

copybara-service bot mentioned this pull request Aug 15, 2023

Editions: Include defaults for any features in the generated pool. #13541

Closed

copybara-service bot mentioned this pull request Aug 15, 2023

Editions: Remove string_field_validation feature. #13556

Closed

copybara-service bot mentioned this pull request Aug 15, 2023

ruby: Fix object cache lookups on 32-bit platforms (#13494) #13557

Closed

copybara-service bot closed this in 055c378 Aug 15, 2023

esrauchg mentioned this pull request Aug 17, 2023

ruby: Fix object cache lookups on 32-bit platforms (#13494) #13580

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ruby: Fix object cache lookups on 32-bit platforms #13494

ruby: Fix object cache lookups on 32-bit platforms #13494

stanhu commented Aug 9, 2023 •

edited

Loading

stanhu Aug 9, 2023 •

edited

Loading

JasonLunn Aug 9, 2023

stanhu Aug 9, 2023 •

edited

Loading

esrauchg commented Aug 10, 2023

stanhu commented Aug 10, 2023 •

edited

Loading

stanhu Aug 15, 2023

ruby: Fix object cache lookups on 32-bit platforms #13494

ruby: Fix object cache lookups on 32-bit platforms #13494

Conversation

stanhu commented Aug 9, 2023 • edited Loading

stanhu Aug 9, 2023 • edited Loading

Choose a reason for hiding this comment

JasonLunn Aug 9, 2023

Choose a reason for hiding this comment

stanhu Aug 9, 2023 • edited Loading

Choose a reason for hiding this comment

esrauchg commented Aug 10, 2023

stanhu commented Aug 10, 2023 • edited Loading

stanhu Aug 15, 2023

Choose a reason for hiding this comment

stanhu commented Aug 9, 2023 •

edited

Loading

stanhu Aug 9, 2023 •

edited

Loading

stanhu Aug 9, 2023 •

edited

Loading

stanhu commented Aug 10, 2023 •

edited

Loading