Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base62 encoding for @user_id is problematic for namespacing #11

Closed
archonic opened this issue Jan 19, 2019 · 13 comments · Fixed by #17
Closed

Base62 encoding for @user_id is problematic for namespacing #11

archonic opened this issue Jan 19, 2019 · 13 comments · Fixed by #17
Assignees

Comments

@archonic
Copy link

I'm using simple-feed in an application with apartment. I have 1 activity feed per resource in each account and don't want resources with the same ID to pollute feeds outside of their account. So I need to namespace the user_id. I've tried #{account.id}_#{resource.id} and #{account.id}x#{resource.id} but both are hitting this error when storing the event:

ArgumentError: comparison of String with 0 failed
/usr/local/bundle/gems/base62-rb-0.3.1/lib/base62-rb.rb:11:in `<'
/usr/local/bundle/gems/base62-rb-0.3.1/lib/base62-rb.rb:11:in `encode'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/key.rb:45:in `base62_user_id'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/key.rb:55:in `render_options'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/key.rb:37:in `block (2 levels) in define_key_methods'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:37:in `block in store'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:224:in `block (3 levels) in with_response_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/response.rb:32:in `for'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:224:in `block (2 levels) in with_response_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:244:in `block (3 levels) in batch_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:243:in `each'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:243:in `block (2 levels) in batch_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:145:in `block (3 levels) in with_pipelined'
/usr/local/bundle/gems/redis-4.0.3/lib/redis.rb:2319:in `block in pipelined'
/usr/local/bundle/gems/redis-4.0.3/lib/redis.rb:50:in `block in synchronize'
/usr/local/bundle/gems/redis-4.0.3/lib/redis.rb:50:in `synchronize'
/usr/local/bundle/gems/redis-4.0.3/lib/redis.rb:2316:in `pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:144:in `block (2 levels) in with_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:136:in `block (2 levels) in with_redis'
/usr/local/bundle/gems/connection_pool-2.2.2/lib/connection_pool.rb:65:in `block (2 levels) in with'
/usr/local/bundle/gems/connection_pool-2.2.2/lib/connection_pool.rb:64:in `handle_interrupt'
/usr/local/bundle/gems/connection_pool-2.2.2/lib/connection_pool.rb:64:in `block in with'
/usr/local/bundle/gems/connection_pool-2.2.2/lib/connection_pool.rb:61:in `handle_interrupt'
/usr/local/bundle/gems/connection_pool-2.2.2/lib/connection_pool.rb:61:in `with'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:135:in `block in with_redis'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:162:in `with_retries'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:134:in `with_redis'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:143:in `block in with_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:162:in `with_retries'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/driver.rb:142:in `with_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:242:in `block in batch_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:241:in `each'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:241:in `each_slice'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:241:in `batch_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:223:in `block in with_response_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/base/provider.rb:72:in `with_response'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:222:in `with_response_pipelined'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/redis/provider.rb:36:in `store'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers/proxy.rb:31:in `method_missing'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/feed.rb:12:in `block in <class:Feed>'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers.rb:32:in `block (3 levels) in define_provider_methods'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/activity/multi_user.rb:73:in `block in <class:MultiUser>'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers.rb:32:in `block (3 levels) in define_provider_methods'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/activity/single_user.rb:66:in `block in <class:SingleUser>'
/usr/local/bundle/gems/simple-feed-2.0.2/lib/simplefeed/providers.rb:32:in `block (3 levels) in define_provider_methods'
/app/app/services/activity_service.rb:60:in `block in create_activity'

Do you know how I can safely namespace feeds into their account while satisfying base62 encoding? From what I've read, everything in [a-z][A-Z][0-9] should be safe but that's not what I'm seeing.

@archonic
Copy link
Author

archonic commented Jan 19, 2019

I've solved this for now by adding an activity_feed_key method to the model file for the resource:

def activity_feed_key
  # Up to 1 billion accounts and resources
  # Leading 1 is required to keep leftpadded zeros
  # "1_000_000_001_000_000_001"
  ("1%09d" % account_id + "%09d" % id).to_i
end

Not ideal but works.

@kigster
Copy link
Owner

kigster commented Jan 22, 2019

When you define a feed you can optionally specify a namespace so that multiple feeds can be defined against a shared redis instance.

Are you using the namespace when defining your feeds?

Perhaps I misunderstand the problem.

If you could paste the code that defines your feeds, and then also show how you are storing the keys that would be helpful.

@kigster
Copy link
Owner

kigster commented Jan 22, 2019

Also, have you seen this page that shows how you can implement base62 serialization?

@archonic
Copy link
Author

I'm using the namespace to differentiate between accounts and documents. This is my simplefeed.rb initializer:

# frozen_string_literal: true

require "simplefeed"

SIMPLEFEED_URL = "#{ENV['REDIS_BASE_URL']}/simplefeed"

# Account feeds
SimpleFeed.define(:accounts) do |f|
  f.provider = SimpleFeed.provider(
    :redis,
    redis: -> { Redis.new(url: SIMPLEFEED_URL) },
    pool_size: 10
  )
  f.per_page   = 25  # default page size
  f.batch_size = 10  # default batch size
  f.namespace  = "a" # only needed if you use the same redis for more than one feed
end

# Document feeds
SimpleFeed.define(:documents) do |f|
  f.provider = SimpleFeed.provider(
    :redis,
    redis: -> { Redis.new(url: SIMPLEFEED_URL) },
    pool_size: 10
  )
  f.per_page   = 25  # default page size
  f.batch_size = 10  # default batch size
  f.namespace  = "d" # only needed if you use the same redis for more than one feed
end

The issue is that I want a unique namespace for each document, which is inside a tenant table, per account. The document namespace should be something like "d#{document.account_id}_#{document.id}". In an initializer, I can't make the namespace aware of IDs which are in the database.

@kigster
Copy link
Owner

kigster commented Jul 3, 2019

I still don't understand your case.

  • What does this mean — "each document is inside a tenant table per account"?
  • What do you store in the :accounts feed as user_id?
  • What do you store in the :documents feed as the user_id?
  • What is the minimum composite key you need to represent the "reader" of each feed? Is it unique per user only? Per account? Or per some @account + @account.documents?

I am sorry for asking so many questions, but your use-case is really not clear to me, and without the context it's hard to visualize your requirements.

If you can shed more light into this, and like I asked earlier, show examples of how you are reading from the feeds, and how you are writing to both feeds, that would make it possible for me to help you.

Thanks!

@kigster
Copy link
Owner

kigster commented Jul 3, 2019

Another option is to show mockups of what you are building — feel free to email them to me. My address is my github username AT google' free email service.

Thanks!

@archonic
Copy link
Author

archonic commented Jul 5, 2019

I'm using the Apartment gem with Postgres. Every account has effectively has its own database (schema). There are resources that are in the public schema (accounts for example) and resources that are per-schema (documents for example). The account IDs are unique across the application, so I've had no problems creating activity feeds for accounts. I just use their ID to write to their feeds. The document IDs are only unique on a per-account basis. The first document in every account has an ID of 1. In order to keep the activity feeds for documents from polluting eachother, I have to namespace their feed id on a per-account basis.

This is why I've needed this method on my document model:

def activity_feed_key
  # Up to 1 billion accounts and resources
  # Leading 1 is required to keep leftpadded zeros
  # "1_000_000_001_000_000_001"
  ("1%09d" % account_id + "%09d" % id).to_i
end

When writing and reading to the feed, I use activity_feed_key as the ID and that keeps every document feed independent. I could of course move activity feeds to Users. Users are in the public schema and have unique IDs. I don't need feeds to be unique to each user (at the moment) though, so that's why I was looking for a dynamic way to namespace. Is a dynamic namespace outside the scope of this gem?

@kigster
Copy link
Owner

kigster commented Jul 5, 2019

Thanks for this context, let me think about it.

@kigster kigster closed this as completed Oct 25, 2019
@rromanchuk
Copy link
Contributor

I think i'm missing some fundamental details. I havent used a bigint as an identifier in over a decade, i must be making incorrect assumptions when reading user_id, which to me implies a uuid represented as a string.

irb(main):006:0> activity = SimpleFeed.newsfeed.activity(SecureRandom.uuid)
irb(main):007:0> activity.store(value: "foo")
Traceback (most recent call last):
        1: from (irb):7
ArgumentError (comparison of String with 0 failed)
``

@rromanchuk
Copy link
Contributor

Just coming back to this now that i'm much more familiar now. @kigster i think @archonic problem is a lot simpler than the discussion surrounding it. You can't pass a string into the activity initializer.

This will fail every-time
SimpleFeed.get(:newsfeed).activity("1").store(value: "test")

ArgumentError (comparison of String with 0 failed)

It might be misleading because SimpleFeed.get(:newsfeed).activity("1") or SimpleFeed.get(:newsfeed).activity(["1", "2"]) returns the feed instance without warning. It's not until store where the redis provider explodes. Even base62 encodable strings fail, so i'm just coercing to integer, which is fine, but it adds a lot code smells.

def uuid_as_bigint
    UUID4(id).to_i
end

SimpleFeed.get(:newsfeed).activity(User.last.uuid_as_bigint).store(value: "test")

I'll dig into the source and figure out where my assumptions are incorrect. I'm not trying to do anything complicated with namespacing or event serialization, just trying to translate my identifiers create_table "users", id: :uuid, default: -> { "gen_random_uuid()" }) to a datatype the provider supports.

@kigster
Copy link
Owner

kigster commented May 20, 2020

This is great! And you are right, I think the code assumes integer data type for the user id.

I’ll take a pass at confirming it, and if confirmed — switching it to a string, but don’t let me stop you :)

@rromanchuk
Copy link
Contributor

yeah i just looked at the key template, i can actually easily provide my own base62 string direct from my uuid representation.

uuid.to_s(format: :base62)
# => "6s7exj9M9mGqLs5KhhWWaB"

I'll try patching, i see specs so it should be straight forward for a modified signature that takes an already encoded identifier

@kigster
Copy link
Owner

kigster commented May 23, 2020

Thanks, everyone, for participating and especially to @rromanchuk for helping me understand the problem. I now fully get the issue, which comes down to:

  • I assumed that any string user ID (composite or not) can always be converted to an integer (for proof, see Ruby's definition of the Object#hash. But I see now that this is perhaps less than ideal, and users of SF don't need to be thinking about a translation layer themselves, SimpleFeed can certainly help them.
  • The reason I wanted numeric IDs is mostly for base62 compaction, but it does feel like premature optimization to me now, so it should be based on the data type of the user_id.

Solution

Head over to #16 for the proposed solution and the corresponding PR #17

@kigster kigster linked a pull request May 23, 2020 that will close this issue
@kigster kigster self-assigned this May 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants