Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use with Apartment gem leads to wrong sequence being used on postgresql #233

Closed
ChrisHampel opened this issue Feb 9, 2016 · 7 comments
Closed
Labels

Comments

@ChrisHampel
Copy link

Our software has 1 PostgreSQL schema per company using our software using the Apartments gem most of the time it works but when there is a lot of activity we will get SQL where the apartment gem is selecting the right schema to insert data into, but this gem is saying to use the wrong schema's sequence for the primary key.

For example, we will be doing an import into schema 3 and it will look up the sequence for schema 1 which was used recently by another company

This either results in:

  • a very large id in a schema where barely any records exist
  • "PG::UniqueViolation: ERROR: duplicate key value violates unique constraint"
    neither of which is preferable since the first will lead to the second eventually unless we intervene and the other just stops saving of the new record all together.

It appears that the sequence being passed in here

connection_memo.next_value_for_sequence(sequence_name)
can be for the wrong schema.

For background information Apartments sets the current schema using the search_path and setting it to:

  • #{schema_name}, public
@jkowens
Copy link
Collaborator

jkowens commented Feb 15, 2016

@ChrisHampel can you provide some code to demonstrate your import process and maybe some details about your environment? My guess is that you have multiple threads and one is modifying the connection context out from beneath another one.

@ChrisHampel
Copy link
Author

we are running 2 puma servers with 16 max threads each, however this still happens with 1 server set to 1 max threads this also happens in rake tasks when we are specifically setting the schema.

this is our code, but for security reasons only included the applicable code.

vars.each do |var|
    instance_klass_attributes=collect_attributes(...stuff here...)
    instances.push instance_klass.new(instance_klass_attributes)
end
temp=instance_klass.import instances
instances=instance_klass.where(id: temp.ids)
unless instance_associations.nil?
    instance_associations.each do |association|
        asocs=[]
        instances.each do |instance|
            process_asoc(self, association).each do |hash|
            asoc=association[:klass].new(hash)
            asoc.send "#{association[:link].to_s}=", instance
            asocs.push asoc
        end
    end
    association[:klass].import asocs
end
unless instance_class_callbacks.nil?
  instance_class_callbacks.each do |callback|
    if instance_klass.respond_to?(:delay)
      instance_klass.delay.send(callback,instances.map(&:id))
    else
      instance_klass.send(callback,instances.map(&:id))
    end
  end
end
instances

the equivelent code we are currently running is due to this bug:

instances=[]
asocs={}
unless instance_associations.nil?
  instance_associations.each do |association|
    asocs["#{association[:key].to_s}_attributes".to_sym]=process_asoc(self, association)
  end
end
var.each do |var|
    instance_klass_attributes=collect_attributes(...stuff here...)
    asocs.each do |key,value|
      instance_klass_attributes[key]=value
    end
    instance=instance_klass.new(instance_klass_attributes)
    instance.save()
    instances.push instance
end

this is obviously orders of magnitude slower.

this all happens in a get request due to the dynamic nature of the data generation, infinite possible instance generated on demand.

we first noticed this when we were running a large scale data migration to the new system we built. in that we migrated each schema one at a time:

schemas.all.each do |schema|
    Apartment::Tenant.switch! schema.name
    ActiveRecord::Base.transaction do
        migration using your library in here(see code above)
    end
end

I hope this information is helpful.

@mtalcott
Copy link

mtalcott commented Jul 27, 2016

We're experiencing the same issue with the wrong schema being passed to nextval when using ActiveRecord. I think I can help shed some light on what's going on here, and suggest a few paths forward.

A little background

If you're using Apartment the way we are (the most common way, with Postgres schemas), search_path is set to ensure that the table from the current schema is selected, like @ChrisHampel mentioned. Queries generated by ActiveRecord and elsewhere work best when no schema qualifier is included with table names which allows search_path to find the correct table. For example, queries on the posts table would use posts, rather than schema.posts. Specifically referencing a schema in the table name can return tables in different schemas, rather than what is set in search_path, if a different schema name is specified. That's what's going on here.

The issue

The issue isn't really with activerecord-import, but rather ActiveRecord's behavior with sequence names (I looked at 4.2.7, which is what we're on). The reason some other schema is appearing in activerecord-import's insert statements is because of this caching within sequence_name. When you call Post.sequence_name, it returns a value that includes the current schema, which is because it's included in what Postgres gives back here. Because of the aforementioned caching, switching to another schema and calling Post.sequence_name again returns the cached value, with the previous schema.

Fix/workaround

I don't think that sequence_name should include any schema in front (even the current one), so that might be a bug to file on ActiveRecord. However, that actually might not be what most people think, taking a look a this issue.

ActiveRecord allows explicitly setting a sequence name, too, which is how I plan on working around this, since we're only using it on one table. I set the sequence_name to the value without a schema in front of it, and it works:

class Post < ActiveRecord::Base
  self.sequence_name = "posts_seq"
end

@ChrisHampel, since you're looping dynamically through lots of records, you could set these dynamically using this fallback behavior for default_sequence_name as a reference. I saw that setting sequence_name on associations actually sets it on the association's model, too.

@jkowens
Copy link
Collaborator

jkowens commented Jul 27, 2016

@mtalcott great find! Do you think calling #reset_sequence_name on the model is an option?

schemas.all.each do |schema|
    Apartment::Tenant.switch! schema.name
    ActiveRecord::Base.transaction do
        # I think this should set the sequence name with correct schema
        Post.reset_sequence_name

        Post.import posts
    end
end

@ChrisHampel
Copy link
Author

If I get a chance I will test this, but we ended up not using your library, at this time, due to this weirdness

@mtalcott
Copy link

Yep, I just verified that #reset_sequence_name also does the trick!

@jkowens jkowens changed the title use with Appartments gem leads to wrong sequence being used on postgresql Use with Apartment gem leads to wrong sequence being used on postgresql Oct 6, 2016
@jkowens jkowens mentioned this issue Apr 4, 2017
10 tasks
@caifara
Copy link

caifara commented Mar 13, 2018

See also this apartment issue. As this cache is set by Rails it is foremost a Rails problem. Next best place to solve this is the apartment gem. I wrote a hack while that issue is getting solved there (see the linked issue).

dillonwelch added a commit to dillonwelch/activerecord-import that referenced this issue Oct 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants