Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplemented labels as HStore field #3427

Merged
merged 1 commit into from
Dec 13, 2022
Merged

Conversation

mdellweg
Copy link
Member

This solves the imminent performance issues.

fixes #3400

@mdellweg mdellweg force-pushed the label_hstore branch 9 times, most recently from 796a5fb to 6a1a733 Compare November 28, 2022 17:36
@dralley
Copy link
Contributor

dralley commented Nov 28, 2022

I believe this is going to result in the same kind of downstream impact as the UUID change, which is that plugins with models that inherit directly from BaseModel will think they need migrations when used with a newer pulpcore, and also when they have applied the migrations and are used with older pulpcore.

@mdellweg
Copy link
Member Author

Yes, when we want to introduce it on the BaseModel. That's why i did not do that yet. But yes, we might be bitten again by the abstract base model.

@dralley
Copy link
Contributor

dralley commented Nov 29, 2022

Is the plan to put it on MasterModel only temporarily?

@mdellweg
Copy link
Member Author

Is the plan to put it on MasterModel only temporarily?

I don't have a final Plan. I can even just put it on the very master models we provide label functionality today.

@mdellweg mdellweg force-pushed the label_hstore branch 2 times, most recently from a5051ff to 97cd925 Compare November 30, 2022 15:52
@mdellweg mdellweg force-pushed the label_hstore branch 2 times, most recently from 0506d08 to 7c64ba9 Compare December 2, 2022 15:24
@mdellweg mdellweg marked this pull request as ready for review December 2, 2022 15:24
@mdellweg mdellweg force-pushed the label_hstore branch 2 times, most recently from 1cff650 to 26c4a22 Compare December 4, 2022 12:43
This solves the imminent performance issues.

fixes pulp#3400
@@ -273,13 +274,67 @@ class CharInFilter(BaseInFilter, CharFilter):
pass


class LabelSelectFilter(Filter):
class LabelFilter(Filter):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking about write a test for this filter? (or if makes any sense for testing it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ggainey
ggainey previously requested changes Dec 6, 2022
Copy link
Contributor

@ggainey ggainey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify for me - we are implementing this at the individual-model-level "for now", for just the models that currently expose labels. "Later", we may rework this to be provided at either Base or Master level, to expand label-access more generally - but that's much more invasive, esp in the migrations, and we want to address the discovered performance-problem "Now". Is that understanding correct?

Other than that - I'll second decko's request for a test for label-filtering.

@dralley
Copy link
Contributor

dralley commented Dec 7, 2022

Just to clarify for me - we are implementing this at the individual-model-level "for now", for just the models that currently expose labels. "Later", we may rework this to be provided at either Base or Master level, to expand label-access more generally - but that's much more invasive, esp in the migrations, and we want to address the discovered performance-problem "Now". Is that understanding correct?

I'd probably go a bit stronger and say that once we pick an approach realistically we won't be able to back down from it. Django doesn't like moving fields from a child to a parent model table one bit. It can probably be done, but not without a flag day for all plugins, e.g. Pulp 4.

@mdellweg
Copy link
Member Author

mdellweg commented Dec 7, 2022

Just to clarify for me - we are implementing this at the individual-model-level "for now", for just the models that currently expose labels. "Later", we may rework this to be provided at either Base or Master level, to expand label-access more generally - but that's much more invasive, esp in the migrations, and we want to address the discovered performance-problem "Now". Is that understanding correct?

I'd probably go a bit stronger and say that once we pick an approach realistically we won't be able to back down from it. Django doesn't like moving fields from a child to a parent model table one bit. It can probably be done, but not without a flag day for all plugins, e.g. Pulp 4.

Well, Django allows us to move the field from the (non-abstract) master models up in the hierarchy to the abstract MasterModel. But this is already marginally dangerous, because we expose MasterModel in the plugin api making this a breaking change. Adding the field to the abstract BaseModel is definitely breaking existing plugins. This is almost the BaseDistribution dilemma reloaded. Bottom line: Do not change abstract base classes (removing the generic relation is not a database change...).
Upside of this approach: We do not add labels to Content (would not match the pulp philosophy) and other tables like Workers (that could be potentially interesting if we add atomic label change endpoints). We are intentional about the things we can label.

@mdellweg mdellweg requested review from ggainey and decko December 7, 2022 09:42
@daviddavis
Copy link
Contributor

This change looks great. 👍 We considered hstore in the original proposal but I assumed that when the postgresql docs mentioned "rarely examined" data, it probably wasn't indexed but it seems that it is indexed.

One thing I wanted to mention though is that the label filter should be able to handle filtering by 6 labels but I suspect there's some problem with the query. It might be worth using QuerySet.explain() to figure out the exact problem but I am pretty sure the problem is that GenericRelationModel isn't indexed AT ALL and it probably should be. From the django docs on GenericForeignKey:

Unlike for the ForeignKey, a database index is not automatically created on the GenericForeignKey, so it’s recommended that you use Meta.indexes to add your own multiple column index. This behavior may change in the future.

@daviddavis
Copy link
Contributor

Actually it looks like the Label model doesn't extend GenericRelationModel so I was wrong. It still is probably worth adding an index to GenericRelationModel though.

@ggainey ggainey dismissed their stale review December 8, 2022 18:55

Tests already exist that exercise the changing codepath

Copy link
Contributor

@ggainey ggainey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to get this performance-knee addressed, and this looks like it does the needful. Approved

@decko
Copy link
Member

decko commented Dec 12, 2022

Actually it looks like the Label model doesn't extend GenericRelationModel so I was wrong. It still is probably worth adding an index to GenericRelationModel though.

Maybe an explain could provide a better answer to it. ☝️

And, do the solution proposed by this PR addresses the performance issues of the Label feature?

@mdellweg
Copy link
Member Author

And, do the solution proposed by this PR addresses the performance issues of the Label feature?

Yes, it does:
#3400 (comment)

And we may eventually decide to even add an index to the hstore field if it's still not sufficiently performant.

@decko
Copy link
Member

decko commented Dec 13, 2022

And, do the solution proposed by this PR addresses the performance issues of the Label feature?

Yes, it does: #3400 (comment)

And we may eventually decide to even add an index to the hstore field if it's still not sufficiently performant.

Sorry @mdellweg. The question was not for you. I should have mention @daviddavis.

@mdellweg mdellweg merged commit 4e25949 into pulp:main Dec 13, 2022
@mdellweg mdellweg deleted the label_hstore branch December 13, 2022 14:15
ekohl added a commit to ekohl/puppet-pulpcore that referenced this pull request Feb 1, 2023
Pulpcore 3.22 will start using HStore[1], which is in
postgresql-contrib. This ensures it's installed.

[1]: pulp/pulpcore#3427
ekohl added a commit to ekohl/puppet-pulpcore that referenced this pull request Feb 1, 2023
Pulpcore 3.22 will start using HStore[1], which is in
postgresql-contrib. This ensures it's installed and enabled on the DB.
That needs to be done by a super user, so it can't be done in regular DB
migrations.

[1]: pulp/pulpcore#3427
evgeni pushed a commit to theforeman/puppet-pulpcore that referenced this pull request Feb 16, 2023
Pulpcore 3.22 will start using HStore[1], which is in
postgresql-contrib. This ensures it's installed and enabled on the DB.
That needs to be done by a super user, so it can't be done in regular DB
migrations.

[1]: pulp/pulpcore#3427
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

label-select option hang when using multiple labels
5 participants