Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic list subpartitioning (rough proposal) #244

Open
rgupta33 opened this issue Jun 15, 2024 · 0 comments
Open

Dynamic list subpartitioning (rough proposal) #244

rgupta33 opened this issue Jun 15, 2024 · 0 comments

Comments

@rgupta33
Copy link

rgupta33 commented Jun 15, 2024

django-postgres-extra makes it easy to partition by a single foreign key field. It would be nice to add better support for subpartitioning. To walk through an example, we can start with this simple model:

class MyModel(...):
    id = UUIDField(..., primary_key=True)
    amount = DecimalField(...)
    
    # fields we want to partition by
    account = ForeignKeyField(...)
    sub_account = ForeignKeyField(...)

Partitioning by a single field

Partitioning by the account field is straightforward.

class MyModel(...):
    class PartitioningMeta:
         method = PostgresPartitioningMethod.LIST
         key = ["account_id"]
    
    id = UUIDField(..., primary_key=True)
    amount = DecimalField(...)
    
   # fields to partition by
    account = ForeignKeyField(...)
    sub_account = ForeignKeyField(...)

Whenever a new account is created, we call add_list_partition.

def create_account(...):
    account = Account.objects.create_model(...)
    
    with transaction.atomic():
        connection.schema_editor().add_list_partition(
            model=MyModel,
            name=account.id,
            values=[account.id]
        )

Current issues with subpartitioning

Now, say we want to partition each account partition based on the sub_account field.

class PartitioningMeta:
         method = PostgresPartitioningMethod.LIST
         key = ["account_id", "sub_account_id"]

Running python manage.py pgmakemigrations will not cause any problems, but when running that migration with python manage.py migrate, Postgres gives back an error: cannot use "list" partition strategy with more than one column. Essentially, we can only partition by one field at a time.

Subpartitioning proposal

Add additional_pk_fields

Since Postgres requires that the field we are partitioning on is part of the table's primary key, we should be able to pass in additional PK fields for the initial creation of the table.

class PartitioningMeta:
         method = PostgresPartitioningMethod.LIST
         key = ["account_id"]
         additional_pk_fields = ["sub_account_id"]

Alternatively, an easier but hackier approach would be to use first element of the key list as the top-level partitioning key but to use all the elements of the key list for the PK.

class PartitioningMeta:
         method = PostgresPartitioningMethod.LIST
         key = ["account_id", "sub_account_id"]

Either way, the generated SQL should contain:

  • A composite PK on id, account_id, and sub_account_id
  • PARTITION BY LIST (account_id)

Modify create_partition_table_name to reference partitions

def create_partition_table_name(self, model: Type[Model], name: str, existing_partition_fields: list[str] | None = None) -> str:
        params = [model._meta.db_table.lower(), *(existing_partition_fields or []), name.lower()]
        return "_".join(["%s"] * len(params))

Example

# returns "mymodel_acct1_subacct1"
create_partition_table_name(model=MyModel, name="subacct1", existing_partition_fields=["acct1"]

Modify add_list_partition

def add_list_partition(
        self,
        model: Type[Model],
        name: str,
        values: List[Any],
        existing_partition_fields: list[str] | None = None,
        next_partition_field: str | None = None,
        comment: Optional[str] = None,
    ) -> None:
        meta = self._partitioning_properties_for_model(model)

        table_name = self.create_partition_table_name(model, name, existing_partition_fields)

        params = [
            self.quote_name(table_name),
            self.quote_name(model._meta.db_table) + (("_" + "_".join(existing_partition)) if existing_partition else ""),
            ",".join(["%s" for _ in range(len(values))]),
            meta.method.upper() if next_partition_field is not None else None,
            self.quote_name(next_partition_field) if next_partition_field is not None else None,
        ]
        params = tuple([param for param in params if param is not None])

        sql = (self.sql_add_list_partition + (self.sql_partition_by if next_partition_field is not None else "")) % params

        with transaction.atomic():
            self.execute(sql, values)

            if comment:
                self.set_comment_on_table(table_name, comment)

Call add_list_partition with the additional inputs

def create_account(...):
    account = Account.objects.create_model(...)
    
    with transaction.atomic():
        connection.schema_editor().add_list_partition(
            model=MyModel,
            name=account.id,
            values=[account.id],
            next_partition_field="sub_account_id"
        )
def create_sub_account(...):
    sub_account = SubAccount.objects.create_model(...)
    
    with transaction.atomic():
        connection.schema_editor().add_list_partition(
            model=MyModel,
            name=sub_account.id,
            values=[sub_account.id],
            existing_partition_fields=["account_id"],
            # could pass `next_partition_field` if further subpartitioning is needed
        )
@rgupta33 rgupta33 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 15, 2024
@rgupta33 rgupta33 reopened this Jun 15, 2024
@rgupta33 rgupta33 changed the title Dynamic list subpartitioning Dynamic list subpartitioning (proposal outline) Jun 15, 2024
@rgupta33 rgupta33 changed the title Dynamic list subpartitioning (proposal outline) Dynamic list subpartitioning (rough proposal) Jun 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant