Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Table#getColumns() performance by reducing its runtime complexity #2724

Merged
merged 2 commits into from
Jun 1, 2017

Conversation

evgpisarchik
Copy link
Contributor

@evgpisarchik evgpisarchik commented May 14, 2017

Resolves #2722
Seems resonable proposal by @verheyenkoen
Could you review these changes please?

});

return $columns;
return array_replace(array_flip($colNames), $columns);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must say that I neither understand this nor the original code. What is the intent: sorting columns? Is there a corresponding test somewhere?

Copy link

@verheyenkoen verheyenkoen May 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reasons PK columns are put first, then FK columns, then the other ones. No idea why that was implemented like this. No idea if anyone relies on that...

Copy link
Member

@Ocramius Ocramius May 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is indeed the "standard approach" when writing DDL, as it otherwise is really hard to figure out what is going on. Still, is there no O(n * log(n)) or similar solution that doesn't make the code unreadable? Original code was heavy due to the repeated array_search() calls, which make this O(n * log(n) * n), so your approach is good, just not understandable at all.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean this may be better if the array_replace line is refactored in a separate private method with a meaningful name?

Copy link
Contributor Author

@evgpisarchik evgpisarchik May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it more comprehensible?

  /**
     * Returns ordered list of columns (primary keys are first, then foreign keys, then the rest)
     * @return Column[]
     */
    public function getColumns()
    {
        return array_merge($this->getPrimaryKeyColumns(), $this->getForeignKeyColumns(), $this->_columns);
    }

    /**
     * Returns primary key columns
     * @return array
     */
    public function getPrimaryKeyColumns()
    {
       $primaryKeyColumnNames = $this->getPrimaryKeyColumnNames();
        return array_filter($this->_columns, function ($key) {
            return in_array($key, $primaryKeyColumnNames);
        }, ARRAY_FILTER_USE_KEY);
    }

    /**
     * Returns foreign key columns
     * @return array
     */
    public function getForeignKeyColumns()
    {
       $foreignKeyColumnNames = $this->getForeignKeyColumnNames();
        return array_filter($this->_columns, function ($key) {
            return in_array($key,  $foreignKeyColumnNames);
        }, ARRAY_FILTER_USE_KEY);
    }

    /**
     * Returns primary key column names
     * @return array
     */
    public function getPrimaryKeyColumnNames()
    {
        $primaryKeyColumnNames = [];
        if ($this->hasPrimaryKey()) {
            $primaryKeyColumnNames = $this->getPrimaryKey()->getColumns();
        }
        return $primaryKeyColumnNames;
    }

    /**
     * Returns foreign key column names
     * @return array
     */
    public function getForeignKeyColumnNames()
    {
        $foreignKeyColumnNames = [];
        foreach ($this->getForeignKeys() as $foreignKey) {
            /* @var $foreignKey ForeignKeyConstraint */
            $foreignKeyColumnNames = array_merge($foreignKeyColumnNames, $foreignKey->getColumns());
        }
        return $foreignKeyColumnNames;
    }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evgpisarchik That seems not too efficient as calls to getPrimaryKeyColumnNames() and getForeignKeyColumnNames() are called multiple times. Am I right?

Copy link
Contributor Author

@evgpisarchik evgpisarchik May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using php 5.6 syntax since composer.json has

"require": {
        "php": "^7.0",

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use PHP 7 syntax, as master requires PHP 7.x

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@verheyenkoen you're right, I've updated the code

@evgpisarchik
Copy link
Contributor Author

Updated previously pushed code, squashed both commits

Copy link
Member

@Ocramius Ocramius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

squashed both commits

Urgh, now I can't compare it with the previous solution. Please don't squash commits next time :S

As far as I can see, this patch added Table#getForeignKeyColumns(), Table#getPrimaryKeyColumnNames(), Table#getForeignKeyColumnNames(), which are all now to be tested to prevent BC regressions.

I appreciate your effort, I really do, but the newly exposed public API is a problem.

@@ -742,7 +742,7 @@ private function getPreAlterTableAlterIndexForeignKeySQL(TableDiff $diff)
foreach ($diff->changedIndexes as $changedIndex) {
// Changed primary key
if ($changedIndex->isPrimary() && $diff->fromTable instanceof Table) {
foreach ($diff->fromTable->getPrimaryKeyColumns() as $columnName) {
foreach ($diff->fromTable->getPrimaryKeyColumnNames() as $columnName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not simply $diff->fromTable->getPrimaryKeyColumns() and then on line 746 $columnName->getName() ?

* @return Column[]
*/
public function getColumns()
{
$columns = $this->_columns;
return array_merge($this->getPrimaryKeyColumns(), $this->getForeignKeyColumns(), $this->_columns);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much cleaner 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using $this->getPrimaryKeyColumns() in this method will fail if there are no primary keys, and the BC break in getPrimaryKeyColumns is reverted (exception throwing on empty PK)

{
$primaryKeyColumnNames = $this->getPrimaryKeyColumnNames();
return array_filter($this->_columns, function ($key) use ($primaryKeyColumnNames) {
return in_array($key, $primaryKeyColumnNames);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use strict comparisons at all times

public function getPrimaryKeyColumns()
{
if ( ! $this->hasPrimaryKey()) {
throw new DBALException("Table " . $this->getName() . " has no primary key.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception is gone - that is a BC break, as now an empty list of primary key columns is returned

* Returns foreign key column names
* @return array
*/
public function getForeignKeyColumnNames()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be public

* Returns foreign key columns
* @return Column[]
*/
public function getForeignKeyColumns()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be public

* Returns primary key column names
* @return array
*/
public function getPrimaryKeyColumnNames()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be public

@evgpisarchik
Copy link
Contributor Author

evgpisarchik commented May 21, 2017

Reverted changes to public Api that were made before (now getPrimaryKeyColumns returns array of strings (not Column[]).

getColumns method is cleaner that it was before and has the same behavior as the original code (it doesn't throw exception if table doesn't have primary keys )

@evgpisarchik evgpisarchik changed the title uksort replaced by array_replace to improve performance Table->getColumns() performance May 21, 2017
@Ocramius Ocramius added this to the 2.6 milestone Jun 1, 2017
@Ocramius Ocramius self-assigned this Jun 1, 2017
@Ocramius
Copy link
Member

Ocramius commented Jun 1, 2017

@evgpisarchik perfect, thanks for all the effort! \o/

🚢

@Ocramius Ocramius merged commit 476c2f9 into doctrine:master Jun 1, 2017
@evgpisarchik
Copy link
Contributor Author

my pleasure!

@Ocramius Ocramius changed the title Table->getColumns() performance Improve Table#getColumns() performance Jul 22, 2017
@Ocramius Ocramius changed the title Improve Table#getColumns() performance Improve Table#getColumns() performance by reducing its runtime complexity Jul 22, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants