Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Removing uniqueness constraints on tables table" #6777

Merged
merged 3 commits into from
Jan 31, 2019

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Jan 30, 2019

Reverting #6718 as the resulting uniqueness is too long in MySQL, i.e.,

sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1071, 'Specified key was too long; max key length is 767 bytes') [SQL: 'ALTER TABLE 

Per #5449 the column sizes need to be reduced in order to prevent the issue, however there are open concerns able reducing the size of the table name.

Note the title for #6718 is a little misleading as the uniqueness constraint wasn't removed but augmented.

to: @agrawaldevesh @michellethomas @mistercrunch

@agrawaldevesh
Copy link

Hi John, Thanks for jumping on this and doing the revert. I am wondering if there is a way I can reproduce this ? Or maybe you can share the full log message ?

The reason I am confused that this PR is to blame is that it only creates a constant named (and thus sized) constraint "uq_table_in_db_schema". So I am not sure why it would be causing this "long key" issue.

Perhaps we can work around this by always recreating the table like we do for sqlite (like in https://github.com/apache/incubator-superset/pull/6777/files#diff-b4aec6518fae5ca2dcce669810ba5695L49).

If I have the full log message or something, then I can understand this error and the cause of this revert better.

@john-bodley
Copy link
Member Author

john-bodley commented Jan 30, 2019

@agrawaldevesh per the comment in my PR, by creating a uniqueness constraint on a tuple of columns the size of the constraint is defined by the combined size of the underlying columns (per here).

Previous the constrain required: 4 (database_id) + 250 (table_name) * 3 = 754 bytes, whereas now it requires: 4 (database_id) + [255 (schema) + 250 (table_name)] * 3 = 1,519 bytes. In #5449 the column sizes were changed: 4 (database_id) + [127 (schema) + 127 (table_name)] * 3 = 766 bytes.

@agrawaldevesh
Copy link

Ah okay, So the right thing to do here is to really remove the unique constraints (and fix/ignore the resulting unit test failures by doing so).

Also, is my understanding correct that in utf8mb3 in mysql, the character can take upto 3 bytes. So it shouldn't be above the 768 bytes limit all the time ? Perhaps only in certain cases ... do you have no ascii named tables ?

Thanks

@john-bodley
Copy link
Member Author

@agrawaldevesh I think the right thing is to revert the PR and then remedy the issue regarding the column sizes. I'm not supportive of removing the uniqueness constraint as this can be problematic to fix in the future. Currently there are a number of instances in our deployment of duplicate entities due to missing/ill-defined uniqueness constraints.

Unique indexes must have a known maximum length (a requirement of MySQL due to its internal implementation), i.e., the size constraint is irrelevant of what's currently stored in the database.

@agrawaldevesh
Copy link

@john-bodley , thanks for explaining !. So I think the best course of action according to you is to revert my PR and then get #5449 (the one that reduces the column size for table name) and the #5445 (wtf-forms one) in ?

@mistercrunch
Copy link
Member

I support the revert, LGTM, let's start over clean.

@agrawaldevesh
Copy link

Btw, I am not sure if this issue seen by John is setup dependant or not.

We use MySql too and it didn't fail for us. So I don't think my PR always makes MySQL fail (there are unit tests for that too !). Perhaps it only fails when the sizes of the actual table/schema/database names exceed a threshold.

Here are some details about my mysql environment:

XXX@XXX superset> SHOW VARIABLES LIKE "%version%";
+-------------------------+--------------------------------------------------------+
| Variable_name | Value |
+-------------------------+--------------------------------------------------------+
| innodb_version | 5.7.24-26 |
| protocol_version | 10 |
| slave_type_conversions | |
| tls_version | TLSv1,TLSv1.1,TLSv1.2 |
| version | 5.7.24-26-log |
| version_comment | Percona Server (GPL), Release '26', Revision 'c8fe767' |
| version_compile_machine | x86_64 |
| version_compile_os | debian-linux-gnu |
| version_suffix | -log |
+-------------------------+--------------------------------------------------------+
9 rows in set (0.00 sec)

XXX@XXX superset> show create table tables;
| tables | CREATE TABLE tables (
created_on datetime DEFAULT NULL,
changed_on datetime DEFAULT NULL,
id int(11) NOT NULL AUTO_INCREMENT,
table_name varchar(250) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
main_dttm_col varchar(250) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
default_endpoint text COLLATE utf8mb4_unicode_ci,
database_id int(11) NOT NULL,
created_by_fk int(11) DEFAULT NULL,
changed_by_fk int(11) DEFAULT NULL,
offset int(11) DEFAULT NULL,
description text COLLATE utf8mb4_unicode_ci,
is_featured tinyint(1) DEFAULT NULL,
cache_timeout int(11) DEFAULT NULL,
schema varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
sql text COLLATE utf8mb4_unicode_ci,
params text COLLATE utf8mb4_unicode_ci,
perm varchar(1000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
filter_select_enabled tinyint(1) DEFAULT NULL,
fetch_values_predicate varchar(1000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
is_sqllab_view tinyint(1) DEFAULT '0',
template_params text COLLATE utf8mb4_unicode_ci,
PRIMARY KEY (id),
UNIQUE KEY uq_table_in_db_schema (database_id,schema,table_name),
KEY created_by_fk (created_by_fk),
KEY changed_by_fk (changed_by_fk),
CONSTRAINT tables_ibfk_1 FOREIGN KEY (database_id) REFERENCES dbs (id),
CONSTRAINT tables_ibfk_2 FOREIGN KEY (created_by_fk) REFERENCES ab_user (id),
CONSTRAINT tables_ibfk_3 FOREIGN KEY (changed_by_fk) REFERENCES ab_user (id)
) ENGINE=InnoDB AUTO_INCREMENT=123 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |

@john-bodley john-bodley merged commit 2631558 into master Jan 31, 2019
@john-bodley john-bodley deleted the revert-6718-fix-uniqueness-table branch January 31, 2019 17:57
@agrawaldevesh
Copy link

@john-bodley, can you comment on whether this is indeed something in your environment vs indeed something that was broken by this diff. I am not sure because I am not understanding why it works with MySql in my production environment but not in yours. Thanks.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants