Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TiFlash join may got wrong result if fine grained join is enabled, and multiple join key has different types #59877

Closed
windtalker opened this issue Mar 4, 2025 · 9 comments · Fixed by #59884
Assignees
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.

Comments

@windtalker
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

mysql> create table t1(id bigint, v1 int);
Query OK, 0 rows affected (0.04 sec)

mysql> create table t2(id bigint unsigned, v1 int);
Query OK, 0 rows affected (0.05 sec)

mysql> create table t3(id bigint, v1 int);
Query OK, 0 rows affected (0.05 sec)

mysql> insert into t1 values(10001, 1);
Query OK, 1 row affected (0.01 sec)

mysql> insert into t2 values(10001, 2);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t3 values(10001, 3);
Query OK, 1 row affected (0.01 sec)

mysql> set tidb_broadcast_join_threshold_size=0;
Query OK, 0 rows affected (0.00 sec)

mysql> set tidb_broadcast_join_threshold_count=0;
Query OK, 0 rows affected (0.00 sec)

mysql> alter table t1 set tiflash replica 1;
Query OK, 0 rows affected (0.06 sec)

mysql> alter table t2 set tiflash replica 1;
Query OK, 0 rows affected (0.04 sec)

mysql> alter table t3 set tiflash replica 1;
Query OK, 0 rows affected (0.06 sec)

mysql> set tidb_enforce_mpp=1;
Query OK, 0 rows affected (0.01 sec)

mysql> set tiflash_fine_grained_shuffle_stream_count=0;
Query OK, 0 rows affected (0.00 sec)

mysql>  select /*+ hash_join_build(t3) */ count(*) from t1 straight_join t2 on t1.id = t2.id straight_join t3 on t1.id = t3.id;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.03 sec)

mysql> set tiflash_fine_grained_shuffle_stream_count=-1;
Query OK, 0 rows affected (0.00 sec)

mysql>  select /*+ hash_join_build(t3) */ count(*) from t1 straight_join t2 on t1.id = t2.id straight_join t3 on t1.id = t3.id;
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.07 sec)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

@windtalker windtalker added the type/bug The issue is confirmed as a bug. label Mar 4, 2025
@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

/assign @yibin87

@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

/sig execution

@ti-chi-bot ti-chi-bot bot added the sig/execution SIG execution label Mar 4, 2025
@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

/severity critical

@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

/label affects-7.1

@ti-chi-bot ti-chi-bot bot added affects-7.1 This bug affects the 7.1.x(LTS) versions. and removed may-affects-7.1 labels Mar 4, 2025
@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

/label affects-7.5
/label affects-8.1
/label affects-8.5

@ti-chi-bot ti-chi-bot bot added affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. and removed may-affects-7.5 may-affects-8.1 may-affects-8.5 labels Mar 4, 2025
@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

/label affects-6.5

@ti-chi-bot ti-chi-bot bot added affects-6.5 This bug affects the 6.5.x(LTS) versions. and removed may-affects-6.5 labels Mar 4, 2025
@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

Users can workaround the issue by set tiflash_fine_grained_shuffle_stream_count = -1. Set the severity to critical for two reasons:

  1. Incorrect query results
  2. Report by customers

The issue's root cause is the shuffle column type is not equal to the comparing join key type, and the join build key type "Jump back and forth":

  1. t1 join t2 on t1.id = t2.id, t1.id is bigint, while t2.id is bigint unsigned, thus choose the common type decimal(20,0) to shuffle. And in this join, the comparing type is also decimal(20,0), so this join result is correct
  2. then (t1 join t2) join t3 on t1.id = t3.id, although t1.id and t3.id are both bigint type, optimizer chooses to benefit from the existing shuffle result to reduce one shuffle: the result of t1 join t2 already obeys shuffling cast(t1.id, decimal(20,0)). So optimizer choose to add a shuffle for t3 and the final shuffle key is cast(t3.id, decimal(20,0))
  3. From v6.5.0, we introduced FineGrainedShuffle optimization to reuse the hash code generated in shuffle phase in join build phase. In this case, the reused hash code is generated for cast(t3.id, decimal(20,0)); However, the final join equal condition is t1.id = t3.id, both are bigint type, no any type convertion is needed. Thus in join probe phase, the hash code is generated for t1.id directly, which cause the miss match.

@yibin87
Copy link
Contributor

yibin87 commented Mar 4, 2025

/remove-label may-affects-6.1

@yibin87
Copy link
Contributor

yibin87 commented Mar 6, 2025

/remove-label affects-6.5

@ti-chi-bot ti-chi-bot bot removed the affects-6.5 This bug affects the 6.5.x(LTS) versions. label Mar 6, 2025
ti-chi-bot bot pushed a commit that referenced this issue Mar 6, 2025
…nsure shuffle keys are the same with actual join keys (#59884) (#59916)

close #59877
ti-chi-bot bot pushed a commit that referenced this issue Mar 6, 2025
…nsure shuffle keys are the same with actual join keys (#59884) (#59914)

close #59877
ti-chi-bot bot pushed a commit that referenced this issue Mar 7, 2025
…nsure shuffle keys are the same with actual join keys (#59884) (#59913)

close #59877
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
2 participants