跳转至

6.8.5 Hint

数据库 Hint 是一种数据库查询优化技术,用于指导数据库查询优化器如何执行特定的查询。通过提供 Hint ,用户可以对查询优化器的默认行为进行微调,以期望获得更好的性能或满足特定需求。

Hint 作用:

  • 性能优化:通过 Hint ,可以影响查询优化器的执行计划,从而提升查询性能。

  • 控制执行计划:可以指定使用连接方法、排序方法等。

  • 调试和测试:在调试和测试查询性能时, Hint 可以帮助确定问题的根源。

1 Hint 概述

在数据库中, Hint 是一种指令,用于指导查询优化器制定执行计划。通过在 SQL 语句中嵌入 Hint ,可以影响优化器的决策,从而选择期望的执行路径。

以下是一个使用 Hint 的背景示例:

假设有一个包含大量数据的表,而在某些特定情况下,你了解到在一个查询中,表的连接顺序可能会影响查询性能。此时, Leading Hint 允许你指定希望优化器遵循的表连接顺序。

以下面 SQL 查询为例:若执行效率不理想,我们希望调整 join 顺序,同时不改变原始 SQL ,以免影响用户原始场景,并达到调优目的。

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
mysql> explain shape plan select * from t1 join t2 on t1.c1 = c2;
+-------------------------------------------+
| Explain String                            |
+-------------------------------------------+
| PhysicalResultSink                        |
| --PhysicalDistribute                      |
| ----PhysicalProject                       |
| ------hashJoin[INNER_JOIN](t1.c1 = t2.c2) |
| --------PhysicalOlapScan[t2]              |
| --------PhysicalDistribute                |
| ----------PhysicalOlapScan[t1]            |
+-------------------------------------------+
7 rows in set (0.06 sec)

此时,我们可以使用 Leading Hint 来任意改变 t1t2Join 顺序。例如:

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
mysql> explain shape plan select /*+ leading(t2 t1)*/ * from t1 join t2 on c1 = c2;
+-----------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                     |
+-----------------------------------------------------------------------------------------------------+
| PhysicalResultSink                                                                                  |
| --PhysicalDistribute                                                                                |
| ----PhysicalProject                                                                                 |
| ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() build RFs:RF0 c1->[c2] |
| --------PhysicalOlapScan[t2] apply RFs: RF0                                                         |
| --------PhysicalDistribute                                                                          |
| ----------PhysicalOlapScan[t1]                                                                      |
|                                                                                                     |
| Hint log:                                                                                           |
| Used: leading(t2 t1)                                                                                |
| UnUsed:                                                                                             |
| SyntaxError:                                                                                        |
+-----------------------------------------------------------------------------------------------------+
12 rows in set (0.06 sec)

在此示例中,使用了 /*+ leading(t2 t1)*/ 这类 Hint 。这类 Hint 会告知优化器在执行计划中使用指定表( t2 )作为驱动表,并将其置于( t1 )之前。

目前, Doris 主要支持与 Join 相关的 Hint 来指定连接操作的顺序或方式,包括:

  • LeadingHint :主要用于控制连接操作的顺序。

  • OrderedHint :主要用于固定连接操作的顺序,可以直接生成基于 SQL 写法的文本序的连接操作、

  • DistributeHint :主要用于固定连接操作右表的分布属性。

  • SetVarHint :主要用于设置在单条 SQL 里面使用的 sessionVariables ,其作用仅在该条 SQL 的生命周期内生效。

接下来,将详细阐述如何在 Doris 中使用上述四类 JoinHint

2 LeadingHint 使用说明

Leading Hint 是一种强大的查询优化技术,允许用户指导 Doris 优化器确定查询计划中的表连接顺序。正确使用 Leading Hint 可以显著提高复杂查询的性能。本文将详细介绍如何在 Doris 中使用 Leading Hint 来控制 join 顺序。

2.1 基本语法

Leading Hint 允许你指定希望优化器遵循的表连接顺序。在 Doris 里面, Leading Hint 的基本语法如下:

SQL
1
SELECT /*+ LEADING(tablespec [tablespec]...)*/ ...

其中,需要注意的是:

  1. Leading Hint/*+*/ 包围,并置于 SQL 语句中 SELECT 的正后方。

  2. tablespec 是表名或表别名,至少需要指定两个表。

  3. 多个表之间用空格分隔。

  4. 可以使用大括号 {} 来显式地指定 Join Tree 的形状。

Warning

Leading Hint 后方的 /SELECT 列表需要隔开至少一个分隔符,例如空格。至少需要写两个以上的表才认为这个 Leading Hint 是合理的。且任意的 Join 里面可以用大括号括起来,来显式地指定 Join Tree 的形状。

举例说明:

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
mysql> explain shape plan select /*+ leading(t2 t1)*/ * from t1 join t2 on c1 = c2;
+------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                              |
+------------------------------------------------------------------------------+
| PhysicalResultSink                                                           |
| --PhysicalDistribute[DistributionSpecGather]                                 |
| ----PhysicalProject                                                          |
| ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
| --------PhysicalOlapScan[t2]                                                 |
| --------PhysicalDistribute[DistributionSpecHash]                             |
| ----------PhysicalOlapScan[t1]                                               |
|                                                                              |
| Hint log:                                                                    |
| Used: leading(t2 t1)                                                         |
| UnUsed:                                                                      |
| SyntaxError:                                                                 |
+------------------------------------------------------------------------------+
12 rows in set (0.01 sec)
  1. Leading Hint 不生效的时候会走正常的流程生成计划, EXPLAIN 会显示使用的 Hint 是否生效,主要分三种来显示:

    • UsedLeading Hint 正常生效

    • Unused :这里不支持的情况包含 Leading Hint 指定的 join order 与原 SQL 不等价或本版本暂不支持特性

    • SyntaxError :指 Leading Hint 语法错误,如找不到对应的表等

  2. Leading Hint 语法默认构造出左深树:

    查询语句如下

    SQL
    1
    select /leading(t1 t2 t3)/ * from t1 join t2 on...
    

    查询结果如下:

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
        join
        /    \
    join    t3
    /    \
    t1    t2
    
    mysql> explain shape plan select /*+ leading(t1 t2 t3)*/ * from t1 join t2 on c1 = c2 join t3 on c2=c3;
    +--------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                |
    +--------------------------------------------------------------------------------+
    | PhysicalResultSink                                                             |
    | --PhysicalDistribute[DistributionSpecGather]                                   |
    | ----PhysicalProject                                                            |
    | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=()   |
    | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
    | ----------PhysicalOlapScan[t1]                                                 |
    | ----------PhysicalDistribute[DistributionSpecHash]                             |
    | ------------PhysicalOlapScan[t2]                                               |
    | --------PhysicalDistribute[DistributionSpecHash]                               |
    | ----------PhysicalOlapScan[t3]                                                 |
    |                                                                                |
    | Hint log:                                                                      |
    | Used: leading(t1 t2 t3)                                                        |
    | UnUsed:                                                                        |
    | SyntaxError:                                                                   |
    +--------------------------------------------------------------------------------+
    15 rows in set (0.00 sec)
    
  3. 同时允许使用大括号指定 Join 树形状:

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    mysql> explain shape plan select /*+ leading(t1 {t2 t3})*/ * from t1 join t2 on c1 = c2 join t3 on c2=c3;
    +----------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                  |
    +----------------------------------------------------------------------------------+
    | PhysicalResultSink                                                               |
    | --PhysicalDistribute[DistributionSpecGather]                                     |
    | ----PhysicalProject                                                              |
    | ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=()     |
    | --------PhysicalOlapScan[t1]                                                     |
    | --------PhysicalDistribute[DistributionSpecHash]                                 |
    | ----------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
    | ------------PhysicalOlapScan[t2]                                                 |
    | ------------PhysicalDistribute[DistributionSpecHash]                             |
    | --------------PhysicalOlapScan[t3]                                               |
    |                                                                                  |
    | Hint log:                                                                        |
    | Used: leading(t1 { t2 t3 })                                                      |
    | UnUsed:                                                                          |
    | SyntaxError:                                                                     |
    +----------------------------------------------------------------------------------+
    15 rows in set (0.02 sec)
    
  4. 当有 View 作为别名参与 JoinReorder 的时候可以指定对应的 View 作为 Leading Hint 的参数。例:

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    mysql> explain shape plan select /*+ leading(alias t1) */ count(*) from t1 join (select c2 from t2 join t3 on t2.c2 = t3.c3) as alias on t1.c1 = alias.c2;
    +--------------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                      |
    +--------------------------------------------------------------------------------------+
    | PhysicalResultSink                                                                   |
    | --hashAgg[GLOBAL]                                                                    |
    | ----PhysicalDistribute[DistributionSpecGather]                                       |
    | ------hashAgg[LOCAL]                                                                 |
    | --------PhysicalProject                                                              |
    | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = alias.c2)) otherCondition=()  |
    | ------------PhysicalProject                                                          |
    | --------------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
    | ----------------PhysicalProject                                                      |
    | ------------------PhysicalOlapScan[t2]                                               |
    | ----------------PhysicalDistribute[DistributionSpecHash]                             |
    | ------------------PhysicalProject                                                    |
    | --------------------PhysicalOlapScan[t3]                                             |
    | ------------PhysicalDistribute[DistributionSpecHash]                                 |
    | --------------PhysicalProject                                                        |
    | ----------------PhysicalOlapScan[t1]                                                 |
    |                                                                                      |
    | Hint log:                                                                            |
    | Used: leading(alias t1)                                                              |
    | UnUsed:                                                                              |
    | SyntaxError:                                                                         |
    +--------------------------------------------------------------------------------------+
    21 rows in set (0.06 sec)
    

2.2 基本用例

Tip

在接下来的案例中,列命名和表命名相关,例如当只有 t1 中有 c1 字段,为了简化会将 t1.c1 直接写成 c1

建表语句如下:

SQL
1
2
3
4
5
6
7
CREATE DATABASE testleading;
USE testleading;

create table t1 (c1 int, c11 int) distributed by hash(c1) buckets 3 properties('replication_num' = '1');
create table t2 (c2 int, c22 int) distributed by hash(c2) buckets 3 properties('replication_num' = '1');
create table t3 (c3 int, c33 int) distributed by hash(c3) buckets 3 properties('replication_num' = '1');
create table t4 (c4 int, c44 int) distributed by hash(c4) buckets 3 properties('replication_num' = '1');

原始 plan

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
mysql> explain shape plan select * from t1 join t2 on t1.c1 = c2;
+-------------------------------------------+
| Explain String                            |
+-------------------------------------------+
| PhysicalResultSink                        |
| --PhysicalDistribute                      |
| ----PhysicalProject                       |
| ------hashJoin[INNER_JOIN](t1.c1 = t2.c2) |
| --------PhysicalOlapScan[t2]              |
| --------PhysicalDistribute                |
| ----------PhysicalOlapScan[t1]            |
+-------------------------------------------+
7 rows in set (0.06 sec)

当我们需要交换 t1t2join 顺序时,只需在前面加上 leading(t2 t1) 即可。在执行 explain 时,会显示是否使用了这个 hint 。如下 Leading plan:Used 表示 Hint 正常生效

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
mysql> explain shape plan select /*+ leading(t2 t1)*/ * from t1 join t2 on c1 = c2;
+------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                              |
+------------------------------------------------------------------------------+
| PhysicalResultSink                                                           |
| --PhysicalDistribute[DistributionSpecGather]                                 |
| ----PhysicalProject                                                          |
| ------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
| --------PhysicalOlapScan[t2]                                                 |
| --------PhysicalDistribute[DistributionSpecHash]                             |
| ----------PhysicalOlapScan[t1]                                               |
|                                                                              |
| Hint log:                                                                    |
| Used: leading(t2 t1)                                                         |
| UnUsed:                                                                      |
| SyntaxError:                                                                 |
+------------------------------------------------------------------------------+
12 rows in set (0.00 sec)

如果 Leading Hint 存在语法错误, explain 时会在 SyntaxError 里显示相应信息,但计划仍能照常生成,只是不会使用 Leading 而已。例如:

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
mysql> explain shape plan select /*+ leading(t2 t3)*/ * from t1 join t2 on t1.c1 = c2;
+--------------------------------------------------------+
| Explain String                                         |
+--------------------------------------------------------+
| PhysicalResultSink                                     |
| --PhysicalDistribute                                   |
| ----PhysicalProject                                    |
| ------hashJoin[INNER_JOIN](t1.c1 = t2.c2)              |
| --------PhysicalOlapScan[t1]                           |
| --------PhysicalDistribute                             |
| ----------PhysicalOlapScan[t2]                         |
|                                                        |
| Used:                                                  |
| UnUsed:                                                |
| SyntaxError: leading(t2 t3) Msg:can not find table: t3 |
+--------------------------------------------------------+
11 rows in set (0.01 sec)

2.3 扩展场景

  1. 左深树

    上文我们提及, Doris 在查询语句不使用任何括号的情况下, Leading 会默认生成左深树。

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    mysql> explain shape plan select /*+ leading(t1 t2 t3)*/ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3;
    +--------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                |
    +--------------------------------------------------------------------------------+
    | PhysicalResultSink                                                             |
    | --PhysicalDistribute[DistributionSpecGather]                                   |
    | ----PhysicalProject                                                            |
    | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=()   |
    | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
    | ----------PhysicalOlapScan[t1]                                                 |
    | ----------PhysicalDistribute[DistributionSpecHash]                             |
    | ------------PhysicalOlapScan[t2]                                               |
    | --------PhysicalDistribute[DistributionSpecHash]                               |
    | ----------PhysicalOlapScan[t3]                                                 |
    |                                                                                |
    | Hint log:                                                                      |
    | Used: leading(t1 t2 t3)                                                        |
    | UnUsed:                                                                        |
    | SyntaxError:                                                                   |
    +--------------------------------------------------------------------------------+
    15 rows in set (0.10 sec)
    
  2. 右深树

    当需要将计划的形状做成右深树、 Bushy 树或者 zig-zag 树时,只需加上大括号来限制 plan 的形状即可,无需像 Oracle 使用 swap 从左深树一步步调整。

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    mysql> explain shape plan select /*+ leading(t1 {t2 t3})*/ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3;
    +-----------------------------------------------+
    | Explain String                                |
    +-----------------------------------------------+
    | PhysicalResultSink                            |
    | --PhysicalDistribute                          |
    | ----PhysicalProject                           |
    | ------hashJoin[INNER_JOIN](t1.c1 = t2.c2)     |
    | --------PhysicalOlapScan[t1]                  |
    | --------PhysicalDistribute                    |
    | ----------hashJoin[INNER_JOIN](t2.c2 = t3.c3) |
    | ------------PhysicalOlapScan[t2]              |
    | ------------PhysicalDistribute                |
    | --------------PhysicalOlapScan[t3]            |
    |                                               |
    | Used: leading(t1 { t2 t3 })                   |
    | UnUsed:                                       |
    | SyntaxError:                                  |
    +-----------------------------------------------+
    14 rows in set (0.02 sec)
    
  3. Bushy

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    mysql> explain shape plan select /*+ leading({t1 t2} {t3 t4})*/ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3 join t4 on c3 = c4;
    +-----------------------------------------------+
    | Explain String                                |
    +-----------------------------------------------+
    | PhysicalResultSink                            |
    | --PhysicalDistribute                          |
    | ----PhysicalProject                           |
    | ------hashJoin[INNER_JOIN](t2.c2 = t3.c3)     |
    | --------hashJoin[INNER_JOIN](t1.c1 = t2.c2)   |
    | ----------PhysicalOlapScan[t1]                |
    | ----------PhysicalDistribute                  |
    | ------------PhysicalOlapScan[t2]              |
    | --------PhysicalDistribute                    |
    | ----------hashJoin[INNER_JOIN](t3.c3 = t4.c4) |
    | ------------PhysicalOlapScan[t3]              |
    | ------------PhysicalDistribute                |
    | --------------PhysicalOlapScan[t4]            |
    |                                               |
    | Used: leading({ t1 t2 } { t3 t4 })            |
    | UnUsed:                                       |
    | SyntaxError:                                  |
    +-----------------------------------------------+
    17 rows in set (0.02 sec)
    
  4. Zig-Zag

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    mysql> explain shape plan select /*+ leading(t1 {t2 t3} t4)*/ * from t1 join t2 on t1.c1 = c2 join t3 on c2 = c3 join t4 on c3 = c4;
    +--------------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                      |
    +--------------------------------------------------------------------------------------+
    | PhysicalResultSink                                                                   |
    | --PhysicalDistribute[DistributionSpecGather]                                         |
    | ----PhysicalProject                                                                  |
    | ------hashJoin[INNER_JOIN] hashCondition=((t3.c3 = t4.c4)) otherCondition=()         |
    | --------PhysicalDistribute[DistributionSpecHash]                                     |
    | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=()     |
    | ------------PhysicalOlapScan[t1]                                                     |
    | ------------PhysicalDistribute[DistributionSpecHash]                                 |
    | --------------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
    | ----------------PhysicalOlapScan[t2]                                                 |
    | ----------------PhysicalDistribute[DistributionSpecHash]                             |
    | ------------------PhysicalOlapScan[t3]                                               |
    | --------PhysicalDistribute[DistributionSpecHash]                                     |
    | ----------PhysicalOlapScan[t4]                                                       |
    |                                                                                      |
    | Hint log:                                                                            |
    | Used: leading(t1 { t2 t3 } t4)                                                       |
    | UnUsed:                                                                              |
    | SyntaxError:                                                                         |
    +--------------------------------------------------------------------------------------+
    19 rows in set (0.02 sec)
    
  5. Non-inner Join

    当遇到非 inner-join (如 Outer JoinSemi/Anti Join )时, Leading Hint 会根据原始 SQL 语义自动推导各个 Join 的方式。若 Leading Hint 与原始 SQL 语义不同或无法生成,则会将其放入 UnUsed 中,但这并不影响计划正常流程的生成。

    以下是一个不能交换的例子:

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    -------- test outer join which can not swap
    -- t1 leftjoin (t2 join t3 on (P23)) on (P12) != (t1 leftjoin t2 on (P12)) join t3 on (P23)
    mysql> explain shape plan select /*+ leading(t1 {t2 t3})*/ * from t1 left join t2 on c1 = c2 join t3 on c2 = c3;
    +--------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                |
    +--------------------------------------------------------------------------------+
    | PhysicalResultSink                                                             |
    | --PhysicalDistribute[DistributionSpecGather]                                   |
    | ----PhysicalProject                                                            |
    | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=()   |
    | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
    | ----------PhysicalOlapScan[t1]                                                 |
    | ----------PhysicalDistribute[DistributionSpecHash]                             |
    | ------------PhysicalOlapScan[t2]                                               |
    | --------PhysicalDistribute[DistributionSpecHash]                               |
    | ----------PhysicalOlapScan[t3]                                                 |
    |                                                                                |
    | Hint log:                                                                      |
    | Used:                                                                          |
    | UnUsed: leading(t1 { t2 t3 })                                                  |
    | SyntaxError:                                                                   |
    +--------------------------------------------------------------------------------+
    15 rows in set (0.01 sec)
    

    下面是一些可以交换的例子和不能交换的例子,读者可自行验证

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    -------- test outer join which can swap
    -- (t1 leftjoin t2  on (P12)) innerjoin t3 on (P13) = (t1 innerjoin t3 on (P13)) leftjoin t2  on (P12)
    explain shape plan select *from t1 left join t2 on c1 = c2 join t3 on c1 = c3;
    explain shape plan select /*+ leading(t1 t3 t2) */* from t1 left join t2 on c1 = c2 join t3 on c1 = c3;
    
    -- (t1 leftjoin t2  on (P12)) leftjoin t3 on (P13) = (t1 leftjoin t3 on (P13)) leftjoin t2  on (P12)
    explain shape plan select *from t1 left join t2 on c1 = c2 left join t3 on c1 = c3;
    explain shape plan select /*+ leading(t1 t3 t2) */* from t1 left join t2 on c1 = c2 left join t3 on c1 = c3;
    
    -- (t1 leftjoin t2  on (P12)) leftjoin t3 on (P23) = t1 leftjoin (t2  leftjoin t3 on (P23)) on (P12)
    select /*+ leading(t2 t3 t1) SWAP_INPUT(t1) */ * from t1 left join t2 on c1 = c2 left join t3 on c2 = c3;
    explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 left join t2 on c1 = c2 left join t3 on c2 = c3;
    explain shape plan select /*+ leading(t1 {t2 t3})*/ * from t1 left join t2 on c1 = c2 left join t3 on c2 = c3;
    
    -------- test outer join which can not swap
    --  t1 leftjoin (t2  join t3 on (P23)) on (P12) != (t1 leftjoin t2  on (P12)) join t3 on (P23)
    -- eliminated to inner join
    explain shape plan select /*+ leading(t1 {t2 t3}) */ * from t1 left join t2 on c1 = c2 join t3 on c2 = c3;
    explain graph select /*+ leading(t1 t2 t3)*/ *from t1 left join (select* from t2 join t3 on c2 = c3) on c1 = c2;
    
    -- test semi join
    explain shape plan select *from t1 where c1 in (select c2 from t2);
    explain shape plan select /*+ leading(t2 t1) */* from t1 where c1 in (select c2 from t2);
    
    -- test anti join
    explain shape plan select * from t1 where exists (select c2 from t2);
    
  6. View

    在涉及别名( Alias )的情况下,可以将别名作为一个完整独立的子树进行指定,并在这些子树内部根据文本序生成 Join 顺序

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    mysql>  explain shape plan select /*+ leading(alias t1) */ count(*) from t1 join (select c2 from t2 join t3 on t2.c2 = t3.c3) as alias on t1.c1 = alias.c2;
    +--------------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                      |
    +--------------------------------------------------------------------------------------+
    | PhysicalResultSink                                                                   |
    | --hashAgg[GLOBAL]                                                                    |
    | ----PhysicalDistribute[DistributionSpecGather]                                       |
    | ------hashAgg[LOCAL]                                                                 |
    | --------PhysicalProject                                                              |
    | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = alias.c2)) otherCondition=()  |
    | ------------PhysicalProject                                                          |
    | --------------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=() |
    | ----------------PhysicalProject                                                      |
    | ------------------PhysicalOlapScan[t2]                                               |
    | ----------------PhysicalDistribute[DistributionSpecHash]                             |
    | ------------------PhysicalProject                                                    |
    | --------------------PhysicalOlapScan[t3]                                             |
    | ------------PhysicalDistribute[DistributionSpecHash]                                 |
    | --------------PhysicalProject                                                        |
    | ----------------PhysicalOlapScan[t1]                                                 |
    |                                                                                      |
    | Hint log:                                                                            |
    | Used: leading(alias t1)                                                              |
    | UnUsed:                                                                              |
    | SyntaxError:                                                                         |
    +--------------------------------------------------------------------------------------+
    21 rows in set (0.02 sec)
    

3 OrderedHint 使用说明

OrderedHint 用于固定 Join Tree 的形状,使其按照查询中表的文本顺序进行显示和执行。这在需要精确控制查询计划时特别有用。

OrderedHint 的语法为 /*+ ORDERED*/ ,它应该被放置在 SELECT 语句中的 SELECT 关键字之后,紧接着查询的其余部分。

以下是一个使用 OrderedHint 的示例:

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
mysql> explain shape plan select /*+ ORDERED*/ t1.c1 from t2 join t1 on t1.c1 = t2.c2 join t3 on c2 = c3;
+--------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                |
+--------------------------------------------------------------------------------+
| PhysicalResultSink                                                             |
| --PhysicalDistribute[DistributionSpecGather]                                   |
| ----PhysicalProject                                                            |
| ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=()   |
| --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
| ----------PhysicalProject                                                      |
| ------------PhysicalOlapScan[t2]                                               |
| ----------PhysicalDistribute[DistributionSpecHash]                             |
| ------------PhysicalProject                                                    |
| --------------PhysicalOlapScan[t1]                                             |
| --------PhysicalDistribute[DistributionSpecHash]                               |
| ----------PhysicalProject                                                      |
| ------------PhysicalOlapScan[t3]                                               |
|                                                                                |
| Hint log:                                                                      |
| Used: ORDERED                                                                  |
| UnUsed:                                                                        |
| SyntaxError:                                                                   |
+--------------------------------------------------------------------------------+
18 rows in set (0.02 sec)

3.1 与 LeadingHint 的关系

OrderedHintLeadingHint 同时使用时, OrderedHint 将优先于 LeadingHint 。这意味着,即使指定了 LeadingHint ,如果同时存在 OrderedHint ,查询计划将按照 OrderedHint 的规则来执行,而 LeadingHint 将被忽略。

以下是一个示例,展示了当两者同时使用时的情况:

SQL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
mysql> explain shape plan select /*+ ORDERED LEADING(t1 t2 t3)*/ t1.c1 from t2 join t1 on t1.c1 = t2.c2 join t3 on c2 = c3;
  +--------------------------------------------------------------------------------+
  | Explain String(Nereids Planner)                                                |
  +--------------------------------------------------------------------------------+
  | PhysicalResultSink                                                             |
  | --PhysicalDistribute[DistributionSpecGather]                                   |
  | ----PhysicalProject                                                            |
  | ------hashJoin[INNER_JOIN] hashCondition=((t2.c2 = t3.c3)) otherCondition=()   |
  | --------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
  | ----------PhysicalProject                                                      |
  | ------------PhysicalOlapScan[t2]                                               |
  | ----------PhysicalDistribute[DistributionSpecHash]                             |
  | ------------PhysicalProject                                                    |
  | --------------PhysicalOlapScan[t1]                                             |
  | --------PhysicalDistribute[DistributionSpecHash]                               |
  | ----------PhysicalProject                                                      |
  | ------------PhysicalOlapScan[t3]                                               |
  |                                                                                |
  | Hint log:                                                                      |
  | Used: ORDERED                                                                  |
  | UnUsed: leading(t1 t2 t3)                                                      |
  | SyntaxError:                                                                   |
  +--------------------------------------------------------------------------------+
  18 rows in set (0.02 sec)

4 DistributeHint 使用说明

  • 目前只能指定右表的 Distribute Type ,而且只有 [shuffle][broadcast] 两种。需写在 Join 右表前面。

  • 目前能使用任意个 DistributeHint

  • 当遇到无法正确生成计划的 DistributeHint 时,系统不会显示错误,会按最大努力原则生效,最终以 EXPLAIN 显示的 Distribute 方式为准。

  • OrderedHint 混用

    利用文本序把 Join 顺序固定下来,然后再指定相应的 Join 里面我们预期使用的 Distribute 方式。例如:

    使用前:

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    mysql> explain shape plan select count(*) from t1 join t2 on t1.c1 = t2.c2;
    +----------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                  |
    +----------------------------------------------------------------------------------+
    | PhysicalResultSink                                                               |
    | --hashAgg[GLOBAL]                                                                |
    | ----PhysicalDistribute[DistributionSpecGather]                                   |
    | ------hashAgg[LOCAL]                                                             |
    | --------PhysicalProject                                                          |
    | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
    | ------------PhysicalProject                                                      |
    | --------------PhysicalOlapScan[t1]                                               |
    | ------------PhysicalDistribute[DistributionSpecHash]                             |
    | --------------PhysicalProject                                                    |
    | ----------------PhysicalOlapScan[t2]                                             |
    +----------------------------------------------------------------------------------+
    11 rows in set (0.01 sec)
    

    使用后:

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    mysql> explain shape plan select /*+ ordered */ count(*) from t2 join[broadcast] t1 on t1.c1 = t2.c2;
    +----------------------------------------------------------------------------------+
    | Explain String(Nereids Planner)                                                  |
    +----------------------------------------------------------------------------------+
    | PhysicalResultSink                                                               |
    | --hashAgg[GLOBAL]                                                                |
    | ----PhysicalDistribute[DistributionSpecGather]                                   |
    | ------hashAgg[LOCAL]                                                             |
    | --------PhysicalProject                                                          |
    | ----------hashJoin[INNER_JOIN] hashCondition=((t1.c1 = t2.c2)) otherCondition=() |
    | ------------PhysicalProject                                                      |
    | --------------PhysicalOlapScan[t2]                                               |
    | ------------PhysicalDistribute[DistributionSpecReplicated]                       |
    | --------------PhysicalProject                                                    |
    | ----------------PhysicalOlapScan[t1]                                             |
    |                                                                                  |
    | Hint log:                                                                        |
    | Used: ORDERED                                                                    |
    | UnUsed:                                                                          |
    | SyntaxError:                                                                     |
    +----------------------------------------------------------------------------------+
    16 rows in set (0.01 sec)
    

    Explain Shape Plan 里面会显示 Distribute 算子相关的信息。其中:

    • DistributionSpecReplicated 表示该算子将对应的数据复制到所有 BE 节点;

    • DistributionSpecGather 表示将数据 GatherFE 节点;

    • DistributionSpecHash 表示将数据按照特定的 hashKey 以及算法打散到不同的 BE 节点。

  • LeadingHint 混用

    在编写 SQL 查询时,我们可以在使用 LEADING 提示的同时,为每个 JOIN 操作指定相应的 DISTRIBUTE 方式。以下是一个具体的例子,展示了如何在 SQL 查询中混合使用 DistributeHintLeadingHint

    SQL
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    explain shape plan
        select
            nation,
            o_year,
            sum(amount) as sum_profit
        from
            (
                select
                    /*+ leading(orders shuffle {lineitem shuffle part} shuffle {supplier broadcast nation} shuffle partsupp)*/
                    n_name as nation,
                    extract(year from o_orderdate) as o_year,
                    l_extendedprice *(1 - l_discount) - ps_supplycost* l_quantity as amount
                from
                    part,
                    supplier,
                    lineitem,
                    partsupp,
                    orders,
                    nation
                where
                    s_suppkey = l_suppkey
                    and ps_suppkey = l_suppkey
                    and ps_partkey = l_partkey
                    and p_partkey = l_partkey
                    and o_orderkey = l_orderkey
                    and s_nationkey = n_nationkey
                    and p_name like '%green%'
            ) as profit
        group by
            nation,
            o_year
        order by
            nation,
            o_year desc;
    

5 附录

5.1 Hint Log

Hint Log 主要用于在执行 EXPLAIN 时显示提示是否生效。其显示位置通常位于 EXPLAIN 输出的最下方。

Hint Log 分为三个状态:

SQL
1
2
3
4
5
6
+---------------------------------+
| Hint log:                       |
| Used:                           |
| UnUsed:                         |
| SyntaxError:                    |
+---------------------------------+
  • Used :表明该提示生效了。

  • UnUsedSyntaxError :都表明该提示未生效。但 SyntaxError 表示提示语法使用错误或该语法不支持,同时会附加不支持的原因信息。