为什么CBO消除在此查询中的“截然不同”,而不是“群体”? -- oracle 领域 和 oracle-11g-r2 领域 和 optimization 领域 和 distinct 领域 dba 相关 的问题

Why does the CBO eliminate `distinct` and not `group by` in this query?


3
vote

问题

中文

CBO选择消除 distinct 在"慢速" 查询中 - 据推测,由于外部 group by ,它可以说明不需要操作。

我的问题是:

1)为什么它选择这样做 - 在这种情况下,我看不到花费的原因和计划中的预测基数

2)如果选择消除 distinct ,为什么不适用相同的逻辑并消除 group by

测试用:

  create table t1 as select rownum product_id, mod(rownum,3)+1 company_id from dual connect by rownum<=500;  create table t2 as select t1.product_id from t1 t1 cross join t1 t12;  create table t3 as select distinct company_id from t1;  analyze table t1 compute statistics; analyze table t2 compute statistics; analyze table t3 compute statistics;   

快(55ms):

  select company_id from t1       join t2 using(product_id)       join ( select company_id            from (select company_id from t1 group by company_id)                  join t3 using(company_id) ) using(company_id) group by company_id;   

慢(5240ms):

  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;   

sqlfiddle这里。

英文原文

The CBO chooses to eliminate the distinct in the 'slow' query - presumably it can tell that the operation isn't needed because of the outer group by.

My questions are:

1) Why does it choose to do so in this case - I can't see a reason from the costings and predicted cardinalities in the plans

2) If it chooses to eliminate the distinct, why not apply the same logic and eliminate the group by?

testbed:

create table t1 as select rownum product_id, mod(rownum,3)+1 company_id from dual connect by rownum<=500;  create table t2 as select t1.product_id from t1 t1 cross join t1 t12;  create table t3 as select distinct company_id from t1;  analyze table t1 compute statistics; analyze table t2 compute statistics; analyze table t3 compute statistics; 

fast (55ms):

select company_id from t1       join t2 using(product_id)       join ( select company_id            from (select company_id from t1 group by company_id)                  join t3 using(company_id) ) using(company_id) group by company_id; 

slow (5240ms):

select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id; 

SQLFiddle here.

           
 
 

回答列表

3
 
vote

输出来自Oracle Linux上的11.2.0.4.6企业版数据库7.1 x86-64平台。

让我们从问题2开始和一个简单的例子。

不同的和群体的处理方式不同:优化器能够在某些情况下完全消除不同的区别,但它不能与组相同。这是一个例子:

  create table t4 as select rownum product_id from dual connect by rownum<=5;  exec dbms_stats.gather_table_stats(user, 'T4');  alter session set statistics_level=all;   

请注意,设置attsistics_level全部显着增加了问题中原始查询的执行时间。

2个查询将是:

  select distinct product_id from t4  Plan hash value: 641655586  ------------------------------------------------------------------------------------------------------------------- | Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT   |      |      1 |        |     4 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH UNIQUE       |      |      1 |      5 |     4  (25)|      5 |00:00:00.01 |  2441K|  2441K| 1503K (0)| |   2 |   TABLE ACCESS FULL| T4   |      1 |      5 |     3   (0)|      5 |00:00:00.01 |       |       |          | -------------------------------------------------------------------------------------------------------------------  select product_id from t4 group by product_id  Plan hash value: 581042373  ------------------------------------------------------------------------------------------------------------------- | Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT   |      |      1 |        |     4 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH GROUP BY     |      |      1 |      5 |     4  (25)|      5 |00:00:00.01 |  2441K|  2441K|  863K (0)| |   2 |   TABLE ACCESS FULL| T4   |      1 |      5 |     3   (0)|      5 |00:00:00.01 |       |       |          | -------------------------------------------------------------------------------------------------------------------   

这就是我们期望的。全表扫描和散列独特,散列集团的组。现在添加一个不是null约束和索引。

  create index t4_i1 on t4(product_id); alter table t4 modify (product_id not null);  select distinct product_id from t4  Plan hash value: 4231414870  ----------------------------------------------------------------------------------------- | Id  | Operation          | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | ----------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT   |       |      1 |        |     2 (100)|      5 |00:00:00.01 | |   1 |  SORT UNIQUE NOSORT|       |      1 |      5 |     2  (50)|      5 |00:00:00.01 | |   2 |   INDEX FULL SCAN  | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | -----------------------------------------------------------------------------------------  select product_id from t4 group by product_id  Plan hash value: 1989519822  ------------------------------------------------------------------------------------------- | Id  | Operation            | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | ------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT     |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  SORT GROUP BY NOSORT|       |      1 |      5 |     1   (0)|      5 |00:00:00.01 | |   2 |   INDEX FULL SCAN    | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | -------------------------------------------------------------------------------------------   

优化器注意到相关列上的索引,由于它能够使用它,避免对用于产生唯一值的数据进行排序,因为数据已在索引中排序。现在为此列添加一个唯一的约束:

  alter table t4 add constraint t4_u1 unique (product_id) using index t4_i1;  select distinct product_id from t4  Plan hash value: 3974767428  --------------------------------------------------------------------------------------- | Id  | Operation        | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | --------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  INDEX FULL SCAN | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | ---------------------------------------------------------------------------------------  select product_id from t4 group by product_id  Plan hash value: 1989519822  ------------------------------------------------------------------------------------------- | Id  | Operation            | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | ------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT     |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  SORT GROUP BY NOSORT|       |      1 |      5 |     1   (0)|      5 |00:00:00.01 | |   2 |   INDEX FULL SCAN    | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | -------------------------------------------------------------------------------------------   

注意在使用不同时,Oracle如何完全跳过排序/哈希唯一,但是通过查询与该组没有任何更改。

在运行不同版本时启用追踪优化器:

  alter system flush shared_pool; alter session set events '10053 trace name context forever, level 1'; select distinct product_id from t4;   

在跟踪文件中,我们可以看到以下内容:

  ************************** Query transformations (QT) ************************** ... Eliminated SELECT DISTINCT from query block SEL$1 (#0) ... ******************************** COST-BASED QUERY TRANSFORMATIONS ******************************** ... Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T4"."PRODUCT_ID" "PRODUCT_ID" FROM "BP"."T4" "T4" ...   

请注意,这是一个查询转换,但不是基于成本的查询转换。您可以从"最终查询" 部分中看到,优化器从查询中删除了不同的。 但是对群体没有这种优化。 DISTINCT用于检索不同的值,但是使用组用于产生聚合,而不仅仅是不同的值。即使优化器可以跳过排序或散列数据,它也无法跳过计数,添加,计算平均等,这是重要的区别,因此截然不同,群体不会以相同的方式处理(即使聚集在一起未指定)。

另一个消除不同部分的情况,当显然不必要时,例如:

  alter session set "_complex_view_merging"=false;   

这被设置为防止复杂的视图合并变换,如不同的放置,组通过放置,子查询,不确定出现并使事物复杂化。

  select distinct product_id from (select distinct product_id from (select distinct product_id from (select distinct product_id from t4)));  Plan hash value: 3974767428  --------------------------------------------------------------------------------------- | Id  | Operation        | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | --------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  INDEX FULL SCAN | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | ---------------------------------------------------------------------------------------  select product_id from (select distinct product_id from (select product_id from (select distinct product_id from t4 ) group by product_id)) group by product_id;  Plan hash value: 4029011489  ------------------------------------------------------------------------------------------------------------------------ | Id  | Operation              | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------ |   0 | SELECT STATEMENT       |       |      1 |        |     1 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH GROUP BY         |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  862K (0)| |   2 |   VIEW                 |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   3 |    SORT GROUP BY NOSORT|       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   4 |     INDEX FULL SCAN    | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | ------------------------------------------------------------------------------------------------------------------------  select product_id from (select product_id from (select product_id from (select product_id from t4 group by product_id) group by product_id) group by product_id) group by product_id;  Plan hash value: 1970696362  ------------------------------------------------------------------------------------------------------------------------ | Id  | Operation              | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------ |   0 | SELECT STATEMENT       |       |      1 |        |     1 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH GROUP BY         |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  883K (0)| |   2 |   VIEW                 |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   3 |    HASH GROUP BY       |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  848K (0)| |   4 |     VIEW               |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   5 |      HASH GROUP BY     |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  840K (0)| |   6 |       VIEW             |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   7 |        HASH GROUP BY   |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  848K (0)| |   8 |         INDEX FULL SCAN| T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | ------------------------------------------------------------------------------------------------------------------------   

注意,优化器如何完全消除了第一和第二个查询中的不同,但不是群组。

遗憾的是在这些情况下,由于子查询中发生了不同的消除("视图" ),因此该信息不存在于优化器轨迹中,就像问题中的原始查询一样。 所以现在我们知道不同的和群体的确实不同,让我们回到问题1。

在下一个帖子中继续...

(两个答案一起超过30000个字符的限制。)

 

Output is from a 11.2.0.4.6 Enterprise Edition database on Oracle Linux 7.1 x86-64 platform.

Lets start with question 2 and an easy example.

DISTINCT and GROUP BY are handled differently: the optimizer is able to completely eliminate a DISTINCT under certain circumstances, but it can not do the same with GROUP BY. Here is an example:

create table t4 as select rownum product_id from dual connect by rownum<=5;  exec dbms_stats.gather_table_stats(user, 'T4');  alter session set statistics_level=all; 

Note that setting statistics_level to ALL significantly increases the execution time for the original queries in the question.

The 2 queries will be:

select distinct product_id from t4  Plan hash value: 641655586  ------------------------------------------------------------------------------------------------------------------- | Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT   |      |      1 |        |     4 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH UNIQUE       |      |      1 |      5 |     4  (25)|      5 |00:00:00.01 |  2441K|  2441K| 1503K (0)| |   2 |   TABLE ACCESS FULL| T4   |      1 |      5 |     3   (0)|      5 |00:00:00.01 |       |       |          | -------------------------------------------------------------------------------------------------------------------  select product_id from t4 group by product_id  Plan hash value: 581042373  ------------------------------------------------------------------------------------------------------------------- | Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT   |      |      1 |        |     4 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH GROUP BY     |      |      1 |      5 |     4  (25)|      5 |00:00:00.01 |  2441K|  2441K|  863K (0)| |   2 |   TABLE ACCESS FULL| T4   |      1 |      5 |     3   (0)|      5 |00:00:00.01 |       |       |          | ------------------------------------------------------------------------------------------------------------------- 

This is what we expect. Full table scan and HASH UNIQUE for DISTINCT, HASH GROUP BY for GROUP BY. Now add a NOT NULL constraint and an index.

create index t4_i1 on t4(product_id); alter table t4 modify (product_id not null);  select distinct product_id from t4  Plan hash value: 4231414870  ----------------------------------------------------------------------------------------- | Id  | Operation          | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | ----------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT   |       |      1 |        |     2 (100)|      5 |00:00:00.01 | |   1 |  SORT UNIQUE NOSORT|       |      1 |      5 |     2  (50)|      5 |00:00:00.01 | |   2 |   INDEX FULL SCAN  | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | -----------------------------------------------------------------------------------------  select product_id from t4 group by product_id  Plan hash value: 1989519822  ------------------------------------------------------------------------------------------- | Id  | Operation            | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | ------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT     |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  SORT GROUP BY NOSORT|       |      1 |      5 |     1   (0)|      5 |00:00:00.01 | |   2 |   INDEX FULL SCAN    | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | ------------------------------------------------------------------------------------------- 

The optimizer noticed the index on the relevant column, and because of the NOT NULL constraint it is able to use it avoid sorting the data for producing the unique values, because the data is already sorted in the index. Now add a UNIQUE constraint to this column:

alter table t4 add constraint t4_u1 unique (product_id) using index t4_i1;  select distinct product_id from t4  Plan hash value: 3974767428  --------------------------------------------------------------------------------------- | Id  | Operation        | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | --------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  INDEX FULL SCAN | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | ---------------------------------------------------------------------------------------  select product_id from t4 group by product_id  Plan hash value: 1989519822  ------------------------------------------------------------------------------------------- | Id  | Operation            | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | ------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT     |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  SORT GROUP BY NOSORT|       |      1 |      5 |     1   (0)|      5 |00:00:00.01 | |   2 |   INDEX FULL SCAN    | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | ------------------------------------------------------------------------------------------- 

Notice how Oracle completely skipped SORT/HASH UNIQUE when using DISTINCT, but nothing changed with the GROUP BY query.

Enable tracing the optimizer when running the DISTINCT version:

alter system flush shared_pool; alter session set events '10053 trace name context forever, level 1'; select distinct product_id from t4; 

In the trace file we can see the following:

************************** Query transformations (QT) ************************** ... Eliminated SELECT DISTINCT from query block SEL$1 (#0) ... ******************************** COST-BASED QUERY TRANSFORMATIONS ******************************** ... Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T4"."PRODUCT_ID" "PRODUCT_ID" FROM "BP"."T4" "T4" ... 

Note that this is a Query Transformation, but NOT a Cost-based Query Transformation. As you can see from the "Final query" part, the optimizer removed DISTINCT from the query. But there is no such optimization for GROUP BY. DISTINCT is used for retrieving distinct values, but GROUP BY is used for producing aggregates, not just distinct values. Even if the optimizer can skip sorting or hashing the data, it can not skip counting, adding, calculating the average, etc, and this is the important difference, so DISTINCT and GROUP BY are not handled in the same way (even if aggregates are not specified).

Another case of eliminating the DISTINCT part, when it is obviously unnecessary, for example:

alter session set "_complex_view_merging"=false; 

This was set to prevent Complex View Merging Transformations like DISTINCT Placement, GROUP BY Placement, Subquery Unnesting appearing and making things complicated.

select distinct product_id from (select distinct product_id from (select distinct product_id from (select distinct product_id from t4)));  Plan hash value: 3974767428  --------------------------------------------------------------------------------------- | Id  | Operation        | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | --------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT |       |      1 |        |     1 (100)|      5 |00:00:00.01 | |   1 |  INDEX FULL SCAN | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 | ---------------------------------------------------------------------------------------  select product_id from (select distinct product_id from (select product_id from (select distinct product_id from t4 ) group by product_id)) group by product_id;  Plan hash value: 4029011489  ------------------------------------------------------------------------------------------------------------------------ | Id  | Operation              | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------ |   0 | SELECT STATEMENT       |       |      1 |        |     1 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH GROUP BY         |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  862K (0)| |   2 |   VIEW                 |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   3 |    SORT GROUP BY NOSORT|       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   4 |     INDEX FULL SCAN    | T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | ------------------------------------------------------------------------------------------------------------------------  select product_id from (select product_id from (select product_id from (select product_id from t4 group by product_id) group by product_id) group by product_id) group by product_id;  Plan hash value: 1970696362  ------------------------------------------------------------------------------------------------------------------------ | Id  | Operation              | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------ |   0 | SELECT STATEMENT       |       |      1 |        |     1 (100)|      5 |00:00:00.01 |       |       |          | |   1 |  HASH GROUP BY         |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  883K (0)| |   2 |   VIEW                 |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   3 |    HASH GROUP BY       |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  848K (0)| |   4 |     VIEW               |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   5 |      HASH GROUP BY     |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  840K (0)| |   6 |       VIEW             |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | |   7 |        HASH GROUP BY   |       |      1 |      5 |     1   (0)|      5 |00:00:00.01 |  2441K|  2441K|  848K (0)| |   8 |         INDEX FULL SCAN| T4_I1 |      1 |      5 |     1   (0)|      5 |00:00:00.01 |       |       |          | ------------------------------------------------------------------------------------------------------------------------ 

Notice how the optimizer completetely eliminated the DISTINCTs in the first and second queries, but not the GROUP BYs.

Unfortunately in these cases, since the DISTINCT elimination happens in subqueries ("views"), this information is not present in the optimizer trace, just like for the original queries in the question.

So now we know that DISTINCT and GROUP BY are handled indeed differently, lets go back to question 1.

To be continued in the next post...

(Both answers together exceed the 30000 characters limit.)

 
 
1
 
vote

继续从之前的帖子。

所以现在我们知道不同的和群体的确实不同,让我们回到问题1。

知道上面,回到原来,较慢的查询,这个:

  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;   

相当于此(Distinct Deact):

  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;   

您可以通过为两个查询启用优化程序跟踪来检查此项,这是2运行的本质:

  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;  Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T3"."COMPANY_ID" "COMPANY_ID" FROM "BP"."T1" "T1","BP"."T2" "T2","BP"."T1" "T1","BP"."T3" "T3" WHERE "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."PRODUCT_ID"="T2"."PRODUCT_ID" GROUP BY "T3"."COMPANY_ID"  Plan hash value: 1403596148  -------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem | -------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT      |      |      1 |        |  1242 (100)|      3 |00:02:13.42 |     390 |       |       |          | |   1 |  HASH GROUP BY        |      |      1 |      3 |  1242  (91)|      3 |00:02:13.42 |     390 |  1610M|    50M| 1229K (0)| |*  2 |   HASH JOIN           |      |      1 |     41M|   210  (44)|     41M|00:01:07.27 |     390 |  2440K|  2440K|  956K (0)| |   3 |    TABLE ACCESS FULL  | T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |*  4 |    HASH JOIN          |      |      1 |    250K|   116   (1)|    250K|00:00:02.05 |     388 |  1969K|  1969K| 1581K (0)| |*  5 |     HASH JOIN         |      |      1 |    500 |     6   (0)|    500 |00:00:00.01 |       4 |  2440K|  2440K|  980K (0)| |   6 |      TABLE ACCESS FULL| T3   |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       2 |       |       |          | |   7 |      TABLE ACCESS FULL| T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |   8 |     TABLE ACCESS FULL | T2   |      1 |    250K|   110   (1)|    250K|00:00:00.43 |     384 |       |       |          | --------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("COMPANY_ID"="T3"."COMPANY_ID")    4 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    5 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;  Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T3"."COMPANY_ID" "COMPANY_ID" FROM "BP"."T1" "T1","BP"."T2" "T2","BP"."T1" "T1","BP"."T3" "T3" WHERE "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."PRODUCT_ID"="T2"."PRODUCT_ID" GROUP BY "T3"."COMPANY_ID"  Plan hash value: 1403596148  -------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem | -------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT      |      |      1 |        |  1242 (100)|      3 |00:02:12.94 |     390 |       |       |          | |   1 |  HASH GROUP BY        |      |      1 |      3 |  1242  (91)|      3 |00:02:12.94 |     390 |  1610M|    50M| 1254K (0)| |*  2 |   HASH JOIN           |      |      1 |     41M|   210  (44)|     41M|00:01:07.23 |     390 |  2440K|  2440K| 1025K (0)| |   3 |    TABLE ACCESS FULL  | T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |*  4 |    HASH JOIN          |      |      1 |    250K|   116   (1)|    250K|00:00:02.04 |     388 |  1969K|  1969K| 1581K (0)| |*  5 |     HASH JOIN         |      |      1 |    500 |     6   (0)|    500 |00:00:00.01 |       4 |  2440K|  2440K|  983K (0)| |   6 |      TABLE ACCESS FULL| T3   |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       2 |       |       |          | |   7 |      TABLE ACCESS FULL| T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |   8 |     TABLE ACCESS FULL | T2   |      1 |    250K|   110   (1)|    250K|00:00:00.44 |     384 |       |       |          | --------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("COMPANY_ID"="T3"."COMPANY_ID")    4 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    5 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")   

正如您所看到的,2个查询的最终表单("最终查询" )是相同的,其执行计划也是一样的。查询运行2分钟,因为统计信息_LEVEL设置为所有才能收集计划统计信息。

所以知道可以从查询中消除不同,查看此查询的文本:
  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;   

有一个子查询(从现在开始看"):

    ( select company_id      from (select company_id from t1)           join t3 using(company_id) ) using(company_id)   

最初包含不同的内视图,因为它,它有资格复杂的视图合并。但是,随着另一个查询转换之前删除的,此视图将成为一个简单的SPJ(Select-Project-Join)视图,现在有资格符合简单的视图合并。以下是一些有关查看合并的有趣信息:

https://docs.oracle.com/database/121/ tgsql / tgsql_transform.htm#tgsql209

本件在这种情况下尤为重要:

"对于某些简单的视图,其中合并总是导致更好 计划,优化器在不考虑的情况下自动合并视图 成本。"

所以如上所述,Plan的成本具有视图合并的是1242.让我们通过View Merging禁用禁用费用:

  SQL> set timing on SQL> alter session set statistics_level=typical;  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select /*+ no_merge */ distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;  COMPANY_ID ----------          1          2          3  Elapsed: 00:00:00.03  Plan hash value: 3348277023  -------------------------------------------------------------------------------- | Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     | -------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |      |       |       |   126 (100)|          | |   1 |  HASH GROUP BY          |      |     3 |    51 |   126   (7)| 00:00:02 | |*  2 |   HASH JOIN             |      |   250K|  4150K|   120   (2)| 00:00:02 | |*  3 |    HASH JOIN            |      |   500 |  6500 |    10  (10)| 00:00:01 | |*  4 |     HASH JOIN           |      |     3 |    18 |     7  (15)| 00:00:01 | |   5 |      VIEW               |      |     3 |     9 |     4  (25)| 00:00:01 | |   6 |       HASH UNIQUE       |      |     3 |     9 |     4  (25)| 00:00:01 | |   7 |        TABLE ACCESS FULL| T1   |   500 |  1500 |     3   (0)| 00:00:01 | |   8 |      TABLE ACCESS FULL  | T3   |     3 |     9 |     3   (0)| 00:00:01 | |   9 |     TABLE ACCESS FULL   | T1   |   500 |  3500 |     3   (0)| 00:00:01 | |  10 |    TABLE ACCESS FULL    | T2   |   250K|   976K|   110   (1)| 00:00:02 | --------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    3 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")    4 - access("from$_subquery$_005"."COMPANY_ID"="T3"."COMPANY_ID")   

而不合并视图,成本为126.所以Oracle本可以选择不转换查询,使用该计划具有126成本,而不是它,而不是检查成本的情况,并选择了一个计划成本高10倍= 1242.文件中的上述引用句子解释了这种行为。

如果检查原始查询的执行计划,则可以在谓词部分中查看执行计划中缺少查看合并 - 查看操作,并"从$ _subquery $ _005" 部分:

  Plan hash value: 54875574  ---------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation               | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem | ---------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |      |      1 |        |   126 (100)|      3 |00:00:02.31 |     390 |       |       |          | |   1 |  HASH GROUP BY          |      |      1 |      3 |   126   (7)|      3 |00:00:02.31 |     390 |    11M|  3254K| 1246K (0)| |*  2 |   HASH JOIN             |      |      1 |    250K|   120   (2)|    250K|00:00:01.91 |     390 |  1969K|  1969K| 1606K (0)| |*  3 |    HASH JOIN            |      |      1 |    500 |    10  (10)|    500 |00:00:00.01 |       6 |  2440K|  2440K|  978K (0)| |*  4 |     HASH JOIN           |      |      1 |      3 |     7  (15)|      3 |00:00:00.01 |       4 |  2440K|  2440K|  980K (0)| |   5 |      VIEW               |      |      1 |      3 |     4  (25)|      3 |00:00:00.01 |       2 |       |       |          | |   6 |       HASH GROUP BY     |      |      1 |      3 |     4  (25)|      3 |00:00:00.01 |       2 |  2441K|  2441K|  717K (0)| |   7 |        TABLE ACCESS FULL| T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |   8 |      TABLE ACCESS FULL  | T3   |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       2 |       |       |          | |   9 |     TABLE ACCESS FULL   | T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |  10 |    TABLE ACCESS FULL    | T2   |      1 |    250K|   110   (1)|    250K|00:00:00.38 |     384 |       |       |          | ----------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    3 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")    4 - access("from$_subquery$_005"."COMPANY_ID"="T3"."COMPANY_ID")   

如果我们强制查看此查询的融合:

  explain plan for  select company_id from t1       join t2 using(product_id)       join ( select /*+ merge */ company_id            from (select /*+ merge */ company_id from t1 group by company_id)                  join t3 using(company_id) ) using(company_id) group by company_id;  select * from table(dbms_xplan.display);   PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Plan hash value: 1480121551  ------------------------------------------------------------------------------------- | Id  | Operation               | Name      | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |           |     3 |     9 |  1242  (91)| 00:00:15 | |   1 |  HASH GROUP BY          |           |     3 |     9 |  1242  (91)| 00:00:15 | |   2 |   VIEW                  | VM_NWVW_1 |     7 |    21 |  1242  (91)| 00:00:15 | |   3 |    HASH GROUP BY        |           |     7 |   119 |  1242  (91)| 00:00:15 | |*  4 |     HASH JOIN           |           |    41M|   675M|   210  (44)| 00:00:03 | |   5 |      TABLE ACCESS FULL  | T1        |   500 |  1500 |     3   (0)| 00:00:01 | |*  6 |      HASH JOIN          |           |   250K|  3417K|   116   (1)| 00:00:02 | |*  7 |       HASH JOIN         |           |   500 |  5000 |     6   (0)| 00:00:01 | |   8 |        TABLE ACCESS FULL| T3        |     3 |     9 |     3   (0)| 00:00:01 | |   9 |        TABLE ACCESS FULL| T1        |   500 |  3500 |     3   (0)| 00:00:01 | |  10 |       TABLE ACCESS FULL | T2        |   250K|   976K|   110   (1)| 00:00:02 | -------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     4 - access("COMPANY_ID"="T3"."COMPANY_ID")    6 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    7 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")   

我们可以看到成本为1242,而内部视图VM_NWVW_1由于基于成本的查询转换而出现。由于群组,这只能是复杂的视图合并和基于成本的查询转换,其中成本不忽视,这就是为什么Oracle没有选择这个计划。

所以知道,即使没有基于成本的查询转换,也可以更容易地消除不同的,并且我们不需要聚合结果,因此可以更好地使用双重不同版本:
  select distinct company_id from t1       join t2 using(product_id)       join ( select company_id            from (select distinct company_id from t1)                  join t3 using(company_id) ) using(company_id) ;  Plan hash value: 3914171991  ----------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation               | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |                 |      1 |        |   126 (100)|      3 |00:00:02.34 |       |       |          | |   1 |  HASH UNIQUE            |                 |      1 |      3 |   126   (7)|      3 |00:00:02.34 |    11M|  3254K|  746K (0)| |*  2 |   HASH JOIN             |                 |      1 |    250K|   120   (2)|    250K|00:00:01.93 |  1969K|  1969K| 1602K (0)| |*  3 |    HASH JOIN            |                 |      1 |    500 |    10  (10)|    500 |00:00:00.01 |  2440K|  2440K|  978K (0)| |*  4 |     HASH JOIN           |                 |      1 |      3 |     7  (15)|      3 |00:00:00.01 |  2440K|  2440K|  986K (0)| |   5 |      VIEW               | VW_DTP_09DF12B3 |      1 |      3 |     4  (25)|      3 |00:00:00.01 |       |       |          | |   6 |       HASH UNIQUE       |                 |      1 |      3 |     4  (25)|      3 |00:00:00.01 |  2441K|  2441K|  716K (0)| |   7 |        TABLE ACCESS FULL| T1              |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       |       |          | |   8 |      TABLE ACCESS FULL  | T3              |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       |       |          | |   9 |     TABLE ACCESS FULL   | T1              |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       |       |          | |  10 |    TABLE ACCESS FULL    | T2              |      1 |    250K|   110   (1)|    250K|00:00:00.38 |       |       |          | -----------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    3 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")    4 - access("ITEM_1"="T3"."COMPANY_ID")   

注意:使用statistic_level =典型的运行时约为0,04 ms:

  SQL> alter session set statistics_level=typical;  Session altered.  Elapsed: 00:00:00.00  SQL> select distinct company_id from t1      join t2 using(product_id)      join ( select company_id            from (select distinct company_id from t1)                 join t3 using(company_id) ) using(company_id) ;  COMPANY_ID ----------          1          2          3  Elapsed: 00:00:00.04   

注意,这结果是由于基于成本的实质性转换(VW_DTP _...内部视图),另一个执行计划。

另一个解决方案是防止简单的视图与/ * + no_merge * /如前所述合并。

但是,在12C(12.1.0.2.3,Oracle Linux x86-64)上,查看合并工作非常好,因为数据库选择哈希连接半代替哈希连接,并且执行时间与其他变体一样快:
  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id; 0  
 

Continued from previous post.

So now we know that DISTINCT and GROUP BY are handled indeed differently, lets go back to question 1.

Knowing the above, going back to the original, slower query, this:

select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id; 

is equivalent to this (DISTINCT removed):

select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id; 

You can check this by enabling the optimizer trace for both queries, here is the essence from the 2 runs:

select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;  Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T3"."COMPANY_ID" "COMPANY_ID" FROM "BP"."T1" "T1","BP"."T2" "T2","BP"."T1" "T1","BP"."T3" "T3" WHERE "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."PRODUCT_ID"="T2"."PRODUCT_ID" GROUP BY "T3"."COMPANY_ID"  Plan hash value: 1403596148  -------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem | -------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT      |      |      1 |        |  1242 (100)|      3 |00:02:13.42 |     390 |       |       |          | |   1 |  HASH GROUP BY        |      |      1 |      3 |  1242  (91)|      3 |00:02:13.42 |     390 |  1610M|    50M| 1229K (0)| |*  2 |   HASH JOIN           |      |      1 |     41M|   210  (44)|     41M|00:01:07.27 |     390 |  2440K|  2440K|  956K (0)| |   3 |    TABLE ACCESS FULL  | T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |*  4 |    HASH JOIN          |      |      1 |    250K|   116   (1)|    250K|00:00:02.05 |     388 |  1969K|  1969K| 1581K (0)| |*  5 |     HASH JOIN         |      |      1 |    500 |     6   (0)|    500 |00:00:00.01 |       4 |  2440K|  2440K|  980K (0)| |   6 |      TABLE ACCESS FULL| T3   |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       2 |       |       |          | |   7 |      TABLE ACCESS FULL| T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |   8 |     TABLE ACCESS FULL | T2   |      1 |    250K|   110   (1)|    250K|00:00:00.43 |     384 |       |       |          | --------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("COMPANY_ID"="T3"."COMPANY_ID")    4 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    5 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;  Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T3"."COMPANY_ID" "COMPANY_ID" FROM "BP"."T1" "T1","BP"."T2" "T2","BP"."T1" "T1","BP"."T3" "T3" WHERE "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."COMPANY_ID"="T3"."COMPANY_ID" AND "T1"."PRODUCT_ID"="T2"."PRODUCT_ID" GROUP BY "T3"."COMPANY_ID"  Plan hash value: 1403596148  -------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem | -------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT      |      |      1 |        |  1242 (100)|      3 |00:02:12.94 |     390 |       |       |          | |   1 |  HASH GROUP BY        |      |      1 |      3 |  1242  (91)|      3 |00:02:12.94 |     390 |  1610M|    50M| 1254K (0)| |*  2 |   HASH JOIN           |      |      1 |     41M|   210  (44)|     41M|00:01:07.23 |     390 |  2440K|  2440K| 1025K (0)| |   3 |    TABLE ACCESS FULL  | T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |*  4 |    HASH JOIN          |      |      1 |    250K|   116   (1)|    250K|00:00:02.04 |     388 |  1969K|  1969K| 1581K (0)| |*  5 |     HASH JOIN         |      |      1 |    500 |     6   (0)|    500 |00:00:00.01 |       4 |  2440K|  2440K|  983K (0)| |   6 |      TABLE ACCESS FULL| T3   |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       2 |       |       |          | |   7 |      TABLE ACCESS FULL| T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |   8 |     TABLE ACCESS FULL | T2   |      1 |    250K|   110   (1)|    250K|00:00:00.44 |     384 |       |       |          | --------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("COMPANY_ID"="T3"."COMPANY_ID")    4 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    5 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID") 

As you can see, the final form ("Final query") of 2 queries are the same, their execution plans are also the same. The queries were running for 2 minutes because statistics_level was set to ALL in order to gather plan statistics.

So knowing that the distinct can be eliminated from the query, look at the text of this query:

select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id; 

There is a subquery ("view" from now on) in this:

  ( select company_id      from (select company_id from t1)           join t3 using(company_id) ) using(company_id) 

The inner view originally contained DISTINCT, and because of that, it was eligible for Complex View Merging. However, as the DISTINCT was eliminated earlier by another query transformation, this view became a simple SPJ (select-project-join) view, and now is eligible for Simple View Merging. Here are some interesting information about view merging:

https://docs.oracle.com/database/121/TGSQL/tgsql_transform.htm#TGSQL209

This part is especially important in this case:

"For certain simple views in which merging always leads to a better plan, the optimizer automatically merges the view without considering cost."

So the as you can see above, the cost of the plan with view merging is 1242. Lets see the cost with view merging disabled:

SQL> set timing on SQL> alter session set statistics_level=typical;  select company_id from t1       join t2 using(product_id)      join ( select company_id              from (select /*+ no_merge */ distinct company_id from t1)                   join t3 using(company_id) ) using(company_id) group by company_id;  COMPANY_ID ----------          1          2          3  Elapsed: 00:00:00.03  Plan hash value: 3348277023  -------------------------------------------------------------------------------- | Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     | -------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |      |       |       |   126 (100)|          | |   1 |  HASH GROUP BY          |      |     3 |    51 |   126   (7)| 00:00:02 | |*  2 |   HASH JOIN             |      |   250K|  4150K|   120   (2)| 00:00:02 | |*  3 |    HASH JOIN            |      |   500 |  6500 |    10  (10)| 00:00:01 | |*  4 |     HASH JOIN           |      |     3 |    18 |     7  (15)| 00:00:01 | |   5 |      VIEW               |      |     3 |     9 |     4  (25)| 00:00:01 | |   6 |       HASH UNIQUE       |      |     3 |     9 |     4  (25)| 00:00:01 | |   7 |        TABLE ACCESS FULL| T1   |   500 |  1500 |     3   (0)| 00:00:01 | |   8 |      TABLE ACCESS FULL  | T3   |     3 |     9 |     3   (0)| 00:00:01 | |   9 |     TABLE ACCESS FULL   | T1   |   500 |  3500 |     3   (0)| 00:00:01 | |  10 |    TABLE ACCESS FULL    | T2   |   250K|   976K|   110   (1)| 00:00:02 | --------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    3 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")    4 - access("from$_subquery$_005"."COMPANY_ID"="T3"."COMPANY_ID") 

Without merging the view, the cost is 126. So Oracle could have chosen not to transform the query, use the plan with cost 126, but instead of it, it merged the view without examining the cost, and chosen a plan with a cost 10 times higher = 1242. The above quoted sentence from the documentation explains this behavior.

If you check the execution plan of the original query, you can see the lack of view merging - VIEW operation in the execution plan and "from$_subquery$_005" in the predicates part:

Plan hash value: 54875574  ---------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation               | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem | ---------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |      |      1 |        |   126 (100)|      3 |00:00:02.31 |     390 |       |       |          | |   1 |  HASH GROUP BY          |      |      1 |      3 |   126   (7)|      3 |00:00:02.31 |     390 |    11M|  3254K| 1246K (0)| |*  2 |   HASH JOIN             |      |      1 |    250K|   120   (2)|    250K|00:00:01.91 |     390 |  1969K|  1969K| 1606K (0)| |*  3 |    HASH JOIN            |      |      1 |    500 |    10  (10)|    500 |00:00:00.01 |       6 |  2440K|  2440K|  978K (0)| |*  4 |     HASH JOIN           |      |      1 |      3 |     7  (15)|      3 |00:00:00.01 |       4 |  2440K|  2440K|  980K (0)| |   5 |      VIEW               |      |      1 |      3 |     4  (25)|      3 |00:00:00.01 |       2 |       |       |          | |   6 |       HASH GROUP BY     |      |      1 |      3 |     4  (25)|      3 |00:00:00.01 |       2 |  2441K|  2441K|  717K (0)| |   7 |        TABLE ACCESS FULL| T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |   8 |      TABLE ACCESS FULL  | T3   |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       2 |       |       |          | |   9 |     TABLE ACCESS FULL   | T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       2 |       |       |          | |  10 |    TABLE ACCESS FULL    | T2   |      1 |    250K|   110   (1)|    250K|00:00:00.38 |     384 |       |       |          | ----------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    3 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")    4 - access("from$_subquery$_005"."COMPANY_ID"="T3"."COMPANY_ID") 

If we force view merging for this query:

explain plan for  select company_id from t1       join t2 using(product_id)       join ( select /*+ merge */ company_id            from (select /*+ merge */ company_id from t1 group by company_id)                  join t3 using(company_id) ) using(company_id) group by company_id;  select * from table(dbms_xplan.display);   PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Plan hash value: 1480121551  ------------------------------------------------------------------------------------- | Id  | Operation               | Name      | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |           |     3 |     9 |  1242  (91)| 00:00:15 | |   1 |  HASH GROUP BY          |           |     3 |     9 |  1242  (91)| 00:00:15 | |   2 |   VIEW                  | VM_NWVW_1 |     7 |    21 |  1242  (91)| 00:00:15 | |   3 |    HASH GROUP BY        |           |     7 |   119 |  1242  (91)| 00:00:15 | |*  4 |     HASH JOIN           |           |    41M|   675M|   210  (44)| 00:00:03 | |   5 |      TABLE ACCESS FULL  | T1        |   500 |  1500 |     3   (0)| 00:00:01 | |*  6 |      HASH JOIN          |           |   250K|  3417K|   116   (1)| 00:00:02 | |*  7 |       HASH JOIN         |           |   500 |  5000 |     6   (0)| 00:00:01 | |   8 |        TABLE ACCESS FULL| T3        |     3 |     9 |     3   (0)| 00:00:01 | |   9 |        TABLE ACCESS FULL| T1        |   500 |  3500 |     3   (0)| 00:00:01 | |  10 |       TABLE ACCESS FULL | T2        |   250K|   976K|   110   (1)| 00:00:02 | -------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     4 - access("COMPANY_ID"="T3"."COMPANY_ID")    6 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    7 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID") 

We can see the cost would be 1242, and the internal view VM_NWVW_1 appeared as a result of a Cost-based Query Transformation. Because of the GROUP BY, this can be only Complex View Merging and Cost-based Query Transformation, where the cost is not ignored, that is why Oracle did not choose this plan.

So knowing that DISTINCT can be eliminated more easily, even without Cost-Based Query Transformations, and that we do not need aggregate results in this query, it would be better to use the double DISTINCT version:

select distinct company_id from t1       join t2 using(product_id)       join ( select company_id            from (select distinct company_id from t1)                  join t3 using(company_id) ) using(company_id) ;  Plan hash value: 3914171991  ----------------------------------------------------------------------------------------------------------------------------------- | Id  | Operation               | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT        |                 |      1 |        |   126 (100)|      3 |00:00:02.34 |       |       |          | |   1 |  HASH UNIQUE            |                 |      1 |      3 |   126   (7)|      3 |00:00:02.34 |    11M|  3254K|  746K (0)| |*  2 |   HASH JOIN             |                 |      1 |    250K|   120   (2)|    250K|00:00:01.93 |  1969K|  1969K| 1602K (0)| |*  3 |    HASH JOIN            |                 |      1 |    500 |    10  (10)|    500 |00:00:00.01 |  2440K|  2440K|  978K (0)| |*  4 |     HASH JOIN           |                 |      1 |      3 |     7  (15)|      3 |00:00:00.01 |  2440K|  2440K|  986K (0)| |   5 |      VIEW               | VW_DTP_09DF12B3 |      1 |      3 |     4  (25)|      3 |00:00:00.01 |       |       |          | |   6 |       HASH UNIQUE       |                 |      1 |      3 |     4  (25)|      3 |00:00:00.01 |  2441K|  2441K|  716K (0)| |   7 |        TABLE ACCESS FULL| T1              |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       |       |          | |   8 |      TABLE ACCESS FULL  | T3              |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       |       |          | |   9 |     TABLE ACCESS FULL   | T1              |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       |       |          | |  10 |    TABLE ACCESS FULL    | T2              |      1 |    250K|   110   (1)|    250K|00:00:00.38 |       |       |          | -----------------------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    3 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")    4 - access("ITEM_1"="T3"."COMPANY_ID") 

Note: runtime with statistic_level=typical is around 0,04 ms:

SQL> alter session set statistics_level=typical;  Session altered.  Elapsed: 00:00:00.00  SQL> select distinct company_id from t1      join t2 using(product_id)      join ( select company_id            from (select distinct company_id from t1)                 join t3 using(company_id) ) using(company_id) ;  COMPANY_ID ----------          1          2          3  Elapsed: 00:00:00.04 

Note that this results another execution plan because of the Distinct Placement Cost-Based Transformation (VW_DTP_... internal view).

Another solution is to prevent Simple View Merging with the /*+ no_merge */ as shown earlier.

However, on 12c (12.1.0.2.3, Oracle Linux x86-64), view merging works quite well, because the database chooses HASH JOIN SEMI instead of HASH JOIN, and execution time is as fast as the other variants:

Plan hash value: 1674031258  ---------------------------------------------------------------------------------------------------------------------- | Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem | ---------------------------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT      |      |      1 |        |   121 (100)|      3 |00:00:01.53 |       |       |          | |   1 |  HASH GROUP BY        |      |      1 |      3 |   121   (3)|      3 |00:00:01.53 |  2442K|  2442K| 1265K (0)| |*  2 |   HASH JOIN SEMI      |      |      1 |    500 |   120   (2)|    500 |00:00:01.53 |  1969K|  1969K| 1514K (0)| |*  3 |    HASH JOIN          |      |      1 |    500 |     9   (0)|    500 |00:00:00.01 |  2440K|  2440K|  989K (0)| |*  4 |     HASH JOIN SEMI    |      |      1 |      3 |     6   (0)|      3 |00:00:00.01 |  2440K|  2440K|  988K (0)| |   5 |      TABLE ACCESS FULL| T3   |      1 |      3 |     3   (0)|      3 |00:00:00.01 |       |       |          | |   6 |      TABLE ACCESS FULL| T1   |      1 |    500 |     3   (0)|      3 |00:00:00.01 |       |       |          | |   7 |     TABLE ACCESS FULL | T1   |      1 |    500 |     3   (0)|    500 |00:00:00.01 |       |       |          | |   8 |    TABLE ACCESS FULL  | T2   |      1 |    250K|   110   (1)|    249K|00:00:00.38 |       |       |          | ----------------------------------------------------------------------------------------------------------------------  Predicate Information (identified by operation id): ---------------------------------------------------     2 - access("T1"."PRODUCT_ID"="T2"."PRODUCT_ID")    3 - access("T1"."COMPANY_ID"="T3"."COMPANY_ID")    4 - access("COMPANY_ID"="T3"."COMPANY_ID")  SQL> alter session set statistics_level=typical;  Session altered.  SQL> set timing on  SQL> select company_id from t1      join t2 using(product_id)      join ( select company_id            from (select company_id from t1 group by company_id)                 join t3 using(company_id) ) using(company_id)   group by company_id;  COMPANY_ID ----------          1          2          3  Elapsed: 00:00:00.02 
 
 
0
 
vote

它不是由于Oracle对 DISTINCT GROUP BY 的差异。如果我们用 DISTINCT 替换在第二个查询中的 abcdefghijklmn2 中的 abcdefghijklmn2 。在选择中,我们将获得类似的性能/计划成本作为第一个

我想,如果我们在内部和外部查询中使用相同的子句,那么Oracle解析器就能识别它需要两次操作,并且能够将操作合并为单个操作。

因此,如果我们查看第一个查询的计划, GROUP BY 操作在完成表扫描完成后发生,导致成本和执行时间较小。

 

It is not due to any difference in Oracle's treatment of DISTINCT and GROUP BY. If we replace GROUP BY in the second query with a DISTINCT in SELECT we shall get similar performance/ plan cost as the first one

I think, if we are using same clause in both the inner and outer queries with a natural join then Oracle parser is able to recognize that it will need to do same operation twice and is able to merge the operation into a single operation.

Thus if we look at the plan of the first query the GROUP BY operation takes place once after the TABLE scans are completed, resulting in lesser cost and execution time.

 
 

相关问题

1  是否可以查询从系统目录中查询包本地类型的列表?  ( Is it possible to query list of package local types from system catalogue ) 
尝试识别如何从系统目录中提取的程序包本地类型列表(即包装内部定义的类型)。 例如,我已经定义了在包规范(现在省略了包机): # listener.ora Network Configuration File: C:oracleproduct10.2.0db_1 etworkadminlistener.ora # ...

1  Oracle:从CSV文件中选择数据  ( Oracle select on data from a csv file ) 
我有一个CSV文件,表单中有许多整数: id --- 1 2 3 我需要编写一个语句,该语句从CSV中列表中的表中选择ID。 我可以一次将数据1000元素转换为类似的SQL: SELECT * FROM TABLE WHERE ID IN (....) 但这需要数百个查询,有没有更好的方法我可以做...

1  当未知确切的变量数量时,如何在“动态查询”中使用“绑定变量”  ( How to use bind variables in a dynamic query when the exact number of varia ) 
我有一个方法,其中我正在使用 Dynamic SQL (变量 actual_query ),并且我基于输入参数 i_tables 哪个是对某些表的名称的串联。它具有以下形式之一: 所有表`test_table1,test_table2。 没有。所以 NULL 将传递给过程。 我已经阅读了关于 Bind v...

1  使用WM_CONCAT到连接行,但以另一列确定的顺序  ( Using wm concat to concatenate rows but in an order determined by another colum ) 
假设我有3列: pages ---------- id: Integer (primary key) url: String (unique) title: String text: String html: String last_visi...

0  商业数据库系统与开源数据库系统[已关闭]  ( Commercial database systems versus open source database systems ) 
正如目前所在的那样,这个问题并不适合我们的Q&amp;一种格式。我们预计事实,参考资料或专业知识的支持,但该问题可能会征求辩论,论点,投票或扩展讨论。如果您觉得可以改进此问题并可能重新开放,请访问帮助中心进行指导。 ...

0  如何在Oracle数据库12c中删除这些重复的记录?  ( How to remove these duplicate records in an oracle database 12c ) 
我有这两个表: messages(id primary key, message_date, created_at, ...) user_messages(user_id, message_id references messages(id)) 我在消息中有一些重复行: select user_id, ...

0  在Oracle中的小数点之前显示数字  ( Displaying number before decimal points in oracle ) 
我有一个逻辑,我想基于一些操作计算和显示数字。操作低于 UPDATE table1 SET value1='foo' WHERE value2='bar'; 0 , UPDATE table1 SET value1='foo' WHERE value2='bar'; 1 , 所以我想计算 UPDATE table...

0  SQL用于在列中插入系统日期和固定时间(不是系统时间)  ( Sql for inserting the system date and a fixed time not the system time in a co ) 
我有一个如此: Insert into UNITRIP (START_TIME, END_TIME) (TO_DATE('10/20/2015 06:05:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('10/20/2015 07:34:06', 'MM/DD/YYYY HH...

5  如何拍摄Oracle备份和恢复?  ( How to take oracle backup and restore ) 
我来自SQL Server背景,其中SQL Server Studio有一种非常简单的携带数据库备份的方法,然后进行还原。 我一直在使用Oracle 11g Express版本过去几个月,我不确定在Oracle中备份和还原。 我可以在Oracle服务器菜单中看到备份数据库和还原数据库的选项,但我不确定它是如何运行的,...

0  MySQL在MView刷新期间耗尽内存  ( Mysql run out of memory during mview refresh ) 
我想使用带有Linux 64位OS的ODBC驱动程序使用DBLink从MySQL创建MView。但由于大约2.5 CR记录的数量左右,它会引发错误 ERROR at line 1: ORA-28500: connection from ORACLE to a non-Oracle system returned ...

1  更改计算机域名后,Oracle 10g侦听器问题  ( Oracle 10g listener issues after it changed computer domain name ) 
我需要一些帮助。 我们在Windows 2008R2计算机上安装了Oracle 10.2.0.5.0。 问题是我们的IT部门更改了该机器的域。 我尝试编辑我们的tnsnames.ora和listner.ora文件并重新启动侦听器服务。 我可以使用Sqldeveloper远程连接到数据库,但如果我尝试通过使用BDE的...

1  SQLPLUS中的SPOOL UNICODE数据到ISO-Client而不具有字符集转换  ( Spool unicode data in sqlplus to iso client without character set conversion ) 
我目前面临以下问题: 我们有一个带有字符集al32utf8的Oracle 11g数据库,以及带有nls_lang = germen_germany.we8iso8859p1的sqlplus-client。 存储在数据库中是各种非ISO8859-1字符。 现在,当客户端连接到数据库并尝试卷起这些特殊字符时,根据客户端的...

0  如何在Oracle中获得2个不同架构的2个表之间的区别? [复制]  ( How to get the difference between 2 tables in 2 different schemas in oracle ) 
这个问题已经在这里有答案: 比较两个表之间的值 (3个答案) 关闭 23天...

1  ORA-28031试图重建或下降指数  ( Ora 28031 while attempting to rebuild or drop index ) 
我在使用数据泵将Oracle数据库的副本设置为Oracle Linux上的11.2.0.3.0到11.2.0.4.0。我有一些问题,我已经设法解决了无效的对象,但我也有许多索引与"失败" 的domidx_opstatus。 试图重建这些索引导致错误: ora-29858:执行ODCIIndexalter例程中的错误...

2  RMAN中备份集的行为如何?  ( How is the behaviour of backupset in rman ) 
我在web上发现了这个定义: 备份文件的逻辑分组 - 备份件 - 即 发出RMAN BACKUP命令时创建。备份集是RMAN的 名称有关与备份关联的文件集合。备份集 由一个或多个备用件组成。 但我不明白为什么当我制作一个整个数据库的一个备份时,RMAN创建了几个备份组 这是一个关于一个备份的输出,我制...




© 2021 it.wenda123.org All Rights Reserved. 问答之家 版权所有


Licensed under cc by-sa 3.0 with attribution required.