优化查询或分为两个查询和流程结果分别? -- performance 领域 和 mariadb 领域 和 subquery 领域 和 optimization 领域 和 query-performance 领域 dba 相关 的问题

Optimize query or split in two queries and process result separately?


-1
vote

问题

中文

我有这个db架构:

db_schema

因为您可能会注意到表之间没有关系,并且不会在未来。如果我运行以下查询,我将获得大量行:

  SELECT COUNT(*) FROM `territory`; => 6112 rows SELECT COUNT(*) FROM `group`; => 13669 rows SELECT COUNT(*) FROM `accountshare`; => 412862 rows SELECT COUNT(*) FROM `account`; => 206410 rows SELECT COUNT(*) FROM `address_vod__c`; => 1198746 rows   

我正在尝试在 2015-07-01 和我使用这个查询后找到每个记录:

  SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     a.lastmodifieddate > '2015-07-01'         OR ad.lastmodifieddate > '2015-07-01'         OR acs.lastmodifieddate > '2015-07-01'         OR g.lastmodifieddate > '2015-07-01' ORDER BY tid DESC;   

但它是因为大尺寸的结果。我试图分隔在两个查询中,然后加入它们,但需要花时间相同:

  SELECT      * FROM     (SELECT          a.id AS aid,             ad.id AS adid,             a.lastmodifieddate AS almd,             ad.lastmodifieddate AS adlmd,             ad.primary_vod__c AS adp     FROM         account AS a     JOIN address_vod__c AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1)     WHERE         a.lastmodifieddate > '2015-07-01'             OR ad.lastmodifieddate > '2015-07-01') AS q1         JOIN     (SELECT          acs.accountid AS aid,             t.id AS tid,             t.name AS tname,             acs.lastmodifieddate AS acslmd,             g.lastmodifieddate AS glmd     FROM         `territory` AS t     JOIN `group` AS g ON (g.relatedid = t.id)     JOIN `accountshare` AS acs ON (acs.userorgroupid = g.id)     WHERE         acs.lastmodifieddate > '2015-07-01'             OR g.lastmodifieddate > '2015-07-01') AS q2 ON q1.aid = q2.aid;   

在这种情况下,第一个查询返回自特定日期(2015-07-01)以来更新的所有帐户和地址记录,第二个查询返回自定日期以来更新的所有帐户更改(2015-07-01)和我需要 accountshare ,因为该表与组匹配目标到地区。

英文原文

I have this DB schema:

DB_Schema

As you may notice already there is no relationship between tables and won't be on the future. If I run the following queries I will get a lot of rows per table:

SELECT COUNT(*) FROM `territory`; => 6112 rows SELECT COUNT(*) FROM `group`; => 13669 rows SELECT COUNT(*) FROM `accountshare`; => 412862 rows SELECT COUNT(*) FROM `account`; => 206410 rows SELECT COUNT(*) FROM `address_vod__c`; => 1198746 rows 

I am trying to find each record after 2015-07-01 and I am using this query:

SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     a.lastmodifieddate > '2015-07-01'         OR ad.lastmodifieddate > '2015-07-01'         OR acs.lastmodifieddate > '2015-07-01'         OR g.lastmodifieddate > '2015-07-01' ORDER BY tid DESC; 

But it's taking an eternity because the large size of the resultsets. I've tried to separate in two queries and then JOIN them but is the same takes to much time:

SELECT      * FROM     (SELECT          a.id AS aid,             ad.id AS adid,             a.lastmodifieddate AS almd,             ad.lastmodifieddate AS adlmd,             ad.primary_vod__c AS adp     FROM         account AS a     JOIN address_vod__c AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1)     WHERE         a.lastmodifieddate > '2015-07-01'             OR ad.lastmodifieddate > '2015-07-01') AS q1         JOIN     (SELECT          acs.accountid AS aid,             t.id AS tid,             t.name AS tname,             acs.lastmodifieddate AS acslmd,             g.lastmodifieddate AS glmd     FROM         `territory` AS t     JOIN `group` AS g ON (g.relatedid = t.id)     JOIN `accountshare` AS acs ON (acs.userorgroupid = g.id)     WHERE         acs.lastmodifieddate > '2015-07-01'             OR g.lastmodifieddate > '2015-07-01') AS q2 ON q1.aid = q2.aid; 

In this case the first query returns all the account and address records updated since a specific date (2015-07-01) and the second query returns all the accountshare changes updated since a specific date (2015-07-01) and I need accountshare because that table along with group match targets to territories.

              
         
         

回答列表

0
 
vote

我建议你尝试使用 apply < / a>加入运算符。当您必须加入比其他表格超过另一个表的表时,它很有用。

它是一种永恒,因为大尺寸的结果

哪个索引涉及查询的执行计划?你能发布用于原始查询的执行计划吗?

感谢

 

I suggest you to try with the APPLY join operator. It is useful when you have to join tables where one table has much more rows than the other tables.

it's taking an eternity because the large size of the resultsets

Which indexes are involved on the execution plan of the query? Can you post the execution plan used for the original query?

Thanks

 
 
       
       
0
 
vote

您可以通过每个 <table>.lastmodifieddate > '2015-07-01' 拆分查询,然后使用 Union 一次运行它们。

关于索引 - 索引的一些注释对于快速查询执行至关重要,它不是关于表的复杂性,但是大量行数 - 小表可以轻松地保存在内存中,但是索引可以帮助快速定位需要行。

您应该至少在校验中创建 (lastmodifieddate) 的索引 - 以及与Union的拆分可以快速制作查询,除非 lastmodifieddate > '2015-07-01' 也是如此许多行。

读此幻灯片share 了解为什么需要索引以及如何决定索引什么。

联合的示例重写:

  SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     a.lastmodifieddate > '2015-07-01' UNION SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE         ad.lastmodifieddate > '2015-07-01' UNION SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     acs.lastmodifieddate > '2015-07-01' UNION SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     g.lastmodifieddate > '2015-07-01' ORDER BY tid DESC;   
 

You can split the query by each <table>.lastmodifieddate > '2015-07-01' and then use UNION to run them all at once.

Regarding some comments about indexing - indexes are crucial to fast query execution, it is not about complexity of a table but about number of rows - small table can be easily kept in memory but large one cannot so index can help to fast locate needed rows.

You should at least create index on (lastmodifieddate) for each table where it is checked - that and the splitting with UNION can make the query fast unless lastmodifieddate > '2015-07-01'is TRUE for too many rows.

Read this slide share to learn why indexes are needed and how to decide what to index.

Example rewrite for UNION:

SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     a.lastmodifieddate > '2015-07-01' UNION SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE         ad.lastmodifieddate > '2015-07-01' UNION SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     acs.lastmodifieddate > '2015-07-01' UNION SELECT      a.id AS aid,     t.id AS tid,     t.name AS tname,     a.lastmodifieddate AS almd,     ad.lastmodifieddate AS adlmd,     acs.lastmodifieddate AS acslmd FROM     `territory` AS t         JOIN     `group` AS g ON (g.relatedid = t.id)         JOIN     `accountshare` AS acs ON (acs.userorgroupid = g.id)         JOIN     `account` AS a ON (a.id = acs.accountid)         JOIN     `address_vod__c` AS ad ON (ad.account_vod__c = a.id         AND ad.primary_vod__c = 1) WHERE     g.lastmodifieddate > '2015-07-01' ORDER BY tid DESC; 
 
 
   
   

相关问题

1  查询简化以获得更好的执行  ( Query simplification for better execution ) 
Select a.item_no, a.item_type, a.item_group, a.uom_group, a.unit_price, c.service_category, c.item_no, b.man...

2  封闭循环在数据库模型中,它是否会影响性能?  ( Closed loops in database model does it affect the performance ) 
我已经听说应该避免在数据库关系模型中进行循环。从我能想到的,它不是一个很好的做法,因为有一个以上的不同的途径来获得相同的结果,可能导致查询结果中的记录不一致。 是真的吗?此外,我也想知道它是否会影响查询计划和SQL Server性能。 你觉得怎么样?它可能导致的其他问题是什么? ...

4  MySQL慢日志配置  ( Mysql slow log configuration ) 
是否有一种方法可以让MySQL慢日志每天开始新的日志文件? 目前它只是一个大型文件,并且每天都有Grep行。为每天慢日志具有单独的文件会更方便。 我是否必须配置My.cnf或某些Linux功能? ...

0  null查询陷入困境并使用Up Connections  ( Null queries getting stuck and using up connections ) 
运行 show processlist 在phpmyadmin中,我经常向我展示查询与null的信息以及一个显然几秒钟的时间,但它在那里陷入了很长一段时间(每次运行 show processlist 相同的ID可能有不同的时间)。我不知道这些查询是什么或者为什么他们被卡住,但它们通常会导致所有可用的连接都在Apach...

8  如何确定何时创建新表以保存可以从查询获取的数据?  ( How can i identify when to create a new table to hold data that can be obtained ) 
我们有一个付款表,代理商获得付款委员会。委员会基于几个不同的因素,如收到付款所需的时间,所以有一些计算涉及委员会的佣金率,但没有任何淫秽的复杂。 例如,它可能永远不会比这更复杂: SELECT Payments.Amount * CASE WHEN DateDiff(year, Client.Recei...

0  性能监视器%处理器在_total和sqlservr.exe之间的时间度量差异  ( Performance monitor processor time metric difference between total and sqlser ) 
我有一个在32核cpu上运行的sql服务器,我使用性能监视器(perfmon.msc)来收集处理器的数据:%处理器时间(instance - _total)和进程:%处理器时间(instance - sqlservr.exe ) 例如,例如2019/07/21(星期日),它显示了平均值: Processo...

2  在戴尔710上缓慢SSD  ( Slow ssd on dell 710 ) 
我们最近通过100GB SSD驱动器将Dell T110服务器升级到Dell T710。我们希望我们能够看到MySQL选择性能的巨大改进,但不幸的是新系统比旧系统慢。 我已经尝试调整fstab和mysql的onfiguration,但没有什么似乎有帮助。看来,由于某种原因,系统无法使用驱动器的全速,或者还有其他东西...

1  用100k +表帮助mysql  ( Help with mysql with 100k table ) 
嗨,我试图开发一个数据库设计,其中它会有100k +托盘居中的表。 我在这里的主要问题是MySQL能够掌握那么多的表,没有性能下降。 此外,MySQL的open_file_limit的限制为64k,但我知道它可以克服ulimit,但是当我有巨大的打开文件时,再次将是对性能的影响。 我不想有错误:24长期运行。 另外,...

1  从SQL Server 2012-2到2019迁移以来的性能问题  ( Performance issues since migrating from sql server 2012r2 to 2019 ) 
我已将数据库从SQL Server 2012R2迁移到新的SQL Server 2019.我刚刚备份了数据库并在新的SQL Server上恢复它。之后我通过 IMPORT , INSERT , UPDATE 或或 abcdefghijklmnMERGE 。在某些测试人员之间使用过一些测试的测试。现在表现比旧服务器慢...

13  为什么要更快地选择此查询的所有结果列,而不是选择我关心的一列?  ( Why is selecting all resulting columns of this query faster than selecting the o ) 
我有一个查询,其中使用 CREATE TABLE marketplace ( marketplace_id serial PRIMARY KEY, marketplace_name varchar UNIQUE ) 2 不仅可以较少的读取,而且还使用比使用 CREATE TABLE marketplace ...

2  使用Merge Join时,PagegreSQL查询与分页缓慢  ( Postgresql query with pagination is slow when using merge join ) 
我正在尝试优化用于迭代与加入表的行的查询。在使用 OFFSET 到 WHERE id > last_id_from_prev_page 和 ORDER BY 作为此处描述 这是一些最小的DDL来表示我正在使用的数据: -- DDL: create table sub ( -- 4.3M records s...

2  在SQL Server中使用非聚集索引  ( Usage of non clustered index in sql server ) 
我有2个索引: CREATE NONCLUSTERED INDEX [IX_1] ON [dbo].[TableA] ( [Column1] ASC, [Column2] ASC ) CREATE NONCLUSTERED INDEX [IX_2] ON [dbo].[TableA] ( ...

1  RAM中的逻辑读数是否出现在等待统计数据中或在哪里?  ( Does logical reads in ram appears in wait stats or where ) 
关于SQLOS的执行模型(运行状态,可运行的队列,服务员列表),当在当前正在进行的RAM中存在逻辑读数时,任务的状态将是什么? 如果它是一个服务员列表,那么最普遍的等待类型是什么? 我可以以某种方式测量此类操作所需的时间吗? 我知道很多逻辑读取慢慢下降了你的查询,大量的表/索引扫描(驻留在缓冲池中)慢下来的查询 - ...

1  CentOS 6 / Postgres 9.3定期“暂停”  ( Centos 6 postgres 9 3 periodic pauses ) 
每天多次,我们在我们的Postgres服务器上经历"暂停" ,其中大约10-20秒,其中内核CPU使用率很高,其他一切都会停止。在Postgres水平上,这导致Queries通常需要〜50毫秒的时间〜15秒完成。在这些时期,I / O不特别高,但是当"暂停" 结束时,当所有积压的查询时,当"暂停" 得出结论时,所有倒...

5  将重新加载表原因在下游重新编译?  ( Will reloading a table cause recompiles downstream ) 
我们有一个请求我们截断数据表的供应商,并每天用相同的数据重新加载它。该表是90k行和约10MB。他们想要完成的原因是,因此ETL程序拾取了表已更新。此表用于许多查询。 我所拥有的问题是对此表所做的更改会导致使用该表的查询计划中重新编译的纹波效果。或者SQL将足够智能地看到表内容没有改变,并避免重新编译? 是的,我...




© 2021 it.wenda123.org All Rights Reserved. 问答之家 版权所有


Licensed under cc by-sa 3.0 with attribution required.