为什么这个简单的连接查询与子查询相比要快得多?

Why is this simple join query significantly quicker with a sub-query?(为什么这个简单的连接查询与子查询相比要快得多?)
本文介绍了为什么这个简单的连接查询与子查询相比要快得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两张桌子.order_details 为 100,000 行,outbound 为 10,000 行.

I have two tables. order_details which is 100,000 rows, and outbound which is 10,000 rows.

我需要将它们加入一个名为 order_number 的列中,这两个列都是 VARCHAR(50).order_number 在出站表中不是唯一的.

I need to join them on a column called order_number, which is a VARCHAR(50) on both. order_number is not unique in the outbound table.

CREATE TABLE `outbound` (
    `outbound_id` int(12) NOT NULL,
    `order_number` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `order_details` (
    `order_details_id` int(12) NOT NULL,
    `order_number` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

这是我的初始查询,运行时间超过 60 秒:

This is my initial query, and it takes well over 60 seconds to run:

SELECT o.order_number
FROM outbound o
INNER JOIN order_details od
    ON o.order_number = od.order_number

此查询得到相同的结果,运行时间不到一秒:

This query gets the same results and takes less than a second to run:

SELECT o.order_number
FROM outbound o
INNER JOIN
(
    SELECT order_number
    FROM order_details
) od
ON (o.order_number = od.order_number)

这让我很惊讶,因为通常子查询要慢得多.

This is surprising to me because usually sub-queries are significantly slower.

运行 EXPLAIN (我仍在学习如何理解)表明子查询版本使用 derived2 表,它正在使用索引,并且索引是 auto_key0.我不够精明,不知道如何解释这一点以理解为什么这会产生重大影响.

Running EXPLAIN (which I'm still learning how to understand) shows that the sub query version uses a derived2 table, that it is using an index, and that index is auto_key0. I'm not savvy enough to know how to interpret this to understand why this makes a significant difference.

我正在通过命令行运行这些查询.

I am running these queries over command line.

我正在为 Linux (x86_64) CentOS 运行 MySQL Ver 14.14 Distrib 5.6.35.

I am running MySQL Ver 14.14 Distrib 5.6.35, for Linux (x86_64) CentOS.

总结:

为什么这个简单的连接查询使用子查询明显更快?

推荐答案

我对MySQL的了解非常有限.但这是我的想法:

My knowledge of MySQL is very limited. But these are my thoughts:

您的表没有索引.然后连接必须读取整个第二个表才能比较第一个表的每一行.

Your tables don't have indexes. Then the join has to read the entire second table in order to compare, for each row of the first table.

子查询读取第二张表一次并创建索引,则不需要为第一张表的每一行读取整个第二张表.它只需要检查索引,这要快得多.

The subquery reads the second table once and creates an index, then it doesn't need to read the entire second table for each row of the first table. It only has to check the index, which is much more faster.

要验证我是否正确,请尝试为两个表中的 order_number 列创建索引(CREATE INDEX ...),然后再次运行这两个查询.您的第一个查询应该只需要不到一秒钟而不是一分钟.

To verify if I'm ritght or not, try creating indexes for the column order_number in your two tables (CREATE INDEX ... ), and run again this two queries. Your first query should only take less than a second instead of a minute.

这篇关于为什么这个简单的连接查询与子查询相比要快得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Execute complex raw SQL query in EF6(在EF6中执行复杂的原始SQL查询)
Hibernate reactive No Vert.x context active in aws rds(AWS RDS中的休眠反应性非Vert.x上下文处于活动状态)
Bulk insert with mysql2 and NodeJs throws 500(使用mysql2和NodeJS的大容量插入抛出500)
Flask + PyMySQL giving error no attribute #39;settimeout#39;(FlASK+PyMySQL给出错误,没有属性#39;setTimeout#39;)
auto_increment column for a group of rows?(一组行的AUTO_INCREMENT列?)
Sort by ID DESC(按ID代码排序)