Tuesday, April 10, 2012

Optimizer: new EXPLAIN FORMAT=JSON

New feature: structured EXPLAIN


Probably, most of us tried to read the output of MySQL EXPLAIN command at least once.
And probably many of us have decided, that it isn't so readable and understandable as it has to be.
Some of us even tried to create or use external pretty-printing programs and other complicated converters to make EXPLAIN's output less cryptic, for example nice Percona's Visual Explain script.
But from now we have a native MySQL command that explains query execution plan in a better human-readable and machine-readable way: EXPLAIN FORMAT=JSON!

A trivial example, SELECT from a single table:

mysql> CREATE TABLE t1 (i INT, j INT);
Query OK, 0 rows affected (0.00 sec)

mysql> EXPLAIN SELECT * FROM t1;
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
| id | select_type | table | type   | possible_keys | key  | key_len | ref  | rows | Extra               |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
|  1 | SIMPLE      | t1    | system | NULL          | NULL | NULL    | NULL |    0 | const row not found |
+----+-------------+-------+--------+---------------+------+---------+------+------+---------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN FORMAT=JSON SELECT * FROM t1;
+------------------------------------------------- ...
| EXPLAIN                                          
+------------------------------------------------- ...
| {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "t1",
      "access_type": "system",
      "rows": 0,
      "filtered": 0,
      "const_row_not_found": true
    }
  }
} |
+------------------------------------------------- ...
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+----------------------------------------------------------------+
| Level | Code | Message                                                        |
+-------+------+----------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select NULL AS `i`,NULL AS `j` from `test`.`t1` |
+-------+------+----------------------------------------------------------------+
1 row in set (0.00 sec)
  

Moreover, new EXPLAIN is not just a 1-to-1 translator of the old EXPLAIN command output.
JSON output is much more informative:

  • it tries to display a real execution path of the query;
  • it describes pushed conditions;
  • temporary table and index creation is displayed in more precise

More advanced example, nested loop join and pushed conditions (see "attached_conditions"):

mysql> CREATE TABLE t2 (i INT KEY, j INT);
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> INSERT INTO t2 SELECT * FROM t1;
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.i = t2.i WHERE t1.j > 1 AND t2.j < 3;
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
|  1 | SIMPLE      | t1    | ALL    | NULL          | NULL    | NULL    | NULL      |    3 | Using where |
|  1 | SIMPLE      | t2    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t1.i |    1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
2 rows in set (0.00 sec)

mysql> SET end_markers_in_json=true;
Query OK, 0 rows affected (0.00 sec)

mysql> EXPLAIN FORMAT=JSON SELECT * FROM t1 JOIN t2 ON t1.i = t2.i WHERE t1.j > 1 AND t2.j < 3;
+------------------------------------------------- ...
| EXPLAIN
+------------------------------------------------- ...
| {
  "query_block": {
    "select_id": 1,
    "nested_loop": [
      {
        "table": {
          "table_name": "t1",
          "access_type": "ALL",
          "rows": 3,
          "filtered": 100,
          "attached_condition": "((`test`.`t1`.`j` > 1) and (`test`.`t1`.`i` is not null))"
        } /* table */
      },
      {
        "table": {
          "table_name": "t2",
          "access_type": "eq_ref",
          "possible_keys": [
            "PRIMARY"
          ] /* possible_keys */,
          "key": "PRIMARY",
          "key_length": "4",
          "ref": [
            "test.t1.i"
          ] /* ref */,
          "rows": 1,
          "filtered": 100,
          "attached_condition": "(`test`.`t2`.`j` < 3)"
        } /* table */
      }
    ] /* nested_loop */
  } /* query_block */
} |
+------------------------------------------------- ...
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+---------------------------------- ...
| Level | Code | Message
+-------+------+---------------------------------- ...
| Note  | 1003 | /* select#1 */ select `test`.`t1`.`i` AS `i`,`test`.`t1`.`j` AS `j`,`test`.`t2`.`i` AS `i`,`test`.`t2`.`j` AS `j` from `test`.`t1` join `test`.`t2` where ((`test`.`t2`.`i` = `test`.`t1`.`i`) and (`test`.`t1`.`j` > 1) and (`test`.`t2`.`j` < 3)) |
+-------+------+---------------------------------- ...
1 row in set (0.00 sec)
  

Thus, instead of the "Using where" in the traditional EXPLAIN output we can see where each part of query condition was pushed down and what part was pushed.

Similar example: "Using index" goes "index_condition":

mysql> EXPLAIN SELECT * FROM t2 WHERE i > 1 AND j < 3;
+----+-------------+-------+-------+---------------+---------+---------+------+------+------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                              |
+----+-------------+-------+-------+---------------+---------+---------+------+------+------------------------------------+
|  1 | SIMPLE      | t2    | range | PRIMARY       | PRIMARY | 4       | NULL |    2 | Using index condition; Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+------------------------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN FORMAT=JSON SELECT * FROM t2 WHERE i > 1 AND j < 3;
+------------------------------------------------- ...
| EXPLAIN
+------------------------------------------------- ...
| {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "t2",
      "access_type": "range",
      "possible_keys": [
        "PRIMARY"
      ] /* possible_keys */,
      "key": "PRIMARY",
      "key_length": "4",
      "rows": 2,
      "filtered": 100,
      "index_condition": "(`test`.`t2`.`i` > 1)",
      "attached_condition": "(`test`.`t2`.`j` < 3)"
    } /* table */
  } /* query_block */
} |
+------------------------------------------------- ...
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+---------------------------------- ...
| Level | Code | Message                                                                                                                                      |
+-------+------+---------------------------------- ...
| Note  | 1003 | /* select#1 */ select `test`.`t2`.`i` AS `i`,`test`.`t2`.`j` AS `j` from `test`.`t2` where ((`test`.`t2`.`i` > 1) and (`test`.`t2`.`j` < 3)) |
+-------+------+---------------------------------- ...
1 row in set (0.00 sec)

  

Also the new explain reflects the proper execution order of "grouping_operation" (GROUP BY clause), "ordering_operation" (ORDER BY) and "duplicates_removal" (DISTINCT etc.) operations.
Each of them may be an initiator of temporary table or temporary index (filesort) creation.
One of the most advanced features of new EXPLAIN is how it displays temporary tables and temporary indices.
Whereas the traditional EXPLAIN may "aggregate" two facts of temporary table creation into the single "Using temporary table" message (and similar for temporary index), the new JSON output tries to display both (if any) at the proper place: i.e. if the "ORDER BY" clause is the initiator of a temporary table creation, the new EXPLAIN shows "using_temporary_table": true key-value pair under the "ordering_operation" node, and if, for example, the "GROUP BY" causes a temporary index creation, the EXPLAIN outputs "using_filesort": true pair under the "grouping_operation" node.

Simple example with DISTINCT, GROUP BY and ORDER BY:

mysql> EXPLAIN SELECT DISTINCT * FROM t1 GROUP BY i ORDER BY j;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    3 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN FORMAT=JSON SELECT DISTINCT * FROM t1 GROUP BY i ORDER BY j;
+------------------------------------------------- ...
| EXPLAIN
+------------------------------------------------- ...
| {
  "query_block": {
    "select_id": 1,
    "ordering_operation": {
      "using_filesort": true,
      "duplicates_removal": {
        "using_filesort": false,
        "grouping_operation": {
          "using_temporary_table": true,
          "using_filesort": false,
          "table": {
            "table_name": "t1",
            "access_type": "ALL",
            "rows": 3,
            "filtered": 100
          } /* table */
        } /* grouping_operation */
      } /* duplicates_removal */
    } /* ordering_operation */
  } /* query_block */
} |
+------------------------------------------------- ...
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+---------------------------------- ...
| Level | Code | Message                                                                                                                                         |
+-------+------+---------------------------------- ...
| Note  | 1003 | /* select#1 */ select distinct `test`.`t1`.`i` AS `i`,`test`.`t1`.`j` AS `j` from `test`.`t1` group by `test`.`t1`.`i` order by `test`.`t1`.`j` |
+-------+------+---------------------------------- ...
1 row in set (0.00 sec)
  

Another advanced feature is how the new EXPLAIN displays table materializations. It includes "derived tables", semi-join materialization etc.
Update: See Guilhem's post about subqueries with materialization for a complicated example.

Trivial subquery materialization example: "derived table"

mysql> EXPLAIN SELECT * FROM (SELECT * FROM t1) t;
+----+-------------+------------+------+---------------+------+---------+------+------+-------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+------+-------+
|  1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    3 | NULL  |
|  2 | DERIVED     | t1         | ALL  | NULL          | NULL | NULL    | NULL |    3 | NULL  |
+----+-------------+------------+------+---------------+------+---------+------+------+-------+
2 rows in set (0.00 sec)

mysql> EXPLAIN FORMAT=JSON SELECT * FROM (SELECT * FROM t1) t;
+------------------------------------------------- ...
| EXPLAIN
+------------------------------------------------- ...
| {
  "query_block": {
    "select_id": 1,
    "table": {
      "table_name": "t",
      "access_type": "ALL",
      "rows": 3,
      "filtered": 100,
      "materialized_from_subquery": {
        "using_temporary_table": true,
        "dependent": false,
        "cacheable": true,
        "query_block": {
          "select_id": 2,
          "table": {
            "table_name": "t1",
            "access_type": "ALL",
            "rows": 3,
            "filtered": 100
          } /* table */
        } /* query_block */
      } /* materialized_from_subquery */
    } /* table */
  } /* query_block */
} |
+------------------------------------------------- ...
1 row in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS;
+-------+------+---------------------------------- ...
| Level | Code | Message                                                                                                                                             |
+-------+------+---------------------------------- ...
| Note  | 1003 | /* select#1 */ select `t`.`i` AS `i`,`t`.`j` AS `j` from (/* select#2 */ select `test`.`t1`.`i` AS `i`,`test`.`t1`.`j` AS `j` from `test`.`t1`) `t` |
+-------+------+---------------------------------- ...
1 row in set (0.00 sec)
  

The new JSON EXPLAIN is feature rich, please try it and have fun!


And, at the end of this post, let me thank Evgeny Potemkin and Guilhem Bichot for their great help in design and implementation of structured EXPLAIN!


Tuesday, October 4, 2011

Optimizer: new EXPLAIN for data-modifiers


One of the new features of MySQL 5.6.3 is EXPLAIN command for data-modifying statements.
As it's obvious from its name, it outputs QEP (query execution plan) for INSERT, REPLACE, UPDATE and DELETE like old good one does for SELECT:

example:

mysql> EXPLAIN UPDATE t1 SET b = 'b' WHERE a > 2;
+----+-------------+-------+-------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+-------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | t1    | range | a,a_2         | a    | 16      | NULL |    2 | Using where; Using temporary |
+----+-------------+-------+-------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

All kinds of MySQL syntax are supported:

INSERT from SELECT:

mysql> EXPLAIN INSERT INTO t1 SELECT * FROM t2 WHERE a IN (1, 3, 5);
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
|  1 | SIMPLE      | t2    | range | t2i1          | t2i1 | 4       | NULL |    3 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
1 row in set (0.00 sec)

EXPLAIN for INSERT ... VALUES works as a special case of the INSERT ... SELECT VALUES FROM DUAL:

trivial EXPLAIN INSERT ... VALUES:

mysql> EXPLAIN INSERT INTO t1 VALUES (1, 2, 3);
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | No tables used |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

mysql> EXPLAIN INSERT INTO t1 SELECT 1, 2, 3 FROM DUAL;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | No tables used |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

multi-table UPDATE:

mysql> EXPLAIN UPDATE t1, t2 SET t1.b = t2.b WHERE t1.a = t2.a;
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------+
|  1 | SIMPLE      | t1    | ALL  | PRIMARY       | NULL | NULL    | NULL      | 3072 |       |
|  1 | SIMPLE      | t2    | ref  | t2i1          | t2i1 | 4       | test.t1.a |    1 |       |
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------+
2 rows in set (0.00 sec)

multi-table DELETE:

mysql> EXPLAIN DELETE FROM t1 USING t1 JOIN t2 ON t1.a = t2.a;
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------------+
|  1 | SIMPLE      | t1    | ALL  | PRIMARY       | NULL | NULL    | NULL      | 3072 |             |
|  1 | SIMPLE      | t2    | ref  | t2i1          | t2i1 | 4       | test.t1.a |    1 | Using index |
+----+-------------+-------+------+---------------+------+---------+-----------+------+-------------+
2 rows in set (0.00 sec)

For single-table UPDATE and DELETE it may be interesting to explain subquery execution:

single-table UPDATE with a subquery:

mysql> EXPLAIN UPDATE t1 SET b = '12345' WHERE a IN (SELECT a FROM t2);
+----+--------------------+-------+----------------+---------------+------+---------+------+------+-------------+
| id | select_type        | table | type           | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+--------------------+-------+----------------+---------------+------+---------+------+------+-------------+
|  1 | PRIMARY            | t1    | ALL            | PRIMARY       | NULL | NULL    | NULL | 3072 | Using where |
|  2 | DEPENDENT SUBQUERY | t2    | index_subquery | t2i1          | t2i1 | 4       | func |    1 | Using index |
+----+--------------------+-------+----------------+---------------+------+---------+------+------+-------------+
2 rows in set (0.00 sec)

... or even:

single-table DELETE with subqueries:

mysql> EXPLAIN DELETE FROM t1 WHERE a IN (SELECT t2.a FROM t2 JOIN t3 WHERE t2.b = t3.b AND t3.a IN (SELECT a FROM t4));
+----+--------------------+-------+------+---------------+------+---------+------+------+-------------- - - -
| id | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | Extra
+----+--------------------+-------+------+---------------+------+---------+------+------+-------------- - - -
|  1 | PRIMARY            | t1    | ALL  | PRIMARY       | NULL | NULL    | NULL | 3072 | Using where
|  2 | DEPENDENT SUBQUERY | t2    | ref  | t2i1          | t2i1 | 4       | func |    1 |
|  2 | DEPENDENT SUBQUERY | t3    | ALL  | NULL          | NULL | NULL    | NULL | 3072 | Using where;
|  3 | DEPENDENT SUBQUERY | t4    | ALL  | NULL          | NULL | NULL    | NULL | 3072 | Using where
+----+--------------------+-------+------+---------------+------+---------+------+------+-------------- - - -
4 rows in set (0.00 sec)

Manuals are available here and here.