Skip to content

Latest commit

 

History

History
350 lines (246 loc) · 20 KB

sql-prepared-plan-cache.md

File metadata and controls

350 lines (246 loc) · 20 KB
title summary aliases
SQL Prepared Execution Plan Cache
Learn about SQL Prepared Execution Plan Cache in TiDB.
/tidb/dev/sql-prepare-plan-cache

SQL Prepared Execution Plan Cache

TiDB supports execution plan caching for Prepare and Execute queries. This includes both forms of prepared statements:

  • Using the COM_STMT_PREPARE and COM_STMT_EXECUTE protocol features.
  • Using the SQL statements PREPARE and EXECUTE.

The TiDB optimizer handles these two types of queries in the same way: when preparing, the parameterized query is parsed into an AST (Abstract Syntax Tree) and cached; in later execution, the execution plan is generated based on the stored AST and specific parameter values.

When the execution plan cache is enabled, in the first execution every Prepare statement checks whether the current query can use the execution plan cache, and if the query can use it, then put the generated execution plan into a cache implemented by LRU (Least Recently Used) linked list. In the subsequent Execute queries, the execution plan is obtained from the cache and checked for availability. If the check succeeds, the step of generating an execution plan is skipped. Otherwise, the execution plan is regenerated and saved in the cache.

TiDB also supports execution plan caching for some non-PREPARE statements, similar to the Prepare/Execute statements. For more details, refer to Non-prepared plan cache.

In the current version of TiDB, if a Prepare statement meets any of the following conditions, the query or the plan is not cached:

  • The query contains SQL statements other than SELECT, UPDATE, INSERT, DELETE, Union, Intersect, and Except.
  • The query accesses partitioned tables or temporary tables, or a table that contains generated columns.
  • The query contains non-correlated sub-queries, such as SELECT * FROM t1 WHERE t1.a > (SELECT 1 FROM t2 WHERE t2.b < 1).
  • The query contains correlated sub-queries with PhysicalApply operators in the execution plan, such as SELECT * FROM t1 WHERE t1.a > (SELECT a FROM t2 WHERE t1.b > t2.b).
  • The query contains the ignore_plan_cache hint, such as select /*+ ignore_plan_cache() */ * from t.
  • The query contains variables other than ? (including system variables or user-defined variables), such as select * from t where a>? and b>@x.
  • The query contains the functions that cannot be cached: database(), current_user, current_role, user, connection_id, last_insert_id, row_count, version, and like.
  • The query uses a variable as the LIMIT parameter (such as LIMIT ? and LIMIT 10, ?) and the variable value is greater than 10000.
  • The query contains ? after Order By, such as Order By ?. Such queries sort data based on the column specified by ?. If the queries targeting different columns use the same execution plan, the results will be wrong. Therefore, such queries are not cached. However, if the query is a common one, such as Order By a+?, it is cached.
  • The query contains ? after Group By, such as Group By?. Such queries group data based on the column specified by ?. If the queries targeting different columns use the same execution plan, the results will be wrong. Therefore, such queries are not cached. However, if the query is a common one, such as Group By a+?, it is cached.
  • The query contains ? in the definition of the Window Frame window function, such as (partition by year order by sale rows ? preceding). If ? appears elsewhere in the window function, the query is cached.
  • The query contains parameters for comparing int and string, such as c_int >= ? or c_int in (?, ?), in which ? indicates the string type, such as set @x='123'. To ensure that the query result is compatible with MySQL, parameters need to be adjusted in each query, so such queries are not cached.
  • The plan attempts to access TiFlash.
  • In most cases, the plan that contains TableDual is not cached, unless the current Prepare statement does not have parameters.

The LRU linked list is designed as a session-level cache because Prepare/Execute cannot be executed across sessions. Each element of the LRU list is a key-value pair. The value is the execution plan, and the key is composed of the following parts:

  • The name of the database where Execute is executed
  • The identifier of the Prepare statement, that is, the name after the PREPARE keyword
  • The current schema version, which is updated after every successfully executed DDL statement
  • The SQL mode when executing Execute
  • The current time zone, which is the value of the time_zone system variable
  • The value of the sql_select_limit system variable

Any change in the preceding information (for example, switching databases, renaming Prepare statement, executing DDL statements, or modifying the value of SQL mode/time_zone), or the LRU cache elimination mechanism causes the execution plan cache miss when executing.

After the execution plan cache is obtained from the cache, TiDB first checks whether the execution plan is still valid. If the current Execute statement is executed in an explicit transaction, and the referenced table is modified in the transaction pre-order statement, the cached execution plan accessing this table does not contain the UnionScan operator, then it cannot be executed.

After the validation test is passed, the scan range of the execution plan is adjusted according to the current parameter values, and then used to perform data querying.

There are several points worth noting about execution plan caching and query performance:

  • No matter an execution plan is cached or not, it is affected by SQL bindings. For execution plans that have not been cached (the first Execute), these plans are affected by existing SQL bindings. For execution plans that have been cached, if new SQL Bindings are created, these plans become invalid.
  • Cached plans are not affected by changes in statistics, optimization rules, and blocklist pushdown by expressions.
  • Considering that the parameters of Execute are different, the execution plan cache prohibits some aggressive query optimization methods that are closely related to specific parameter values to ensure adaptability. This causes that the query plan may not be optimal for certain parameter values. For example, the filter condition of the query is where a > ? And a < ?, the parameters of the first Execute statement are 2 and 1 respectively. Considering that these two parameters maybe be 1 and 2 in the next execution time, the optimizer does not generate the optimal TableDual execution plan that is specific to current parameter values;
  • If cache invalidation and elimination are not considered, an execution plan cache is applied to various parameter values, which in theory also results in non-optimal execution plans for certain values. For example, if the filter condition is where a < ? and the parameter value used for the first execution is 1, then the optimizer generates the optimal IndexScan execution plan and puts it into the cache. In the subsequent executions, if the value becomes 10000, the TableScan plan might be the better one. But due to the execution plan cache, the previously generated IndexScan is used for execution. Therefore, the execution plan cache is more suitable for application scenarios where the query is simple (the ratio of compilation is high) and the execution plan is relatively fixed.

Starting from v6.1.0, the execution plan cache is enabled by default. You can control prepared plan cache via the system variable tidb_enable_prepared_plan_cache.

Note:

The execution plan cache feature applies only to Prepare/Execute queries and does not take effect for normal queries.

After the execution plan cache feature is enabled, you can use the session-level system variable last_plan_from_cache to see whether the previous Execute statement used the cached execution plan, for example:

{{< copyable "sql" >}}

MySQL [test]> create table t(a int);
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> prepare stmt from 'select * from t where a = ?';
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> set @a = 1;
Query OK, 0 rows affected (0.00 sec)

-- The first execution generates an execution plan and saves it in the cache.
MySQL [test]> execute stmt using @a;
Empty set (0.00 sec)
MySQL [test]> select @@last_plan_from_cache;
+------------------------+
| @@last_plan_from_cache |
+------------------------+
| 0                      |
+------------------------+
1 row in set (0.00 sec)

-- The second execution hits the cache.
MySQL [test]> execute stmt using @a;
Empty set (0.00 sec)
MySQL [test]> select @@last_plan_from_cache;
+------------------------+
| @@last_plan_from_cache |
+------------------------+
| 1                      |
+------------------------+
1 row in set (0.00 sec)

If you find that a certain set of Prepare/Execute has unexpected behavior due to the execution plan cache, you can use the ignore_plan_cache() SQL hint to skip using the execution plan cache for the current statement. Still, use the preceding statement as an example:

{{< copyable "sql" >}}

MySQL [test]> prepare stmt from 'select /*+ ignore_plan_cache() */ * from t where a = ?';
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> set @a = 1;
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> execute stmt using @a;
Empty set (0.00 sec)
MySQL [test]> select @@last_plan_from_cache;
+------------------------+
| @@last_plan_from_cache |
+------------------------+
| 0                      |
+------------------------+
1 row in set (0.00 sec)
MySQL [test]> execute stmt using @a;
Empty set (0.00 sec)
MySQL [test]> select @@last_plan_from_cache;
+------------------------+
| @@last_plan_from_cache |
+------------------------+
| 0                      |
+------------------------+
1 row in set (0.00 sec)

Diagnostics of Prepared Plan Cache

Some queries or plans cannot be cached. You can use the SHOW WARNINGS statement to check whether the query or plan is cached. If it is not cached, you can check the reason for the failure in the result. For example:

mysql> PREPARE st FROM 'SELECT * FROM t WHERE a > (SELECT MAX(a) FROM t)';  -- The query contains a subquery and cannot be cached.

Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> show warnings;  -- Checks the reason why the query plan cannot be cached.

+---------+------+-----------------------------------------------+
| Level   | Code | Message                                       |
+---------+------+-----------------------------------------------+
| Warning | 1105 | skip plan-cache: sub-queries are un-cacheable |
+---------+------+-----------------------------------------------+
1 row in set (0.00 sec)

mysql> prepare st from 'select * from t where a<?';

Query OK, 0 rows affected (0.00 sec)

mysql> set @a='1';

Query OK, 0 rows affected (0.00 sec)

mysql> execute st using @a;  -- The optimization converts a non-INT type to an INT type, and the execution plan might change with the change of the parameter, so TiDB does not cache the plan.

Empty set, 1 warning (0.01 sec)

mysql> show warnings;

+---------+------+----------------------------------------------+
| Level   | Code | Message                                      |
+---------+------+----------------------------------------------+
| Warning | 1105 | skip plan-cache: '1' may be converted to INT |
+---------+------+----------------------------------------------+
1 row in set (0.00 sec)

Memory management of Prepared Plan Cache

Using Prepared Plan Cache incurs memory overhead. To view the total memory consumption by the cached execution plans of all sessions in each TiDB instance, you can use the Plan Cache Memory Usage monitoring panel in Grafana.

Note:

Because of the memory reclaim mechanism of Golang and some uncounted memory structures, the memory displayed in Grafana is not equal to the actual heap memory usage. It is tested that there is a deviation of about ±20% between the memory displayed in Grafana and the actual heap memory usage.

To view the total number of execution plans cached in each TiDB instance, you can use the Plan Cache Plan Num panel in Grafana.

The following is an example of the Plan Cache Memory Usage and Plan Cache Plan Num panels in Grafana:

grafana_panels

Starting from v7.1.0, you can control the maximum number of plans that can be cached in each session by configuring the system variable tidb_session_plan_cache_size. For different environments, the recommended value is as follows and you can adjust it according to the monitoring panels:

Using Prepared Plan Cache has some memory overhead. In internal tests, each cached plan consumes an average of 100 KiB of memory. Because Plan Cache is currently at the SESSION level, the total memory consumption is approximately the number of sessions * the average number of cached plans in a session * 100 KiB.

For example, the current TiDB instance has 50 sessions in concurrency and each session has approximately 100 cached plans. The total memory consumption is approximately 50 * 100 * 100 KiB = 512 MB.

You can control the maximum number of plans that can be cached in each session by configuring the system variable tidb_session_plan_cache_size. For different environments, the recommended value is as follows:

  • When the memory threshold of the TiDB server instance is <= 64 GiB, set tidb_session_plan_cache_size to 50.
  • When the memory threshold of the TiDB server instance is > 64 GiB, set tidb_session_plan_cache_size to 100.

Starting from v7.1.0, you can control the maximum size of a plan that can be cached using the system variable tidb_plan_cache_max_plan_size. The default value is 2 MB. If the size of a plan exceeds this value, the plan will not be cached.

When the unused memory of the TiDB server is less than a certain threshold, the memory protection mechanism of plan cache is triggered, through which some cached plans will be evicted.

You can control the threshold by configuring the system variable tidb_prepared_plan_cache_memory_guard_ratio. The threshold is 0.1 by default, which means when the unused memory of the TiDB server is less than 10% of the total memory (90% of the memory is used), the memory protection mechanism is triggered.

Due to memory limit, plan cache might be missed sometimes. You can check the status by viewing the Plan Cache Miss OPS metric in the Grafana dashboard.

Due to memory limit, plan cache might be missed sometimes.

Clear execution plan cache

You can clear execution plan cache by executing the ADMIN FLUSH [SESSION | INSTANCE] PLAN_CACHE statement.

In this statement, [SESSION | INSTANCE] specifies whether the plan cache is cleared for the current session or the whole TiDB instance. If the scope is not specified, the preceding statement applies to the SESSION cache by default.

The following is an example of clearing the SESSION execution plan cache:

{{< copyable "sql" >}}

MySQL [test]> create table t (a int);
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> prepare stmt from 'select * from t';
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> execute stmt;
Empty set (0.00 sec)

MySQL [test]> execute stmt;
Empty set (0.00 sec)

MySQL [test]> select @@last_plan_from_cache; -- Select the cached plan
+------------------------+
| @@last_plan_from_cache |
+------------------------+
|                      1 |
+------------------------+
1 row in set (0.00 sec)

MySQL [test]> admin flush session plan_cache; -- Clear the cached plan of the current session
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> execute stmt;
Empty set (0.00 sec)

MySQL [test]> select @@last_plan_from_cache; -- The cached plan cannot be selected again, because it has been cleared
+------------------------+
| @@last_plan_from_cache |
+------------------------+
|                      0 |
+------------------------+
1 row in set (0.00 sec)

Currently, TiDB does not support clearing GLOBAL execution plan cache. That means you cannot clear the cached plan of the whole TiDB cluster. The following error is reported if you try to clear the GLOBAL execution plan cache:

{{< copyable "sql" >}}

MySQL [test]> admin flush global plan_cache;
ERROR 1105 (HY000): Do not support the 'admin flush global scope.'

Ignore the COM_STMT_CLOSE command and the DEALLOCATE PREPARE statement

To reduce the syntax parsing cost of SQL statements, it is recommended that you run prepare stmt once, then execute stmt multiple times before running deallocate prepare:

{{< copyable "sql" >}}

MySQL [test]> prepare stmt from '...'; -- Prepare once
MySQL [test]> execute stmt using ...;  -- Execute once
MySQL [test]> ...
MySQL [test]> execute stmt using ...;  -- Execute multiple times
MySQL [test]> deallocate prepare stmt; -- Release the prepared statement

In real practice, you may be used to running deallocate prepare each time after running execute stmt, as shown below:

{{< copyable "sql" >}}

MySQL [test]> prepare stmt from '...'; -- Prepare once
MySQL [test]> execute stmt using ...;
MySQL [test]> deallocate prepare stmt; -- Release the prepared statement
MySQL [test]> prepare stmt from '...'; -- Prepare twice
MySQL [test]> execute stmt using ...;
MySQL [test]> deallocate prepare stmt; -- Release the prepared statement

In such practice, the plan obtained by the first executed statement cannot be reused by the second executed statement.

To address the problem, you can set the system variable tidb_ignore_prepared_cache_close_stmt to ON so TiDB ignores commands to close prepare stmt:

{{< copyable "sql" >}}

mysql> set @@tidb_ignore_prepared_cache_close_stmt=1;  -- Enable the variable
Query OK, 0 rows affected (0.00 sec)

mysql> prepare stmt from 'select * from t'; -- Prepare once
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt;                        -- Execute once
Empty set (0.00 sec)

mysql> deallocate prepare stmt;             -- Release after the first execute
Query OK, 0 rows affected (0.00 sec)

mysql> prepare stmt from 'select * from t'; -- Prepare twice
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt;                        -- Execute twice
Empty set (0.00 sec)

mysql> select @@last_plan_from_cache;       -- Reuse the last plan
+------------------------+
| @@last_plan_from_cache |
+------------------------+
|                      1 |
+------------------------+
1 row in set (0.00 sec)

Monitoring

In the Grafana dashboard on the TiDB page in the Executor section, there are the "Queries Using Plan Cache OPS" and "Plan Cache Miss OPS" graphs. These graphs can be used to check if both TiDB and the application are configured correctly to allow the SQL Plan Cache to work correctly. The Server section on the same page provides the "Prepared Statement Count" graph. This graph shows a non-zero value if the application uses prepared statements, which is required for the SQL Plan Cache to function correctly.

sql_plan_cache

On the Monitoring page of the TiDB Cloud console, you can check the Queries Using Plan Cache OPS metric to get the number of queries using or missing plan cache per second in all TiDB instances.