您好,登錄后才能下訂單哦!
前言
不管是Oracle還是MySQL,新版本推出的新特性,一方面給產品帶來功能、性能、用戶體驗等方面的提升,另一方面也可能會帶來一些問題,如代碼bug、客戶使用方法不正確引發問題等等。
案例分享
MySQL 5.7下的場景
(1)首先,創建兩張表,并插入數據
mysql> select version(); +------------+ | version() | +------------+ | 5.7.30-log | +------------+ 1 row in set (0.00 sec) mysql> show create table test\G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4 MAX_ROWS=1000000 1 row in set (0.00 sec) mysql> show create table sbtest1\G *************************** 1. row *************************** Table: sbtest1 Create Table: CREATE TABLE `sbtest1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8mb4 MAX_ROWS=1000000 1 row in set (0.00 sec) mysql> select count(*) from test; +----------+ | count(*) | +----------+ | 100 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from sbtest1; +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (0.14 sec)
(2)查看兩張表的統計信息,均比較準確
mysql> select table_schema,table_name,table_rows from tables where table_name='test'; +--------------+------------+------------+ | table_schema | table_name | table_rows | +--------------+------------+------------+ | test | test | 100 | +--------------+------------+------------+ 1 row in set (0.00 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1'; +--------------+------------+------------+ | table_schema | table_name | table_rows | +--------------+------------+------------+ | test | sbtest1 | 947263 | +--------------+------------+------------+ 1 row in set (0.00 sec)
(3)我們持續往test表插入1000w條記錄,并再次查看統計信息,還是相對準確的,因為在默認情況下,數據變化量超過10%,就會觸發統計信息更新
mysql> select count(*) from test; +----------+ | count(*) | +----------+ | 10000100 | +----------+ 1 row in set (1.50 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='test'; +--------------+------------+------------+ | table_schema | table_name | table_rows | +--------------+------------+------------+ | test | test | 9749036 | +--------------+------------+------------+ 1 row in set (0.00 sec)
MySQL 8.0下的場景
(1)接下來我們看看8.0下的情況吧,同樣地,我們創建兩張表,并插入相同記錄
mysql> select version(); +-----------+ | version() | +-----------+ | 8.0.20 | +-----------+ 1 row in set (0.00 sec) mysql> show create table test\G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `id` int unsigned NOT NULL AUTO_INCREMENT, `k` int unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci MAX_ROWS=1000000 1 row in set (0.00 sec) mysql> show create table sbtest1\G *************************** 1. row *************************** Table: sbtest1 Create Table: CREATE TABLE `sbtest1` ( `id` int unsigned NOT NULL AUTO_INCREMENT, `k` int unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci MAX_ROWS=1000000 1 row in set (0.00 sec) mysql> select count(*) from test; +----------+ | count(*) | +----------+ | 100 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from sbtest1; +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (0.02 sec)
(2)查看兩張表的統計信息,均比較準確
mysql> select table_schema,table_name,table_rows from tables where table_name='test'; +--------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | +--------------+------------+------------+ | test | test | 100 | +--------------+------------+------------+ 1 row in set (0.00 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1'; +--------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | +--------------+------------+------------+ | test | sbtest1 | 947468 | +--------------+------------+------------+ 1 row in set (0.01 sec)
(3)同樣地,我們持續往test表插入1000w條記錄,并再次查看統計信息,發現table_rows顯示還是100條,出現了較大偏差
mysql> select count(*) from test; +----------+ | count(*) | +----------+ | 10000100 | +----------+ 1 row in set (0.33 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='test'; +--------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | +--------------+------------+------------+ | test | test | 100 | +--------------+------------+------------+ 1 row in set (0.00 sec)
原因剖析
那么導致統計信息不準確的原因是什么呢?其實是MySQL 8.0為了提高information_schema的查詢效率,將視圖tables和statistics里面的統計信息緩存起來,緩存過期時間由參數information_schema_stats_expiry決定,默認為86400s;如果想獲取最新的統計信息,可以通過如下兩種方式:
(1)analyze table進行表分析
(2)設置information_schema_stats_expiry=0
繼續探索
那么統計信息不準確,會帶來哪些影響呢?是否會影響執行計劃呢?接下來我們再次進行測試
測試1:表test記錄數100,表sbtest1記錄數100w
執行如下SQL,查看執行計劃,走的是NLJ,小表test作為驅動表(全表掃描),大表sbtest1作為被驅動表(主鍵關聯),執行效率很快
mysql> select count(*) from test; +----------+ | count(*) | +----------+ | 100 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from sbtest1; +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (0.02 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='test'; +--------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | +--------------+------------+------------+ | test | test | 100 | +--------------+------------+------------+ 1 row in set (0.00 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1'; +--------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | +--------------+------------+------------+ | test | sbtest1 | 947468 | +--------------+------------+------------+ 1 row in set (0.01 sec) mysql> select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977'; +----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ | id | k | c | pad | +----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ | 1 | 501885 | 08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977 | 63188288836-92351140030-06390587585-66802097351-49282961843 | +----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> explain select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-4664 +----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+ | 1 | SIMPLE | t | NULL | ALL | PRIMARY | NULL | NULL | NULL | 100 | 10.00 | Using where | | 1 | SIMPLE | t1 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.t.id | 1 | 10.00 | Using where | +----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec)
測試2:表test記錄數1000w左右,表sbtest1記錄數100w
再次執行SQL,查看執行計劃,走的也是NLJ,相對小表sbtest1作為驅動表,大表test作為被驅動表,也是正確的執行計劃
mysql> select count(*) from test; +----------+ | count(*) | +----------+ | 10000100 | +----------+ 1 row in set (0.33 sec) mysql> select count(*) from sbtest1; +----------+ | count(*) | +----------+ | 1000000 | +----------+ 1 row in set (0.02 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='test'; +--------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | +--------------+------------+------------+ | test | test | 100 | +--------------+------------+------------+ 1 row in set (0.00 sec) mysql> select table_schema,table_name,table_rows from tables where table_name='sbtest1'; +--------------+------------+------------+ | TABLE_SCHEMA | TABLE_NAME | TABLE_ROWS | +--------------+------------+------------+ | test | sbtest1 | 947468 | +--------------+------------+------------+ 1 row in set (0.01 sec) mysql> select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977'; +----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ | id | k | c | pad | +----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ | 1 | 501885 | 08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977 | 63188288836-92351140030-06390587585-66802097351-49282961843 | +----+--------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ 1 row in set (0.37 sec) mysql> explain select t.* from test t inner join sbtest1 t1 on t.id=t1.id where t.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977' and t1.c='08566691963-88624912351-16662227201-46648573979-64646226163-77505759394-75470094713-41097360717-15161106334-50535565977'; +----+-------------+-------+------------+--------+---------------+---------+---------+------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+---------------+---------+---------+------------+--------+----------+-------------+ | 1 | SIMPLE | t1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 947468 | 10.00 | Using where | | 1 | SIMPLE | t | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.t1.id | 1 | 10.00 | Using where | +----+-------------+-------+------------+--------+---------------+---------+---------+------------+--------+----------+-------------+ 2 rows in set, 1 warning (0.01 sec)
為什么優化器沒有選擇錯誤的執行計劃呢?之前文章也提過,MySQL 8.0是將元數據信息存放在mysql庫下的數據字典表里,information_schema庫只是提供相對方便的視圖供用戶查詢,所以優化器在選擇執行計劃時,會從數據字典表中獲取統計信息,生成正確的執行計劃。
總結
MySQL 8.0為了提高information_schema的查詢效率,會將視圖tables和statistics里面的統計信息緩存起來,緩存過期時間由參數information_schema_stats_expiry決定(建議設置該參數值為0);這可能會導致用戶查詢相應視圖時,無法獲取最新、準確的統計信息,但并不會影響執行計劃的選擇。
以上就是MySQL 8.0統計信息不準確的原因的詳細內容,更多關于MySQL 8.0統計信息不準確的資料請關注億速云其它相關文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。