從源碼解析ERROR 1129 (HY000):Host is blocked because of many connection errors
原創水平有限,有錯請指出
源碼版本5.7.14
今天群里一個朋友出現如下錯誤:
ERROR 1129 (HY000): Host '10.0.0.8' is blocked because of many connection errors; unblock with '
mysqladmin flush-hosts'
為了找到這個問題原因,首先在源碼中找到錯誤碼
-
{ "ER_HOST_IS_BLOCKED", 1129, "Host \'%-.64s\' is blocked because of many connection errors; unblock with \'mysqladmin flush-hosts\'" }
然后找到拋錯地點如下:
位于
sql_connect.cc下的check_connection()函數的
-
if (!(specialflag & SPECIAL_NO_RESOLVE))
-
{
-
int rc;
-
char *host;
-
LEX_CSTRING main_sctx_host;
-
-
rc= ip_to_hostname(&net->vio->remote,
-
main_sctx_ip.str,
-
&host, &connect_errors); //接受ip_to_hostname的返回值到rc
-
......
-
if (rc == RC_BLOCKED_HOST)//判斷rc是否為RC_BLOCKED_HOST 1
-
{
-
/* HOST_CACHE stats updated by ip_to_hostname(). */
-
my_error(ER_HOST_IS_BLOCKED, MYF(0),
-
thd->m_main_security_ctx.host_or_ip().str);
-
return 1;
-
}
-
}
這里如果如果rc == RC_BLOCKED_HOST RC_BLOCKED_HOST是一個宏定義為1
#define RC_BLOCKED_HOST 1
為真則拋錯,接下來我們需要看rc是函數ip_to_hostname的返回值
位于hostname.cc 的ip_to_hostname函數中
-
if (!(specialflag & SPECIAL_NO_HOST_CACHE))
-
{
-
mysql_mutex_lock(&hostname_cache->lock);//這里注意一下整個在cache中查找的過程是有MUTEX的
-
Host_entry *entry= hostname_cache_search(ip_key);//在cache中查找
-
-
if (entry)
-
{
-
entry->m_last_seen= now;
-
*connect_errors= entry->m_errors.m_connect;
-
if (entry->m_errors.m_connect >= max_connect_errors) //max_connect_errors就是參數我們設置的參數
-
{
-
entry->m_errors.m_host_blocked++;
-
entry->set_error_timestamps(now);
-
mysql_mutex_unlock(&hostname_cache->lock);//這里解鎖
-
DBUG_RETURN(RC_BLOCKED_HOST);
-
}
到這里我們找到了和這個報錯相關的一些事實:
1、max_connect_errors和這個報錯有關
2、SPECIAL_NO_RESOLVE和這個報錯有關
3、SPECIAL_NO_HOST_CACHE和這個報錯有關
max_connect_errors參數不用再解釋,如果!(specialflag & SPECIAL_NO_RESOLVE)返回為假則不會調用
ip_to_hostname做IP域名反解析,如果!(specialflag & SPECIAL_NO_HOST_CACHE)返回為假則不會調用if
以后報錯的內容。換句話說這里的位與后然后取反關系到了這個報錯的是否觸發,那么我們就用必要看看
specialflag和SPECIAL_NO_RESOLVE以及SPECIAL_NO_HOST_CACHE的關系了。
下面是源碼參數
MYSQLD.CC
-
case (int) OPT_SKIP_HOST_CACHE:
-
opt_specialflag|= SPECIAL_NO_HOST_CACHE;
-
break;
-
case (int) OPT_SKIP_RESOLVE:
-
opt_skip_name_resolve= 1;
-
opt_specialflag|=SPECIAL_NO_RESOLVE;
-
break;
其實這里是受到OPT_SKIP_HOST_CACHE、OPT_SKIP_RESOLVE的控制就是我們的參數
skip-host-cache
skip-name-resolve
那么我們起碼能夠通過skip-host-cache和skip-name-resolve來解決問題,但是這也帶來一個問題
建立用戶的時候將不能使用域名,如果有UNIX網絡編程基礎的朋友應該知道客戶端通過socket連接
服務端得到的是IP地址和客戶端的端口如果一旦關閉IP->
DNS的反解析,user@domain這種用戶將是
不能連接的,只能是user@ip,這點異常重要。本來是可以連接的:
create user testuuu@'test' identified by '123';
/etc/hosts設置為:
192.168.190.60 test
root@test1 ~]# mysql -utestuuu -p123 -h292.168.190.93 -P13001
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 17
Server version: 5.7.14-7-debug-log Source distribution
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> exit
但是設置skip-host-cache、skip-name-resolve后
[root@test1 ~]# mysql -utestuuu -p123 -h292.168.190.93 -P13001
Warning: Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'testuuu'@'192.168.190.60' (using password: YES)
明顯這里沒有通過192.168.190.60去做反解析.
其實關于很多解析的報錯都是在函數ip_to_hostname函數中,如果skip-name-resolve將不會發生這些
包含如下:
sql_print_warning("IP address '%s' could not be resolved: %s",ip_key,gai_strerror(err_code));
sql_print_warning("IP address '%s' has been resolved " "to the host name '%s', which resembles " "IPv4-address itself.",ip_key,hostname_buffer);
sql_print_warning("Host name '%s' could not be resolved: %s",hostname_buffer,gai_strerror(err_code));
sql_print_warning("Hostname '%s' does not resolve to '%s'.",hostname_buffer,ip_key);
如果遇到問題類似問題緊急情況下先設置skip-name-resolve再說。
注意:
1、
在整個解析期間可能還會出現下面的用戶
unauthenticated user
也就是TCP/IP 握手已經成功,數據已經開始交互,線程已經建立,但是還沒有通過MYSQL 權限認證的用戶(不知這句結論正確與否,自我理解)。
mysql> show processlist;
+----+----------------------+----------------------+------+---------+------+----------+------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+----------------------+----------------------+------+---------+------+----------+------------------+-----------+---------------+
| 5 | unauthenticated user | 192.168.190.60:12770 | NULL | Connect | 35 | login | NULL | 0 | 0 |
| 6 | root | localhost | NULL | Query | 0 | starting | show processlist | 0 | 0 |
+----+----------------------+----------------------+------+---------+------+----------+------------------+-----------+---------------+
如果出現這種用戶優先考慮一下是否是DNS解析緩慢問題,如開啟了mysql 反解析,沒有設置合適的/etc/hosts
2、測試期間,未關閉skip-name-resolve首先使用了user@ip進行了連接,然后刪除用戶建立一個user@domain的用戶
并且設置了/etc/hosts,這個時候客戶端連接不上服務端,一直報密碼不對,這個問題flush hosts后解決,這是
因為最開始使用IP反解析的時候得到的域名和我設置的/etc/hosts的域名不一致,雖然得到了正確的IP,但是在
host cache中,得到的domain并不一致,也就是說本來是username@domain1被反解析為了username@domain2這種
情況當然在mysql.user中找不到相應的權限用戶記錄了,這個時候flush hosts后清空了host cache,得以重新
生成相應的host cache后正常,這點在前面的代碼
Host_entry *entry= hostname_cache_search(ip_key);//在cache中查找
也有體現。
3、select * from performance_schema.host_cache 這里記錄了IP->DOMAIN的類容
如下:
mysql> select * from performance_schema.host_cache \G;
*************************** 1. row ***************************
IP: 192.168.190.60
HOST: test
HOST_VALIDATED: YES
..................
FIRST_SEEN: 2017-05-31 17:17:40
LAST_SEEN: 2017-05-31 17:17:40
FIRST_ERROR_SEEN: NULL
LAST_ERROR_SEEN: NULL
4、反解析的作用
如前面所講述,反解析IP->DOMAIN的作用就在于,在建立如下用戶的時候
create user testuuu@'test' identified by '123';
的時候MYSQL能夠通過ip地址判斷出他的權限信息,因為在MYSQL.USER中存儲的是域名(DOMAIN),而為了加速反解析的速度,而有了HOST CACHE
那就是
第一次連接
1、拿到客戶端IP地址(socket連接客戶端IP地址信息)
2、進行DNS反解析 (如/etc/hosts)
3、存儲反解析信息到host cache
4、從host cache中拿到這個反解析出來的domain進行權限驗證
如果不是第一次連接跳過第二步,也正是因為跳過了第二步,產生了注意2中的問題
flush host就是來清理host cache從而解決這種問題,重新進行DNS反解析
其實整個定位問題的過程還是比較簡單,但是我至今沒有找到entry->m_errors.m_connect什么時候增加,而且源碼中
還有很多地方 沒有理解由于時間原因我沒有去仔細看(因為這要消耗很多很多的時間),這里只能拋磚了,同時也記錄
了我的分析過程。
下面是MYSQL官方手冊的解釋:
9.12.6.2 DNS Lookup Optimization and the Host Cache
The server handles entries in the host cache like this:
1. When the first TCP client connection reaches the server from a given IP address, a new entry is
created to record the client IP, host name, and client lookup validation flag. Initially, the host name is
set to NULLand the flag is false. This entry is also used for subsequent client connections from the
same originating IP.
2. If the validation flag for the client IP entry is false, the server attempts an IP-to-host name DNS
resolution. If that is successful, the host name is updated with the resolved host name and the
validation flag is set to true. If resolution is unsuccessful, the action taken depends on whether the error
is permanent or transient. For permanent failures, the host name remains NULLand the validation flag
is set to true. For transient failures, the host name and validation flag remain unchanged. (Another DNS
resolution attempt occurs the next time a client connects from this IP.)
3. If an error occurs while processing an incoming client connection from a given IP address, the server
updates the corresponding error counters in the entry for that IP. For a description of the errors
recorded, see Section 23.9.16.1, “The host_cache Table”.
The server uses the host cache for several purposes:
? By caching the results of IP-to-host name lookups, the server avoids doing a DNS lookup for each client
connection. Instead, for a given host, it needs to perform a lookup only for the first connection from that
host.
? The cache contains information about errors that occur during the connection process. Some errors are
considered “blocking.”If too many of these occur successively from a given host without a successful
connection, the server blocks further connections from that host. The max_connect_errorssystem
variable determines the number of permitted errors before blocking occurs. See Section B.5.2.6, “Host
'host_name' is blocked”.
To unblock blocked hosts, flush the host cache by issuing a FLUSH HOSTSstatement or executing a
mysqladmin flush-hostscommand.
作者微信: