您好,登錄后才能下訂單哦!
一、問題:集群 red 后,所有節點全部重啟恢復不好,查看節點下的數據目錄,發現對應索引目錄下沒有文件
參照博客: http://www.wklken.me/posts/2015/05/23/elasticsearch-issues.html
(還有說法,再加一個節點就會自動分配,我加了節點發現此方法行不通,以下方法操作后,集群狀態就 green 了)
通過一系列排查,發現是因為有4個分片未分配到節點上,重啟后還無法分配,通過 head 插件可以看到無法分配節點的分片,通過以下命令也可以看到 unassigned shards 有4個無法分配
curl http://192.168.224.188:9200/_cluster/health\?pretty
{
"cluster_name" : "gag-prod",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 233,
"active_shards" : 466,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 4, \\ 這個就是
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 99.14893617021276
}
curl http://192.168.224.188:9200/_cat/shards 從這里找到 UNASSIGNED 類型的索引名字。(和從 head 插件里看到的剩余的分片名字一樣)
items22 4 p STARTED 2273 571.1kb 192.168.224.187 gag-prod-node-187
items22 4 r STARTED 2273 571.1kb 192.168.224.188 gag-prod-node-188
items22 2 p UNASSIGNED
items22 2 r UNASSIGNED
items22 1 p STARTED 2284 555.2kb 192.168.224.187 gag-prod-node-187
items22 1 r STARTED 2284 555.2kb 192.168.224.188 gag-prod-node-188
items22 3 p STARTED 2276 641.5kb 192.168.224.187 gag-prod-node-187
items22 3 r STARTED 2276 641.5kb 192.168.224.188 gag-prod-node-188
items22 0 p UNASSIGNED
items22 0 r UNASSIGNED
shop_entity7 4 p STARTED 53 29.6kb 192.168.224.187 gag-prod-node-187
curl http://192.168.224.188:9200_nodes/process?pretty 查看 master節點的唯一標識
{
"cluster_name" : "gag-prod",
"nodes" : {
"tdp1G9DbRseQm8xS9v8jng" : { \\這個是 187 節點的唯一標識
"name" : "gag-prod-node-187",
"transport_address" : "192.168.224.187:9300",
"host" : "192.168.224.187",
"ip" : "192.168.224.187",
"version" : "2.3.2",
"build" : "b9e4a6a",
"http_address" : "192.168.224.187:9200",
"attributes" : {
"master" : "true"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 10009,
"mlockall" : false
}
},
"a6tktPPYSCOGv4uw8uRclg" : { \\這個是 186 節點的唯一標識
"name" : "gag-prod-node-186",
"transport_address" : "192.168.224.186:9300",
"host" : "192.168.224.186",
"ip" : "192.168.224.186",
"version" : "2.3.2",
"build" : "b9e4a6a",
"http_address" : "192.168.224.186:9200",
"attributes" : {
"master" : "false"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 24049,
"mlockall" : false
}
},
"d5DvDdr6SLak8YCC099jRg" : { \\這個是 188 節點的唯一標識
"name" : "gag-prod-node-188",
"transport_address" : "192.168.224.188:9300",
"host" : "192.168.224.188",
"ip" : "192.168.224.188",
"version" : "2.3.2",
"build" : "b9e4a6a",
"http_address" : "192.168.224.188:9200",
"attributes" : {
"master" : "true"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 13058,
"mlockall" : false
}
}
}
}
通過以上操作我們已經找到了 “問題分片”、“節點唯一標識”,現在我們就可以強制把問題分片分配到其中一個節點上了。下面我們將問題分片分到 gag-prod-node-187 上
編輯腳本:(如果有很多unassigned shards,那么可以寫循環腳本)
#!/bin/bash
# 將 items22 0 強制分配到 gag-prod-node-187(tdp1G9DbRseQm8xS9v8jng)
curl -XPOST '192.168.224.187:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "items22",
"shard" : 0,
"node" : "tdp1G9DbRseQm8xS9v8jng",
"allow_primary" : true
}
}
]
}'
# 將 items22 2 強制分配到 gag-prod-node-187(tdp1G9DbRseQm8xS9v8jng)
curl -XPOST '192.168.224.187:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "items22",
"shard" : 2,
"node" : "tdp1G9DbRseQm8xS9v8jng",
"allow_primary" : true
}
}
]
}'
運行完此腳本后,再查看集群狀態,已經恢復,等到此分片自動備份到另一個節點上后,停止 gag-prod-node-187 節點,分片已經可以自動分片節點。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。