您好,登錄后才能下訂單哦!
本篇內容介紹了“Kubernetes Node Controller怎么啟動”的有關知識,在實際案例的操作過程中,不少人都會遇到這樣的困境,接下來就讓小編帶領大家學習一下如何處理這些情況吧!希望大家仔細閱讀,能夠學有所成!
if ctx.IsControllerEnabled(nodeControllerName) { // 解析得到Cluster CIDR, # clusterCIDR is CIDR Range for Pods in cluster. _, clusterCIDR, err := net.ParseCIDR(s.ClusterCIDR) // 解析得到Service CIDR,# serviceCIDR is CIDR Range for Services in cluster. _, serviceCIDR, err := net.ParseCIDR(s.ServiceCIDR) // 創建NodeController實例 nodeController, err := nodecontroller.NewNodeController( sharedInformers.Core().V1().Pods(), sharedInformers.Core().V1().Nodes(), sharedInformers.Extensions().V1beta1().DaemonSets(), cloud, clientBuilder.ClientOrDie("node-controller"), s.PodEvictionTimeout.Duration, s.NodeEvictionRate, s.SecondaryNodeEvictionRate, s.LargeClusterSizeThreshold, s.UnhealthyZoneThreshold, s.NodeMonitorGracePeriod.Duration, s.NodeStartupGracePeriod.Duration, s.NodeMonitorPeriod.Duration, clusterCIDR, serviceCIDR, int(s.NodeCIDRMaskSize), s.AllocateNodeCIDRs, s.EnableTaintManager, utilfeature.DefaultFeatureGate.Enabled(features.TaintBasedEvictions), ) // 執行Run方法啟動該Controller nodeController.Run() // sleep一個隨機時間,該時間大小為 “ControllerStartInterval + rand.Float64()*1.0*float64(ControllerStartInterval))”,其中ControllerStartInterval可以通過配置kube-controller-manager的"--controller-start-interval”參數指定。 time.Sleep(wait.Jitter(s.ControllerStartInterval.Duration, ControllerStartJitter)) }
因此,很清晰地,關鍵就在以下兩步:
nodeController, err := nodecontroller.NewNodeController
創建NodeController實例。
nodeController.Run()
執行Run方法啟動該Controller。
在分析NodeController的原理之前,我們有必要先看看NodeController是如何定義的,其完整的定義如下:
type NodeController struct { allocateNodeCIDRs bool cloud cloudprovider.Interface clusterCIDR *net.IPNet serviceCIDR *net.IPNet knownNodeSet map[string]*v1.Node kubeClient clientset.Interface // Method for easy mocking in unittest. lookupIP func(host string) ([]net.IP, error) // Value used if sync_nodes_status=False. NodeController will not proactively // sync node status in this case, but will monitor node status updated from kubelet. If // it doesn't receive update for this amount of time, it will start posting "NodeReady== // ConditionUnknown". The amount of time before which NodeController start evicting pods // is controlled via flag 'pod-eviction-timeout'. // Note: be cautious when changing the constant, it must work with nodeStatusUpdateFrequency // in kubelet. There are several constraints: // 1. nodeMonitorGracePeriod must be N times more than nodeStatusUpdateFrequency, where // N means number of retries allowed for kubelet to post node status. It is pointless // to make nodeMonitorGracePeriod be less than nodeStatusUpdateFrequency, since there // will only be fresh values from Kubelet at an interval of nodeStatusUpdateFrequency. // The constant must be less than podEvictionTimeout. // 2. nodeMonitorGracePeriod can't be too large for user experience - larger value takes // longer for user to see up-to-date node status. nodeMonitorGracePeriod time.Duration // Value controlling NodeController monitoring period, i.e. how often does NodeController // check node status posted from kubelet. This value should be lower than nodeMonitorGracePeriod. // TODO: Change node status monitor to watch based. nodeMonitorPeriod time.Duration // Value used if sync_nodes_status=False, only for node startup. When node // is just created, e.g. cluster bootstrap or node creation, we give a longer grace period. nodeStartupGracePeriod time.Duration // per Node map storing last observed Status together with a local time when it was observed. // This timestamp is to be used instead of LastProbeTime stored in Condition. We do this // to aviod the problem with time skew across the cluster. nodeStatusMap map[string]nodeStatusData now func() metav1.Time // Lock to access evictor workers evictorLock sync.Mutex // workers that evicts pods from unresponsive nodes. zonePodEvictor map[string]*RateLimitedTimedQueue // workers that are responsible for tainting nodes. zoneNotReadyOrUnreachableTainer map[string]*RateLimitedTimedQueue podEvictionTimeout time.Duration // The maximum duration before a pod evicted from a node can be forcefully terminated. maximumGracePeriod time.Duration recorder record.EventRecorder nodeLister corelisters.NodeLister nodeInformerSynced cache.InformerSynced daemonSetStore extensionslisters.DaemonSetLister daemonSetInformerSynced cache.InformerSynced podInformerSynced cache.InformerSynced // allocate/recycle CIDRs for node if allocateNodeCIDRs == true cidrAllocator CIDRAllocator // manages taints taintManager *NoExecuteTaintManager forcefullyDeletePod func(*v1.Pod) error nodeExistsInCloudProvider func(types.NodeName) (bool, error) computeZoneStateFunc func(nodeConditions []*v1.NodeCondition) (int, zoneState) enterPartialDisruptionFunc func(nodeNum int) float32 enterFullDisruptionFunc func(nodeNum int) float32 zoneStates map[string]zoneState evictionLimiterQPS float32 secondaryEvictionLimiterQPS float32 largeClusterThreshold int32 unhealthyZoneThreshold float32 // if set to true NodeController will start TaintManager that will evict Pods from // tainted nodes, if they're not tolerated. runTaintManager bool // if set to true NodeController will taint Nodes with 'TaintNodeNotReady' and 'TaintNodeUnreachable' // taints instead of evicting Pods itself. useTaintBasedEvictions bool }
整個NodeController結構體非常復雜,包含30+項,我們將重點關注:
clusterCIDR
- 通過--cluster-cidr
來設置,表示CIDR Range for Pods in cluster。
serivceCIDR
- 通過--service-cluster-ip-range
來設置,表示CIDR Range for Services in cluster。
knownNodeSet
- 用來記錄NodeController observed節點的集合。
nodeMonitorGracePeriod
- 通過--node-monitor-grace-period
來設置,默認為40s,表示在標記某個Node為unhealthy前,允許40s內該Node unresponsive。
nodeMonitorPeriod
- 通過--node-monitor-period
來設置,默認為5s,表示在NodeController中同步NodeStatus的周期。
nodeStatusMap
- 用來記錄每個Node最近一次觀察到的Status。
zonePodEvictor
- workers that evicts pods from unresponsive nodes.
zoneNotReadyOrUnreachableTainer
- workers that are responsible for tainting nodes.
podEvictionTimeout
- 通過--pod-eviction-timeout
設置,默認為5min,表示在強制刪除Pod時,允許的最大的Pod eviction時間。
maximumGracePeriod
- The maximum duration before a pod evicted from a node can be forcefully terminated. 不可配置,代碼中寫死為5min。
nodeLister
- 用來獲取Node數據的Interface。
daemonSetStore
- 用來獲取 daemonSet數據的Interface。在通過Eviction方式刪除Pods時,會跳過該Node上所有的daemonSet對應的Pods。
taintManager
- 它是一個NoExecuteTaintManager
對象,當runTaintManager
(默認true)為true時:
PodInformer和NodeInformer將監聽到PodAdd,PodDelete,PodUpdate和NodeAdd,NodeDelete,NodeUpdate事件后,
觸發TraintManager執行對應的NoExecuteTaintManager.PodUpdated
和NoExecuteTaintManager.NodeUpdated
方法,
將事件加入到對應的queue(podUpdateQueue and nodeUpdateQueue),TaintController會從這些queue中消費這些消息,
TaintController分別調用handlePodUpdate和handleNodeUpdate處理。
具體的TaintController的處理邏輯,后續再單獨分析。
forcefullyDeletePod
- 該方法用來NodeController調用apiserver接口強制刪除該Pod。用來刪除那些被調度到kubelet version 小于v1.1.0 Node上的Pod,因為kubelet v1.1.0之前的版本不支持graceful termination。
computeZoneStateFunc
- 該方法返回Zone中NotReadyNodes數量以及該Zone的state。
如果沒有一個Ready Node,則該node state為FullDisruption
;
如果unhealthy Nodes所占的比例大于等于unhealthyZoneThreshold
,則該node state為PartialDisruption
;
否則該node state就是Narmal
。
enterPartialDisruptionFunc
- 該方法用當前node num對比largeClusterThreshold
:
如果nodeNum > largeClusterThreshold
則返回secondaryEvictionLimiterQPS
(默認為0.01);
否則返回0,表示停止evict操作。
enterFullDisruptionFunc
- 用來獲取evictionLimiterQPS
(默認為0.1)的方法,關于evictionLimiterQPS
的理解見下。
zoneStates
- 表示各個zone的狀態,狀態值可以為
Initial
;
Normal
;
FullDisruption
;
PartialDisruption
;
evictionLimiterQPS
- 通過--node-eviction-rate
設置,默認為0.1,表示當某個Zone status為healthy時,每秒應該剔除的Nodes數量,即每10s剔除1個Node。
secondaryEvictionLimiterQPS
- 通過--secondary-node-eviction-rate
設置,默認為0.01,表示當某個Zone status為unhealthy時,每秒應該剔除的Nodes數量,即每100s剔除1個Node。
largeClusterThreshold
- 通過--large-cluster-size-threshold
設置,默認為50,表示當健康nodes組成的集群規模小于等于50時,secondary-node-eviction-rate
將被設置為0。
unhealthyZoneThreshold
- 通過--unhealthy-zone-threshold
設置,默認為0.55,表示當某個Zone中unhealthy Nodes(最少為3)所占的比例達到0.55時,就認為該Zone的狀態為unhealthy。
runTaintManager
- 在--enable-taint-manager
中指定,默認為true。如果為true,則表示NodeController將會啟動TaintManager,由TaintManager負責將不能容忍該Taint的Nodes上的Pods進行evict操作。
useTaintBasedEvictions
- 在--feature-gates
中指定,默認TaintBasedEvictions=false
,仍屬于Alpha特性。如果為true,則表示將通過Taint Nodes的方式來Evict Pods。
“Kubernetes Node Controller怎么啟動”的內容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業相關的知識可以關注億速云網站,小編將為大家輸出更多高質量的實用文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。