您好,登錄后才能下訂單哦!
本篇內容主要講解“怎么創建Node Controller”,感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷,實用性強。下面就讓小編來帶大家學習“怎么創建Node Controller”吧!
Controller Manager在啟動時,會啟動一系列的Controller,Node Controller也是在Controller Manager啟動時StartControllers方法中啟動的Controller之一,其對應的創建代碼如下。
cmd/kube-controller-manager/app/controllermanager.go:455 nodeController, err := nodecontroller.NewNodeController( sharedInformers.Core().V1().Pods(), sharedInformers.Core().V1().Nodes(), sharedInformers.Extensions().V1beta1().DaemonSets(), cloud, clientBuilder.ClientOrDie("node-controller"), s.PodEvictionTimeout.Duration, s.NodeEvictionRate, s.SecondaryNodeEvictionRate, s.LargeClusterSizeThreshold, s.UnhealthyZoneThreshold, s.NodeMonitorGracePeriod.Duration, s.NodeStartupGracePeriod.Duration, s.NodeMonitorPeriod.Duration, clusterCIDR, serviceCIDR, int(s.NodeCIDRMaskSize), s.AllocateNodeCIDRs, s.EnableTaintManager, utilfeature.DefaultFeatureGate.Enabled(features.TaintBasedEvictions), )
可見,Node Controller主要是ListWatch sharedInformers中的如下對象:
Pods
Nodes
DaemonSets
另外,需要注意:
s.EnableTaintManager的默認值為true,即表示默認開啟Taint Manager,可通過--enable-taint-manager
進行設置。
DefaultFeatureGate.Enabled(features.TaintBasedEvictions)的默認值為false,可通過--feature-gates
中添加TaintBasedEvictions=true
修改為true,true即表示Node上的Pods Eviction Operation通過TaintManager來進行。
補充:關于Kubernetes的Default FeaturesGate的設置見如下代碼:
pkg/features/kube_features.go:100 var defaultKubernetesFeatureGates = map[utilfeature.Feature]utilfeature.FeatureSpec{ ExternalTrafficLocalOnly: {Default: true, PreRelease: utilfeature.Beta}, AppArmor: {Default: true, PreRelease: utilfeature.Beta}, DynamicKubeletConfig: {Default: false, PreRelease: utilfeature.Alpha}, DynamicVolumeProvisioning: {Default: true, PreRelease: utilfeature.Alpha}, ExperimentalHostUserNamespaceDefaultingGate: {Default: false, PreRelease: utilfeature.Beta}, ExperimentalCriticalPodAnnotation: {Default: false, PreRelease: utilfeature.Alpha}, AffinityInAnnotations: {Default: false, PreRelease: utilfeature.Alpha}, Accelerators: {Default: false, PreRelease: utilfeature.Alpha}, TaintBasedEvictions: {Default: false, PreRelease: utilfeature.Alpha}, // inherited features from generic apiserver, relisted here to get a conflict if it is changed // unintentionally on either side: StreamingProxyRedirects: {Default: true, PreRelease: utilfeature.Beta}, }
func NewNodeController( podInformer coreinformers.PodInformer, nodeInformer coreinformers.NodeInformer, daemonSetInformer extensionsinformers.DaemonSetInformer, cloud cloudprovider.Interface, kubeClient clientset.Interface, podEvictionTimeout time.Duration, evictionLimiterQPS float32, secondaryEvictionLimiterQPS float32, largeClusterThreshold int32, unhealthyZoneThreshold float32, nodeMonitorGracePeriod time.Duration, nodeStartupGracePeriod time.Duration, nodeMonitorPeriod time.Duration, clusterCIDR *net.IPNet, serviceCIDR *net.IPNet, nodeCIDRMaskSize int, allocateNodeCIDRs bool, runTaintManager bool, useTaintBasedEvictions bool) (*NodeController, error) { ... nc := &NodeController{ cloud: cloud, knownNodeSet: make(map[string]*v1.Node), kubeClient: kubeClient, recorder: recorder, podEvictionTimeout: podEvictionTimeout, maximumGracePeriod: 5 * time.Minute, // 不可配置,表示"The maximum duration before a pod evicted from a node can be forcefully terminated" zonePodEvictor: make(map[string]*RateLimitedTimedQueue), zoneNotReadyOrUnreachableTainer: make(map[string]*RateLimitedTimedQueue), nodeStatusMap: make(map[string]nodeStatusData), nodeMonitorGracePeriod: nodeMonitorGracePeriod, nodeMonitorPeriod: nodeMonitorPeriod, nodeStartupGracePeriod: nodeStartupGracePeriod, lookupIP: net.LookupIP, now: metav1.Now, clusterCIDR: clusterCIDR, serviceCIDR: serviceCIDR, allocateNodeCIDRs: allocateNodeCIDRs, forcefullyDeletePod: func(p *v1.Pod) error { return forcefullyDeletePod(kubeClient, p) }, nodeExistsInCloudProvider: func(nodeName types.NodeName) (bool, error) { return nodeExistsInCloudProvider(cloud, nodeName) }, evictionLimiterQPS: evictionLimiterQPS, secondaryEvictionLimiterQPS: secondaryEvictionLimiterQPS, largeClusterThreshold: largeClusterThreshold, unhealthyZoneThreshold: unhealthyZoneThreshold, zoneStates: make(map[string]zoneState), runTaintManager: runTaintManager, useTaintBasedEvictions: useTaintBasedEvictions && runTaintManager, } ... // 注冊enterPartialDisruptionFunc函數為ReducedQPSFunc,當zone state為"PartialDisruption"時,將invoke ReducedQPSFunc來setLimiterInZone。 nc.enterPartialDisruptionFunc = nc.ReducedQPSFunc // 注冊enterFullDisruptionFunc函數為HealthyQPSFunc,當zone state為"FullDisruption"時,將invoke HealthyQPSFunc來setLimiterInZone。 nc.enterFullDisruptionFunc = nc.HealthyQPSFunc // 注冊computeZoneStateFunc函數為ComputeZoneState,當handleDisruption時,將invoke ComputeZoneState來計算集群中unhealthy node number及zone state。 nc.computeZoneStateFunc = nc.ComputeZoneState // 注冊PodInformer的Event Handler:Add,Update,Delete。 podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ // 對于Pod Add和Update Event,都會去判斷Node上kubelet的version,如果version低于1.1.0,則會通過forcefullyDeletePod直接調用apiserver接口刪除etcd中該Pod object。 // 對于Pod Add, Update, Delete Event,如果啟動了TaintManager,則會對比OldPod和newPod的Tolerations信息,如果不相同,則會將該Pod的變更信息Add到NoExecuteTaintManager的podUpdateQueue中,交給Taint Controller處理。只不過對于Delete Event,newPod 為nil。 AddFunc: func(obj interface{}) { nc.maybeDeleteTerminatingPod(obj) pod := obj.(*v1.Pod) if nc.taintManager != nil { nc.taintManager.PodUpdated(nil, pod) } }, UpdateFunc: func(prev, obj interface{}) { nc.maybeDeleteTerminatingPod(obj) prevPod := prev.(*v1.Pod) newPod := obj.(*v1.Pod) if nc.taintManager != nil { nc.taintManager.PodUpdated(prevPod, newPod) } }, DeleteFunc: func(obj interface{}) { pod, isPod := obj.(*v1.Pod) // We can get DeletedFinalStateUnknown instead of *v1.Node here and we need to handle that correctly. #34692 if !isPod { deletedState, ok := obj.(cache.DeletedFinalStateUnknown) if !ok { glog.Errorf("Received unexpected object: %v", obj) return } pod, ok = deletedState.Obj.(*v1.Pod) if !ok { glog.Errorf("DeletedFinalStateUnknown contained non-Node object: %v", deletedState.Obj) return } } if nc.taintManager != nil { nc.taintManager.PodUpdated(pod, nil) } }, }) // returns true if the shared informer's store has synced. nc.podInformerSynced = podInformer.Informer().HasSynced // 注冊NodeInformer的Event Handler:Add,Update,Delete。 nodeEventHandlerFuncs := cache.ResourceEventHandlerFuncs{} if nc.allocateNodeCIDRs { // --allocate-node-cidrs —— Should CIDRs for Pods be allocated and set on the cloud provider。 ... } else { nodeEventHandlerFuncs = cache.ResourceEventHandlerFuncs{ // 對于Node Add, Update, Delete Event,如果啟動了TaintManager,則會對比OldNode和newNode的Taints信息,如果不相同,則會將該Node的變更信息Add到NoExecuteTaintManager的nodeUpdateQueue中,交給Taint Controller處理。只不過對于Delete Event,newNode 為nil。 AddFunc: func(originalObj interface{}) { obj, err := api.Scheme.DeepCopy(originalObj) if err != nil { utilruntime.HandleError(err) return } node := obj.(*v1.Node) if nc.taintManager != nil { nc.taintManager.NodeUpdated(nil, node) } }, UpdateFunc: func(oldNode, newNode interface{}) { node := newNode.(*v1.Node) prevNode := oldNode.(*v1.Node) if nc.taintManager != nil { nc.taintManager.NodeUpdated(prevNode, node) } }, DeleteFunc: func(originalObj interface{}) { obj, err := api.Scheme.DeepCopy(originalObj) if err != nil { utilruntime.HandleError(err) return } node, isNode := obj.(*v1.Node) // We can get DeletedFinalStateUnknown instead of *v1.Node here and we need to handle that correctly. #34692 if !isNode { deletedState, ok := obj.(cache.DeletedFinalStateUnknown) if !ok { glog.Errorf("Received unexpected object: %v", obj) return } node, ok = deletedState.Obj.(*v1.Node) if !ok { glog.Errorf("DeletedFinalStateUnknown contained non-Node object: %v", deletedState.Obj) return } } if nc.taintManager != nil { nc.taintManager.NodeUpdated(node, nil) } }, } } // 注冊NoExecuteTaintManager為taintManager。 if nc.runTaintManager { nc.taintManager = NewNoExecuteTaintManager(kubeClient) } nodeInformer.Informer().AddEventHandler(nodeEventHandlerFuncs) nc.nodeLister = nodeInformer.Lister() // returns true if the shared informer's nodeStore has synced. nc.nodeInformerSynced = nodeInformer.Informer().HasSynced // returns true if the shared informer's daemonSetStore has synced. nc.daemonSetStore = daemonSetInformer.Lister() nc.daemonSetInformerSynced = daemonSetInformer.Informer().HasSynced return nc, nil
因此,創建NodeController實例時,主要進行了如下工作:
maximumGracePeriod
- The maximum duration before a pod evicted from a node can be forcefully terminated. 不可配置,代碼中寫死為5min。
注冊enterPartialDisruptionFunc
函數為ReducedQPSFunc
,當zone state為"PartialDisruption"時,將invoke ReducedQPSFunc
來setLimiterInZone
。
注冊enterFullDisruptionFunc
函數為HealthyQPSFunc
,當zone state為"FullDisruption"時,將invoke HealthyQPSFunc
來setLimiterInZone
。
注冊computeZoneStateFunc
函數為ComputeZoneState
,當handleDisruption
時,將invoke ComputeZoneState
來計算集群中unhealthy node number及zone state。
注冊**PodInformer
**的Event Handler:Add,Update,Delete。
對于Pod Add和Update Event,都會去判斷Node上kubelet version,如果version低于1.1.0,則會通過forcefullyDeletePod
直接調用apiserver接口刪除etcd中該Pod object。
對于Pod Add, Update, Delete Event,如果啟動了TaintManager
,則會對比OldPod和newPod的Tolerations信息,如果不相同,則會將該Pod的變更信息Add到NoExecuteTaintManager
的**podUpdateQueue
**中,交給Taint Controller處理。只不過對于Delete Event,newPod 為nil。
注冊PodInformerSynced,用來檢查the shared informer's Podstore
是否已經synced.
注冊**NodeInformer
**的Event Handler:Add,Update,Delete。
對于Node Add, Update, Delete Event,如果啟動了TaintManager
,則會對比OldNode和newNode的Taints信息,如果不相同,則會將該Node的變更信息Add到NoExecuteTaintManager
的nodeUpdateQueue
中,交給Taint Controller處理。只不過對于Delete Event,newNode 為nil。
注冊NoExecuteTaintManager
為taintManager。
注冊NodeInformerSynced,用來檢查the shared informer's NodeStore
是否已經synced.
注冊DaemonSetInformerSynced,用來檢查the shared informer's DaemonSetStore
是否已經synced.
上面提到ZoneState,關于ZoneState是怎么來的,見如下代碼:
pkg/api/v1/types.go:3277 const ( // NodeReady means kubelet is healthy and ready to accept pods. NodeReady NodeConditionType = "Ready" // NodeOutOfDisk means the kubelet will not accept new pods due to insufficient free disk // space on the node. NodeOutOfDisk NodeConditionType = "OutOfDisk" // NodeMemoryPressure means the kubelet is under pressure due to insufficient available memory. NodeMemoryPressure NodeConditionType = "MemoryPressure" // NodeDiskPressure means the kubelet is under pressure due to insufficient available disk. NodeDiskPressure NodeConditionType = "DiskPressure" // NodeNetworkUnavailable means that network for the node is not correctly configured. NodeNetworkUnavailable NodeConditionType = "NetworkUnavailable" // NodeInodePressure means the kubelet is under pressure due to insufficient available inodes. NodeInodePressure NodeConditionType = "InodePressure" ) pkg/controller/node/nodecontroller.go:1149 // This function is expected to get a slice of NodeReadyConditions for all Nodes in a given zone. // The zone is considered: // - fullyDisrupted if there're no Ready Nodes, // - partiallyDisrupted if at least than nc.unhealthyZoneThreshold percent of Nodes are not Ready, // - normal otherwise func (nc *NodeController) ComputeZoneState(nodeReadyConditions []*v1.NodeCondition) (int, zoneState) { readyNodes := 0 notReadyNodes := 0 for i := range nodeReadyConditions { if nodeReadyConditions[i] != nil && nodeReadyConditions[i].Status == v1.ConditionTrue { readyNodes++ } else { notReadyNodes++ } } switch { case readyNodes == 0 && notReadyNodes > 0: return notReadyNodes, stateFullDisruption case notReadyNodes > 2 && float32(notReadyNodes)/float32(notReadyNodes+readyNodes) >= nc.unhealthyZoneThreshold: return notReadyNodes, statePartialDisruption default: return notReadyNodes, stateNormal } }
zone state共分為如下三種類型:
FullDisruption:Ready狀態的Nodes number為0,并且NotReady狀態的Nodes number大于0。
PartialDisruption:NotReady狀態的Nodes number大于2,并且notReadyNodes/(notReadyNodes+readyNodes) >= nc.unhealthyZoneThreshold
,其中nc.unhealthyZoneThreshold通過--unhealthy-zone-threshold
設置,默認為0.55。
Normal:除了以上兩種zone state,其他都屬于Normal狀態。
到此,相信大家對“怎么創建Node Controller”有了更深的了解,不妨來實際操作一番吧!這里是億速云網站,更多相關內容可以進入相關頻道進行查詢,關注我們,繼續學習!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。