您好,登錄后才能下訂單哦!
本篇內容介紹了“Kubernetes Critical Pod怎么使用”的有關知識,在實際案例的操作過程中,不少人都會遇到這樣的困境,接下來就讓小編帶領大家學習一下如何處理這些情況吧!希望大家仔細閱讀,能夠學有所成!
規則1:
Enable Feature Gate ExperimentalCriticaPodAnnotation
必須隸屬于kube-system
namespace;
必須加上Annotation scheduler.alpha.kubernetes.io/critical-pod=""
規則2:
Enable Feature Gate ExperimentalCriticaPodAnnotation, PodPriority
Pod的Priority不為空,且不小于2 * 10^9
;
system-node-critical priority = 10^9 + 1000;
system-cluster-critical priority = 10^9;
滿足規則1或規則2之一,就認為該Pod為Critical Pod;
在default scheduler進行pod調度的predicate階段,會注冊GeneralPredicates
為default predicates之一,并沒有判斷critical Pod使用EssentialPredicates
來對critical Pod進行predicate process。這意味著什么呢?
我們看看GeneralPredicates和EssentialPredicates的關系就知道了。GeneralPredicates中,先調用noncriticalPredicates,再調用EssentialPredicates。因此如果你給Deployment/StatefulSet等(DeamonSet除外)標識為Critical,那么在scheduler調度時,仍然走GeneralPredicates的流程,會調用noncriticalPredicates,而你卻希望它直接走EssentialPredicates。
// GeneralPredicates checks whether noncriticalPredicates and EssentialPredicates pass. noncriticalPredicates are the predicates // that only non-critical pods need and EssentialPredicates are the predicates that all pods, including critical pods, need func GeneralPredicates(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) { var predicateFails []algorithm.PredicateFailureReason fit, reasons, err := noncriticalPredicates(pod, meta, nodeInfo) if err != nil { return false, predicateFails, err } if !fit { predicateFails = append(predicateFails, reasons...) } fit, reasons, err = EssentialPredicates(pod, meta, nodeInfo) if err != nil { return false, predicateFails, err } if !fit { predicateFails = append(predicateFails, reasons...) } return len(predicateFails) == 0, predicateFails, nil }
noncriticalPredicates原意是想對non-critical pod做的額外predicate邏輯,這個邏輯就是PodFitsResources檢查。
pkg/scheduler/algorithm/predicates/predicates.go:1076 // noncriticalPredicates are the predicates that only non-critical pods need func noncriticalPredicates(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) { var predicateFails []algorithm.PredicateFailureReason fit, reasons, err := PodFitsResources(pod, meta, nodeInfo) if err != nil { return false, predicateFails, err } if !fit { predicateFails = append(predicateFails, reasons...) } return len(predicateFails) == 0, predicateFails, nil }
PodFitsResources就做以下檢查資源是否滿足要求:
Allowed Pod Number;
CPU;
Memory;
EphemeralStorage;
Extended Resources;
也就是說,如果你給Deployment/StatefulSet等(DeamonSet除外)標識為Critical,那么對應的Pod調度時仍然會檢查Allowed Pod Number, CPU, Memory, EphemeralStorage,Extended Resources
是否足夠,如果不滿足則會觸發預選失敗,并且在Preempt階段也只是根據對應的PriorityClass進行正常的搶占邏輯,并沒有針對Critical Pod進行特殊處理,因此最終可能會因為找不到滿足資源要求的Node,導致該Critical Pod調度失敗,一直處于Pending狀態。
而用戶設置Critical Pod是不想因為資源不足導致調度失敗的。那如果我就是想使用Deployment/StatefulSet等(DeamonSet除外)標識為Critical Pod來部署關鍵服務呢?有以下兩個辦法:
按照前面提到的規則2,給Pod設置system-cluster-critical
或system-node-critical
Priority Class,這樣就會在scheduler正常的Preempt流程中搶占到資源完成調度。
按照前面提到的規則1,并且修改GeneralPredicates
的代碼如下,檢測是否為Critical Pod,如果是,則不執行noncriticalPredicates邏輯,也就是說predicate階段不對Allowed Pod Number, CPU, Memory, EphemeralStorage,Extended Resources
資源進行檢查。
func GeneralPredicates(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) { var predicateFails, resons []algorithm.PredicateFailureReason var fit bool var err error // **Modify**: check whether the pod is a Critical Pod, don't invoke noncriticalPredicates if false. isCriticalPod := utilfeature.DefaultFeatureGate.Enabled(features.ExperimentalCriticalPodAnnotation) && kubelettypes.IsCriticalPod(newPod) if !isCriticalPod { fit, reasons, err = noncriticalPredicates(pod, meta, nodeInfo) if err != nil { return false, predicateFails, err } } if !fit { predicateFails = append(predicateFails, reasons...) } fit, reasons, err = EssentialPredicates(pod, meta, nodeInfo) if err != nil { return false, predicateFails, err } if !fit { predicateFails = append(predicateFails, reasons...) } return len(predicateFails) == 0, predicateFails, nil }
方法1,其實Kubernetes在Admission Priority檢查時已經幫你做了。
// admitPod makes sure a new pod does not set spec.Priority field. It also makes sure that the PriorityClassName exists if it is provided and resolves the pod priority from the PriorityClassName. func (p *priorityPlugin) admitPod(a admission.Attributes) error { ... if utilfeature.DefaultFeatureGate.Enabled(features.PodPriority) { var priority int32 if len(pod.Spec.PriorityClassName) == 0 && utilfeature.DefaultFeatureGate.Enabled(features.ExperimentalCriticalPodAnnotation) && kubelettypes.IsCritical(a.GetNamespace(), pod.Annotations) { pod.Spec.PriorityClassName = scheduling.SystemClusterCritical } ... }
在Admission時候會對Pod的Priority進行檢查,如果發現您已經:
Enable PriorityClass Feature Gate;
Enable ExperimentalCriticalPodAnnotation Feature Gate;
給Pod添加了ExperimentalCriticalPodAnnotation;
部署在kube-system namespace;
沒有手動設置自定義PriorityClass;
那么,Admisson Priority階段會自動給Pod添加SystemClusterCritical(system-cluster-critical) PriorityClass;
通過上面的分析,給出如下最佳實踐:在Kubernetes集群中,通過非DeamonSet方式(比如Deployment、RS等)部署關鍵服務時,為了在集群資源不足時仍能保證搶占調度成功,請確保如下事宜:
Enable PriorityClass Feature Gate;
Enable ExperimentalCriticalPodAnnotation Feature Gate;
給Pod添加了ExperimentalCriticalPodAnnotation;
部署在kube-system namespace;
千萬不要手動設置自定義PriorityClass;
“Kubernetes Critical Pod怎么使用”的內容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業相關的知識可以關注億速云網站,小編將為大家輸出更多高質量的實用文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。