您好,登錄后才能下訂單哦!
1.部署helm
部署helm參考方法
后面使用helm部署grafana和prometheus,因此首先需要部署helm,保證helm能正常使用.
部署helm客戶端過程見下:
[root@k8s-node1 helm]# curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 > get_helm.sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 6617 100 6617 0 0 5189 0 0:00:01 0:00:01 --:--:-- 5193
[root@k8s-node1 helm]# ls
get_helm.sh
[root@k8s-node1 helm]# chmod 700 get_helm.sh
[root@k8s-node1 helm]# ./get_helm.sh
Downloading https://get.helm.sh/helm-v3.0.2-linux-amd64.tar.gz
Preparing to install helm into /usr/local/bin
helm installed into /usr/local/bin/helm
[root@k8s-node1 helm]# helm version
version.BuildInfo{Version:"v3.0.2", GitCommit:"19e47ee3283ae98139d98460de796c1be1e3975f", GitTreeState:"clean", GoVersion:"go1.13.5"}
加yum
[root@k8s-node1 helm]# helm repo add stable https://kubernetes-charts.storage.googleapis.com/
"stable" has been added to your repositories
搜搜看看
[root@k8s-node1 helm]# helm search repo stable |grep grafana
stable/grafana 4.2.2 6.5.2 The leading tool for querying and visualizing t...
[root@k8s-node1 helm]# helm search repo stable |grep prometheus
stable/helm-exporter 0.3.1 0.4.0 Exports helm release stats to prometheus
stable/prometheus 9.7.2 2.13.1 Prometheus is a monitoring system and time seri...
stable/prometheus-adapter 1.4.0 v0.5.0 A Helm chart for k8s prometheus adapter
stable/prometheus-blackbox-exporter 1.6.0 0.15.1 Prometheus Blackbox Exporter
stable/prometheus-cloudwatch-exporter 0.5.0 0.6.0 A Helm chart for prometheus cloudwatch-exporter
stable/prometheus-consul-exporter 0.1.4 0.4.0 A Helm chart for the Prometheus Consul Exporter
stable/prometheus-couchdb-exporter 0.1.1 1.0 A Helm chart to export the metrics from couchdb...
stable/prometheus-mongodb-exporter 2.4.0 v0.10.0 A Prometheus exporter for MongoDB metrics
stable/prometheus-mysql-exporter 0.5.2 v0.11.0 A Helm chart for prometheus mysql exporter with...
stable/prometheus-nats-exporter 2.3.0 0.6.0 A Helm chart for prometheus-nats-exporter
stable/prometheus-node-exporter 1.8.1 0.18.1 A Helm chart for prometheus node-exporter
stable/prometheus-operator 8.5.0 0.34.0 Provides easy monitoring definitions for Kubern...
stable/prometheus-postgres-exporter 1.1.1 0.5.1 A Helm chart for prometheus postgres-exporter
stable/prometheus-pushgateway 1.2.10 1.0.1 A Helm chart for prometheus pushgateway
stable/prometheus-rabbitmq-exporter 0.5.5 v0.29.0 Rabbitmq metrics exporter for prometheus
stable/prometheus-redis-exporter 3.2.0 1.0.4 Prometheus exporter for Redis metrics
stable/prometheus-snmp-exporter 0.0.4 0.14.0 Prometheus SNMP Exporter
stable/prometheus-to-sd 0.3.0 0.5.2 Scrape metrics stored in prometheus format and ...
部署個應用測試
[root@k8s-node1 helm]# helm install stable/nginx-ingress --generate-name
NAME: nginx-ingress-1577092943
LAST DEPLOYED: Mon Dec 23 17:22:26 2019
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The nginx-ingress controller has been installed.
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status by running 'kubectl --namespace default get services -o wide -w nginx-ingress-1577092943-controller'
[root@k8s-node1 helm]# helm ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
nginx-ingress-1577092943 default 1 2019-12-23 17:22:26.230661264 +0800 CST deployed nginx-ingress-1.27.0 0.26.1
都起來了,見下
[root@k8s-node1 helm]# kubectl get all |grep nginx
pod/nginx-ingress-1577092943-controller-8468884448-9wszl 1/1 Running 0 4m49s
pod/nginx-ingress-1577092943-default-backend-74c4db5b5b-clc2s 1/1 Running 0 4m49s
service/nginx-ingress-1577092943-controller LoadBalancer 10.254.229.168 <pending> 80:8691/TCP,443:8569/TCP 4m49s
service/nginx-ingress-1577092943-default-backend ClusterIP 10.254.37.89 <none> 80/TCP 4m49s
deployment.apps/nginx-ingress-1577092943-controller 1/1 1 1 4m49s
deployment.apps/nginx-ingress-1577092943-default-backend 1/1 1 1 4m49s
replicaset.apps/nginx-ingress-1577092943-controller-8468884448 1 1 1 4m49s
replicaset.apps/nginx-ingress-1577092943-default-backend-74c4db5b5b 1 1 1 4m49s
部署完成,測試可行,移除現在安裝的應用.
[root@k8s-node1 helm]# helm ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
nginx-ingress-1577092943 default 1 2019-12-23 17:22:26.230661264 +0800 CST deployed nginx-ingress-1.27.0 0.26.1
[root@k8s-node1 helm]# helm uninstall nginx-ingress-1577092943
release "nginx-ingress-1577092943" uninstalled
2.helm部署prometheus
helm部署prometheus
prometheus官方地址
prometheus學習文檔
2.1.開始部署
[root@k8s-node1 ~]# helm install stable/prometheus --generate-name
NAME: prometheus-1577239571
LAST DEPLOYED: Wed Dec 25 10:06:14 2019
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The Prometheus server can be accessed via port 80 on the following DNS name from within your cluster:
prometheus-1577239571-server.default.svc.cluster.local
2.2.遇到問題
檢索啟動的svc,pod
[root@k8s-node1 ~]# kubectl get svc,pod -o wide|grep prometheus
service/prometheus-1577239571-alertmanager ClusterIP 10.254.251.30 <none> 80/TCP 2m26s app=prometheus,component=alertmanager,release=prometheus-1577239571
service/prometheus-1577239571-kube-state-metrics ClusterIP None <none> 80/TCP 2m26s app=prometheus,component=kube-state-metrics,release=prometheus-1577239571
service/prometheus-1577239571-node-exporter ClusterIP None <none> 9100/TCP 2m26s app=prometheus,component=node-exporter,release=prometheus-1577239571
service/prometheus-1577239571-pushgateway ClusterIP 10.254.188.166 <none> 9091/TCP 2m26s app=prometheus,component=pushgateway,release=prometheus-1577239571
service/prometheus-1577239571-server ClusterIP 10.254.128.74 <none> 80/TCP 2m26s app=prometheus,component=server,release=prometheus-1577239571
pod/prometheus-1577239571-alertmanager-67b967b8c7-lmjf7 0/2 Pending 0 2m25s <none> <none> <none> <none>
pod/prometheus-1577239571-kube-state-metrics-6d86bf588b-w7hrq 1/1 Running 0 2m25s 172.30.4.7 k8s-node1 <none> <none>
pod/prometheus-1577239571-node-exporter-k9bsf 1/1 Running 0 2m25s 192.168.174.130 k8s-node3 <none> <none>
pod/prometheus-1577239571-node-exporter-rv9k8 1/1 Running 0 2m25s 192.168.174.129 k8s-node2 <none> <none>
pod/prometheus-1577239571-node-exporter-xc8f2 1/1 Running 0 2m25s 192.168.174.128 k8s-node1 <none> <none>
pod/prometheus-1577239571-pushgateway-d9b4cb944-zppfm 1/1 Running 0 2m25s 172.30.26.7 k8s-node3 <none> <none>
pod/prometheus-1577239571-server-c5d4dffbf-gzk9n 0/2 Pending 0 2m25s <none> <none> <none> <none>
有兩個pod一直是pending狀態,檢索原因,describe看看如下報錯:
Warning FailedScheduling 25s (x5 over 4m27s) default-scheduler pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
pvc的報錯,檢索pvc
[root@k8s-node1 templates]# kubectl get pvc |grep prometheus
prometheus-1577239571-alertmanager Pending 21m
prometheus-1577239571-server Pending 21m
describe pvc看看詳情,報錯,沒有pv或者沒有對接存儲,因此無法啟用pvc,報錯見下:
Normal FailedBinding 16s (x82 over 20m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
怎么辦?我這里的集群是pvc動態對接nfs存儲的,能否修改成對接nfs存儲呢?
對接Nfs存儲參考前面文章,storageclass的名字見下:
[root@k8s-node1 templates]# kubectl get storageclass
NAME PROVISIONER AGE
managed-nfs-storage fuseim.pri/ifs 5d17h
2.3.對接存儲,解決報錯
檢索stable/prometheus的變量,檢索關于報錯的pv的變量設置,參考命令見下
helm show values stable/prometheus
因為需要檢索后修改,把chart文件下下來,檢索修改.
[root@k8s-node1 prometheus-grafana]# helm pull stable/prometheus
[root@k8s-node1 prometheus-grafana]# ls
prometheus-9.7.2.tgz
[root@k8s-node1 prometheus-grafana]# tar zxvf prometheus-9.7.2.tgz --warning=no-timestamp
[root@k8s-node1 prometheus-grafana]# ls
prometheus prometheus-9.7.2.tgz
[root@k8s-node1 prometheus-grafana]# tree prometheus
prometheus
├── Chart.yaml
├── README.md
├── templates
│ ├── alertmanager-clusterrolebinding.yaml
│ ├── alertmanager-clusterrole.yaml
│ ├── alertmanager-configmap.yaml
│ ├── alertmanager-deployment.yaml
│ ├── alertmanager-ingress.yaml
│ ├── alertmanager-networkpolicy.yaml
│ ├── alertmanager-pdb.yaml
│ ├── alertmanager-podsecuritypolicy.yaml
│ ├── alertmanager-pvc.yaml
│ ├── alertmanager-serviceaccount.yaml
│ ├── alertmanager-service-headless.yaml
│ ├── alertmanager-service.yaml
│ ├── alertmanager-statefulset.yaml
│ ├── _helpers.tpl
│ ├── kube-state-metrics-clusterrolebinding.yaml
│ ├── kube-state-metrics-clusterrole.yaml
│ ├── kube-state-metrics-deployment.yaml
│ ├── kube-state-metrics-networkpolicy.yaml
│ ├── kube-state-metrics-pdb.yaml
│ ├── kube-state-metrics-podsecuritypolicy.yaml
│ ├── kube-state-metrics-serviceaccount.yaml
│ ├── kube-state-metrics-svc.yaml
│ ├── node-exporter-daemonset.yaml
│ ├── node-exporter-podsecuritypolicy.yaml
│ ├── node-exporter-rolebinding.yaml
│ ├── node-exporter-role.yaml
│ ├── node-exporter-serviceaccount.yaml
│ ├── node-exporter-service.yaml
│ ├── NOTES.txt
│ ├── pushgateway-clusterrolebinding.yaml
│ ├── pushgateway-clusterrole.yaml
│ ├── pushgateway-deployment.yaml
│ ├── pushgateway-ingress.yaml
│ ├── pushgateway-networkpolicy.yaml
│ ├── pushgateway-pdb.yaml
│ ├── pushgateway-podsecuritypolicy.yaml
│ ├── pushgateway-pvc.yaml
│ ├── pushgateway-serviceaccount.yaml
│ ├── pushgateway-service.yaml
│ ├── server-clusterrolebinding.yaml
│ ├── server-clusterrole.yaml
│ ├── server-configmap.yaml
│ ├── server-deployment.yaml
│ ├── server-ingress.yaml
│ ├── server-networkpolicy.yaml
│ ├── server-pdb.yaml
│ ├── server-podsecuritypolicy.yaml
│ ├── server-pvc.yaml
│ ├── server-serviceaccount.yaml
│ ├── server-service-headless.yaml
│ ├── server-service.yaml
│ ├── server-statefulset.yaml
│ └── server-vpa.yaml
└── values.yaml
1 directory, 56 files
定義所有變量的是values.yaml文件,檢索這個文件.
包含的東西特別多,需要一條條去檢查,關于pv的其中一個配置定義
persistentVolume:
## If true, alertmanager will create/use a Persistent Volume Claim
## If false, use emptyDir
##
enabled: true
## alertmanager data Persistent Volume access modes
## Must match those of existing PV or dynamic provisioner
## Ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
##
accessModes:
- ReadWriteOnce
## alertmanager data Persistent Volume Claim annotations
##
annotations: {}
## alertmanager data Persistent Volume existing claim name
## Requires alertmanager.persistentVolume.enabled: true
## If defined, PVC must be created manually before volume will be bound
existingClaim: ""
## alertmanager data Persistent Volume mount root path
##
mountPath: /data
## alertmanager data Persistent Volume size
##
size: 2Gi
## alertmanager data Persistent Volume Storage Class
## If defined, storageClassName: <storageClass>
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, AWS & OpenStack)
##
# storageClass: "-"
## alertmanager data Persistent Volume Binding Mode
## If defined, volumeBindingMode: <volumeBindingMode>
## If undefined (the default) or set to null, no volumeBindingMode spec is
## set, choosing the default mode.
##
根據上面的變量解釋,可知,chart定義了一個2GB的pvc,配置pvc對接動態存儲的參數是:storageClass,默認沒有啟用.啟用這個參數對接storageclass.
把# storageClass: "-" 修改成 storageClass: managed-nfs-storage (managed-nfs-storage 是我在集群配置的storageclass的名字,總共需要修改三處)
[root@k8s-node1 prometheus-grafana]# cat prometheus/values.yaml |grep -B 8 managed
## alertmanager data Persistent Volume Storage Class
## If defined, storageClassName: <storageClass>
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, AWS & OpenStack)
##
# storageClass: "-"
storageClass: managed-nfs-storage
--
## Prometheus server data Persistent Volume Storage Class
## If defined, storageClassName: <storageClass>
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, AWS & OpenStack)
##
# storageClass: "-"
storageClass: managed-nfs-storage
--
## pushgateway data Persistent Volume Storage Class
## If defined, storageClassName: <storageClass>
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, AWS & OpenStack)
##
# storageClass: "-"
storageClass: managed-nfs-storage
可以的,修改對接存儲參數后,安裝成功,見下
[root@k8s-node1 prometheus-grafana]# kubectl get svc,pod -o wide |grep prometheus
service/prometheus-1577263826-alertmanager ClusterIP 10.254.112.105 <none> 80/TCP 4m6s app=prometheus,component=alertmanager,release=prometheus-1577263826
service/prometheus-1577263826-kube-state-metrics ClusterIP None <none> 80/TCP 4m6s app=prometheus,component=kube-state-metrics,release=prometheus-1577263826
service/prometheus-1577263826-node-exporter ClusterIP None <none> 9100/TCP 4m6s app=prometheus,component=node-exporter,release=prometheus-1577263826
service/prometheus-1577263826-pushgateway ClusterIP 10.254.185.145 <none> 9091/TCP 4m6s app=prometheus,component=pushgateway,release=prometheus-1577263826
service/prometheus-1577263826-server ClusterIP 10.254.132.104 <none> 80/TCP 4m6s app=prometheus,component=server,release=prometheus-1577263826
pod/prometheus-1577263826-alertmanager-5cfccc55b7-6hdqn 2/2 Running 0 4m5s 172.30.26.8 k8s-node3 <none> <none>
pod/prometheus-1577263826-kube-state-metrics-697db589d4-d5rmm 1/1 Running 0 4m5s 172.30.26.7 k8s-node3 <none> <none>
pod/prometheus-1577263826-node-exporter-5gcc2 1/1 Running 0 4m5s 192.168.174.129 k8s-node2 <none> <none>
pod/prometheus-1577263826-node-exporter-b569p 1/1 Running 0 4m5s 192.168.174.130 k8s-node3 <none> <none>
pod/prometheus-1577263826-node-exporter-mft6l 1/1 Running 0 4m5s 192.168.174.128 k8s-node1 <none> <none>
pod/prometheus-1577263826-pushgateway-95c67bd5d-28p25 1/1 Running 0 4m5s 172.30.4.7 k8s-node1 <none> <none>
pod/prometheus-1577263826-server-88fbdfc47-p2bfm 2/2 Running 0 4m5s 172.30.4.8 k8s-node1 <none> <none>
2.4.prometheus基礎概念
prometheus這些組件的作用
資料來源
prometheus server
Prometheus Server是Prometheus組件中的核心部分,負責實現對監控數據的獲取,存儲以及查詢.
Prometheus Server內置的Express Browser UI,通過這個UI可以直接通過PromQL實現數據的查詢以及可視化.
node-exporter
Exporter將監控數據采集的端點通過HTTP服務的形式暴露給Prometheus Server,Prometheus Server通過訪問該Exporter提供的Endpoint端點,即可獲取到需要采集的監控數據.
alertmanager
在Prometheus Server中支持基于PromQL創建告警規則,如果滿足PromQL定義的規則,則會產生一條告警,而告警的后續處理流程則由AlertManager進行管理.在AlertManager中我們可以與郵件,Slack等等內置的通知方式進行集成,也可以通過Webhook自定義告警處理方式.AlertManager即Prometheus體系中的告警處理中心.
pushgateway
由于Prometheus數據采集基于Pull模型進行設計,因此在網絡環境的配置上必須要讓Prometheus Server能夠直接與Exporter進行通信.當這種網絡需求無法直接滿足時,就可以利用PushGateway來進行中轉.可以通過PushGateway將內部網絡的監控數據主動Push到Gateway當中.而Prometheus Server則可以采用同樣Pull的方式從PushGateway中獲取到監控數據.
這里的環境用不到這個.
kube-state-metrics
基礎概念是:kube-state-metrics輪詢Kubernetes API,并將Kubernetes的結構化信息轉換為metrics.比如調度多少rc,現在可用多少個rc?現在有多少個Job在執行?
2.5.配置web訪問prometheus server和kube-state-metrics
前面環境部署了traefik,只需要添加ingress即可,見下:
prometheus server
[root@k8s-node1 prometheus-grafana]# cat prometheus-server-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus-server
namespace: default
spec:
rules:
- host: prometheus-server
http:
paths:
- path: /
backend:
serviceName: prometheus-1577263826-server
servicePort: 80
kube-state-metrics
[root@k8s-node1 prometheus-grafana]# cat kube-state-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: kube-state
namespace: default
spec:
rules:
- host: kube-state
http:
paths:
- path: /
backend:
serviceName: prometheus-1577263826-kube-state-metrics
servicePort: 80
指定下Host解析,即可正常訪問,注意兩個server都是https的.怎么配置traefik請參考traefik的配置.
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。