k8s集群部署時etcd容器不停重啟問題以及處理詳解
問題現(xiàn)象
在安裝部署Kubernetes 1.26
版本時,通過kubeadm
初始化集群后,發(fā)現(xiàn)執(zhí)行kubectl
命令報以下錯誤:
The connection to the server localhost:8080 was refused - did you specify the right host or port?
查看kubelet
狀態(tài)是否正常,發(fā)現(xiàn)無法連接apiserver的6443
端口。
Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015089 7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused" Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015445 7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused" Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015654 7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused" Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015818 7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused"
進而查看apiserver容器的狀態(tài),由于是基于containerd作為容器運行時,此時kubectl不可用的情況下,使用crictl ps -a
命令可以查看所有容器的情況。
root@k8s-master:~/k8s/calico# crictl ps -a CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD 395b45b1cb733 a31e1d84401e6 50 seconds ago Exited kube-apiserver 28 e87800ae06ff5 kube-apiserver-k8s-master b5c7e2a07bf1b 5d7c5dfd3ba18 3 minutes ago Running kube-controller-manager 32 6b7cc9dd07f1d kube-controller-manager-k8s-master 944aa31862613 556768f31eb1d 4 minutes ago Exited kube-proxy 27 ccb6557c6f629 kube-proxy-ctjjq c097332b6f416 fce326961ae2d 4 minutes ago Exited etcd 30 079d491eb9925 etcd-k8s-master b8103090322c4 dafd8ad70b156 6 minutes ago Exited kube-scheduler 32 48f9544c9798c kube-scheduler-k8s-master a14b969e8ad05 5d7c5dfd3ba18 12 minutes ago Exited kube-controller-manager 31 5576806b4e142 kube-controller-manager-k8s-master
發(fā)現(xiàn)此時kube-apiserver
容器已經(jīng)退出,查看容器日志是否有異常信息。通過日志信息發(fā)現(xiàn)是kube-apiserver
無法連接etcd的2379
端口,那么問題應該是出在etcd了。
W1221 07:00:20.392868 1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to { "Addr": "127.0.0.1:2379", "ServerName": "127.0.0.1", "Attributes": null, "BalancerAttributes": null, "Type": 0, "Metadata": null }. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused" W1221 07:00:21.391330 1 logging.go:59] [core] [Channel #4 SubChannel #6] grpc: addrConn.createTransport failed to connect to { "Addr": "127.0.0.1:2379", "ServerName": "127.0.0.1", "Attributes": null, "BalancerAttributes": null, "Type": 0, "Metadata": null }. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused"
此時etcd容器也在不斷地重啟,查看其日志發(fā)現(xiàn)沒有錯誤級別的信息。
{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 is starting a new election at term 2"} {"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 became pre-candidate at term 2"} {"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 received MsgPreVoteResp from d975d9ebc69964b3 at term 2"} {"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 became candidate at term 3"} {"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 received MsgVoteResp from d975d9ebc69964b3 at term 3"} {"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 became leader at term 3"} {"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: d975d9ebc69964b3 elected leader d975d9ebc69964b3 at term 3"} {"level":"info","ts":"2022-12-21T10:29:00.742Z","caller":"etcdserver/server.go:2054","msg":"published local member to cluster through raft","local-member-id":"d975d9ebc69964b3","local-member-attributes":"{Name:k8s-master ClientURLs:[https://192.168.2.200:2379]}","request-path":"/0/members/d975d9ebc69964b3/attributes","cluster-id":"f88ac1c8c4bab6","publish-timeout":"7s"} {"level":"info","ts":"2022-12-21T10:29:00.742Z","caller":"embed/serve.go:100","msg":"ready to serve client requests"} {"level":"info","ts":"2022-12-21T10:29:00.742Z","caller":"embed/serve.go:100","msg":"ready to serve client requests"} {"level":"info","ts":"2022-12-21T10:29:00.743Z","caller":"etcdmain/main.go:44","msg":"notifying init daemon"} {"level":"info","ts":"2022-12-21T10:29:00.743Z","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"} {"level":"info","ts":"2022-12-21T10:29:00.744Z","caller":"embed/serve.go:198","msg":"serving client traffic securely","address":"192.168.2.200:2379"} {"level":"info","ts":"2022-12-21T10:29:00.745Z","caller":"embed/serve.go:198","msg":"serving client traffic securely","address":"127.0.0.1:2379"} {"level":"info","ts":"2022-12-21T10:30:20.624Z","caller":"osutil/interrupt_unix.go:64","msg":"received signal; shutting down","signal":"terminated"} {"level":"info","ts":"2022-12-21T10:30:20.624Z","caller":"embed/etcd.go:373","msg":"closing etcd server","name":"k8s-master","data-dir":"/var/lib/etcd","advertise-peer-urls":["https://192.168.2.200:2380"],"advertise-client-urls":["https://192.168.2.200:2379"]} {"level":"info","ts":"2022-12-21T10:30:20.636Z","caller":"etcdserver/server.go:1465","msg":"skipped leadership transfer for single voting member cluster","local-member-id":"d975d9ebc69964b3","current-leader-member-id":"d975d9ebc69964b3"} {"level":"info","ts":"2022-12-21T10:30:20.637Z","caller":"embed/etcd.go:568","msg":"stopping serving peer traffic","address":"192.168.2.200:2380"} {"level":"info","ts":"2022-12-21T10:30:20.639Z","caller":"embed/etcd.go:573","msg":"stopped serving peer traffic","address":"192.168.2.200:2380"} {"level":"info","ts":"2022-12-21T10:30:20.639Z","caller":"embed/etcd.go:375","msg":"closed etcd server","name":"k8s-master","data-dir":"/var/lib/etcd","advertise-peer-urls":["https://192.168.2.200:2380"],"advertise-client-urls":["https://192.168.2.200:2379"]}
但是,其中一行日志信息表示etcd收到了關閉的信號,并不是異常退出的。
{"level":"info","ts":"2022-12-21T10:30:20.624Z","caller":"osutil/interrupt_unix.go:64","msg":"received signal; shutting down","signal":"terminated"}
解決問題
該問題為未正確設置cgroups導致,在containerd的配置文件/etc/containerd/config.toml
中,修改SystemdCgroup
配置為true
。
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] BinaryName = "" CriuImagePath = "" CriuPath = "" CriuWorkPath = "" IoGid = 0 IoUid = 0 NoNewKeyring = false NoPivotRoot = false Root = "" ShimCgroup = "" SystemdCgroup = true
重啟containerd服務
systemctl restart containerd
etcd容器不再重啟,其他容器也恢復正常,問題解決。
總結(jié)
到此這篇關于k8s集群部署時etcd容器不停重啟問題以及處理方法的文章就介紹到這了,更多相關k8s集群部署etcd容器不停重啟內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家!
相關文章
Kubernetes關鍵組件與結(jié)構(gòu)組成介紹
這篇文章介紹了Kubernetes的關鍵組件與結(jié)構(gòu)組成,對大家的學習或者工作具有一定的參考學習價值,需要的朋友們下面隨著小編來一起學習學習吧2022-03-03kubernetes中的namespace、node、pod介紹
這篇文章介紹了kubernetes中的namespace、node、pod,對大家的學習或者工作具有一定的參考學習價值,需要的朋友們下面隨著小編來一起學習學習吧2022-03-03使用kubeadm命令行工具創(chuàng)建kubernetes集群
這篇文章介紹了使用kubeadm命令行工具創(chuàng)建kubernetes集群的方法,對大家的學習或者工作具有一定的參考學習價值,需要的朋友們下面隨著小編來一起學習學習吧2022-03-03Kubernetes Informer數(shù)據(jù)存儲Index與Pod分配流程解析
這篇文章主要為大家介紹了Kubernetes Informer數(shù)據(jù)存儲Index與Pod分配流程解析,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步,早日升職加薪2022-11-11KubeSphere接入外部Elasticsearch實戰(zhàn)示例
這篇文章主要為大家介紹了KubeSphere接入外部Elasticsearch實戰(zhàn)示例,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步,早日升職加薪2022-12-12kubernetes數(shù)據(jù)持久化StorageClass動態(tài)供給實現(xiàn)詳解
這篇文章主要為大家介紹了kubernetes數(shù)據(jù)持久化StorageClass動態(tài)供給實現(xiàn)詳解,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步,早日升職加薪2022-11-11