首页 > 美文阅读

K8S部署---故障处理

更新时间:2023-05-12 23:45:34 阅读：评论：0

K8S部署---故障处理

问题3：kubeadm⽅式部署K8S集群时，node加⼊K8S集群时卡顿并失败？

[root@k8s-node01 ~]# kubeadm join 192.168.1.201:6443 --token 1qo7ms.7atall1jcecf10qz --discovery-token-ca-cert-hash

sha256:d1d102ceb6241a3617777f6156cd4e86dc9f9edd9e1d6d73266d6ca7f6280890

[preflight] Running pre-flight checks谁会记得六哲

原因分析：初始化主机点后发现部分组件异常；

[root@k8s-master01 ~]# kubectl get pod -n kube-system && kubectl get svc

NAME READY STATUS RESTARTS AGE

coredns-54d67798b7-28w5q 0/1 Pending 0 3m39s

coredns-54d67798b7-sxqpm 0/1 Pending 0 3m39s

etcd-k8s-master01 1/1 Running 0 3m53s

kube-apirver-k8s-master01 1/1 Running 0 3m53s

kube-controller-manager-k8s-master01 1/1 Running 0 3m53s

kube-proxy-rvj6w 0/1 CrashLoopBackOff 5 3m40s

kube-scheduler-k8s-master01 1/1 Running 0 3m53s

解决⽅法：修改kubeadm-config.yaml ，重新进⾏初始化主节点。

kubeadm ret -f;ipvsadm --clear;rm -rf ./.kube

kubeadm init --config=new-kubeadm-config.yaml --upload-certs |tee kubeadm-init.log

问题2：kubeadm⽅式部署K8S集群时，node加⼊K8S集群失败？

[root@k8s-node01 ~]# kubeadm join 192.168.1.201:6443 --token 2g9k0a.tsm6xe31rdb7jbo8 --discovery-token-ca-cert-hash

sha256:d1d102ceb6241a3617777f6156cd4e86dc9f9edd9e1d6d73266d6ca7f6280890

[preflight] Running pre-flight checks

[preflight] Reading configuration from

[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'

error execution pha preflight: unable to fetch the kubeadm-config ConfigMap: failed to get config map: Unauthorized

To e the stack trace of this error execute with --v=5 or higher

原因分析：token过期了；

解决⽅法：重新⽣成不过期的token即可；

[root@k8s-master01 ~]# kubeadm token create --ttl 0 --print-join-command

W0819 12:00:27.541838 :202] WARNING: kubeadm cannot validate component configs for API groups [fig.k8s.fig.k8s.io]

kubeadm join 192.168.1.201:6443 --token 6xyv8a.cueltqmpe9qa8nxu --discovery-token-ca-cert-hash

sha256:bd78dfd370e47dfca742b5f6934c21014792168fa4dc19c9fa63bfdd87270097

问题3：kubeadm⽅式部署K8S集群时，部署flannel组件失败？

kube-flannel-ds-amd64-8cqqz 0/1 CrashLoopBackOff 3 84s 192.168.66.10 k8s-master01 <none> <none>原因分析：查看⽇志，发现注册⽹络失败，原因在于主节点初始化时yaml⽂件存在问题。

kubectl logs kube-flannel-ds-amd64-8cqqz -n kubesystem

I0602 01:53:54.021093 :514] Determining IP address of default interface

I0602 01:53:54.022514 :527] Using interface with name ens33 and address 192.168.66.10

I0602 01:53:54.022619 :544] Defaulting external address to interface address (192.168.66.10)

I0602 01:53:54.030311 :126] Waiting 10m0s for node controller to sync

I0602 01:53:54.030555 :309] Starting kube subnet manager

I0602 01:53:55.118656 :133] Node controller sync successful

I0602 01:53:55.118754 :244] Created subnet manager: Kubernetes Subnet Manager - k8s-master01

I0602 01:53:55.118765 :247] Installing signal handlers

I0602 01:53:55.119057 :386] Found network config - Backend type: vxlan

I0602 01:53:55.119146 :120] VXLAN config: VNI=1 Port=0 GBP=fal DirectRouting=fal

E0602 01:53:55.119470 :289] Error registering network: failed to acquire lea: node "k8s-master01" pod cidr not assigned

I0602 01:53:55.119506 :366]

解决⽅法：修改kubeadm-config.yaml后重新初始化主节点即可。

问题4：K8S集群初始化主节点失败？

[init] Using Kubernetes version: v1.15.1

[preflight] Running pre-flight checks

[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09

error execution pha preflight: [preflight] Some fatal errors occurred:

[ERROR Port-6443]: Port 6443 is in u

[ERROR Port-10251]: Port 10251 is in u

[ERROR Port-10252]: Port 10252 is in u

[ERROR FileAvailable--etc-kubernetes-manifests-kube-apirver.yaml]: /etc/kubernetes/manifests/kube-apirver.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists

[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists

[ERROR Port-10250]: Port 10250 is in u

[ERROR Port-2379]: Port 2379 is in u

[ERROR Port-2380]: Port 2380 is in u

[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty

[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

原因分析：K8S集群已进⾏过初始化主节点。

解决⽅法：重置K8S后重新初始化。

有哪些童话故事

kubeadm ret

kubeadm init --config=kubeadm-config.yaml --upload-certs |tee kubeadm-init.log

问题5：重置K8S成功后，是否需要删除相关⽂件？

[ret] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.草莓能不能放冰箱

[ret] Are you sure you want to proceed? [y/N]: y

[preflight] Running pre-flight checks

W0602 10:20:53.656954 :79] [ret] No kubeadm config, using etcd pod spec to get data directory [ret] No etcd config found. Assuming external etcd

[ret] Plea, manually ret etcd to prevent further issues

[ret] Stopping the kubelet rvice

[ret] Unmounting mounted directories in "/var/lib/kubelet"

[ret] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]

[ret] Deleting files: [/etc/f /etc/f /etc/f

/etc/f /etc/f]

[ret] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/

run/kubernetes]

The ret process does not ret or clean up iptables rules or IPVS tables.

If you wish to ret iptables, you must do so manually.

For example:

iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If your cluster was tup to utilize IPVS, run ipvsadm --clear (or similar)

to ret your system's IPVS tables.

The ret process does not clean your kubeconfig files and you must remove them manually.

Plea, check the contents of the $HOME/.kube/config file.

原因分析：⽆。

解决⽅法：可根据提⽰删除相关⽂件，避免主节点初始化后引起其他问题。

问题6：主节点初始化成功后，查看节点信息失败？

Unable to connect to the rver: x509: certificate signed by unknown authority (possibly becau of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

原因分析：kubeadm重置时，未删除缓存⽂件等。

解决⽅法：删除缓存⽂件后再进⾏初始化主节点。

rm -rf $HOME/.kube/

kubeadm ret

>kubeadm-init.log

问题7：⼯作节点加⼊master节点时卡住不动，仅提⽰docker版本警告？

[preflight] Running pre-flight checks

[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09 error execution pha preflight: couldn't validate the identity of the AP

编织地垫I Server: abort connecting to API rvers after timeout of 5m0s

原因分析：master节点的token过期了，docker版本过⾼。

演讲比赛活动策划方案解决⽅法：使⽤ 18.06 版本可以消除该警告；在master节点重新⽣成token；并在⼯作节点使⽤新tonken执⾏命令。

问题8、master节点加⼊K8S集群失败？

[root@k8s-master02 ~]# kubeadm join 192.168.1.201:6443 --token 6xyv8a.cueltqmpe9qa8nxu --discovery-token-ca-cert-hash

sha256:bd78dfd370e47dfca742b5f6934c21014792168fa4dc19c9fa63bfdd87270097 \

> --control-plane --certificate-key b464a8d23d3313c4c0bb5b65648b039cb9b1177dddefbf46e2e296899d0e4516

[preflight] Running pre-flight checks

[preflight] Reading configuration from

[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'

error execution pha preflight:

One or more conditions for hosting a new control plane instance is not satisfied.

unable to add a new control plane instance a cluster that doesn't have a stable controlPlaneEndpoint address

Plea ensure that:

* The cluster has a stable controlPlaneEndpoint address.

* The certificates that must be shared among control plane instances are provided.

To e the stack trace of this error execute with --v=5 or higher

原因分析：证书未共享。

解决⽅法：共享证书即可

>>其他master节点执⾏如下内容>>>>>>###

mkdir -p /etc/kubernetes/pki/etcd/

>>master01节点执⾏如下内容>>>>>>###

cd /etc/kubernetes/pki/

scp ca.* front-proxy-ca.* sa.* 192.168.1.202:/etc/kubernetes/pki/

scp ca.* front-proxy-ca.* sa.* 192.168.1.203:/etc/kubernetes/pki/

>>其他master节点执⾏如下内容>>>>>>###

kubeadm join 192.168.1.201:6443 --token 6xyv8a.cueltqmpe9qa8nxu --discovery-token-ca-cert-hash

sha256:bd78dfd370e47dfca742b5f6934c21014792168fa4dc19c9fa63bfdd87270097 --control-plane --certificate-key

b464a8d23d3313c4c0bb5b65648b039cb9b1177dddefbf46e2e296899d0e4516

问题9、部署prometheus失败？

unable to recognize "0prometheus-operator-rviceMonitor.yaml": no matches for kind "ServiceMonitor" in version " m/v1"

unable to recognize "alertmanager-alertmanager.yaml": no matches for kind "Alertmanager" in version "/v1" unable to recognize "alertmanager-rviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "/v1" unable to recognize "grafana-rviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "/v1"

unable to recognize "kube-state-metrics-rviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "/ v1" unable to recognize "node-exporter-rviceMonitor.yaml": no matches for kind "ServiceMonitor" in version "/v1" unable to recognize "prometheus-prometheus.yaml": no matches for kind "Prometheus" in version "/v1"

复古风英文

unable to recognize "prometheus-rules.yaml": no matches for kind "PrometheusRule" in version "/v1"

unable to recognize "prometheus-rviceMonitor.yaml": no matches for kind "ServiceMonitor" in vers

ion "/v1" unable to recognize "prometheus-rviceMonitorApirver.yaml": no matches for kind "ServiceMonitor" in version " /v1"

unable to recognize "prometheus-rviceMonitorCoreDNS.yaml": no matches for kind "ServiceMonitor" in version

"/v 1"

unable to recognize "prometheus-rviceMonitorKubeControllerManager.yaml": no matches for kind "ServiceMonitor" in version "s.com/v1"

unable to recognize "prometheus-rviceMonitorKubeScheduler.yaml": no matches for kind "ServiceMonitor" in version "s .com/v1"

unable to recognize "prometheus-rviceMonitorKubelet.yaml": no matches for kind "ServiceMonitor" in version "/v 1"sing现在分词

原因分析：不明。

解决⽅法：重新执⾏发布命令。

问题7：⽆法下载K8S⾼可⽤安装包，但能直接通过浏览器下载git包？

Cloning into 'k8s-ha-install'...

error: RPC failed; result=35, HTTP code = 0

fatal: The remote end hung up unexpectedly

原因分析： git buffer 太⼩了。

云南省临检中心官网解决⽅法：git config --global http.postBuffer 100M #git buffer增⼤。

问题8：⽆法下载K8S⾼可⽤安装包，但能直接通过浏览器下载git包？

原因分析：不明；

解决⽅法：⼿动下载并上传git包；

问题9：K8S查看分⽀失败？

[root@k8s-master01 k8s-ha-install-master]# git branch -a

fatal: Not a git repository (or any of the parent directories): .git

原因分析：缺少.git本地仓库。

解决⽅法：初始化git即可。

[root@k8s-master01 k8s-ha-install-master]# git init

Initialized empty Git repository in /root/install-k8s-v1.17/k8s-ha-install-master/.git/

[root@k8s-master01 k8s-ha-install-master]# git branch -a

问题10：K8S切换分⽀失败？

[root@k8s-master01 k8s-ha-install-master]# git checkout manual-installation-v1.20.x

error: pathspec 'manual-installation-v1.20.x' did not match any file(s) known to git.

原因分析：没有发现分⽀。

解决⽅法：必须通过git下载，不能通过浏览器下载zip格式的⽂件；或将git下载的⽂件打包后再次解压也不可使⽤。

[root@k8s-master01 k8s-ha-install]# git checkout manual-installation-v1.20.x

Branch manual-installation-v1.20.x t up to track remote branch manual-installation-v1.20.x from origin. Switched to a new branch 'manual-installation-v1.20.x'

问题11：apirver聚合证书⽣成失败？

原因分析：命令缺少hosts参数，不适合⽤于⽹站；但不影响apirver组件与其他组件通信。

解决⽅法：⽆需关注。

本文发布于:2023-05-12 23:45:34，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/82/606022.html

上一篇：求职应聘自我评价

下一篇：公务员初任培训班培训总结(5篇)