Summary of k8s add node operation exceptions

Environment: k8s cluster with kubeadm mode not

Question:

1. Check the versions of kubeadm, kubelet and kubectl installed in the source system

[root@k8s-3 ~]# yum list kubeadm kubelet kubectl 
//Loaded plug-in: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.nju.edu.cn
 * epel: ftp.riken.jp
 * extras: mirrors.nju.edu.cn
 * updates: mirrors.nju.edu.cn
//Installed packages
kubeadm.x86_64                                                                1.17.4-0                                                                @kubernetes
kubectl.x86_64                                                                1.17.4-0                                                                @kubernetes
kubelet.x86_64                                                                1.17.4-0                                                                @kubernetes
//Installable packages
kubeadm.x86_64                                                                1.18.0-0                                                                kubernetes 
kubectl.x86_64                                                                1.18.0-0                                                                kubernetes 
kubelet.x86_64                                                                1.18.0-0                                                                kubernetes

Install the same version on the added node

yum install -y  kubeadm-1.17.4-0   kubectl-1.17.4-0     kubelet-1.17.4-0

2. Prompt for error when adding node

There is no error screenshot, just write a general idea: prompt to view the information of kubelet, start the kubelet service directly when operating kubelet, and the error is reported as follows

[root@k8s3-1 kubernetes]# journalctl  -xeu kubelet 
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished starting up.
-- 
-- The start-up result is done.
4 month 05 23:45:52 k8s3-1 kubelet[5469]: F0405 23:45:52.985015    5469 server.go:198] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed 
4 month 05 23:45:52 k8s3-1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
4 month 05 23:45:52 k8s3-1 systemd[1]: Unit kubelet.service entered failed state.
4 month 05 23:45:52 k8s3-1 systemd[1]: kubelet.service failed.
4 month 05 23:46:04 k8s3-1 systemd[1]: kubelet.service holdoff time over, scheduling restart.
4 month 05 23:46:04 k8s3-1 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.

After searching, I found that there was no such file. I wanted to copy this file from other node s. I started the service, but I still reported an error. I remembered that when I started the kubelet service, it was automatically executed when I started the kubeadm join operation. I just need to put the docker Start the service configuration city of kubelet and kubelet. Check that the kubelet configuration does not start. After the operation, execute the join command to check that the kubelet service is started successfully,

master view your node information normally

[root@k8s-3 ~]# kubectl get nodes -n kube-system  -o wide
NAME     STATUS     ROLES    AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8s-3    Ready      master   15d   v1.17.4   192.168.191.30   <none>        CentOS Linux 7 (Core)   3.10.0-1062.18.1.el7.x86_64   docker://19.3.8
k8s-4    Ready      node     13d   v1.17.4   192.168.191.31   <none>        CentOS Linux 7 (Core)   3.10.0-1062.18.1.el7.x86_64   docker://19.3.8
k8s3-1   NotReady   <none>   10h   v1.17.4   192.168.191.22   <none>        CentOS Linux 7 (Core)   3.10.0-1062.18.1.el7.x86_64   docker://19.3.8

3. The start-up problem of network pod for new nodes

The pod of the flannel on this node is always in the initialization state. The Kube proxy is accepted normally. Both flannel and Kube proxy are daemons

[root@k8s-3 ~]# kubectl get pods  -n kube-system  -o wide
NAME                            READY   STATUS     RESTARTS   AGE     IP               NODE     NOMINATED NODE   READINESS GATES
coredns-7f9c544f75-hdfjm        1/1     Running    6          15d     10.244.0.15      k8s-3    <none>           <none>
coredns-7f9c544f75-w62rh        1/1     Running    6          15d     10.244.0.14      k8s-3    <none>           <none>
etcd-k8s-3                      1/1     Running    9          12d     192.168.191.30   k8s-3    <none>           <none>
kube-apiserver-k8s-3            1/1     Running    9          15d     192.168.191.30   k8s-3    <none>           <none>
kube-controller-manager-k8s-3   1/1     Running    76         15d     192.168.191.30   k8s-3    <none>           <none>
kube-flannel-ds-amd64-dqv9t     1/1     Running    0          15m     192.168.191.30   k8s-3    <none>           <none>
kube-flannel-ds-amd64-mw6bq     0/1     Init:0/1   0          15m     192.168.191.22   k8s3-1   <none>           <none>
kube-flannel-ds-amd64-rsl68     1/1     Running    0          15m     192.168.191.31   k8s-4    <none>           <none>
kube-proxy-54kv7                1/1     Running    1          10h     192.168.191.22   k8s3-1   <none>           <none>
kube-proxy-7jwmj                1/1     Running    17         15d     192.168.191.30   k8s-3    <none>           <none>
kube-proxy-psrgh                1/1     Running    4          6d22h   192.168.191.31   k8s-4    <none>           <none>
kube-scheduler-k8s-3            1/1     Running    74         15d     192.168.191.30   k8s-3    <none>           <none>


Check the log of kubelet, and it will be prompted that the configuration file of flannel does not exist. I want to copy one manually, but I don't think it should be. There must be something wrong with other places. Finally, I think that the image file is needed when pod starts. I also check that the images of docker do not have the image of flannel, and import one manually

Apr  6 11:32:58 k8s3-1 kubelet: W0406 11:32:58.346394    2867 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Apr  6 11:33:01 k8s3-1 kubelet: E0406 11:33:01.498881    2867 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr  6 11:33:03 k8s3-1 kubelet: W0406 11:33:03.347829    2867 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Apr  6 11:33:06 k8s3-1 kubelet: E0406 11:33:06.530602    2867 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr  6 11:33:08 k8s3-1 kubelet: W0406 11:33:08.348352    2867 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Apr  6 11:33:11 k8s3-1 kubelet: E0406 11:33:11.572273    2867 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr  6 11:33:13 k8s3-1 kubelet: W0406 11:33:13.350727    2867 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Apr  6 11:33:16 k8s3-1 kubelet: E0406 11:33:16.599437    2867 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

Manually import the image of flannel

[root@k8s3-1 ~]# docker images
REPOSITORY                                                       TAG                 IMAGE ID            CREATED             SIZE
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy   v1.17.4             6dec7cfde1e5        3 weeks ago         116MB
registry.cn-hangzhou.aliyuncs.com/google_containers/pause        3.1                 da86e6ba6ca1        2 years ago         742kB
[root@k8s3-1 ~]# docker load --input flannel.tar 
256a7af3acb1: Loading layer [==================================================>]  5.844MB/5.844MB
d572e5d9d39b: Loading layer [==================================================>]  10.37MB/10.37MB
57c10be5852f: Loading layer [==================================================>]  2.249MB/2.249MB
7412f8eefb77: Loading layer [==================================================>]  35.26MB/35.26MB
05116c9ff7bf: Loading layer [==================================================>]   5.12kB/5.12kB
Loaded image: quay.io/coreos/flannel:v0.12.0-amd64



View the node information on the master again

[root@k8s-3 ~]# kubectl get pods  -n kube-system  -o wide
NAME                            READY   STATUS    RESTARTS   AGE     IP               NODE     NOMINATED NODE   READINESS GATES
coredns-7f9c544f75-hdfjm        1/1     Running   6          15d     10.244.0.15      k8s-3    <none>           <none>
coredns-7f9c544f75-w62rh        1/1     Running   6          15d     10.244.0.14      k8s-3    <none>           <none>
etcd-k8s-3                      1/1     Running   9          12d     192.168.191.30   k8s-3    <none>           <none>
kube-apiserver-k8s-3            1/1     Running   9          15d     192.168.191.30   k8s-3    <none>           <none>
kube-controller-manager-k8s-3   1/1     Running   77         15d     192.168.191.30   k8s-3    <none>           <none>
kube-flannel-ds-amd64-dqv9t     1/1     Running   0          44m     192.168.191.30   k8s-3    <none>           <none>
kube-flannel-ds-amd64-mw6bq     1/1     Running   0          44m     192.168.191.22   k8s3-1   <none>           <none>
kube-flannel-ds-amd64-rsl68     1/1     Running   0          44m     192.168.191.31   k8s-4    <none>           <none>
kube-proxy-54kv7                1/1     Running   3          11h     192.168.191.22   k8s3-1   <none>           <none>
kube-proxy-7jwmj                1/1     Running   17         15d     192.168.191.30   k8s-3    <none>           <none>
kube-proxy-psrgh                1/1     Running   4          6d23h   192.168.191.31   k8s-4    <none>           <none>
kube-scheduler-k8s-3            1/1     Running   75         15d     192.168.191.30   k8s-3    <none>           <none>
[root@k8s-3 ~]# kubectl get nodes -n kube-system  -o wide
NAME     STATUS   ROLES    AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8s-3    Ready    master   15d   v1.17.4   192.168.191.30   <none>        CentOS Linux 7 (Core)   3.10.0-1062.18.1.el7.x86_64   docker://19.3.8
k8s-4    Ready    node     13d   v1.17.4   192.168.191.31   <none>        CentOS Linux 7 (Core)   3.10.0-1062.18.1.el7.x86_64   docker://19.3.8
k8s3-1   Ready    node     11h   v1.17.4   192.168.191.22   <none>        CentOS Linux 7 (Core)   3.10.0-1062.18.1.el7.x86_64   docker://19.3.8


In the end, the reason why docker didn't download the flannel image is uncertain


Set up kubelet Domestic pause image
cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
EOF


Tags: Linux kubelet Docker network Kubernetes

Posted on Mon, 06 Apr 2020 01:00:14 -0700 by ldmccla