某环境,使用kubeadm创建3个master节点的高可用集群,但是一直失败。
10.197.145.25: exec "kubeadm join 10.197.145.24:6443 \\\\\\n--node-name=10.197.145.25 --token=vbgvbq.8bmv408a1uj0io2m \\\\\\n--control-plane --certificate-key=c6840dd68d6dc52f4e74ed244d551337f2b3c9ac7dc766cd574090ae64feb39d \\\\\\n--skip-phases=control-plane-join/mark-control-plane \\\\\\n--discovery-token-unsafe-skip-ca-verification \\\\\\n--ignore-preflight-errors=ImagePull \\\\\\n--ignore-preflight-errors=Port-10250 \\\\\\n--ignore-preflight-errors=FileContent--proc-sys-net-bridge-bridge-nf-call-iptables \\\\\\n--ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests\\n" failed:exit 1:stderr [WARNING DirAvailable--etc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [WARNING Port-10250]: Port 10250 is in use error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists [ERROR FileAvailable--etc-kubernetes-bootstrap-kubelet.conf]: /etc/kubernetes/bootstrap-kubelet.conf already exists [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` :error %!s()
#!/bin/bash
rm -rf /etc/kubernetes
systemctl stop kubelet 2>/dev/null
docker rm -f $(docker ps -aq) 2>/dev/null
systemctl stop docker 2>/dev/null
ip link del cni0 2>/etc/null
for port in 80 2379 6443 8086 {10249..10259} ; do
fuser -k -9 ${port}/tcp
done
rm -fv /root/.kube/config
rm -rfv /var/lib/kubelet
rm -rfv /var/lib/cni
rm -rfv /etc/cni
rm -rfv /var/lib/etcd
rm -rfv /var/lib/postgresql /etc/core/token /var/lib/redis /storage /chart_storage
systemctl start docker 2>/dev/null
3.于是检查10.197.145.25节点情况docker和kubelet运行情况
systemctl status docker
该节点docker进程为active和running的
systemctl status kublet
节点kubelet进程为inactive状态
4.通过systemctl restart kubelet
重启kubelet进程,发现启动后kubelet进程启动后一会后进程退出。
5.检查节点的message日志,发现kubelet有如下报错
kubelet: F0727 11:39:07.530805 23779 docker_service.go:414] Streaming server stopped unexpectedly: listen tcp 172.0.0.1:0: bind: cannot assign requested address
检查该节点的/etc/hosts,确实发现没有localhost的解析文件了。经咨询得知,客户为了配置一个域名的自定义解析,采用了 echo “1.2.3.4 a.b.com” > /etc/hosts
的操作,导致/etc/hosts中localhost解析被清空。
参照https://coding-stream-of-consciousness.com/2020/01/15/minikube-start-failure-streaming-server-stopped-cannot-assign-requested-address/修复方式,我们在/etc/hosts文件中增加localhost解析后,kubelet启动正常了,节点添加成功。
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
须加载iptable_nat,br_netfilter内核模块
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
参照官方文档,https://kubernetes.io/docs/tasks/configure-pod-container/security-context/,新版本的kubernetes可以开启selinux,但是selinux配置很复杂,开启selinux时,docker可能无法创建出容器。除非安全要求很严格的场景,一般情况建议关闭。
默认开启swap会导致kubelet无法启动
建议时区设置为 Asia/Shanghai
kubernetes组件运行需要的端口须保持开放,如6443,2379,2380等。
所有服务器可以通过 hostname 解析成 ip,可以将 localhost 解析成 127.0.0.1
,同时hosts 文件内,不能有重复的 hostname。
/tmp/
权限/tmp
目录的权限是 777
socat不仅kubeadm创建集群需要,helm使用时也需要。
iptables FORWARD链为ACCEPT
iptables -P FORWARD ACCEPT
如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!
赞0
添加新评论0 条评论