资源预留的必要性

以常见的kubeadm安装的k8s集群来说,默认情况下kubelet没有配置kube-reserverd和system-reserverd资源预留。worker node上的pod负载,理论上可以使用该节点服务器上的所有cpu和内存资源。比如某个deployment controller管理的pod存在bug,运行时无法正常释放内存,那么该worker node上的kubelet进程最终会抢占不到足够的内存,无法向kube-apiserver同步心跳状态,该worker node节点的状态进而被标记为NotReady。随后deployment controller会在另外一个worker节点上创建一个pod副本,又重复前述过程,压垮第二个worker node,最终整个k8s集群将面临“雪崩”危险。

资源分类

Node capacity:节点总的资源
kube-reserved:预留给k8s进程的资源(如kubelet, container runtime, node problem detector等)
system-reserved:预留给操作系统的资源(如sshd、udev等)
eviction-threshold:kubelet eviction的阀值
allocatable:留给pod的可用资源=Node capacity - kube-reserved - system-reserved - eviction-threshold

配置方法

配置步骤如下:

  1. vim /var/lib/kubelet/config.yaml
  2. enforceNodeAllocatable:
  3. - pods
  4. - kube-reserved
  5. - system-reserved
  6. systemReserved:
  7. cpu: "1"
  8. memory: "2Gi"
  9. kubeReserved:
  10. cpu: "1"
  11. memory: "500Mi"
  12. systemReservedCgroup: /system.slice
  13. kubeReservedCgroup: /system.slice/kubelet.service
  1. vim /usr/lib/systemd/system/kubelet.service
  2. #添加如下信息
  3. [Service]
  4. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service
  5. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service
  6. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/cpu/system.slice/kubelet.service
  7. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/cpuacct/system.slice/kubelet.service
  8. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service
  9. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/memory/system.slice/kubelet.service
  10. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/systemd/system.slice/kubelet.service
  11. ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/pids/system.slice/kubelet.service
  1. vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
  2. # [Service] 下添加如下选项
  3. CPUAccounting=true ## 添加 CPUAccounting=true 选项,开启 systemd CPU 统计功能
  4. MemoryAccounting=true ## 添加 MemoryAccounting=true 选项,开启 systemd Memory 统计功能

配置完上述配置之后使用组合命令

  1. systemctl daemon-reload && systemctl stop docker && systemctl stop kubelet && systemctl disable docker && systemctl disable kubelet && reboot
  2. rm -rf /var/lib/kubelet/pods/* && systemctl start docker && systemctl start kubelet && systemctl enable docker && systemctl enable kubelet

遇到问题

需要使用cgroupfs
需要升级内核到 3.10.0-1127.19.1.el7.x86_64

  1. 升级内核
  2. yum update -y kernel kernel-tools && grub2-mkconfig -o /boot/grub2/grub.cfg && reboot
文档更新时间: 2021-03-18 15:56   作者:张尚