Kubernetes特权Pod

Kubernetes特权Pod (privileged Pod) 是一种特殊运行Pod:

  • 运行在 privileged 模式下的容器,能够完全访问物理节点内核(full access to the node’s kernel)

  • 可以仔细地通过屏蔽掉特定能力来授权以限制容器的特权

  • 通过定义一些安全相关特性,例如 runAsUser / RunAsNonRoot

准备工作

要运行 privileged pod,只需要在容器配置的 securityContext 部分设置 privileged: true :

创建 privileged pod

一个特权pod案例
apiVersion: v1
kind: Pod
metadata:
  name: test-pod-1
  namespace: default
spec:
  containers:
  - name: centos
    image: centos
    command: ['sh', '-c', 'sleep 999']
    securityContext:
       privileged: true
  • 创建测试pod:

创建测试 privilege pod
kubectl create -f privileged-pod-1.yaml
  • 当上述测试pod运行正常后,进入该pod:

进入privileged pod
kubectl exec -it test-pod-1 -- bash
  • 然后在这个 privileged pod中检查容器能力:

在容器内部检查该容器能力
capsh --print

输出可以看到:

在容器内部检查该容器能力,输出信息
Current: = cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read,38,39,40+ep
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read,38,39,40
Ambient set =
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
 secure-no-ambient-raise: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root)

创建 non-privileged pod

备注

虽然配置 privileged: falseallowPrivilegedEscalation: false ,但是实际上 pod 还是会有一些privileges的

  • 创建 non-privileged pod:

一个 non-privileged pod案例
apiVersion: v1
kind: Pod
metadata:
  name: test-pod-2
  namespace: default
spec:
  containers:
  - name: centos
    image: centos
    command: ['sh', '-c', 'sleep 999']
    securityContext:
       privileged: false
       allowPrivilegeEscalation: false
  • 运行:

创建测试 privilege pod
 kubectl create -f privileged-pod-2.yaml
  • 然后在这个 non-privileged pod中检查容器能力:

在容器内部检查该容器能力( non-privileged )
capsh --print

则输出内容明显降低了能力:

non-privileged pod容器内部检查该容器能力,输出信息
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+ep
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Ambient set =
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
 secure-no-ambient-raise: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root)

创建完全drop privileged的pod

最严格的是 drop: ALLnon-privileged pod:

  • 创建 drop ALLnon-privileged pod

一个 drop ALL 的 non-privileged pod案例
apiVersion: v1
kind: Pod
metadata:
  name: test-pod-3
  namespace: default
spec:
  containers:
  - name: centos
    image: centos
    command: ['sh', '-c', 'sleep 999']
    securityContext:
       privileged: false
       allowPrivilegeEscalation: false
       capabilities:
          drop:
            - ALL
  • 运行:

创建测试 drop ALL的 non-privilege pod
kubectl create -f privileged-pod-3.yaml

在这个 drop ALL的 non-privileged pod中,可以看到没有任何能力

在 drop ALL 的 non-privileged pod容器内部检查该容器能力,输出信息
Current: =
Bounding set =
Ambient set =
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
 secure-no-ambient-raise: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root)

此时容器中不能安装rpm包,不能删除文件

创建以特定非Root用户运行 non-privileged pod

  • 以 1000 uid 运行的容器配置:

一个特定用户的 drop ALL non-privileged pod案例
apiVersion: v1
kind: Pod
metadata:
  name: test-pod-4
  namespace: default
spec:
  containers:
  - name: centos
    image: centos
    command: ['sh', '-c', 'sleep 999']
    securityContext:
       privileged: false
       allowPrivilegeEscalation: false
       runAsUser: 1000
       capabilities:
          drop:
            - ALL

创建特定能力的特定非Root用户运行 non-privileged pod

进一步,可以给容器一些特定的权限,例如允许调整进程的nice值:

一个特定能力的用户 drop ALL non-privileged pod案例
apiVersion: v1
kind: Pod
metadata:
  name: test-pod-5
  namespace: default
spec:
  containers:
  - name: centos
    image: centos
    command: ['sh', '-c', 'sleep 999']
    securityContext:
       privileged: false
       allowPrivilegeEscalation: false
       runAsUser: 1000
       capabilities:
          drop:
            - ALL
          add:
             - SYS_NICE

备注

总之,调整比较细节,可以进一步参考原文 Kubernetes Privileged Pod Practical Examples

参考