Cloud Native Security - Container Security

1. What is Cloud Native Security#

Cloud Computing -> Cloud Native -> Cloud Native Security

Before understanding cloud native security, let's first understand cloud computing and cloud native.

Before the emergence of cloud computing, there were fewer users and data for software, which could be directly placed in the company's computer room. However, in the current big data environment, traditional software architecture is no longer suitable, and software reconstruction and data migration are cumbersome, so cloud computing has emerged.

Cloud computing allows users to obtain resources according to their needs.

Cloud native does not have a precise definition and is constantly evolving. The interpretation of cloud native does not belong to any individual or organization.

Cloud native is a method of building and running applications, which is a set of technical systems and methodologies.

Applications that conform to cloud native architecture should: be containerized using open source stacks (K8S+Docker), improve flexibility and maintainability based on microservices architecture, support continuous iteration and operation automation through agile methods and DevOps, achieve elastic scaling and dynamic scheduling through cloud platform infrastructure, and optimize resource utilization.

Four elements of cloud native:

Microservices
Containerization
DevOps
Continuous Delivery

Simply put, cloud native is a technical architecture for software application services to adapt to the cloud during the process of migrating to the cloud.

For cloud native, there are still traditional security issues, such as DDOS attacks, web intrusions, etc.

In addition, cloud native also faces the following issues:

API security
Container security
Lack of centralized management
Difficult troubleshooting

2. What is Container Security#

In the cloud native ecosystem, there are many different containers. Let's take Docker as an example for analysis.

2.1 Docker's own security vulnerabilities#

There are more than 20 vulnerabilities in Docker's historical versions recorded in the CVE official records, including code execution, privilege escalation, information leakage, etc.
Docker source code security
Docker hub security

2.2 Docker architecture flaws#

LAN attacks between containers
DDOS attacks depleting resources
Vulnerable system calls
Shared root user privileges

3. Container Security#

3.1 Attacks during containerized development and testing#

Background

docker cp command

The docker cp command is used to copy files or directories between the Docker-created container and the host file system.

Symbolic links

Symbolic links (symlinks) are similar to shortcuts in Windows.

CVE-2018-15664 - Symbolic Link Replacement Vulnerability

Affected versions: Docker 17.06.0-ce～17.12.1-ce:rc2, 18.01.0-ce～18.06.1-ce:rc2

Vulnerability principle: CVE-2018-15664 is actually a TOCTOU (time-of-check to time-of-use) issue, which is a race condition vulnerability.

In simple terms, there is a gap between the steps of a program performing a security check (e.g., when a user executes the docker cp command, the Docker daemon checks the specified copy path) and using the object. An attacker can construct an object that can pass the security check and immediately replace the legitimate object with a malicious object, so that the target program actually uses the replaced malicious object (when using the docker cp command, replacing the symbolic link can cause directory traversal).

Vulnerability reproduction: Use metarget to quickly set up the CVE-2018-15664 environment.

CVE-2019-14271 - Loading Untrusted Dynamic Link Libraries

Affected versions: Docker 19.03.x before 19.03.1

Vulnerability principle: The docker cp command relies on the docker-tar component to load the nsswitch dynamic link library inside the container. An attacker can inject code by hijacking the nsswitch inside the container, gaining the ability to execute code with root privileges on the host.

The vulnerability exploitation process is as follows:

Find out which dynamic link libraries inside the container docker-tar will load.
Download the corresponding source code of the dynamic link library and add the __attribute__ attribute to the run_at_link function (this function is executed first when the dynamic link library is loaded).
Wait for the docker cp command to trigger the vulnerability.

3.2 Attacks on container software supply chain#

With the popularity of container technology, container images have become a very important part of the software supply chain. We can obtain images from public repositories or private repositories.

There are two vulnerability issues when obtaining images from public repositories:

Security vulnerabilities in the software in the image
Malicious programs such as mining programs, backdoors, viruses, etc. in the image

Image vulnerability exploitation

Image vulnerability exploitation refers to the situation where if there is a vulnerability locally in the image, the container created and run using the image usually has the same vulnerability.

For example, Alpine is a lightweight Linux distribution built on musl libc and busybox. Due to its small size, it is very popular to build software based on the Alpine base image. However, the Alpine image has had a vulnerability: CVE-2019-5021. In Alpine images from version 3.3 to 3.9, the root user password was set to empty, allowing attackers to elevate to root privileges inside the container after compromising it.

Image poisoning

Image poisoning is a broad topic that refers to attackers using various methods, such as uploading malicious images to public repositories, uploading images to victims' local repositories after compromising the system, and modifying image names to impersonate normal images, deceiving and inducing victims to use the specified malicious images to create and run containers, thereby achieving intrusion or using victims' hosts for malicious activities.

There are three types of common image poisoning based on different purposes:

Distribution of malicious mining images
Distribution of malicious backdoor images
Distribution of malicious exploit images

3.3 Attacks on container runtime#

Container escape due to insecure configuration

Over the years, the container community has been working to implement defense in depth, least privilege, and other concepts and principles.

Docker has changed the blacklisting mechanism for container runtime capabilities to the current default of denying all capabilities and then granting the minimum necessary privileges to the container runtime in a whitelist manner. Docker defaults to granting 14 out of nearly 40 capabilities to containers:

func DefaultCapabilities() []string {
	return []string{
		"CAP_CHOWN",
		"CAP_DAC_OVERRIDE",
		"CAP_FSETID",
		"CAP_FOWNER",
		"CAP_MKNOD",
		"CAP_NET_RAW",
		"CAP_SETGID",
		"CAP_SETUID",
		"CAP_SETFCAP",
		"CAP_SETPCAP",
		"CAP_NET_BIND_SERVICE",
		"CAP_SYS_CHROOT",
		"CAP_KILL",
		"CAP_AUDIT_WRITE",
	}
}

Whether it is fine-grained permission control or other security mechanisms, users can narrow or expand constraints by modifying container environment configurations or specifying parameters when running containers. If users provide certain dangerous configuration parameters for uncontrolled containers, it provides attackers with a certain degree of escape possibility.

Container escape due to dangerous mounts

To facilitate data exchange between the host and the virtual machine, virtualization solutions provide the function of mounting host directories to the virtual machine. Containers also have this feature. However, when sensitive files or directories on the host are mounted into the container, serious problems can occur if the controlled container has insecure mounts.

3.4 Container escape due to program vulnerabilities#

CVE-2019-5736

Affected versions: Docker version <= 18.09.2 & RunC version <= 1.0-rc6

Vulnerability principle: CVE-2019-5736 is a container escape vulnerability that can overwrite the host's runc program. When executing commands similar to docker exec, the underlying container runtime is actually performing the operation. For example, in the case of runC, the runc exec command is executed. Its ultimate effect is to execute the user-specified program inside the container. In other words, it starts a process within the container's various namespaces, subject to various restrictions (such as Cgroups). In addition, this operation is no different from executing a program on the host.

Attack steps:

Overwrite the /bin/sh program inside the container with #!/proc/self/exe.
Continuously traverse the /proc directory inside the container and read each /proc/[PID]/cmdline, performing string matching on runc until the process ID of runc is found.
Open /proc/[runc-PID]/exe in read-only mode to obtain the file descriptor.
Continuously attempt to open the read-only file descriptor obtained in step 3 in write mode (/proc/self/fd/[fd]). It initially returns failure, but once runc finishes using it and releases it, opening it in write mode succeeds. Immediately write the attack payload to /usr/bin/runc (which may also be /usr/bin/docker/runc) on the host using this file descriptor.
Finally, runc will execute /bin/sh specified by the user through docker exec. Its content has been replaced with #/proc/self/exe, so it actually executes the runc on the host, which has been replaced in step 4.

3.5 Container escape due to kernel vulnerabilities#

From the perspective of the operating system, container processes are just processes constrained by various security mechanisms. Therefore, from the perspective of attack and defense, container escape follows the traditional privilege escalation process. Attackers can expand their ideas for container escape based on this characteristic. Once a new kernel vulnerability occurs, it can be considered whether it can be used for container escape. Defenders can also protect and detect based on this feature, such as patching the host kernel or checking the characteristics of the exploitation of this kernel vulnerability.

CVE-2016-5195

Affected versions: Linux kernel >= 2.6.22 (released in 2007, fixed on October 18, 2016)

Vulnerability principle: The Linux kernel's memory subsystem has a race condition vulnerability when handling copy-on-write (COW), which can destroy private read-only memory mappings.

Use the PoC to complete the container escape. The core idea of this exploit is to write shellcode to the vDSO and hijack the call process of normal functions.

4. References#

"Cloud Native Security - Attack and Defense Practices and System Construction"
https://github.com/Metarget/cloud-native-security-book
https://www.cnblogs.com/buchiyexiao/p/14702051.html