How do Virtual Machines and Containers operate?
A problem caused by Unix’s shared global filesystem is the lack of configuration isolation. Multiple applications can have conflicting requirements for system-wide configuration settings and also shared libraries between the applications where an application requires different versions of shared libraries. These problems led administrators and developers to install applications on different servers, which led to resource wastage. To overcome this virtualization was adopted that focuses on resource allocation and isolation.
How Virtual Machine operates:
Scheduling and allocation:
VMs also have two levels of allocation and scheduling: one in the hypervisor and one in the guest OS.
Because a VM has a static number of virtual CPUs and a fixed amount of RAM, its resource consumption is naturally bounded. A vCPU cannot use more than one real CPU worth of cycles and each page of vRAM maps to at most one page of physical RAM.
Operating systems generally assume that CPUs are always running and memory has relatively fixed access time, but under VM vCPUs can be unscheduled without notification and virtual RAM can be swapped out, causing performance anomalies that are hard to debug. Many cloud providers eliminate these problems by pinning each vCPU to a physical CPU and locking all virtual RAM into real RAM.
This essentially eliminates scheduling at the hypervisor level.
Isolation:
VMs naturally provide a certain level of isolation and security because of their narrow interface; the only way a VM can communicate with the outside world is through a limited number of hypercalls, which is controlled by the hypervisor.
While VMs excel at isolation, they add overhead when sharing data between guests or between the guest and hypervisor. Usually, such sharing requires fairly expensive marshaling and hypercalls.
How Container operates:
Recommended by LinkedIn
Scheduling and Allocation:
The Linux cgroups subsystem is used to group processes and manage their aggregate resource consumption. It is commonly used to limit the memory and CPU consumption of containers. A container can be resized by simply changing the limits of its corresponding cgroups.
Because a containerized Linux system only has one kernel and the kernel has full visibility into the containers there is only one level of resource allocation and scheduling.
Isolation:
Linux containers are a concept built on kernel namespace feature. This feature accessed by the clone() system call allows separate instances of namespaces. Linux implements filesystem, PID, network, user, IPC, and hostname namespaces. For eg. each filesystem namespace has its own root directory and mount table.
A namespace can be used in many different ways, but the most common approach is to create an isolated container that has no visibility to objects outside the container.
If total isolation is not desired, it is easy to share some resources among containers. For eg.. bind mounts allow a directory to appear in multiple containers
Communication between containers or between a container and the host (which is really just a parent namespace) is as efficient as normal Linux IPC.
Securing containers tends to be simpler than managing Unix permissions because the container cannot access what it cannot see and thus the potential for accidentally overbroad permissions is greatly reduced.