Updated: October 28, 2024 |
There are many sources of overhead in a virtualized system; finding them requires analysis both from the top down and from the bottom up.
Sources of overhead in QNX hypervisor systems include:
A good general rule to follow when tuning your hypervisor system for performance is that for a guest OS in a VM, the relative costs of accessing privileged or device registers and memory are different than for an OS running in a non-virtualized environment.
Usually, for a guest OS, the cost of accessing memory is comparable to the cost for an OS performing the same action in a non-virtualized environment. However, accessing privileged or device registers requires guest exits, and thus incurs significant additional overhead compared to the same action in a non-virtualized environment.
For instructions on how to get more information about what your hypervisor and its guests are doing, see the Monitoring and Troubleshooting chapter.
Finding sources of overhead requires analysis both from the top down and from the bottom up.
Run the same system in a VM, and record the same benchmark information (V for virtual).
Usually benchmark N will show better performance, though the opposite is possible if the VM is able to optimize an inefficient guest OS.
Assuming that benchmark N was better than benchmark V, adjust the virtual environment to isolate the source of the highest additional overhead. If benchmark V was better, you may want to examine your guest OS in a non-virtualized environment before proceeding.
When you have identified the sources of the most significant increases to overhead, you will know where your tuning efforts are likely to yield the greatest benefits.
The hypervisor events include all guest exits. Guest exits are a significant source of overhead in a hypervisor system (see Guest exits and Guest-triggered exits in this chapter).
QNX hypervisors include a hypervisor-enabled microkernel. As with all QNX microkernel systems, the bootloader and startup code pre-configure the SoC, including use of the physical CPUs (e.g., SMP configuration) and memory cache configuration. The hypervisor doesn't modify this configuration; however, you can do so to provide performance gains.
For more information about how to change the bootloader and startup code, see Building Embedded Systems in the QNX Neutrino documentation, and your Board Support Package (BSP) User's Guide.
The hypervisor supports adaptive partitioning (APS). You can use APS to prevent guests from starving other guests (or even the hypervisor) of essential processing capacity. APS can ensure that processes aren't starved of CPU cycles while also ensuring that system resources aren't wasted. It assigns minimum levels of processor time to a group of threads to use if the threads need it.
Thus, you can use APS to ensure that the vCPU threads in a VM hosting a guest that's running critical applications are guaranteed the physical CPU resources they need, while allowing vCPU threads in other VMs to use these resources when critical applications don't need them.
A QNX Neutrino guest can also use APS. However, if you are using APS in a guest, remember that the partitioning applies to virtual CPUs (i.e., vCPU threads), not to physical CPUs. If you don't ensure that your guest gets the physical CPU resources it needs, nothing you do with the vCPUs will matter.
For more information, see the Adaptive Partitioning User's Guide.
The QNX hypervisor supports symmetric multiprocessing (SMP) as well as asymmetric multiprocessing (AMP) and bound multiprocessing (BMP). You can configure threads in the hypervisor host to run on specific physical CPUs. These threads include vCPU threads and other threads in your qvm processes. You can pin these threads to one or several physical CPUs by using the cpu runmask option when you assemble your VMs.
For more information about QNX Neutrino support for multiprocessing, see the Multicore Processing chapter in the QNX Neutrino Programmer's Guide.
You can combine the CPU runmasks for your vCPU threads with adaptive partitioning to exert very precise control over vCPU scheduling. Remember, though, that runmasks always take priority: if the adaptive partition allows the hypervisor to allocate additional CPU time to a vCPU thread, the hypervisor can allocate the time only if the runmask permits it. That is, the hypervisor can allocate the time it if is available on a CPU on which the vCPU thread is allowed to run, but can't allocate the time if it is on a CPU that has been masked for the vCPU thread.
For example, if the runmask for vCPU thread 1 allows that thread to run on CPUs 2 and 3, and CPU 2 is fully occupied but time is available on CPU 3, then the hypervisor can allow vCPU thread 1 to run on CPU 3. However, if CPUs 2 and 3 are fully loaded but time is available on CPU 4, the hypervisor won't allow vCPU thread 1 to run there, because the runmask for this thread forbids it.
See also Scheduling in the Understanding QNX Virtual Environments chapter.