Mixed-safety Systems Using Multicore SoCs With Hypervisors and Multicore Frameworks

Learn about how heterogeneous multicore systems-on-chip (SoCs), hypervisors, and multicore frameworks can be useful in mixed-safety systems.

Industry Article October 26, 2022 by Jeff Hancock, Siemens Software

In response to the embedded software industry’s persistent demand for more processing power, semiconductor manufacturers have made available new heterogeneous multicore SoCs. Congruently, this hardware innovation has then necessitated corresponding software advancements. Embedded developers now need optimal software tools and practices to effectively leverage heterogeneous multicore SoCs. Understanding the different software design choices available will allow designers to achieve an optimal multicore design.

Effective management and seamless orchestration of different software components on a single SoC can be achieved in multiple ways. Hypervisors and multicore frameworks are two key options for software management. Each with its respective pros and cons. Below, we explore the different structures and functions each applies to, including mixed-safety-criticality systems.

This article aims to equip software engineers with the knowledge of these options for consideration as complementary solutions to unlock the power of a heterogeneous multicore SoC.

Basics Behind SoCs With Multiple Processors

These aptly named SoCs can house entire systems by accommodating multiple processor cores on a single piece of hardware. The true promise of this technological innovation lies in the fact that these multiple cores may differ from one another (heterogeneous). Heterogeneous multi-processors allow for the consolidation of various systems that previously required individual devices or different systems running on separate devices. Further, the utility of SoCs extends beyond just consolidating systems.

This has brought the concept of mixed-safety-criticality to the forefront, which is both a safe domain and a non-safe domain running on a single SoC. It should be no surprise that multicore processing is becoming increasingly popular within the embedded systems space (Figure 1).

Figure 1. The trend toward heterogeneous multiprocessing on SoCs.

However, as with most things in life, with great (compute) power comes great (software and hardware design) responsibility. As powerful as multi-processing is, it comes with its own set of challenges. The aforementioned mixed-safety-criticality may entail a combination of mission-critical subsystems with real-time/determinism requirements, secure subsystems, and safety-certified subsystems.

Ensuring reliability and safety when integrating different functionalities onto a single SoC can be challenging. It requires isolating safe and non-safe domains and establishing reliable, safe/secure communications between the different domains.

Isolation can take many forms based on different safety-criticality applications:

Physical and temporal isolation requires dedicated independent cores
Spatial isolation requires hardware protection units
System-wide spatial isolation requires direct memory access (DMA) and power

Additional heterogeneous multi-processing challenges include:

Utilization—How to effectively utilize the heterogeneous compute resources present in the system.
Separation—How to enable the required level of safety separation between different software functionalities.
Resource sharing—How to effectively share system resources across consolidated system functions.
Boot order—How to control the sequence in which the software on each core starts to avoid synchronization and security issues.

Given the complexity of these systems, there is a clear need for careful resource management allocation, separation, and utilization. All in all, there are two main solutions for this conundrum: hypervisors and multicore frameworks. However, what are these solutions? Is either one (or both) right for you? Let’s find out.

Hardware Aspects of Hypervisors

A hypervisor is a reasonably complex, versatile software component that provides a supervisory capability over several operating systems to manage access to processor cores and peripherals, inter-operating system (OS) communications, and security.

Alternatively, hypervisors can be used in embedded applications in asymmetric multi-processing (AMP) designs that need the supervision of inter-core communication and allocation of peripherals to specific cores (Figure 2). A hypervisor can also take care of the boot sequence and manage shared peripheral access. The main advantage of a hypervisor is that if an OS crashes, it will not affect the execution of workloads on other cores. The hypervisor can even reboot the OS without requiring a device reboot. It is advisable to utilize a hypervisor specifically designed for embedded applications to ensure better performance.

Figure 2. Supervisory capabilities of hypervisors.

Hypervisors are now designed to use underlying hardware virtualization features on most multicore processors. This allows for the use of an unmodified guest operating system to be run with the hypervisor

Like any other technology, hypervisors have merits and demerits (Table 1).

Table 1. Pros and cons of hypervisors.

Multicore Frameworks

A few embedded runtime vendors have developed an alternative to hypervisors specifically engineered to support an AMP multicore system: the multicore framework.

Multicore frameworks are designed specifically to support multicore and multi-OS systems by providing boot order control and inter-core communications. The framework can load a system with lower overhead and can even run in a bare metal environment.

Multicore frameworks also have their merits and demerits (Table 2).

Table 2. Pros and cons of multicore frameworks.

Understanding a Mixed-safety-criticality System

Sometimes, an embedded device must manage/process both safety-critical and non-safety-critical functions. For example, consider a car whose backup camera and radio settings are running on the same SoC. One is critical for safety, and the other most likely isn’t. The challenge is ensuring that even if the radio breaks, the camera is still functional.

A mixed-safety-criticality system requires the execution of applications of different safety-integrity levels (SILs) or different criticalities (safety-critical and non-safety-critical) on a single SoC. Both a hypervisor and a multicore framework can support this type of configuration.

Different levels of safety require certification based on different industry standards. Virtual machines can then have different criticality levels running with a certified hypervisor. The certified hypervisor provides separation, typically using underlying hardware virtualization and separation features on the SoC. However, it comes with the additional costs of certifying the extra separation code.

A multicore framework leverages other hardware-assisted separation capabilities provided by some SoC architectures to obtain the required separation between the safe and non-safe domains (Figure 3). This includes separating processing blocks, memory blocks, peripherals, and system functions. The multicore framework provides enhanced bound checking to ensure the integrity of shared memory data structures, plus interrupt throttling and polling mode to prevent interrupt flooding.

Figure 3. Separation on Xilinx UltraScale MPSoC.

The common feature of such systems is that authorities require certification to be applied to the safe subsystem/partition before they can be marketed or deployed. The certification process may differ from industry to industry, and the procedure can be expensive and time-consuming. Anything that can reduce this cost and time is a boon.

The cost and time of certification are significantly affected by the code volume. So, minimizing code size is helpful. This also affects the choice of operating system. With source code readily available, a small real-time operating system (RTOS), such as Nucleus RTOS, is an attractive option. Typically, it is impossible to have an OS certified alone because the whole application must be subject to the process. However, Siemens Embedded Nucleus SafetyCert can provide a certified package with artifacts and test cases for the RTOS to ease the process.

Figure 4 is an example of a mixed-safety system using the Xilinx Zynq UltraScale+ MPSoC. The mixed safety systems use the Xilinx memory protection unit (XMPU) and the Xilinx peripheral protection unit (XPPU) to create a separation between certified and uncertified parts of the system.

Figure 4. Separation on Xilinx UltraScale MPSoC.

Another similar solution is the NXP i.MX 8 series of SoCs use a resource domain controller (RDC) between the certified and uncertified parts of the system. Similar concepts apply to other SoC device manufacturers

In this way, you can even use a non-safety-certified hypervisor along with a mixed-safety-criticality-enabled multicore framework.

Safety and Security in a Multicore SoC

In the past, to meet the functional safety requirement, different hardware systems would have to be created, or the entire system (including the parts that did not impact safety functions) certified. Now, the heterogeneous multicore SoC features can be employed to separate the safe domain from the unsafe domain and establish communication by a certified framework. This results in lower hardware and certification costs.

Besides the hardware separation, there are some things to consider from the software perspective. The most important is ensuring safe/secure communications between the different domains.

Buffer validation—Buffer parameters such as the address, size, and permissions must be validated before being used by the safe domain. This includes checking bounds on the buffer and discarding any buffer outside the valid range. Buffer validation must be paired with the proper error response to provide insight into system interactions for detecting malicious activity.
Mitigate interrupt flooding—The non-safe world has the potential to flood the communication channel with interrupts. This unanticipated load can violate the temporal isolation requirements of the system if no special handling is provided. Implementing a mechanism to throttle excessive interrupts from the non-safe side, or to support polling mode on the safe side, is often needed.

Choosing Between a Hypervisor and a Multicore Framework

A device or instrument is only as powerful as the skill of the person using it. To effectively wield the SoC requires making strategic choices on both the hardware and the software front. The decision to realize a design using multiple processors may be influenced by several factors—technical goals, time to market, target design, and production costs.

Using a hypervisor, a multicore framework or both to control and manage a multicore system is a critical architecture decision.

The choice will ultimately depend on the specific application requirements and the use case for the device. The options should be considered as complementary solutions to unlock the power of a multicore SoC. The availability and understanding of these choices allow the designer to achieve a more optimal multicore design.

For additional information, visit this website for information on multicore and hypervisor solutions.

All images used courtesy of Siemens