Demystifying the concept of static error recovery mechanism as specified in ISO 26262 Part-6.
Part-6 ISO 26262

Demystifying the concept of static error recovery mechanism as specified in ISO 26262 Part-6.

Static recovery mechanism refers to actions that are decided before the software runs. For example the software reset after encountering watchdog fault, (stack over flow fault, illegal memory access fault etc. ) is a predefined action provided by the microcontroller.  In case of pure static recovery like this the system loses all the context and starts all over again from level-0 (re birth) .

However this may not be always useful , for example imagine a case when you were trying to download a big file and system crashed after downloading 90% of the file or ADAS  was in the emergency situation of applying brake and it  lost all the context due to transient fault. Also many systems may have start up delays which can further affect control actions.

Moving the system from its current state back into a formerly accurate and safe state  from an faulty  one is the main task in   backward error recovery. This can be achieved by technique called as check pointing. Backward error recovery requires for the system to routinely save its state onto stable storage like Keep Alive RAM.  For example many micro controllers provide an interrupt at around 75-80% of watchdog expiry period where snap shot of the system state of all the tasks ( local variables) ,  global variables & SFRs  can be stored in Keep alive memory so that execution can start from the point where it left rather than all over again. Please note that this is just one sample ( & simple) technique for backward error recovery , this topic in itself is very complex and several interesting publications are already available on how check pointing can be done for hard real time embedded systems.

In case of forward error recovery as the name suggests Instead of returning the system to a previous, check pointed state when it has entered an incorrect state, an effort is made to place the system in a correct new state from which it can continue to operate. The fundamental issue with forward error recovery techniques is that potential errors must be anticipated in advance. Only then is it feasible to change those mistakes and transfer to a new state. memories which support SECDED ( single error correction and double error detection) are good example of forward error recovery and these are highly advisable to use in ISO-26262 ASIL-D systems.



To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics