PT 11: Functional safety in high demand: Categories 3 and 4 according to ISO 13849-1

Summary

This article is part of a series of articles written on Functional Safety of Machinery. We recently introduced one of the two standards used to design Safety Control Systems in High Demand: ISO 13849-1. We remind that ISO 13849-1 is based upon a complex Markov model from which five categories have been derived. Each category describes the required behaviour of subsystems in respect of its resistance to faults, based upon the design considerations previously indicated (MTTFD, DCavg etc..).

Last year we presented the Category B, 1 and 2. We now continue with the introduction to Category 3. We present the key features and discuss in detail how the Diagnostic channel can be handled.

What is Functional Safety?

You need the domain of Functional Safety every time you decide to use an Automation System to reduce the risk associated with a Machinery or a Process. The risk is normally reduced by removing all the energies: those can be electrical (a motor that drives a dangerous movement), pneumatic, hydraulic but also given by process fluids like methane gas for a burner or a pump that increases the pressure in a tank. Every time you decide that, to eliminate the risk, you need a pressure sensor that, in case of a high dangerous value, a valve must be closed, that is when Functional Safety plays the key role.

An example, more tailored to Machinery, is the case of a Robot Cell; each time an operator enters the cell, an Interlock, placed on the entry gate, must trigger a safety function that stops all the dangerous movements of the robot that operates inside the cell.

The issue is that one of the elements of this Safety Function can fail. The reliability of the Safety function has to be higher, the hight is the risk it reduces.

The estimation of the reliability of Safety Functions, implemented using a Control System, is the domain of Functional Safety.

The parts of the machinery control system that provide safety functions are defined by ISO 13849-1 as Safety‐related Parts of Control System (SRP/CS), while IEC 62061 defines them as Safety-related Control System (SCS). These can consist of hardware and/or software and can either be separated from the machine control system or be an integral part of it.

We remind that Both ISO 13849-1 and IEC 62061 decompose a Safety-related part of the control System or SRP/CS into Subsystems, usually made of:

Input (Sensor)
Logic Solver
Output (Final Element)

Introduction to Category 3 of ISO 13849-1

In Category 3, both Basic and Well-tried safety principles must be used. Each Category 3 subsystem should be designed so that a single fault does not lead to the loss of the safety function.

Moreover, whenever reasonably practicable, a single fault shall be detected at or before the next demand upon the safety function.

Figure 1: Category 3 Architecture

Keys:

Im represent the interconnecting means, typically, electrical wires
I₁ and I₂ represent the Inputs (for example two interlocking devices)
L represents the Safety Logic; usually a Safety Module (non-programmable) or a Programmable Logic.
O represents the output; it can be a contactor or a solenoid valve, for example
c is the cross monitoring
m dashed lines represent reasonably practicable fault detection

The Diagnostic Coverage (DC_avg) of each subsystem shall be at least low. The MTTF_D of each redundant channel shall be low‐to‐high, depending upon the required performance level (PLr).

Measures against CCF are applied.

In Category 3, the requirement of single‐fault detection does not mean that all faults will be detected. Consequently, the accumulation of undetected faults can lead to a hazardous situation at the machine.

The subsystem behaviour of this Category is therefore characterised by:

Continued performance of the safety function in the presence of a single fault.
Detection of some, but not all, faults.
Possible loss of the safety function, due to accumulation of undetected faults.

Diagnostic Coverage in Category 3

We see applications where the Diagnostic Coverage is done externally to the Safety Logic, or better, to the Functional Channel. In that case it must be justified that the simplified approach (Annex K of ISO 13849-1) is still applicable. Determination of PFH_D based on other modelling techniques is also possible. Figure 2 shows how the Safety-related Block Diagram would look like.

Figure 2: Category 3 Architecture with External Test Equipment

The issue is that the PFH_D values of Table K.1 do not correctly represent the Reliability level of that SRP/CS. An updated Markov Modelling should be done. However, we estimate the difference in PFH_D would not be significant, provided at least a Low MTTF can be claimed for the Test Equipment. The requirement for the MTTF of the Test Equipment is therefore less than what is required in Category 2, since in Category 3 and 4 we have a redundant subsystem.

One of the conditions would also be that, in case a fault is detected in one of the two channels (for example, the welding of the contacts in a contactor), the safe state (the other contactor is supposed to open) is maintained until the fault is cleared. For that reason, despite it is possible to monitor the contactor status in an Automation PLC, a signal must be sent to the Safety Logic that blocks any reset until the fault is cleared (contactor K_R in figure 3). To state it in another way: the Diagnostic can go through a general-purpose PLC, but the reaction has to be Safe. Which also means that one single fault is not able to lead to the loss of the safety function.

In the IFA report [6] there are a few examples where a general-purpose PLC is used for diagnostics in a Category 3 or 4 Subsystem and the simplified approach is kept.

Figure 3: Category 3 Architecture with External Test Equipment

Example of Category 3 for Input Subsystem: Interlocking Device

In this example, a Type 2 Interlocking Device (please refer to ISO 14199 for further details), mounted on a movable gate that gives access to a safeguarded area, is controlled by a Safety PLC. In case the guard is opened the Safety Logic must detect it and take appropriate actions. We focus on the input subsystem.

For the electrical circuit shown in figure 4, the Safety-related Block Diagram input subsystem is shown in figure 5.

Safety data:

Interlocking device B₁have B_10D = 1·10⁶ and a Mission Time of 20 years.

Usage frequency:

The interlocking device is supposed to open ten times per hour.

Avoidance of Systematic Failures

Since we can claim Category 3 (double channels with monitoring), basic and well-tried safety principles are applied. We also verified that enough measures have been applied to prevent common cause failures. (Score CCF > 65).

Fault Exclusion:

Considering the way the interlocking device was installed, we consider negligible the probability of breakage of the actuator. Therefore, we apply the fault exclusion to the mechanical part of the Interlocking Device: that is the meaning of the FE element in Figure 5. In Figure 6 we show another possible and equivalent representation.

Figure 4: Interlocking device input subsystem

Figure 5: Input subsystem represented as RBD

Figure 6: Another way to represent the Input subsystem as RBD

Being the subsystem in Category 3, MTTF_D must be limited to 100 years;

As a second step, we estimate the Diagnostic Coverage. The cross monitoring of inputs can only be performed when the guard is opened; therefore, that cannot be defined as “Cross monitoring of input signals with dynamic test” since it is not an automatic dynamic test, even if the trigger is present. That can be defined simply as a “cross monitoring of inputs without dynamic test”. The corresponding DC can vary “g. depending on how often a signal change is done by the application”. Since the gate is opened at least once per day, the maximum DC value that can be reached is 99%. Since we have the trigger, we assume the highest possible DC in the Category: DC = 98% (medium).

The last step is to refer to table K.1 of ISO 13849-1 where, for Category 3 with medium DC and MTTF_D = 100 years, the PFH_D = 4,2910^-8 which corresponds to PL e, however, since we did a fault exclusion on the mechanical part of the interlocking device (§ 4.11.2.3), the final result is:

PFH_D = 4,29·10^-8 ; PL d

Useful Lifetime verification

The concept is present in both IEC 62061 and ISO 13849-1. We need to verify if the interlocking device has to be replaced before its mission time expires. As shown in the formula hereafter, it has to be replaced after 17 years: it cannot be used for the whole mission time.

Category 4

In Category 4, both Basic and Well-tried safety principles must be used. Each Category 4 subsystem should be designed so that a single fault does not lead to the loss of the safety function.

Moreover, the single fault must be detected at or before the next demand upon the safety function. When this detection is not possible, an accumulation of undetected faults should not lead to the loss of the safety function.

Figure 7: Category 4 Architecture

Solid lines for monitoring represent a Diagnostic Coverage that is higher than in Category 3

The Diagnostic Coverage (DC_avg) of each subsystem shall be at high. The MTTF_D of each redundant channel shall be high and measures against CCF shall be applied.

The subsystem behaviour of this Category is therefore characterised by:

Continued performance of the safety function in the presence of a single fault.
Detection of faults in time to prevent the loss of the safety function.
The accumulation of undetected faults is considered.

As you can read, Category 4 is very similar to Category 3; that is the reason why IEC 62061 has one architecture that covers both Category 3 and Category 4: Architecture D (1oo2D).

When using Table K.1 to assess the reliability level of a Safety System, to be design a Category 4 subsystem the diagnostic level of both channels must be at least 99%.

A Category 4 subsystem reaches a Performance Level PL e, equivalent to SIL 3 and the maximum reachable using ISO 13849-1 or IEC 62061.