4.1. INTRODUCING A SAFETY MANAGEMENT SYSTEM
66. Many organizations will already have the components of an effective safety management system in place. However, in some cases these may not have been explicitly recognized and developed as part of a coherent safety management system with the general components identified in Figs 1 and 2.
67. In the safety management system or in the review or upgrading of systems, the following guidance may provide a useful benchmark against which existing systems can be assessed:
—Existing processes and procedures affecting safety can be identified and assessed against the headings identified in this report (or some comparable alternative classification). This may permit deficiencies to be easily identified.
—In some cases, there may exist more than one process within the organization which seeks to achieve the same objective. This may present an opportunity to reduce duplication or overlap. It may also improve clarity with respect to organizational requirements and systems and encourage the adoption of unified best practices across the organization.
—The process of classifying and documenting existing systems may lead to the identification of areas for improvement in the system. For example, it may be that audit, review and feedback systems are predominantly reactive rather than proactive and the balance between these approaches might therefore be adjusted.
—Where analysis of the current safety management system identifies significant deficiencies in the existing system, it is important to introduce remedial measures on a planned and prioritized basis. A useful first step is to assess which deficiencies or shortfalls present the greatest potential threat to safety and seek to introduce or improve systems in these areas as the top priority, moving to lower priority areas at a later stage.
—The checklist given in the Appendix of this report may be of further use as a prompt in order to assess whether the safety management system contains all the desired components and whether these are effective.
—In documenting the organization’s system for safety, it is often helpful to clarify:
—who is responsible for a particular part of the system;
—what is the purpose of the process;
—how the process operates and fits into the overall system.
—The clarity and transparency introduced by a systematic review of the safety management system provides a starting point against which the system can be reviewed and audited in future. The existence of a documented system, with a clear, logical basis that has been benchmarked against best practices elsewhere should provide additional confidence and assurance to the regulatory body that there exists a satisfactory system for managing safety.
68. It is often useful to ensure that there exists a hierarchy of documented requirements as part of an overall quality system. At the ‘highest’ level in the system there will generally be a statement of corporate safety policy. From this starting point, a logical progression of requirements can be developed. For example, the policy and goals of the organization can lead to a statement of the processes and responsibilities that exist to achieve the goals. Below this, standards can define management expectations for the safety of particular processes. In turn, these can lead on to instructions or procedures used in day to day operations. It is important that these be seen as useful and relevant by those who use them. Staff involvement in producing and reviewing such a hierarchy of requirements should not only improve understanding of safety, but also improve ‘ownership’, because the relevance of those parts of the safety management system affecting the day to day work of the individual will be seen in its overall context as part of a planned system to ensure and improve safety throughout the organization.
69. In principle, it should be possible for all staff to recognize the existence of an unbroken chain of requirements and organizational processes and responsibilities from the boardroom to the workplace, through a logical and consistent auditable trail. The production of an overview document explaining the overall system to all staff in the organization is often beneficial. This helps to ensure a clearer understanding in all parts of the organization of why various components of the safety management system exist and how they are interrelated.
4.2. MANAGEMENT OF SAFETY DURING ORGANIZATIONAL CHANGE
70. It is widely recognized that systems are required in all organizations which operate potentially hazardous plant to ensure that any engineered changes to the plant are properly considered in safety terms before being implemented. For those operational or engineered changes which have the highest potential for degrading safety if they do not meet intended standards or are not implemented satisfactorily, systems should be in place to ensure that proposed changes are closely and independently scrutinized before changes to the plant take place.
71. In recent years, the need to reduce costs and improve efficiency, combined with changes to the structure of electrical utilities and, in some cases, the change of ownership (e.g. privatization) of industries, has led many companies to consider how they might improve work processes and change organizational structures. This has often resulted in reductions in numbers of staff and changes in responsibilities, personnel and interfaces within the organization and greater use of contractors to carry out work. Such changes can lead to either improvements or reductions in safety, depending to a large degree on how they are planned and introduced.
72. For example, safety can potentially be improved by introducing shorter lines of communication, providing clearer accountabilities and simplifying and reducing organizational interfaces. As a specific example, improved planning and work control can increase the productivity of plant maintenance which, in turn, can lead to a reduced maintenance backlog. This is likely to decrease the number of equipment problems with a beneficial effect in reducing the number of plant events and challenges to safety systems. Better planning and work control also means that control room operations staff, maintenance technicians, system engineers, radiation protection personnel and planners are able better to co-ordinate their activities. This increased team working means that changes to the plant can be carried out more efficiently and effectively, with a potential safety benefit.
73. However, pressures arising from organizational change have the potential, if the changes are inadequately effected, to reduce safety. Three examples serve to exemplify the potential dangers. First, pressure for short refuelling outages can lead to inadequate investigation of equipment condition. This, in turn, can lead to short term repairs which can subsequently result in unscheduled forced outages. Second, unless control systems are in place and care is taken to ensure that standards are maintained, a substantial increase in use of contractors can potentially compromise safety.
A third example arises when, in attempting to work more effectively under economic and time pressures, workers fail to comply with safety rules or procedures in a misguided attempt to assist the organization to reduce costs. It is vital that management neither encourage such behaviour nor condone it, but make it clear to staff that this is neither intended nor acceptable.
74. Many of the potential adverse effects of organizational change on safety can be avoided if consideration is given to the effects of such change on the maintenance of acceptable levels of safety before changes are allowed to take place. By analogy with the processes in place to categorize the safety significance of proposed engineering changes, organizations should establish a system to assess in advance the impact of organizational change, to the extent warranted by its assessed potential safety significance.
75. It is important that, for significant changes, an implementation plan be drawn up which recognizes the need to scrutinize the effects on safety of the proposed changes as they proceed and which recognizes circumstances under which countermeasures might need to be applied should adverse effects on safety become apparent. For such changes, independent internal review may also be required. The regulatory body or bodies will also need to be fully informed about changes with potentially significant effects on safety so that it or they can independently assess the proposed changes, and can inspect and if necessary intervene if they conclude that safety is being jeopardized.
76. For changes where it is judged that potentially significant effects on safety could arise, assessments should ensure the following:
—The final organizational structure needs to be fully acceptable in safety terms. In particular, it is important to ensure that adequate provision has been made to maintain a suitable level of trained and competent staff in all areas critical to safety and that any new systems introduced have been documented with clear and well understood roles, responsibilities and interfaces. All necessary retraining requirements should have been identified by, for example, carrying out a
training needs analysis of each of the new roles and planning for retraining of key staff where this has been identified as necessary. These issues are particularly important if personnel from outside the operating organization are to be used for work which has traditionally been carried out internally or if their role is to be otherwise substantially extended.
—The transitional arrangements need to be fully secure in terms of safety. For example, it is important that sufficient existing safety critical expertise be maintained until training programmes are complete and that organizational changes not be made in such a way as to lose clarity about roles, responsibilities and interfaces. Any significant departure from preplanned transitional arrangements should be subject to further review.
77. Organizational change can potentially have broader effects important to maintaining high levels of safety. For example, it is important that the overall strategy for introducing change should recognize the potential for adverse effects on morale and motivation. Changes that are not understood or accepted by the parts of the organization and individuals affected are likely to lead to reduced morale among staff. Good communication and involvement of staff in the change process can often reduce such undesirable consequences. Planning of change that involves staff and their representatives, together with briefing and joint review during the process, is therefore desirable. This may serve not only to improve commitment and ownership, but also to enable new issues to be identified as they arise.
4.3. MONITORING EFFECTIVENESS USING PERFORMANCE MEASURES
78. An important part of the process of audit, feedback and review shown in Fig. 2 is to allow the objective assessment of safety performance within the organization.
Therefore, wherever possible and meaningful, measurable indicators of safety performance should be introduced. Monitoring of the measures of safety performance is a management responsibility. While staff can compile the data and develop the reports or summaries, the task of monitoring the results and determining which actions are called for is a vital line management function.
79. The introduction of performance measures enables an organization to set safety targets and to trend performance for the organization as a whole, for individual nuclear power plants and, where feasible, for organizational units within a plant. The inclusion of quantitative performance indicators that are defined nationally or internationally (e.g. those defined by WANO) also allows the organization and individual plants to benchmark their performance against national and international standards.
To achieve this it is helpful to adopt indicators, current approaches to which are discussed in the following.
80. There is general agreement that no one indicator has been developed that provides a measure of nuclear safety. A range of indicators needs to be considered in order to provide a general sense of the overall performance of a nuclear plant and its trend over time.
81. These can be measures of recent performance, achievement of actions to improve safety and measures of the attitudes and behaviour of staff. Most conventional quantitative indicators measure historical performance (they are often referred to as ‘output’ or ‘lagging’ indicators) and thus their predictive capacity arises from extrapolation of trends or comparisons with past performance. Forward looking indicators (sometimes referred to as ‘input’ or ‘proactive’ indicators) which measure positive efforts to improve safety are particularly valuable, although they are recognized as being more difficult to develop and measure objectively. Measures of personnel behaviour and attitudes, although more qualitative in nature, can provide a significant input to judgements about overall safety performance. Although results are usually more difficult to interpret, they have the advantage of providing direct feedback from operational staff and provide opportunities for incipient safety issues to be detected and early signs of deteriorating performance to be identified.
82. In the development of quantitative measures, it is important to recognize potential pitfalls in their interpretation and use:
—Improvement measures usually take a substantial time to be reflected in performance data, particularly when data are analysed on a rolling basis (e.g. monthly data analysed on a 12 month rolling average).
—Care needs to be taken in setting targets and analysing data when dealing with small numbers. Statistical fluctuations can easily mask trends.
—Whenever possible, quantitative measures should not relate solely to failures (e.g. number of events, number of accidents, etc.). Ideally, measures should also be designed to ensure progress on those activities which will improve safety.
For example, the reporting of ‘near misses’, the number of safety inspections and the provision of safety training can all be used as input measures.
—In the development of reporting systems, account needs to be taken of local and cultural aspects that may inhibit reporting, e.g. the response of management to individuals associated with an event, local reward systems based on a reduction in accidents or the number of reported events and a culture which accepts injuries as a part of normal life.
—Numerical measures must always be subject to careful interpretation and be used as part of an overall judgement about safety performance. They should not be regarded as an end in themselves.
—Indicators should be periodically reviewed and their relative importance may change with time. The use of a fixed set of indicators that do not reflect the evolution of the organization and its requirements should be avoided.
83. Many operators of nuclear power plants have developed their own output performance indicators; however, the following ‘top level’ performance indicators have been used by WANO:
—unit capability factor,
—unplanned capability loss factor,
—unplanned automatic scrams per 7000 hours critical,
—safety system performance,
—volume of low level solid radioactive waste produced,
—industrial accident rate,
The extent to which individual indicators in this list are of a direct measure of safety varies considerably, although most of them, at least, provide an indirect measure. Furthermore, it should be recognized that some of these have greater significance for particular reactor types (e.g. the chemistry index) and thus when comparing performance, allowance must also be made for the characteristics of different designs.
84. Experience has shown that plants that have an overall poor record on a majority of these indicators typically have operational problems with a potential impact on safety. As a rule of thumb, when a few of these indicators show declining trends, this can be taken as a useful early warning signal to alert management and to prompt further analysis and investigation of the underlying issues.
85. These indicators are broad based and it is often helpful to monitor other specific or more detailed indicators. For example, analysis of plant events of various types can provide a useful further input to the assessment of safety performance. The following are among those which might be considered:
—significant events, measured by both number and consequence;
—repeat events that have taken place on the plant; these provide a measure of thefailure to implement effective corrective actions;
—events that are similar to those identified at other nuclear plants; in this case, the organization may not have learned sufficiently from the experience of others;
—events arising from particular types of deficiency (e.g. failure to comply with technical specifications or near misses related to human factors or from deficiencies, in particular in nuclear related systems (e.g. the amount of time a system is declared as not being available — even if within technical specification limits).
86. Where similar root causes recur, a plant probably has weaknesses in its overall performance or cultural deficiencies that are in need of attention. Event analysis has expanded at many plants to include analysis of events without significant consequence (sometimes called ‘near misses’). As it is generally agreed that both consequential and non-consequential events have similar causes, it follows that correcting the causes of non-consequential events should contribute to improvements in safety by helping to prevent future events.
87. It is also sometimes useful to develop detailed indicators for specific organizational units in a plant. For example, in the maintenance area, the following have proved useful for monitoring performance in some organizations:
—number of outstanding backlogs;
—a measure of non-proceduralized practices or ‘workarounds’ employed;
—number of control room instruments out of service;
—amount of maintenance rework;
—percentage of spare parts available, as expected, on demand;
—average life of corrective maintenance actions;
—a measure of the prevalence of human errors;
—the completion of training to agreed time-scales;
—numbers of minor injuries and near misses (an increasing trend in the reporting of these is to be encouraged, since they frequently represent precursors to more serious accidents);
—standards of housekeeping.
This approach allows, in principle, deteriorating performance in a specific functional area to be recognized at an early stage. Although some of the measures are difficult to define and monitor on a fully consistent basis, they can nonetheless provide an important input to the overall picture and can serve as an added impetus to improvement.
88. There are other more general measures of safety performance that, whilst providing more qualitative information, are an important adjunct to numerical indicators.
For example, observations of the behaviour of plant personnel can give an indication of how safely they actually carry out work and comply with procedures and good practices. Observing plant personnel performing work in the field and their interactions with supervisors and managers can provide insight into the safety culture at a plant. Such measures can be supplemented by surveys and interviews into the attitudes of staff. Although these tend to reveal what people think rather than how they act, properly conducted surveys and interviews can provide an accurate impression of the level of safety culture at a plant.
4.4. IDENTIFYING DECLINING SAFETY PERFORMANCE
89. In order to avoid any decline in safety performance, nuclear power plant and utility management must remain vigilant and objectively self-critical. Early signs of declining performance are not readily visible and tend to be ambiguous or hard to interpret. In fact, when the signals are clear, it means that it is often too late and that serious performance problems exist. A key to this is the establishment of an objective internal self-evaluation programme supported by periodic external reviews conducted by experienced industry peers using well established and proven processes. Such a combined programme reduces the dangers of complacency and acts as a counter to any tendency towards self-denial (e.g. ascribing any deteriorating performance to such factors as ‘a run of bad luck’). In addition to the early detection of any deterioration, such an approach can also be used to identify any enhancements of operational performance and safety and to learn from success.
90. Declining performance typically exhibits the following pattern:
Stage 1: Over-confidence. This is brought about as a result of good past performance, praise from independent evaluations, and unjustified self-satisfaction.
Stage 2: Complacency. In this phase, minor events begin to occur at the plant and insufficient self-assessments are performed to understand their significance singly or in totality. Oversight organizations begin to be weakened and self-satisfaction leads to delay or cancellation of some improvement programmes.
24 Stage 3: Denial. Denial is often visible when the number of minor events increases further and more significant events begin to occur. However, there is a prevailing belief that they are still isolated cases. Negative findings by internal audit organizations or self-assessments tend to be rejected as invalid and the programmes to evaluate root causes are not applied or are weakened. Corrective actions are not systematically carried out and improvement programmes are incomplete or are terminated early.
Stage 4: Danger. Danger sets in when a few potential severe events occur but when management and staff tend consistently to reject criticisms coming from internal audits, regulators or other external organizations. The belief develops that the results are biased and that there is unjust criticism of the plant. As a consequence, oversight organizations are often silent and afraid to be the bearers of bad news and/or to confront the management.
Stage 5: Collapse. Collapse can be recognized most easily. This is the phase where problems have become clear for all to see and the regulator and other external organizations need to make special diagnostic and augmented evaluations. Management is overwhelmed and usually needs to be replaced. A major and very costly improvement programme usually needs to be implemented. It is important that declining performance be recognized after the first two stages and at the latest early in Stage 3.
91. The key to a successful internal self-evaluation programme is the establishment of a learning culture throughout the organization with staff at all levels seeking to review their work critically on a routine basis and to identify areas for improvement and means of achieving this. In its turn, management must be supportive, for example by seeking opportunities for both themselves and staff to visit other nuclear power plants to identify good practices that they might adopt. This can occur both on an individual plant to plant exchange basis and also as members of international teams undertaking external reviews at nuclear power plants in other Member States.
92. Specific studies and general experience have shown that frequently occurring underlying conditions at those plants which have had significant problems include:
—acceptance of low standards of plant condition/housekeeping;
—failure to recognize that performance is declining and to restore higher levels of performance in specific areas at an early enough stage;
—a lack of accountability among line management and workers;
—ineffective management monitoring and trending of performance;
—deficient performance in the control room;
—an increasing human error rate;
—inadequate and/or poorly used procedures;
—insufficient and/or ineffective training;
—insufficient use of operational experience feedback and root cause analysis programmes in the analysis of events and ‘near misses’;
—an inadequate control of design configuration;
—failure to benchmark against those with better safety performance;
—a lack of awareness among the top managers about the principal deficiencies and associated corrective actions often reinforced by a ‘good news’ culture;
—inadequate or insufficient self-assessments being carried out on issues relating to safety culture;
—inadequate capability for supervising and monitoring contractors.
93. While weakness in a few areas can exist at even top performing plants, experience has indicated as a rough ‘rule of thumb’ that when weaknesses are apparent in more than a few of these conditions, there is a danger that a significant decline in plant performance is occurring.
94. The routine and objective review of the trends in a set of performance indicators such as those discussed in Section 4.3 is undertaken at most nuclear power plants. An early indication of concern might require the development and monitoring of additional lower level measures of performance to confirm (or otherwise) the existence of a deteriorating trend and to support the identification of the associated root causes. In seeking critically to assess performance, the management at a plant may wish to give particular attention to analysing performance in areas such as those identified in para. 92.
95 Self-assessment has significant advantages as a means to identify such precursors. If it is left to external reviews and audits, or worse still, for actual events to expose these weaknesses, the required corrective actions are often far more extensive and expensive to implement. Early identification and correction at the plant is thus the optimum solution. To achieve this, management must develop within the organization the ability to conduct thorough, critical self-assessments. Also, when areas for improvement have been identified, management needs to establish clearly prioritized action plans that address the root causes, gain ownership for these from staff and pursue them vigorously.
96. Even where self-evaluation programmes have been established, weaknesses can arise for a number of reasons. These include:
—failure to identify the real root causes;
—lack of actual or perceived management commitment in the resolution of the identified problems;
—insufficient attention to the content of remedial action plans and, in particular, a failure to prioritize actions;
—failure to gain the commitment of staff to the changes proposed;
—failure to commit adequate resources to complete the improvement programme satisfactorily;
—insufficient commitment to see the programme through to a stage where actions are complete and have achieved real and measurable improvement.