Operators can Prevent Down Time and Emergency Situations: True or False?
Words from a Production Manager: Situational Awareness and Operator Response are key. We just had two straight months of production records crash to a halt after a column carried paste over into a vent system. The operator did not make an error – but he was not aware of the specific scenario that created the event and so he could not and did not take the proper action to return the system back to normal. A simple reduction in feed to the column would have prevented 5+ days of downtime. The control system indicated an abnormal numerical value to him but did not alarm (alarm was not set up on one of the parameters that would have informed him of the issue). The process upset cost approximately $2.5 million in lost revenue. The correct operator action and/or engineering interlock control would have avoided all of it.
During a PHA or HAZOP study, situations like a failed pump or pressure release valve can be identified as hazards. Safeguards must be in place to prevent injury, death, or loss of production. One of those safeguards is the control room operation.
The control room operator will likely have an opportunity to prevent hazards by intervening before a failure trips the SIS. The problem is that most organizations assume the operators can handle failures effectively and alarms are set up correctly. However, most organizations have never put their assumption to the test until they are down.
When operators are expected to jump into high gear to troubleshoot a problem, they must rely on training, experience, communication, and operator support systems like alarms, displays, and procedures. Based on our experience, most of these systems have major flaws. The operator’s actions or lack of action can be costly. When operators run out of time or makes the wrong move during an abnormal situation that event can quickly turn into a major accident.
The operator is a critical component, a “Human Safeguard” for the operability of the process.
During a failure you may have a situation that poses a threat to the operability of the process. This is a critical time for the operator that can’t be ignored. We recommend you identify and review specific situations where the operator is considered a line of defense or an important layer of protection. We believe that the operator is a valuable asset that must be able to detect, diagnose, and prevent the consequence of a failure.
Plant surveys show that incidents are frequent, with typical costs ranging from $100,000 to well over $1 million per year. One plant surveyed had 240 shutdowns per year at a total cost of $8 million. Many of these shutdowns were preventable.
We found that refineries typically suffer a major incident once every three years, at an average cost of $80 million. One insurance company shows that industry claims typically exceed $2.2 billion due to equipment damage. It is likely that actual total losses to the companies are significantly higher than what is claimable. Per the ASM: Companies achieving Best Practices in operations can improve productivity by 5-12%.
From the period of 2008 until 2010 Borregaard totally changed its control room system. Originally the site was managed by sixteen different control rooms, and this was remodeled into one common control room. The end goal for Borregaard AS was to achieve safer operation and increase productivity (volume, quality, staffing).
The original way of controlling the site led to lack of communication and misunderstanding that affected safe operation and reduced productivity. To change this working process the following systems were addressed: Staffing, control room design, HMI design, Operator training development, Field studies, Operator Gap analysis, Management of organizational change, Abnormal situation management and Alarm management.
During the project User centered Design Services and Borregaard worked together with new working processes to close the gap towards best practices. The cooperation went on long after the building of the control room and until a new culture had been established at site.
During this working period, the TRI went from 6 and down to 1, the sick leave went from 8% down to 4%, volumes increased 5% and the hit rate increased 10%.
The total results of the project exceeded the investment many times and have given Borregaard the opportunity to build in new productivity measures years after finalizing the project. Ole Gunnar Jakobsen Site Director
Operator Hazard Analysis (OHA)
To ensure that the operator can manage abnormal situations we use scenarios which could result in incidents with major hazard potential. There should be no fixed rule on the number of scenarios that should or must be analyzed. Each plant or unit is different. It is recommended that scenarios representing the following are analyzed:
A site will also need to consider whether it is necessary to assess the scenarios at different times such as during the day and at night, during the week and at weekends, if staffing arrangements vary over these times.
It is necessary to define the circumstances of each scenario in sufficient detail. As a minimum:
The task of analyzing hazards in a workplace or system can be daunting. However, without an effective analysis, potential hazards may not be discovered before they result in injuries and loss.
If you are familiar with the HAZOP process, then you know there are many ways to assess a process or workplace for hazards. What usually gets overlooked is the assumption that an operator will be ready to respond and prevent failures during emergency situations. Just like the SIS is tested, we must test our operators and the systems they use, including the control room environment.
Recommended by LinkedIn
An Operator Hazard Analysis (OHA) uses several common methods of finding potential hazards.
We recently performed an OHA at a major U.S. refinery. The team assumed the operators were capable of managing abnormal situations but had never tested the theory.
If they had done so they would have discovered that the existing training was inadequate. The only reason that the operators are competent is that, after 20 years on the job, they found their own ways to get the job done. Contrary to company policy, procedures weren’t documented, validated, or tested. On one of those rare occasions when the holes in the Swiss cheese line up and an accident occurred operators were condemned by managers who didn’t understand the process as well as the operators did.
During the Operator Hazards Analysis we look at several factors. There should be continuous supervision of the process by skilled operators, i.e., operators should be able to gather information and intervene when required. Distractions such as answering phones, talking to people in the control room, administration tasks and nuisance alarms should be minimized to reduce the possibility of missing alarms. Additional information required for diagnosis and recovery should be accessible, correct and intelligible.
Communication links between the control room and field should be reliable. For example, back-up communication hardware that is non-vulnerable to common cause failure should be provided where necessary. Preventive maintenance routines and regular operation of back-up equipment are examples of arrangements to assure reliability. Staff required to assist in diagnosis and recovery should be available with sufficient time to attend when required.
Operating staff should be allowed to concentrate on recovering the plant to a safe state. Therefore distractions should be avoided and necessary but time consuming tasks, such as summoning emergency services or communicating with site security, should be allocated to others.
Other Systems that must be analyzed for Gaps:
· Alarm Management
· HMI Operator Displays
· Console Workstations
· Control Room Environment
· Staffing / Workload
· Fatigue
· Shift Handover
· Procedures
· Communication
· Training
· Experience
· Real World Scenarios
Like our client said in the beginning, situational awareness and operator response is key. It is our responsibility to find gaps that impact the operator’s ability to perform at the highest level and fill those gaps. Please visit www.mycontrolroom.com for information on Gap Studies and Operator Situational Awareness.
Chemical engineering
1yStephen Maddox Quote “Distractions such as answering phones, talking to people in the control room, administration tasks and nuisance alarms should be minimized to reduce the possibility of missing alarms” Unquote. These are obvious distractions in the operators life. Could you implement OHA findings to draw the line of “acceptable” practice? What are the key drivers for the change? OHA is new to me and I am curious if the same line of reasoning can be directed at especially 2nd and 3rd categories of situations for operator intervention. Known, knowns Known, unknowns Unknown, unknowns Best regards.