'Just a crazy day': More than 30 systems hit by major network crash at The Ottawa Hospital
Internal documents reveal 2022 ‘code grey’ was more severe than reported
More than 30 computer systems were affected during a 12-hour network failure at The Ottawa Hospital (TOH) last summer that halted surgeries and medical appointments, according to internal records obtained by CBC News.
On Sept. 2, 2022, front-line staff faced a host of challenges, from difficulties paging one another to accessing medical records and managing diagnoses.
It was "just a crazy day," a hospital leader wrote in a group chat.
At the height of the code grey — which refers to a critical infrastructure failure — 31 systems crashed across all three TOH campuses, far more than what was initially revealed.
CBC Ottawa obtained documents including emails, chats and notices through a Freedom of Information request, but many details were blacked out.
As staff came up with workarounds, such as labelling lab samples by hand, doctors became overwhelmed and confused at times, the documents show.
"The lab is receiving a flood of routine samples marked as STAT (immediate) and, given we are currently operating in a completely manual process, we are unable to process samples," one clinical biochemist wrote in an email.
"This is concerning as we need to have capacity to support urgent care areas … and we are unable to at this moment due to the flood of samples, " he wrote. "We have received samples and do not know where they originated."
CBC is not naming the front-line health-care workers involved in the email chain because they were reporting problems internally.
The details of TOH's code grey come after CBC uncovered similar issues during several code greys at the Queensway Carleton Hospital (QCH).
- Anxiety remains high after multiple code greys at Queensway Carleton Hospital
- Code grey: Inside a 'catastrophic' IT failure at the Queensway Carleton Hospital
The Ottawa Hospital attributed its code grey to a "rare type of hardware issue."
"The hardware issue was resolved within 12 hours, with many of our systems back up and running before then," a hospital spokesperson said in a statement. "TOH continues to review, refine and test emergency plans to ensure we will always be able to care for our patients."
There have been no code grey incidents since, TOH added.
Scope of breakdown
The Ottawa Hospital first discovered around a dozen software woes just before 4 a.m. on Sept. 2.
After a flurry of tests and emails, staff realized the problem was much more pervasive.
"Do we know what the issue is? Is it network? We are also completely down," wrote an administrator with the department of medical imaging.
Maintenance crews tried to reboot all the servers on site, but that failed.
Leaders convened for a call around 8 a.m. The hospital's director of information systems, Leanne Taylor, then approved an all-staff memo to announce the code grey.
At first, the memo listed 13 affected systems, including:
- EPIC (digital health records).
- PACs (the picture archiving and communications system).
- Cerner (the automated lab system).
- Rhapsody (an integration engine).
- SPOK Mobile Paging (a smartphone pager app).
- The hospital's corporate Wi-Fi.
Subsequent technical updates informed staff they should "anticipate problems will be harder to resolve" and could last "longer than just an hour or so."
Technicians also ruled out the possibility of a cyberattack.
I've never seen a computer outage or a code grey like this before in my career.- Rachel Muir, longtime nurse
"Definitely not cyber, definitely a piece of our hardware environment," a memo read.
As the morning wore on, Taylor expanded the list of affected systems to 31. The long list included programs designed to manage radiation consultations, ultrasounds, X-rays, mammograms, ob-gyn examinations, electrocardiograms, drug prescriptions, lab testing and other tasks.
At one point, leaders contemplated using old CD-ROMs to restore some computer capacity, but the workstations in question did not have CD drives.
No announcement or update for hours
TOH's communications department did not respond to CBC's requests for information at the time of the code grey, but multiple patients reported cancelled appointments, including surgeries.
The internal memos confirm some operations were pushed back.
"ORs (operating rooms) are going through elective volumes that have to be postponed until over the weekend and early next week," read one of the meeting summaries.
"Fingers crossed the worst is over," Taylor concluded in an email.
One longtime nurse with TOH recalled the chaos of the day in an interview with CBC.
"I've never seen a computer outage or a code grey like this before in my career," said Rachel Muir, who spoke as a representative of the Ontario Nurses' Association.
Muir said doctors and nurses resorted to using paper records during the outage.
The hospital did not make any announcements on social media or issue any media notices about the ordeal for nearly 12 hours, only sending an update at 5 p.m. when the crisis was over, prompting criticism over its lack of communication.
TOH did not answer CBC's latest round of questions about the impact on patient care or its communications strategy, either.
Instead, it issued a brief statement.
"The Ottawa Hospital quickly implemented downtime procedures and co-ordinated responses throughout the hospital to support patients and front-line staff," the statement read.
"Care teams worked quickly to reschedule any appointments that were impacted, and we made every effort to ensure that patients continued receiving the care they needed."
Same potential hardware failure as QCH
In her final technical update, Taylor wrote that her team was "100 per cent confident" the root cause lay in the hardware infrastructure, which is manufactured by global technology giant Cisco.
A week after TOH's code grey, Queensway Carleton experienced its own "catastrophic" IT failure. QCH has called at least five more code greys since Sept. 9.
Although QCH has not identified the exact cause of its original code grey, internal emails from that time also pointed to Cisco and aging hardware.
"The hardware required to allow us to migrate from the [old] to the new cores was ordered earlier this year, however Cisco supply chain backlogs have a December 2022 ETA for the Nexus devices," wrote Nathaniel Boisvenue, technology services manager at QCH, in mid-September.
"I have reached out to … Cisco to ask if they can expedite this for us."
QCH has since confirmed it has not yet received the new hardware from Cisco.
A Cisco spokesperson told CBC that it is currently experiencing "extended lead times" for several products, "from automotive to consumer electronics and beyond."
"Material shortages across the semiconductor industry continue to impact supply chains globally, slowing output across multiple industries," read a statement from the California-based conglomerate.
The Ottawa Hospital did not answer CBC's questions about its Cisco hardware, but some staff noted similarities between the two hospitals' experiences.
"We are in a similar situation after our unplanned downtime, debriefs, clinical impacts and lessons learned," said Tim Pemberton, a vice-president at QCH, in an email to Taylor on Sept. 12.
"My guess is you had a similar type of failure," he added.
In response, Taylor wrote she was happy to consider expanding communication between the two organizations.
She did not, however, address Pemberton's speculation.
Bioethicist likens chaos of outages to 'war zone'
The Ottawa Hospital is one of Canada's largest hospital networks, serving 1.2 million people across eastern Ontario at its various campuses — by its own count, more than any other academic health centre in the country. It also leads on the technological front.
Queensway Carleton, by contrast, is the only full-service hospital in west Ottawa, serving 500,000 patients in the area.
Its high-tech connection with five other hospitals within the Champlain Local Health Integration Network also makes it a core health-care provider in the region.
The integration of artificial intelligence, automation and other advances at both QCH and TOH make technology a vital piece of their infrastructure and that's why maintaining that infrastructure is "essential," said Bryn Williams-Jones, a professor of bioethics at the University of Montreal.
"It should be part of your regular practices, exactly like we maintain buildings or ensure that [a] ventilation system is working," Williams-Jones said.
When infrastructure fails at key times, like during a temporary code grey, it creates chaos for both health-care workers and patients, he said.
"We're not in a war zone, but if our hospitals look like they're war zones, it's a mega problem."
Wiliams-Jones said it also compounds the stress many doctors and nurses are already feeling.
"It's very damaging. It's the sort of thing that leads and contributes to burnout, to disengagement, to people leaving for a different professional practice."