The BioPhorum IT Cyber Security Education Series: 1
This is the first in an ongoing series of publications to be delivered by the BioPhorum IT Cyber Workstream as part of the new BioPhorum IT Cyber Security Education Series– a portfolio of articles and short papers addressing emerging issues in cyber security in the biopharmaceutical manufacturing environment. Focusing on key findings from extensive surveys, case studies and peer reviews, the series will provide valuable information about cyber security issues and challenges experienced by BioPhorum members and comparing and sharing how members are responding to these in today’s fast-moving environment. The intention is that the series will form the basis for a new set of BioPhorum guiding standards and best practice principles for cyber security in biopharmaceutical manufacturing.
Improving performance against boutique vulnerabilities
Manufacturing organizations today face an increasing number of new cyber security threats, and many require urgent action to shore up defenses, typically but not exclusively through patching. Several of these so-called ‘boutique vulnerabilities’ have appeared in recent years – WannaCry, BlueKeep, DejaBlue to name a few – and each has required unplanned, near-term action to address.
These efforts are disruptive, diverting IT (enterprise information technology) and OT (operational technology/process control) resources away from other work and resulting in ad hocimpacts to manufacturing, as equipment outages are required to complete patching and related work.
With each new challenge, organizations are working to improve their response and reduce the time and effort required. Companies are all investing in routine patching where possible, reducing this activity in urgent situations, as well as investing in other mitigation options such as isolation where appropriate – more options means reduced impact. Across organizations everyone, including quality, manufacturing, external application vendors, and senior leadership is working to improve understanding and ensure support is in place in advance.
New vulnerabilities demand quick mobilization of resources
While new vulnerabilities are identified all the time, some rise to attention because of the potential severity of the consequences. These are sometimes called ‘boutique vulnerabilities. Here are some well-known examples which have arisen since 2018:
- WannaCry appeared in May 2017 and targeted an early version of the Server Message Block (SMB) protocol used by Windows for mapping drives between machines. Microsoft took the unusual, and therefore noteworthy, approach of issuing a patch for no-longer-supported versions of Windows, including Windows 2003 and Windows XP. In addition to patching, this vulnerability could be mitigated by disabling the SMB v1 protocol using registry entries.
- BlueKeep appeared in May 2018 and targeted the Remote Desktop Service (RDS) on versions of Windows Server including Windows 2008 and the no-longer-supported Windows 2003. Microsoft again provided patches for Windows 2003 servers. The vulnerability could be mitigated by patching or by disabling RDS. This vulnerability raised concerns because RDS is commonly used for support and no credentials were required to exploit it and potentially run attack code freely.
- DejaBlue appeared later in 2018, like BlueKeep but affecting Windows 2008 and newer versions of Windows Server. Again, mitigation options included both patching and disabling of RDS.
In each case, there was a concern that an attacker would have little trouble leveraging the vulnerability once they could physically reach the machine. The fact that drive mapping and RDS were involved, both using protocols that are often part of normal communications through firewalls, meant that doors were open that had to be closed quickly. In short, if network defenses failed or were overcome, the vulnerabilities were easy to exploit, and the consequences could be significant.
Member companies all felt compelled to mount an immediate effort to patch or otherwise protect as many systems as possible in their manufacturing operations.
Efforts are time consuming and disruptive
Mobilizing IT/OT staff at manufacturing sites and in support roles across the company is time consuming. Manufacturing systems are also affected, so local quality groups, supporting software and automation vendors, as well as manufacturing managers are also involved. Although time consuming, it is essential for all involved to:
- Identify vulnerable assets
Sometimes this is just a question of the operating system version, but there could be other criteria like using a specific service, or you may exclude assets that are isolated or otherwise protected.
- Work with system vendors who are ideally vetting patches on your behalf
Many systems are supported by application or system vendors, and often these vendors provide testing of new patches with their software andalsocondition their ongoing support on the basis you will not modify the system without coordinating with them.
- Work with internal quality teams to manage and document changes required to patch or otherwise update systems
Updates are considered a change, so they require risk assessments, tracking by quality teams, and verification of proper operation after they have been applied.
- Work with manufacturing teams to secure outages for applying patches
Production equipment cannot be rebooted without close coordination with manufacturing, typically requiring an outage between batches.
These efforts are disruptive because they are urgent, frequently pushing other efforts aside, for example by:
- Pulling IT/OT staff off other work
Work that is potentially of higher strategic value is put on hold while mitigation efforts run their course.
- Disrupting manufacturing by requiring outage windows between batches
Even waiting until batches are complete before fitting a patch disrupts the planned rhythm of the operation.
- Scheduling of outages further in advance is difficult
Vendors provide valuable testing of patches, but delays and uncertainties make proactive scheduling of outages impractical and unlikely.
BioPhorum member companies have demonstrated numerous optimizations
A member company provided a short overview of metrics from their WannaCry, BlueKeep, and DejaBlue efforts. Here are some of the key points:
- This isn’t easy.
For WannaCry, the first effort, the member company managed to fully protect over 80% of vulnerable assets within roughly 40 days, and all vulnerable assets only after 13 months.
- Having options beyond just patching makes a measurable difference.
The BlueKeep effort benefited from lessons learned, and all vulnerable assets were protected within only 22 days. A key difference for BlueKeep was that assets which could not be patched, typically because of manufacturing scheduling, could still be protectedbecause there was an option to just disable RDS until those systems could be patched later. Some assets that had been identified for upgrade or retirement were retired immediately, and others were immediately isolated.
- Constant improvement is sometimes limited by factors outside your control.
The later DejaBlue effort seemed easier because the assets ran newer versions of Windows. However, in reality it was harder because so many had not been patched and required lengthy periods to complete numerous accumulated patches to reach ‘current’. Vendors also identified some patch conflicts and dependencies. 80% of the assets were protected by day 40, although some remained unprotected beyond day 90.
Discussion among member companies revealed similar findings and experiences were identified across different companies. These findings are summarized below:
- Track and manage efforts using Excel, databases, CMDB, or a combination.
However, everyone wants to move away from spreadsheets and are investigating technology options for support.
- Working directly with manufacturing is key.
This is the ‘new normal’. It is important for all stakeholders to understand this and to have a sense of the criticality of getting this work done.
- Top-down leadership helps, a lot.
Urgent, disruptive, labor-intensive efforts require broad cooperation among various groups and quick, temporary realignment of priorities, all of which require executive support during as well as after an effort.
- In addition to frequent patching, supporting software must be current too.
During the DejaBlue effort, older versions of McAfee Antivirus and Microsoft SCCM created challenges for everyone.
- Use a ‘patch-to-endpoint’ strategy during an urgent effort.
Some companies go ahead and install patches aggressively, and thenwork with manufacturing to plan the reboots that will actually apply the changes. Some found this does not work well for routine patching because the reboots are a long time coming and if multiple patches are pending, things are less predictable from a time-required perspective. However, there was general agreement this is a good strategy during an urgent effort.
What are your peers doing?
The responses to these boutique vulnerabilities are still disruptive, ad hocefforts, but everyone is learning and applying those lessons. Everyone has included the following activities in their toolkit, in some form:
- Improving communications to stakeholders.
We need to tell our stories very clearly. For example, it is reasonable for people outside the actual work to assume that each new effort will set a new ‘worst case’ in terms of how long it takes to address these things. In practice, however, this is not necessarily the case and there are always new unknowns and things outside of our control. Every set of new vulnerabilities is unique.
- Leveraging all options, not just patching, to protect individual assets.
There are often temporary mitigations like disabling a specific OS feature that can protect assets more quickly and with less impact than patching. This makes it possible to schedule subsequent patching to permanently close vulnerabilities when the business impact is smaller.
- Learning that everything is not Windows.
All of the examples here were Windows vulnerabilities, but there are other assets out there. Recently, the URGENT/11 vulnerability was identified, impacting the VxWorx operating system at the heart of many industrial control devices. We can’t limit our planning to Windows vulnerabilities.
- Adopting an approach of constant, incremental improvements.
Your peers are thinking about process improvements for the next effort while they are closing out the current effort. Post-effort reviews highlight potential improvements in tracking tools, asset inventory, and procedures. While these vulnerabilities appear without warning, the period immediately following an effort is a good time to make improvements.
- Pursuing increased routine patching as a means to reduce the impact of these ad hoc efforts.
If, for example, 40% of your systems will get patching within the next 60 days through routine patching, you can decide whether to move faster when a new critical vulnerability appears or just let the routine patching play out, focusing your ad hoceffort on the gaps.
Why isn’t NotPetya, the 2017 attack that so seriously impacted Merck, Maersk, and other organizations around the world, included in this list? Companies that were vulnerable had either patched the specific targets or not, and either used the infected software or not. Though a patch was issued several months earlier, it was not a situation where a new and frightening vulnerability was announced to the world leading to a race between attackers and defenders.
These problems affected all members equally, based on the discussion. Vendors like Microsoft and Emerson were aggressive in identifying and addressing these kinds of problems, and almost everyone had to closely follow those developments as they occurred.