With digital transformation accelerating across the industry, pharma and biotech companies are increasingly aware that their most valuable asset is not necessarily a molecule or drug brand – it is most likely their petabytes of data.
These enormous proprietary information resources hold immense potential to drive business performance. But while data flows from various sources – including clinical datasets, patient data, customer complaints and manufacturing systems – it often faces the same critical challenge. Data is produced, formatted and stored in ways that leave it ill-suited for accessibility and reuse. The vast majority (80%) of data generated today is unstructured, and the high volume of data, especially manufacturing data, has very poor utilization. Only 1% of unstructured data is used or analyzed.
A critical culprit in this series of challenges is the lack of data governance policies and systems to support significantly higher volumes of data being generated. This problem will only be exacerbated as the volume of data grows exponentially with increased accessibility of information driven by industry 4.0 connectivity of IT and OT systems.
This position paper and supporting personas address the issue of where to start our industry’s digital transformation journey. Part of the answer we propose is data governance and, more specifically, the pressing need to collaborate and agree on a standard set of taxonomies and ontologies for the biopharmaceutical industry, recognizing that these standards will need sufficient flexibility to allow innovation. As a first step, we propose that the industry adopt The FAIR Guiding Principles for scientific data management and stewardship.