Big data's big issues

By Sobia Raza

23 November 2015


‘Big data’ – the buzzword excites some and is reviled by others. Either way it’s a term hard to escape. Not a week goes by without some conversation about how data, ‘big’ or otherwise can transform healthcare services. Earlier this month Public Health England hosted PHE Data Week; a series of activity focussing on data and its importance to health protection, prevention and care. It was encouraging to learn of the work underway to ensure data, which is now viewed as a public asset, can be harnessed to make better, more informed decisions about public health. Last week the House of Common’s Science and Technology committee continued its evidence taking sessions for the ‘Big Data Dilemma’ inquiry into the opportunities and risks of big data.  

Big data challenges cut across many of the PHG Foundation’s policy projects. Regardless of the healthcare area in question, in our experience there are critical and recurrent themes which need to be addressed if big data is to live up to the hype: 

Curation and integration

Arguably data can only become ‘big’ if it can first be collated and integrated from disparate silos. Even within our unified healthcare system, these silos exist due in part to fragmented delivery systems and to non-interoperable technology platforms. As highlighted in our consultation response to the Big Data Dilemma inquiry, while capital investment for research facilities is vitally important, purpose-built data infrastructure for the healthcare domain, as well as resources and support for clinical and health data curation are also indispensable and should not be overlooked or conflated with support for research endeavours. 

Apart from the capital support, the aggregation of healthcare data requires strategy and leadership. Our work on bringing pathogen genomics into healthcare practice, describes how the collation of pathogen genomic data and associated metadata is fundamental to both the delivery of frontline public health services, as well as the development of future approaches to manage infectious diseases. In developing their National Infection Service, PHE have a responsibility to ensure the infrastructure and strategy is in place to collate the enormous amount of genomic and associated data that will arise from pathogen genomic services across the health system. We think a single, unified data management strategy should be developed that encompasses PHE and the other major organisations with a stake in infectious disease management. 

Sharing and access

The benefits of generating and aggregating big data sets will only be achieved if the data can then be accessed by those who need it to deliver their (healthcare) services and / or use it to innovate. As one of the contributors to PHE Data Week blog series noted; open standards and technical solutions are one way of promoting transparency and improving the sharing of data, code and innovations more widely. In our experience technical and infrastructure challenges although critical, are seldom the only impediment to data sharing. The legal and regulatory frameworks surrounding the sharing of patient data, especially genomic / genetic data is complex. In developing approaches to data sharing which are both responsible and proportionate, there can be great value in assessing the legal, social and technical considerations side by side. For example there may be technical methods to data anonymization, managing access, and data analysis which have implications for the development or execution of regulation. 

Engagement and trust 

Trust is key to improving data sharing, as emphasised by the National Data Guardian for health and social care at last month’s Parliamentary evidence session on the Big Data Dilemma inquiry.

Transparency, so the public can see how their data are used for care and more widely, is integral to building this trust. Failure to be transparent will almost inevitably lead to repetition of the implementation debacle. Public support should also be garnered by demonstrating the benefits and utility of data. PHE Data Week series included a number of engaging graphics to communicate results of their data analysis – from the success of the 2014/15 ‘flu vaccine pilot, to interactive tools to view health outcomes data. It would be good to see these and social media activity during PHE’s Data Week forming part of an ongoing campaign to engage the public – as transient efforts will have transient results.  

Earlier this year the PHG Foundation in collaboration with the Association for Clinical Genetic Science (ACGS), hosted a meeting to discuss the most pressing challenges to sharing human clinical genetic and genomic data in the context of rare disease diagnoses. This included the technical, legal, and practical considerations to sharing and discussions extended to the principle of trustworthiness. Some of this trust can be built through establishing the elements essential to sharing data effectively and responsibly – such as clearly defined specifications, infrastructure, safeguards, and security. In any case a multidisciplinary and integrated approach to managing data is essential if big data is to deliver on its big promises. 

The report from the joint workshop on data sharing to support clinical genetic and genomics services will be available soon. 

Genomics and policy newsletter

Sign up