Saturday, 22 February 2014

My thoughts on Care.data

Well, I didn't expect to be reviving this personal blog however I needed to make it clear that this post is absolutely just my personal opinion.


I've been involved in a Twitter discussion with @dbarthjones and it's become apparent I needed more than 140 characters to adequately express my views with respect to the NHS data sharing initiative known as care.data.


In my opinion, Ben Goldacre hits the nail on the head in his post at:


http://www.theguardian.com/society/2014/feb/21/nhs-plan-share-medical-data-save-lives


In particular he notes that the public have two main concerns with the proposals:


i) Privacy; and
ii) Commercial exploitation of their personal data.


Jumping straight to sharing of personal data with the commercial sector was a clear presentational mistake if nothing else.   But I'm not blogging about (ii), I'm more concerned with (i).


I'm not overly concerned about the potential general re-identification of all individuals within the care.data dataset - at least not at present.  There are few datasets out there of sufficient scale and cross-over (shared attributes) to make general re-identification trivial and easy to automate.   Of course, this will change if the Open Data evangelists get their way and add education, tax and other HMG assets to the pot.  In this latter scenario the kind of general re-identification across these huge data-sets (complete with shared unique sets of attributes) becomes much more likely/scary.   No, what I am concerned about today is the targeted re-identification of individuals by those who can have a real impact on our day to day lives; think friends, family, employers etc.  Such potential threat actors already know a great deal of information about us - information that we have chosen to share with them.   That does not mean that they know everything about us, we are perfectly entitled to keep things to ourselves.   If we fall ill, we are not obliged to tell our friends exactly what our condition is or even to tell them that we are ill at all!    And that is what worries me, if our data falls into the public domain then we lose that ability to decide with whom we share our most private information.    As far as I am aware there no current plans to make individual-level care.data publicly available, only to researchers, however the more widely you distribute data the less control you have over further distribution.  It comes down to trust in those with whom you share, albeit trust usually bound up in some of data sharing agreement.


Now, there are lots of bland statements made by proponents of data sharing that they have been sharing health data for years and that there have been no major incidents of data leakage.  My question is this:  how do they know?  What monitoring systems are in place?  You can be very good at not spotting security breaches if you don't look.  Having spent many years working across HMG I am more than aware of the amount of spare resource that the Civil Service currently has available to police enforcement of data sharing agreements.


So.  That's the bad stuff.  I do however fully buy-in to the potential benefits of sharing health data.  Conceptually, I know it makes sense and, to be honest, prior to the last couple of weeks I was not planning to opt out.   My opinion has been hardening due to the utter failure of those pushing the scheme to acknowledge genuine concerns.  See http://www.bbc.co.uk/news/health-26277866 for example.  If they either don't understand, or choose not to address, concerns around re-identification and commercial exploitation do I really want to trust them with my data?   I have not yet opted out.  I want to be convinced that the people running the scheme understand the issues and are willing to seek our consent for this sharing of OUR data.  At the moment I fear it is still a case of public opinion being ridden over rough-shod in favour of commercial interests.   That would be a shame.   There is a real opportunity to sell the benefits of Open Data but this MUST be done in a balanced banner and which gives data subjects genuine choice.   It's a question of benefit and risk but we should be able to make our own individual choices and not have the decisions made on our behalf by those who think they know best.  They don't.