Wednesday 24 January 2018

Whose data is it anyway?


To the extent that economics is concerned with the study of how resources are allocated, a system of property rights impacts on the way these resources can be used. For example, if a person owns a piece of land they can choose (within limits) what to do with it e.g. build a house or let it lie fallow. Other people have no right to determine how the land can be used. In a modern market economy, transactions between individuals involve the transfer of property rights and form the basis of the price determination process we see at work every day. These rights are backed up by a legal system designed to enforce the entitlement to a given bundle of goods (or services) and to record their transfer from one person to another.

However, in the digital age the distinction of property rights has become much more blurred. I was reminded of this recently by an article in The Economist which quoted Nikhil Pahwa, an Indian digital-rights activist, as saying “When they say, ‘Big data is the new oil,’ I answer, ‘But my data is not your resource.’” The context of his quote is India’s biometric ID scheme, Aadhaar, whose database is apparently rather leaky with the result that many people’s personal details find their way into the public domain. But it could equally be applied to the likes of Facebook, which owns the world’s largest personal dataset. At issue is whose data is it? 

Technically, of course, it belongs to the individual who posted it. But Facebook’s terms of service state quite explicitly that “you grant us a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any IP content that you post on or in connection with Facebook.” In other words,  although you own the content Facebook has carte blanche to do what they want with it. From the company’s perspective this is great because it has a huge database upon which it can let loose its AI algorithms to generate ever more sophisticated consumer profiles. One of the great concerns expressed by network campaigners is that such huge databases act as a barrier to entry to smaller companies attempting to break into a particular market, because the lack of access to data means that their consumer profiling will always be inferior.

And this takes us right back to Pahwa’s point: Is it right that the data which we own, and which we give away for free, should be used by a profit maximising organisation to enrich shareholders? In their defence, big data companies argue that they do not charge for their services – Google clicks do not cost the user, so in that sense we are getting something for nothing. Except that is not quite true because we pay for it by giving up some data about ourselves, which may be trivial in isolation but when combined with the billions of pieces from other users, goes to make up a huge mosaic which Google can use to target its adverts more effectively. 

In an interesting paper by Imanol Arrieta and co-authors, the argument is made that data providers should be paid for the information they yield in order that they are compensated for their contribution to the world of AI – information which might in due course be used to displace workers replaced by machines. As data hoarding by Big Data companies increasingly raises public interest concerns, it is likely to provoke the interest of regulators keen to cut down the monopoly power of Google, Facebook et al. It would not be the first time that regulators have taken an interest in tech-related issues: Twenty years ago, the US government opened antitrust proceedings against Microsoft, accusing it of establishing a monopoly position and engaging in anti-competitive practices. And if data really is the new oil, as many commentators contend, recall how in the early twentieth century the US government forced the breakup of Standard Oil, accusing it of being an illegal monopoly.

Big Data companies are already potentially feeling the heat from the US Federal Communications Commission, which voted in December to dismantle its existing net neutrality rules. These rules prevent broadband suppliers from treating different groups of consumers differently, and the likes of Google, Facebook et al are concerned that changes to the rules could impact upon their business models if they are discriminated against by internet service providers (ISPs). As an aside, there are many who argue that net neutrality impinges on the property rights of ISPs, but that is a subject for another day. 

In order to alleviate regulators concerns, it might be prudent for the Big Data outfits to take some pre-emptive actions which show that they are taking mounting social concerns more seriously. For example, there is a case for suggesting that at least part of the data they collect could be shared across a range of platforms thus creating an open-source database (after suitable efforts have been made to anonymise it). After all, it is a public resource – it is “our” information. Of course, this might mean an end to much of the apparently “free” content currently available online. However, both the tech industry and society as a whole are going to have to do some hard thinking about how to balance privacy issues against the cost of online services. If this does not happen, it is likely that government will take the decisions for us, which may not be to anyone’s liking.

No comments:

Post a Comment