Data ownership is the problem, not the solution

Please see this discussion between Isabella DiMichelis and Evgeny Morozov, moderated by Maria Mazzucato.

I’m concerned less with the right to monetize than the option of not monetizing. I miss the days when many Internet participants chose not to monetize. That pretty much got lost in the aftermath of the 2008 crash, DIY hacker ethos died and FYPM (fuck you, pay me) ethos took its place. Now everyone is a mercenary, very careful to avoid “working for free,” and all content is monetized, and all monetization gimmicks are very cynical and rely on some gotcha placement of a toll booth, but also of course rely on Digital Restrictions Management, which means the end of general purpose computing.

Isabella DiMichelis points out that:

Can’t have innovation without data.
Can’t have a lot of data without consumers being engaged.
Can’t have a lot of data with too much privacy.

There are too few innovation opportunities because data flows from individuals to businesses and not the other way. This is by design. Websites and mobile apps are very intentionally designed to transmit signal (such as empirical and often actionable observations of individual behavior) in one direction and noise (basically bloat) in the other.

Consumers may or may not be engaged, but are stuck in a passive rather than active role in use of data because the data they generate are not for their own use, and are not available to them in a usable, machine-readable form.

Privacy is a lost cause anyway. The issue now is information asymmetry, namely those one-way informational valves that send signal one way and noise the other.

DiMichelis, again:

We can change way users perceive their value
We are not calling for change in how data exchange is implemented
Individuals are suppliers, not just consumers
Individuals need to know the value of the data they supply, but this is not transparent
The intent is to have new services for the consumer, new wealth for society, new innovation, etc.
The means by which this is monetized is artificial scarcity (well, no shit, Sherlock)

This is where I very much disagree.

I am very much calling for change in how data exchange is implemented. The systems for this need to be replaced with nonproprietary systems. If the businesses will not go along with this, noncommercial alternatives must be created, or the interfaces with those businesses must be reverse engineered. Individuals need to bear in mind that they are suppliers of information, but also need to feel empowered to make creative use of the data they (and perhaps others) generate. There seems to be an assumption that the innovation takes place in the businesses, and that whatever innovations they create with it are proprietary, and that the consumer public are passive consumers of these innovations and the products that result from them. In DeMichelis’ scheme of course they would also be compensated for their informational contribution to them, but I think DiMichelis is possibly underestimating the consumers’ potential intellectual contribution to this innovation. More than ownership of their data (which is an artificial construct, anyway), consumers should get access to their data. The reason why the transparency needed for individuals to know the value of their data does not exist is because the way the data exchange is implemented is proprietary. If the data does have exchange value, as DiMichelis points out, that would be due to artificial scarcity, which is another house of cards for the legal system to prop up. It’s certainly telling that DiMichelis suggests (time index 41:34) the copyright lobby as one of the agencies of collective bargaining in place in the European Union. She seems to think that, between Google, Uber, Amazon and Apple, a consumer can decide to make their data available to zero, one, two, three or all four. But if you made your data available to one of those entities, how would you know they didn’t share it with one of the other ones? How would the consumer (or the legal system) enforce this exclusivity?

Morozov, in contrast, wants institutions that communicate to consumers in a language other than that of the marketplace. That is more my speed. He believes we can design non-market forms of cooperation and collaboration.

Lest I take sides, DiMichelis, to her credit, sees user passivity to be a problem and sees this as due to too few interoperability standards (none, really, for mobile apps) and too much vendor lock-in.

Apparently, according to information in this video, a hospital sold its data to Google in exchange for digitization of the same data, basically for data entry. This is all kinds of problematic. Data entry labor has cheapened to Amazon Turk levels. The service offered by Google is of low value. One might argue that hospital data is of a very personal nature and therefore protection of personal privacy is of utmost importance, so getting data digitization on the cheap could be a grievous error. This brings us to some of the problems in how the privacy issue is framed today. Nobody expects today’s privacy policies to protect confidentiality of personal data. That is correctly assumed by most to be a lost cause. It is assumed that the more reputable firms will safeguard against acquisition of personal data by cybercriminals, and successfully preventing data breaches could be considered a good outcome. People click “agree” resigned to the reality that the non-criminal uses to which the data will be put will aid the development of proprietary technologies, of asymmetric informational advantages in the marketplace (personalized prices calibrated to individuals’ “price points” or even “pain points”), or of manipulative simulacra such as filter bubbles. Donating the data to the public domain, to a data commons, would almost certainly do more for “innovation,” if nothing else because innovators would be free to explore data without worrying whether they’re trespassing on someone’s intellectual property. The lowered costs due to free-as-in-beer data access couldn’t hurt, either, as well as opportunities to collaborate, not just compete, in the pursuit of novel applications of data.

One thing that is urgently needed is a nonproprietary, noncommercial implementation of trust metrics. The fact that the first reasonably good solutions to this problem have been proprietary is perhaps the largest threat to the hacker ethos, but hopefully not the final nail in its coffin. For a collaborative data commons to be possible:

Consumers must be able to contribute data to the commons without exposing themselves to criminal behavior.
Consumers must also get something of value in return for actionable data.

I propose that the assurances for each of these two concerns come from the nonproprietary nature of the data commons itself. To make data available to the public at large includes criminals. There’s no way around that, but hopefully in a highly transparent informational matrix it will be very difficult, almost impossible, to commit crimes undetected. Surveillance camera footage might or might not be watched by the guard in the guard shack, while sousveillance camera footage (where the camera is basically an open-access webcam) might or might not be watched by any number of people anywhere in the world. The idea here is applying the idea of sousveillance to many different forms of data; not just video.

By actionable data above I mean, broadly speaking, business intelligence. Particularly, I mean capacity to solve valuation problems, or recognize merchandise as overpriced or underpriced. If you’re using an app to guide you to what are touted as the most competitive supermarket prices or gasoline prices or something, I’m pretty sure you’re giving more than you’re getting when it comes to actionable data. A healthy data commons offers a marketplace in which price arbitrage makes less of a difference, where price discovery happens a lot faster, and where the “law of one price” is a much more accurate model. All cards face up on the table means you don’t get to play your cards close to the vest, but neither do your competitors. I suppose that is a nightmare scenario to those who think of shopping as a competitive sport. I’m kind of hoping they’re in the minority.

The problem with both of these arguments is that to have the relevant assurances to offer, the data commons must already be in place, so it will be a challenge to recruit some number of people to contribute the “seed data” around which to build this public infrastructure. The number of early participants must be large enough to form a critical mass of sorts.

This might be where national intermediaries (42:59) come in. One beautiful example of a public dataset curated by the public sector is the USDA‘s Food Composition Databases Nation states and their political subdivisions would do well to expand such data projects and add new ones. This may come as a surprise to long-time readers of the present blog. Suffice it to say that my political attitudes have shifted somewhat over the last few years. For example, antinationalism has replaced antistatism as the central pillar of my worldview. I’ve also become more tolerant of the regulatory aspect of statism as capitalist practices have become more and more indefensible and (apropos the present post) informational leverage becomes more and more concentrated in the commercial sector. It’s become more important to me that the power of business get checked soon than that it gets consigned to the ash heap eventually.

Data ownership is the problem, not the solution

Comments

Leave a Reply Cancel reply

More posts

Adversarial recourse against system-gaming

I can’t watch news on TV

Should we still be demanding “Medicare for All?”

Employment equals infantilization.