Leaking Data: an Alternative Model for Data Sharing

Job Spierings and Max Kortlander
Wednesday 1st November 2017

Stories about personal data - many of which may feature your own personal data - are hitting the headlines daily, from the Equifax data breach and hacking of T-Mobile, to the hundreds of pages of data stored by Tinder about each of its users. Data breaches are now even receiving the listicle treatment.

The inevitable conclusion is that logging in and sharing data on the Internet is far too often unsafe, unreliable, costly, and manipulative.

All of your most personal information is spread across many thousands of databases on computers in data centres, backups, and "development environments" (often a euphemism for a software developer's laptop). And every start-up, new app, or digital service must build its own data  silo — a complex task requiring the development of a secure technical environment, which is subject to increasingly heavier legislative requirements. Building such data silos is complicated, costly, and risky.

The solution for many of these webshops and apps is to rid themselves of this worry by using the single sign-on services (SSO) offered by Facebook, Google or Yahoo.

In exchange for providing this sign-in feature, these companies get access to a portion of your behaviour on the website or app. Each time you log in, log out or click a link, your action will be registered and given to ‘self-learning’ systems. Companies use the insights gained through this surveillance to target advertisements and content, which is based upon an ever-expanding profile that a company maintains of each of its users. This has led to the reshaping of the internet, such that every click or action by the user is surveilled, stored, and translated into a micro-financial transaction (to purchase ad space, for example) that is hidden from, yet directed back toward, the user.

To add to this imbalance, for all types of services and stores, users are immediately subject to the compliance rules of these companies, many of which reserve the right to deny access at any time. Yahoo, for example, states the following in its Terms of Service:

Yahoo may, without telling you, immediately cancel or limit your access to your Yahoo accounts, certain Yahoo Services and any associated email addresses… (emphasis added) 

Contrary to Yahoo, Facebook’s terms do indeed state that they will inform users in advance about changes to their compliance rules, or about denying a user access. But if you disagree with such a decision, your recourse is to appeal to a court in California, irrespective of where you are on Earth. So while a form of recourse exists, it is hardly realistic or attainable for the vast majority of Facebook’s users to engage with California’s judicial system. All of this is to say that there is pervasive unevenness in a user’s relationship with large tech companies  — they continue to accumulate leverage and knowledge about us, while we have neither control over, nor knowledge about, them.

Safe Sharing

The issues arising from this expanding, unequal relationship between large tech companies and users raise the need for an alternative model. You may want a system that lets you share your data with another party safely (that is, encrypted); that allows you to decide how long the other person can use that information and who they can share it with; and (when necessary) whereby the other party can also be sure that you represent what you say to represent—that you're really older than 18, for example, without having to scan your ID card.

And that's precisely what DECODE is researching, designing and developing: a set of open source software that allows you to encrypt, control and share your personal information. DECODE aims to become a decentralised platform that uses a distributed architecture to let people decide who they share their data with and under what conditions.

The DECODE software will provide the means for a user to make claims about themselves, without disclosing any unnecessary data:for example, there may be situations where you want to purchase alcohol online. Thanks to the concept of zero-knowledge proof, a user can prove they are over 18 without giving their age. Or, similarly, a voter may be able to prove he or she lives within a certain jurisdiction without disclosing their exact address.

Sometimes it is necessary for data to be shared and checked at a later time—for example, in the case of an online petition, where votes ought to be cast in secret, but also ought to be capable of being checked for fraudulence. Distributed ledger technology (often called blockchain) provides an avenue for this sort of data transaction, by recording the transaction on a public ledger that is distributed over multiple computers. This ledger is then able to show how many people have voted as well as whether or not the voters were eligible to vote—all without particular personal attributes needing to be individually identified, and all without a central data storage location. In this type of distributed, decentralised architecture, information relating to a person or transaction is not stored in a single place. However, certain aspects of this data may be accessed when necessary.

So far, the DECODE project has selected communities to work with. Working with these communities, we held inception workshops to co-create the first set of requirements for DECODE as well as a list of the project’s priorities. Our next steps are to work with these groups to finalise the DECODE architecture, exact user scenarios and planning of the pilots. At that time the communities will be publicised in more detail as well.

We hope to offer users and developers an alternative to the existing model of data collection that returns the control and ownership of data back to users. Critical readers will probably note that DECODE’s ambitions are high, and difficult to completely realise. We do think it is vital to move the conversation forward, and we aim to do so by demonstrating a practical way and offering convincing alternative visions.