Cryptographic data integrity in a multiplatform environment

Andrea D'Intino
Wednesday 12th June 2019

Cryptographic data integrity in a multiplatform environment 

About five months from the end of the project, we at DECODE have delivered a considerable volume of code and successfully deployed a lot of it already, particularly: 

  • The first Barcelona pilot "DDDC": a blockchain powered module of the Decidim platform, aimed at to share a petition, sign it with a mobile app and count the signatures in a cryptographically secure and completely anonymous way. This pilot involves a several microservices, connected to an admin panel, communicating with a mobile app and storing data on a database and a blockchain.
  • The second Barcelona pilot, "IoT": a data collecting platform, allowing citizen to anonymously monitor temperature, noise and pollution on their homes' balconies, and let the data be collected, analyzed and shown on the DECODE's developed BCNNow admin panel. 
  • The Amsterdam pilot "18+": a platform to cryptographically store users credentials (age in this case), allowing citizens to prove their age without the need to show their ID card, thus protecting their privacy from surveillance cameras (who may be recording pictures of the their ID cards) and undesired attention. This pilot includes passport scanners, as well as a network of microservices delivering and showing the credentials via mobile browsers. 

 

Cryptographic data integrity, what is that? 

All these pilots have in common an elaborated architecture and modern micro-service rich design, as is expected from complex projects involving many partners with different skills and visions. 

One of the biggest challenges in managing distributed platforms built on diverse services and apps written in different languages is running on different devices (clients, servers, web browsers and mobile)  while granting  data integrity and the consistence of cryptographic transformations.

All that data that undergoes a cryptographical transformation, will later need be transformed again, be transformed back or be matched against other data. In a properly designed cryptographic flow,  even for a simple encryption/decryption, each transformation needs to occur "end to end" (where different ends can be very different devices) while being deterministic and lead to the same results, otherwise the flow will break. 
Despite a good level of homogeneity among cryptographic algorithms and implementations, determinism can hardly be taken granted in this domain. The severity issue is directly proportional with the amount of different software platforms and components present in the cryptographic flow: the implementation of algorithms can differ between multiple applications and libraries, or even executing the same library function on a different OS can produce a non deterministic output, making the data unusable.

For example the usage of MD5 hashing across platforms, the fact that different commands manage strings differently or even the case of the SHA1 hashing function producing a different result in Python 2 and Python 3, creating the need for language agnostic implementations.

How a virtual machine can solve this problem

The DECODE development team incurred in the python 2 vs python3 hashing issue, while working on the DDDC pilot: there a list of emails was first hashed on a server running python 2, and individual emails where hashed by a different service, running python 3 (both running on Linux) as well from a mobile app written in React Native and running and Android and iOS : to our surprise, calling the same function with identical parameters, produced different hash of the emails. 

This led us to switch from using different function to use Zenroom even for simple hashing, developing the very simple smart contract 50

A few days of testing ensued, which allowed us to asses the high level of determinism that Zenroom offers across hardware platforms and operating systems. 

What is a virtual machine anyway?

The term "virtual machine" often refers to "system virtual machines" linked to virtualization platforms that  allow you to run several OS instances on the same physical server. Zenroom is not a "vm" in that sense, it is in fact a "process virtual machine". 
it aims to provide a platform-independent programming environment that abstracts away details of the underlying hardware or operating system and allows a program to execute in the same way on any platform.

Zenroom is in fact fully independent from its underlying OS, providing an high degree of determinism for complex cryptographic flows happening in multi-platform environments.

As a result, developers in DECODE could save a lot of time demanded by the necessity to align across different teams from multiple organizations, adopting different architectures, libraries and languages: the use of a VM like Zenroom has benefitted our workflow by lowering the friction and helping reach consistent results also across very complex cryptographic data transformations.

cryptographic data integrity