Cryptographic data integrity in a multiplatform environment
About five months from the end of the project, we at DECODE have delivered a considerable volume of code and successfully deployed a lot of it already, particularly:
- The first Barcelona pilot "DDDC": a blockchain powered module of the Decidim platform, aimed at to share a petition, sign it with a mobile app and count the signatures in a cryptographically secure and completely anonymous way. This pilot involves a several microservices, connected to an admin panel, communicating with a mobile app and storing data on a database and a blockchain.
- The second Barcelona pilot, "IoT": a data collecting platform, allowing citizen to anonymously monitor temperature, noise and pollution on their homes' balconies, and let the data be collected, analyzed and showed on the DECODE's developed BCNNow admin panel.
- The Amsterdam pilot "18+": a platform to cryptographically store users credentials (age in this case), allowing citizens to prove their age without the need to show their ID card, thus protecting their privacy from surveillance cameras (who may be recording pictures of the their ID cards) and undesired attention. This pilot includes passport scanners, as well as a network of microservices delivering and showing the credentials via mobile browsers.
Cryptographic data integrity, what is that?
All these pilots have in common an elaborated architecture and modern micro-service rich design, as is expected from complex projects involving many partners with different skills and visions.
One of the challenges in managing a platform built on services and apps, running on different and hardware and software environments, is the integrity of the cryptographic data.
All that data that undergoes a cryptographical transformation, will later need be transformed again, be transformed back or be matched against other data. In any cryptography flow, such as a simple encryption/decryption, each transformation requires to be deterministic, otherwise the flow will break.
Despite a good level of homogenity among cryptographic algorythms and implementations, determinism can hardly be taken granted in this domain. The severity issue is directly proportional with the amount of different software platforms and components present in the cryptographic flow: implementation of cryptography algorythms can differ between different applications and libraries, or even executing the same library functon on a different OS can produce a non deterministic output, making the data unusable.
For example the usage of MD5 hashing across platforms, the fact that different commands manage strings differently or even the case of the SHA1 hashing function producing a different result in Python 2 and Python 3.
How a virtual machine can solve this problem
The DECODE developed team incurred in the python 2 vs python SHA1 issue, while working on the DDDC pilot: there a list of emails was first hashed on a server running python 2, and individual emails where hashed by a different service, running python 3 (both running on Linux) as well from a mobile app written in React Native and running and Android and iOS : to our surprise, calling the same function with identical parameters, produced different hash of the emails.
This led us to switch from using the python function, to use Zenroom even for simple hashing, developing the very simple smart contract 50.
A few days of testing ensued, which allowed us to asses the high level of determinism that Zenroom offers across hardware platforms and OS.
What is a virtual machine anyway?
The term "virtual machine" often refers to "system virtual machines" linked to virtualization platforms that allow you to run several OS instances on the same physical server. Zenroom is not a "vm" in that sense, it is in fact a "process virtual machine".
it aims to provide a platform-independent programming environment that abstracts away details of the underlying hardware or operating system and allows a program to execute in the same way on any platform.
Zenroom is in fact fully independent from its underlying OS, cause it uses its own built-in pseudo-random number generation function and allocates its memory allocation directly: this combination provides a high degree of determinism for complex cryptographic flows happening in multiplatform environments