Blockchains for Big Data
From Data Audit Trails to a Universal Data Exchange
Big Data is Big Business
Big data arose in the early and mid 2000s to meet internet-scale computation needs: ZooKeeper at Yahoo, BigTable and MapReduce at Google, Cassandra at Facebook; and so on. Then came open source projects like Hadoop File System (HDFS), Hadoop MapReduce, Cassandra, and more.
Now, big data technology is quietly transforming every enterprise backend on the planet. For example, in many places “data warehouses” of relational databases are getting replaced by “data lakes” running big data software. More than $100B annually is going towards big iron compute clusters, the software on top, and the services to keep it all running smoothly.
Big Data Challenges
But big data has its challenges, which include control, data authenticity and monetization.
First, who controls the infrastructure when there are multiple actors involved? For example:
If you’re a multinational enterprise, how do you share data around the planet? If you have multiple copies, how do you know which one is the most up-to-date? How do you reconcile a different system administrator role at each regional office?
If you’re an industry consortium, how to share control of the ecosystem infrastructure among the companies in your consortium? This is especially hard if those companies are competitors!
Why can’t there be data just “out there” as a single shared source of truth that no one on the planet owns or controls, per se? Rather, data would be a public utility like electricity or the internet itself.
Second, how well can you trust the data? For example:
If you generate the data yourself, how do you prove you were the originator? If you get data from others, how do you know it was truly them?
What about crashes and malicious behavior? Machines crash, glitches happen, bits flip. Zombie IoT toasters might be inputting garbage. So after all your fancy Spark calculations, is it still just garbage out?
Finally, how do you monetize the data? For example:
How do you transfer the rights of the data, or buy rights from others?
There’s a long standing dream of a universal data marketplace; how?
# # # #
Trent McConagh has been raised in a pig farm in Canada, “hacking away on cold winter nights. 3D CAD tool, wordprocessor, dozens of games.” He holds a PhD in EE from KU Leuven, Belgium. Awarded #1 thesis worldwide in the field.