There’s a dangerous idea that massive use of anonymity is a noble antidote to the prying state. This is like pumping up the level of heavy metals in your body to make it stronger. Rather, privacy can be gained only by trust, and trust requires persistent identity. In the end, the more trust the better, and the more responsibility the better. Like all trace elements, anonymity should never be eliminated completely, but it should be kept as close to zero as possible.
• • •
Everything else in the realm of data is headed to infinity. Or at least astronomical quantities. The average bit effectively becomes anonymous, almost undetectable, when measured against the scale of planetary data. In fact, we are running out of prefixes to indicate how big this new realm is. Gigabytes are on your phone. Terabytes were once unimaginably enormous, yet today I have three terabytes sitting on my desk. The next level up is peta. Petabytes are the new normal for companies. Exabytes are the current planetary scale. We’ll probably reach zetta in a few years. Yotta is the last scientific term for which we have an official measure of magnitude. Bigger than yotta is blank. Until now, any more than a yotta was a fantasy not deserving an official name. But we’ll be flinging around yottabytes in two decades or so. For anything beyond yotta, I propose we use the single term “zillion”—a flexible notation to cover any and all new magnitudes at this scale.
Large quantities of something can transform the nature of those somethings. More is different. Computer scientist J. Storrs Hall writes: “If there is enough of something, it is possible, indeed not unusual, for it to have properties not exhibited at all in small, isolated examples. There is no case in our experience where a difference of a factor of a trillion doesn’t make a qualitative, as opposed to merely a quantitative, difference. A trillion is essentially the difference in weight between a dust mite, too small to see and too light to feel, and an elephant. It’s the difference between $50 and a year’s economic output for the entire human race. It’s the difference between the thickness of a business card and the distance from here to the moon.”
Call this difference zillionics.
A zillion neurons give you a smartness a million won’t. A zillion data points will give you insight that a mere hundred thousand don’t. A zillion chips connected to the internet create a pulsating, vibrating unity that 10 million chips can’t. A zillion hyperlinks will give you information and behavior you could never expect from a hundred thousand links. The social web runs in the land of zillionics. Artificial intelligence, robotics, and virtual realities all require mastery of zillionics. But the skills needed to manage zillionics are daunting.
The usual tools for managing big data don’t work very well in this territory. A statistical prediction technique such as a maximum likelihood estimation (MLE) breaks down because in the realm of zillionics the maximum likely estimate becomes improbable. Navigating zillions of bits, in real time, will require entire new fields of mathematics, completely new categories of software algorithms, and radically innovative hardware. What wide-open opportunities!
The coming new arrangement of data at the magnitude of zillionics promises a new machine at the scale of the planet. The atoms of this vast machine are bits. Bits can be arranged into complicated structures just as atoms are arranged into molecules. By raising the level of complexity, we elevate bits from data to information to knowledge. The full power of data lies in the many ways it can be reordered, restructured, reused, reimagined, remixed. Bits want to be linked; the more relationships a bit of data can join, the more powerful it gets.
The challenge is that the bulk of usable information today has been arranged in forms that only humans understand. Inside a snapshot taken on your phone is a long string of 50 million bits that are arranged in a way that makes sense to a human eye. This book you are reading is about 700,000 bits ordered into the structure of English grammar. But we are at our limits. Humans can no longer touch, let along process, zillions of bits. To exploit the full potential of the zillionbytes of data that we are harvesting and creating, we need to be able to arrange bits in ways that machines and artificial intelligences can understand. When self-tracking data can be cognified by machines, it will yield new, novel, and improved ways of seeing ourselves. In a few years, when AIs can understand movies, we’ll be able to repurpose the zillionbytes of that visual information in entirely new ways. AI will parse images like we parse an article, and so it will be able to easily reorder image elements in the way we reorder words and phrases when we write.
Entirely new industries have sprung up in the last two decades based on the idea of unbundling. The music industry was overturned by technological startups that enabled melodies to be unbundled from songs and songs unbundled from albums. Revolutionary iTunes sold single songs, not albums. Once distilled and extracted from their former mixture, musical elements could be reordered into new compounds, such as shareable playlists. Big general-interest newspapers were unbundled into classifieds (Craigslist), stock quotes (Yahoo!), gossip (BuzzFeed), restaurant reviews (Yelp), and stories (the web) that stood and grew on their own. These new elements can be rearranged—remixed—into new text compounds, such as news updates tweeted by your friend. The next step is to unbundle classifieds, stories, and updates into even more elemental particles that can be rearranged in unexpected and unimaginable ways. Sort of like smashing information into ever smaller subparticles that can be recombined into a new chemistry. Over the next 30 years, the great work will be parsing all the information we track and create—all the information of business, education, entertainment, science, sport, and social relations—into their most primeval elements. The scale of this undertaking requires massive cycles of cognition. Data scientists call this stage “machine readable” information, because it is AIs and not humans who will do this work in the zillions. When you hear a term like “big data,” this is what it is about.
Out of this new chemistry of information will arise thousands of new compounds and informational building materials. Ceaseless tracking is inevitable, but it is only the start.
We are on our way to manufacturing 54 billion sensors every year by 2020. Spread around the globe, embedded in our cars, draped over our bodies, and watching us at home and on public streets, this web of sensors will generate another 300 zillionbytes of data in the next decade. Each of those bits will in turn generate twice as many metabits. Tracked, parsed, and cognified by utilitarian AIs, this vast ocean of informational atoms can be molded into hundreds of new forms, novel products, and innovative services. We will be astounded at what is possible by a new level of tracking ourselves.
11 QUESTIONING
Much of what I believed about human nature, and the nature of knowledge, was upended by Wikipedia. Wikipedia is now famous, but when it began I and many others considered it impossible. It’s an online reference organized like an encyclopedia that unexpectedly allows anyone in the world to add to it, or change it, at any time, no permission needed. A 12-year-old in Jakarta could edit the entry for George Washington if she wanted to. I knew that the human propensity for mischief among the young and bored—many of whom lived online—would make an encyclopedia editable by anyone an impossibility. I also knew that even among the responsible contributors, the temptation to exaggerate and misremember was inescapable, adding to the impossibility of a reliable text. I knew from my own 20-year experience online that you could not rely on what you read by a random stranger, and I believed that an aggregation of random contributions would be a total mess. Even unedited web pages created by experts failed to impress me, so an entire encyclopedia written by unedited amateurs, not to mention ignoramuses, seemed destined to be junk.