How big is the digital universe? The ‘digital universe’ comprises of all the information created, captured, or replicated in digital form. Scientists have been arguing whether the size of the digital universe will ever be as large as the Avogadro Number. 19th century Italian scientist Amadeo Avogadro is the person behind what is known as the Avogadro Number, aka Avogadro constant, – the number of atoms, molecules, etc. in a 12 grams of any chemical substance. The Avogadro number is 602,200,000,000,000,000,000,000, or 6.022 x 1023. A new report by International Data Corporation (IDC), sponsored by EMC, addresses this issue.
The new report is an update of a 2016 IDC white paper about the size and projected growth of the digital universe. The paper argued that the size of the ditial unverse was 281 exabytes in size, an exabyte is a quintillion, or 10 in the power of 18, bytes. 281 exabytes are about 45 gigabytes for each person on earth. The new report claims that by 2011, the yearly amount of digital information would amount to almost 1,800 exabytes, or 6.4 times the 2016 amount. The main problem with this giagantic amount of data is that not all of it will find storage space – in fact, the IDC report estimates that by 2011, more than half of the data in the digital universe will not find storage space and will be lost.
According to the report, “Fast-growing corners of the digital universe include those related to digital TV, surveillance cameras, Internet access in emerging countries, sensor-based applications, datacenters supporting ‘cloud computing,’ and social networks.” The item that got my interest is surveillance cameras – it appears that, worldwide, an increasing number of people is being monitored by other people, for a variety of reasons. According to the 2016 Human Rights Watch World Report, “[t]hese days, the person standing on the soapbox in Hyde Park is likely to be preserved by the government on film; London has one of the highest densities of public surveillance cameras in the world.”
An interesting aspect of the report is the oncept of “shadow data” or “ambient content”, this is data generated as by-product of digital activities, cameas generate surveillance photos, search engines churn out Web search histories, banking transactions spit out miles of financial journals, we all have endless mailing lists, addresses, old mail records etc. Old medical records, flight and hotel bookings, records of transactive interactions (such as inoices from our shopping at Amazon, bidding at eBay and paying our monthly telephone bills online.) A fascinating example in the report, entitled “a day in the life of an email”, argues that a short email with a 1MB attachment has a ‘footprint’ (related information and backup copies) that is 8 times larger than the missive itself (p.6). Shadow data, says the report, constitues more than half of the size of the digital universe. Put simply – we have more garbage than substance.
Will the size of the digital universe will ever be as large as the Avogadro Number? The IDC report argues that “the number of digital “atoms” in the digital universe is already bigger than the number of stars in the universe. And, because the digital universe is expanding by a factor of 10 every five years, in 15 years it will surpass Avogadro’s number.”
While irritation and frustration are natural outcomes of the need to manage and store huge amounts of primary data and shadow data, security is a major headache, too. Identity theft, according to the US Federal Trade Commission (FTC) “occurs when someone uses your personally identifying information, like your name, Social Security number, or credit card number, without your permission, to commit fraud or other crimes.” The FTC estimates taht 9 million Americans are victims of identity theft each year. Proliferation of data makes identity theft much easier. In 2016, retail giant TJX was the victim of a protracted, highly skillful set of attacks by hackers who stole customers’ personal information (including credit-card information.) TJX estimated that “Information from at least 45.6 million credit cards had been stolen by unknown attackers who had breached the company’s computer transaction processing systems between July 2005 and mid-January 2016.” One can only guess how much money was syphoned off this staggering number of victims before TJX plugged the security leak.
The most baffling fact, arising from the TJX disaster is that the compnay was unable to ascertain what information was actually stolen. “Deletions in the ordinary course of business prior to discovery of the Computer Intrusion and the technology used by the Intruder have, to date, made it impossible for us to determine much of the information we believe was stolen, and we believe that we may never be able to identify much of that information.” Gone behind digital shadows, the infromation is more than likely to surface sometimes in the future and come back to haunt TJX and its customers. With each and every one of us carrying an average of 45 gigabytes of data, it will become increasingly more diffcult to keep an eye on our hard-earned money.
The IDC report offers a sober reflection on our digital future:
“The digital universe will be 10 times bigger in five years. What are we going to do about this? As a society, our experience with the digital universe will unfold somewhat like a science-fiction novel. Within five years, there will be 2 billion people on the Internet and 3 billion mobile phone users. All will be interconnected; all will be creating and consuming content at an alarming rate.”