Sunday, January 26, 2014

It all starts with data.

Let us dig a bit deeper into the first step of big data - the creation of the bytes by most of us going about our ways. This is by no means comprehensive but you should get an idea of the deluge we are responsible for.

The Eyes

The biggest source of bytes is the ccd chip; a small device that captures images in a device/feature called a camera.

Why do we use this device so much? well someone spent a lot of money to find out in 2005 - which makes for interesting reading. These instruments are everywhere, very easy to use and we might regret not using them later so we use them. I have so many great memories not on film or digital form that i wish i had; but that is a different story.

The point is we just make lots of memory in digital form and as it gets easier to make more crisper memories, we go right ahead in burst mode. Burst mode is taking 100's of pictures with the hope that one of them will be perfect- emotion, angle, light and no nosy noise in the background. Ahem, we then conveniently forget to delete the nosy and now have 99 nosies along with one perfect.

Just these habits with an ever increasing megapixel capability and we have just managed to create a monster album.

The Multipliers

Now take the huge bits of pictures, videos and then, sync them with icloud, box, skydrive, google drive, instagram, facebook and a bunch of other places popping up all over the place- with sync and share capability and we just multiplied the data.

Imagine you are an apple fan, you have a ipad and a iphone (die-hard fanboys have mini's, retina's imacs etc). You take One picture of a soaring eagle on a crisp sunny day and the picture is replicated across all your icloud connected devices. You can conveniently show off your great luck in witnessing the glories of nature with your device closest to your reach at all times. 

The price of convenience? A whole bunch of hard disks that keep multiple copies of your pictures on icloud or any other sharing service you subscribe to. I have heard they keep anywhere from 2-6 copies.. folks that know for sure- ping me on this. 

So every picture taken * the number of devices connected * the icloud storage service = bytes now in this world.

The Act

With the ubiquitous internet, we act a lot- like visit websites, click on a bunch of interesting items and so on just because we are browsing or surfing. In the simplest form a webpage has some text, a logo, some pictures and hosted on a complex system consisting of network equipment, computers, databases and software that makes sure you get the page you asked for 90% of the time or more. One web page view can generate anywhere from 4-40 requests to fetch text, stylesheets, templates, logos, pictures, etc that touches at a minimum three electronic devices that make one line of note each saying you asked for something and you either got it or did not.

The Multipliers

In a simplest form, one request generates 4 requests x 3 devices that is 12 lines of notes also known as logs at a minimum on a simple but professional system. If you add on the analytics, caching, reporting and tracking each added feature starts  multiplying the 4 in the request with that number and you can see this adding up rather quickly. 

Considering that just Google handles over 100 billion searches per month each search carries at least 27 requests (on chrome per the netmon for my computer); that translates to over five trillion (5400 billion) lines of logs. Assuming no other shady tracking.  

The Shadows

Many internet sites install trackers on your computer that watch where you go, what you do and build profiles on where you live, how much you earn and where you spend money. This adds requests to every click, scroll of internet browsing you do.

Multipliers

I don't want to spook you as that is another separate discussion.

Bottom-line

Big data is created intentionally or systemically- call it machine data, human generated data or behavioral data, this is a digital footprint we leave just by being online. I wonder what is the carbon footprint of all this as the bytes are stored for all eternity? 

No comments:

Post a Comment