Some notes on Big Data

I'm fascinated with this concept, but it feels like its so far from my experience that it will be difficult to really understand and use it.

However, I am going to try to make sense of it.

I just listened to a video on you tube, http://www.npr.org/templates/reg/login.php?returnUrl=http://www.npr.org/blogs/13.7/2013/03/10/173960533/explaining-big-data 
Fascinating 8 minute video--most companies create tons of data that they can't use but they should be using (or someone should be).  he called it "data exhaust".  Most data captured by loyalty cards has not been utilized in any way.

Hadoop is the leading big data technology (what the heck is it!). Open source software for reliable, scalable, digital computing (from the Hadoop website). It's a platform for big data analytics.  What's data analytics? As far as I can tell it's the collection and analysis (usually of enormous quantities of bits) of data to make predictions. Hadoop consists of two components: hadoop distributed file system that allows for huge storage; and map reduce, a data processing framework. It maps large data sets across multiple servers which then create summaries of the data. All the summary info is aggregated in a reduced stage.

An article in a German magazine (but the article is in English) with a caution about big data: http://www.faz.net/aktuell/feuilleton/modelle-die-sich-schlecht-benehmen/f-a-z-column-by-emanuel-derman-little-big-data-12103958.html

Comments