In a recent blog post (Big Data…Marketing Magic or the Real Deal) we asked whether or not the ‘Big Data’ movement actually meant anything really new for data analytics. Our final conclusion was that there is a real change going on in the tools and techniques that we are using to access and manipulate really large datasets. I left open, however, the question of whether or not as data miners we need to develop new functions and algorithms to analysis all of this fabulous data or if we can simply apply our tried and tested techniques such as Decision Trees, Neural Network, Clustering, Association Analysis, Regressions to these data sources. This still remains an open question, but there have been some really interesting developments in an area known asdeep learning.
Deep learning refers to a relatively recently developed set of generative machine learning techniques that autonomously generate high-level representations from raw data sources, and using these representations can perform typical machine learning tasks such as classification, regression and clustering. Many of the most important deep learning techniques are extensions of neural network methods and a simple way to understand them is to think of multiple layers of neural networks linked together. Taking raw data input at the first layer the output of the next layer is is a set of high level features which are passed to a further layer which in tune generates a set of higher level features. This continues for a number of layers until eventually output (for example a prediction) is produced.
The image below shows a simplified illustration of this where a stack of neural networks are used to classify images. While the data presented to the network would be raw pixel values, internally the network would generate much higher level features. For example, there might be a node in the network that responds to the presence of diagonal lines in an image or, at an even higher level, to the presence of faces (this is reigniting very interesting old discussions about the idea of the grandmother cell!).