What a good news, I was accepted to participate in the Google Summer of Code 2017 under the mentorship of the CERN-HSF umbrella corporation. I will be working on the project "Convolutional Deep Neural Networks on GPUs for Particle Physics Applications". More about this project can be find here.
The Convolutional Neural Networks (CNNs) are one special type of deep learning neural networks with an enormous discriminative power for image classification. In fact, they significantly outperform the standard Computer Vision techniques of manually extracting image features and then building classifiers on top of them. Although they are already established model in the Machine Learning community, their potential in the High Energy Physics is still growing.
Since ROOT is the state-of-the-art data analysis tool developed by CERN and extensively used in the High Energy Physics, it is of paramount importance to integrate such a solution in its submodule called TMVA (Toolkit for Multivariate Analysis).
Convolutional Neural Networks
Despite the fact that there is a vast amount of resources to learn about the CNNs, I will give a brief overview in this post. While I was learning about them, I've found very useful and I am highly recommending the Stanford's CS231n course "Convolutional Neural Networks for Visual Recognition".
The Convolutional Neural Networks have an architecture of predefined layers. Each layer is having its own characteristics and provides interfaces for interaction with the other layers. One CNN is typically consisted of the following layers: Convolutional layer (CONV), Pooling layer (POOL) and Fully-Connected (FC) layer.
We can think of the Convolutional and the Pooling layers as a cube of neurons, with specified width, height and depth. If used for classification, the outputs of the network are the class assignment probabilities of the input. One example of a CNN is given in the figure below.
The CONV layer can be seen as a cube of neurons, where the cube is having a predefined number of identical rectangular slices with predefined spatial dimensions: width and height. The number of slices defines the depth of the layer. The neurons output values are calculated by means of convolution, of a small region in the input space and a predefined filter. The small region in the input space is called a receptive field of the neuron and all neurons with the same spatial coordinates extending along the depth dimension see the same region in the input space, or they have the same receptive field, but perceived through a different filter. The values inside of these filters are defining the parameters to be learned.
Similarly, the POOL layer can also be seen as a neuronal cube, derived from the previous layer, but only spatially downsampled. In fact, this is the only aim of the layer, to perform spatial downsampling of the previous layer, which means it does not contain any parameters to be learned.
The FC layer is a standard layer where each neuron is connected with each neuron in the previous layer.
The training of the CNN is also executed using the backpropagation algorithm, such that it is taking a slightly different form, since we have the convolution and downsampling operations.
I hope this summer will be interesting, challenging and productive and that all Google Summer of Code participants will do a good job and make the Open Source community stronger and richer.