Jul 29, 2020
Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.
Neural networks are fast becoming an essential tool for today’s machine learning (ML) practitioners. In some of our previous blogs on ML, we discussed how a simple, feed-forward neural network works, how to run a model on Qualcomm Snapdragon for inference, and how the Snapdragon mobile platforms, along with our Qualcomm Neural Processing SDK for artificial intelligence (AI), are well positioned for MLOps pipelines.
In this blog, we’ll dive a bit deeper into different types of neural networks, by looking at some specific types of topologies and layers. Many articles about neural networks, such as this one from Towards Data Science, show just how many variations are out there, and there are sure to be more as neural networks continue to evolve. To gain perspective on how these variations solve different problems, we’ll look at three specific types of neural networks: Convolution Neural Networks, Generative Adversarial Networks, and Recurrent Neural Networks. We’ll then discuss the types of neural networks supported by the popular ML framework TensorFlow, and how ML models generated with TensorFlow can be imported with the Qualcomm Neural Processing SDK and run on Snapdragon platforms for inference on the edge.
Convolution neural networks
Neural networks, as their name implies, loosely model how human neurons allow us to think and perceive. So in the quest to make computers “see” more like humans, Convolution Neural Networks (CNNs) were born, based on the idea that individual neurons respond to specific regions of the visual fields captured by our visual cortex. Thus CNNs are commonly used for image recognition and classification.
A CNN uses a number of different types of neural network layers to detect features in images. As depicted in the figure below, an input image is fed through a series of convolution layers. Each convolution layer uses a filter called a kernel to reduce the dimensionality while extracting image features (e.g., edges):
The filter is merely a set of weights that are multiplied against the input, to determine the probabilities that a subset of pixels represents a feature. This multiplication process is repeated from left to right and top to bottom over the image, and the result of the multiplication is known as the convolved feature or feature map as seen in the figure below:
A pooling layer is then used to further reduce dimensionality by down sampling or reducing the size of the feature map. Subsequent convolution and pooling layers are often employed to capture increasingly higher levels of features to develop the necessary understanding of what the image contains. The final feature map is then flattened and fed into a group of fully connected layers to derive the final feature classifications.
Generative adversarial networks
If one neural network can be effective, then why not add a second? This was the thinking behind the next neural network architecture in our discussion, the Generative Adversarial Network (GAN). A GAN is comprised of two neural networks that feed into each other during training, often in the form of a zero-sum game. A GAN builds both the capability to generate realistic data and the ability to classify if such data is real or generated as shown in this figure below:
The first neural network in a GAN, typically a deconvolutional neural network, is called the generative or generator network that learns how to build increasingly realistic data, starting from just random noise. The other neural network in the GAN, typically a CNN, is known as the discriminative or discriminator network, and continually improves its ability to classify the generator’s output as being real or fake (i.e., whether the output was created by the generative network or is in fact real data).
The discriminator trains using both the generated data as well as real-world sample data to make these classifications. The classification from each iteration is fed back into the generative network, and that network uses it to build its next, and hopefully, improved output. The GAN’s training continues until the discriminator incorrectly classifies images as being real or fake around 50% of the time. Some ML practitioners may train to other levels of accuracy depending on their needs.
The final trained model consists of a generator that can generate ultra-realistic data, and a discriminator that can reliably distinguish between real-work and artificially-generated data. Depending on what functionality is required, one or both of these fully-trained neural networks from the GAN may be retained and used for inference. Given this functionality, GANs are used for a wide range of applications including generative art, upscaling images, and various scientific visualizations.
Recurrent neural networks
Recurrent Neural Networks (RNNs) are another interesting type of neural network, because they attempt to mimic how our thoughts and memory give us a way to persist information and build upon it to learn new information. They do this by including looping functionality, hence the name recurrent, in conjunction with the storage of state information.
RNNs operate on the assumption that the elements of the input have some relationship to each other. For example, an RNN can be used to try and predict the next word a user might type in a sentence.
The RNN repeatedly applies transformations to a series of inputs to produce a series of outputs. The RNN also produces a hidden state vector, which serves as a sort of memory containing data/state information to use in a subsequent calculation on the input series. In other words, the state vector contains information about things learnt from a previous iteration. This process is illustrated in the figure below:
Here, the RNN trains on a series of input vector(s) xn. In addition to the input vector(s), the neural network uses a hidden state vector hn with information from the previous iteration on the input xn-1. The neural network then produces output yn and creates a new hidden state vector hn+1 that is carried forward to the next iteration. Through the iterations, weights are applied to the input, state vector, and output as Wx, Wh, and Wy respectively.
The RNN topology illustrated above is just a basic example, and there are many variations of RNNs such as Long Short Term Memory (LSTM) networks that can be used to solve a variety of problems. For a list of RNN variations, see the Architectures section of the Wikipedia RNN topic.
Qualcomm Neural Processing SDK facilities for TensorFlow models
So far we’ve looked at just a few types of neural network topologies, but there are literally dozens of topologies and layer types in use these days. And when it comes to using them in practice, a rich ML framework like TensorFlow allows developers to start building them quickly and efficiently. Thus, part of TensorFlow’s popularity has no doubt come from its support for an ever-increasing array of neural networks. Developers will therefore be pleased to know the Qualcomm Neural Processing SDK allows TensorFlow model exports to run hardware-accelerated inference on the edge with devices based on Snapdragon platforms.
The Qualcomm Neural Processing SDK allows developers to convert and import TensorFlow models into the Deep Learning Container (.dlc) format of Snapdragon for execution on its mobile platform Qualcomm Kryo CPU, Qualcomm Adreno GPU, and Qualcomm Hexagon DSP. Our Neural Processing SDK supports a rich set of TensorFlow neural networks and layers as listed on the SDK’s Supported Network Layers page. The page also lists the Snapdragon hardware resource(s) (CPU, GPU, or DSP) which help provide hardware accelerated support for the model to run on.
For TensorFlow operations not directly supported by the SDK, the SDK allows developers to create User-Defined Operations (UDOs). A UDO is a plugin package consisting of a dynamic library that registers itself and contains a compiled model that can run on the Snapdragon CPU, GPU, or DSP.
Constructing a UDO involves the definition of a UDO Config File that specifies the package’s input, output, and target hardware cores. This config file is used both to create the UDO package itself, and to identify the operations in the model that need to be expressed as UDOs when converting a TensorFlow model to .dlc:
The usage of the configuration file ensures that both the UDO package and converted model, are maintained in lockstep. At the end of the pipeline, both the UDO package and model containing UDO operations are loaded onto the Snapdragon-based device to run hardware-accelerated inference on the edge.
There are many types of neural network topologies and layers in use today, and frameworks like TensorFlow are making them easier to implement. Furthermore, our Snapdragon mobile platforms in conjunction with our Qualcomm Neural Processing SDK, support the tools and processing resources for hardware-accelerated inference of TensorFlow neural networks and models on the edge.
If you’re interested in diving deeper into the SDK, be sure to visit our Neural Processing SDK Learning Resources. And to see the SDK in action, be sure to check out our projects page where you can select Neural Processing SDK for AI under the Software Tools filter.