MobileNets on Intel® Movidius™ Neural Compute Stick and Raspberry Pi 3

By Ramana Rachakonda, November 15 2017

Introduction

Deep Learning at the edge gives innovative developers across the globe the opportunity to create architecture and devices promising to solve problems and deliver innovative solutions like the Google’s Clips Camera with Intel’s Movidius VPU Inside. An edge device typically should be portable and use low power while delivering scalable architecture for the deep learning neural network. This article will showcase one such deep learning edge solution that pairs the popular Raspberry Pi 3 single board computer with the Intel® Movidius™ Neural Compute Stick.

Click here for a community contributed Chinese translation of this blog.

With several classification networks available, a scalable family of networks offers out-of-the-box customization to create more appropriate solutions for a user’s power, performance, and accuracy requirements. One such family of networks is Google’s MobileNets, which offers two variables that can be changed to create a custom network that is just right to solve the problem of using low power while maintaining high performance and accuracy in these types of devices.

Google's MobileNets Tradeoffs

* Data from Google’s Blog https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html

The first variable is the size of the input image. As shown in the Google Research Blog article on Google’s MobileNets, the complexity and accuracy varies with input size. Google also provided pre-trained ImageNet classification checkpoints for various sizes in the same article.

The second variable is called the depth multiplier. Even though the structure of the network remains the same, changing the depth multiplier changes the number of channels for each of the layers. This affects the complexity of the network, and that affects the accuracy and frame rate of the network. In general, of course, the higher the frame rate from the network, the lower the accuracy.

The information below will walk you through how to set up and run the NCSDK, how to download NCAppZoo, and how to run MobileNet variants on the Intel Movidius Neural Compute Stick. Finally, we demonstrate the usage of the benchmarkncs app from the NCAppZoo, which lets you collect the performance of one or many Intel Movidius Neural Compute Sticks attached to an application processor like Raspberry Pi 3.

Items Required

Raspberry Pi 3 with Power Supply and Storage (Suggest to buy a case)

Raspberry Pi 3 case
Raspberry PI 3 Model B
SD card
HDMI monitor or TV for display
Keyboard and mouse
Intel Movidius Neural Compute Stick

Procedure

Step 1: Install the latest Raspberry Pi 3 Raspbian OS

Step 2: Connect the Intel Movidius Neural Compute Stick to Raspberry Pi 3

Step 3: Install Intel Movidius Neural Compute SDK (NCSDK):

Use the following to download and install the NCSDK:

git clone https://github.com/movidius/ncsdk
cd ncsdk
make install
cd ..

Step 4: Clone the NCAppZoo repository

git clone https://github.com/movidius/ncappzoo
cd ncappzoo

Step 5: Use the benchmarkncs.py to collect performance for MobileNets

cd apps/benchmarkncs
./mobilenets_benchmark.sh | grep FPS

Results

Given this scalable family of networks, one can find the perfect network with the required accuracy and performance. The following graph (data from Google’s blog) shows the tradeoffs of accuracy vs. performance for the ImageNet classification. The unofficial performance (FPS) on the Intel Movidius Neural Compute Stick is also shown on the same graph.

Results

* Network Accuracy Data from Google’s Blog https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html

As you can see from the above graph, the higher end MobileNets with DepthMultiplier=1.0 and input image size = 224x224 with a Top5 accuracy of 89.5% runs at 9x the speed (FPS) when an Intel Movidius Neural Compute Stick is attached to the Raspberry Pi, compared to running it natively on the Raspberry Pi 3 using the CPU.

Raspberry Pi 3 Perf improvement with NCS

Raspberry Pi has been very successful in bringing a wonderful platform to the developer community. While it is possible to do inferencing at a reasonable frame rate on Raspberry Pi 3, the NCS brings an order of magnitude more performance and makes the platform better when running CNN-based Neural Networks. As we can see from the table, using the Intel Movidius Neural Compute Stick along with the Raspberry Pi 3 increases the Raspberry Pi 3 performance for computing inference using MobileNets in the range of 718% to 1254%.