OnQ Blog

Build a seeing, hearing robotic arm with the Qualcomm Robotics RB3 platform [video]

Developers going beyond robotics

Dec 13, 2019

Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.

Remember when I posted that that our Qualcomm Robotics RB3 Platform wasn’t very pretty, but it was very smart? Sahaj Sarup, application engineer at 96Boards, proved that by building a seeing, hearing robotic arm around the RB3. This post from Sahaj recaps noteworthy steps on his software and hardware path to getting a 6-degrees-of-freedom (6DOF) arm to work with voice and computer vision (CV). Let’s hear from Sahaj...

What’s the best part of building a robotic arm? Going beyond robotics to make an arm that can also see and hear.

My goal was to build a demonstrative use case for the Qualcomm Robotics RB3 Development Kit. I chose to build a robotic arm, then I added OpenCV so that it could recognize objects and speech detection so that it could process voice instructions. To get 6DOF, I connected the six servomotors in a LewanSoul Robotic Arm Kit first to an Arduino board, then later to an I2C-based servo controller.

Here’s an overview of how you can do it.

Vision and machine learning for a robot

First, I needed the arm to recognize shapes, detect colors and determine its own position relative to an object, then pick up the object. That meant computer vision and machine learning. Also, I wanted to use a multi-threading library to spread those workloads across multiple CPU cores.

The Qualcomm Robotics RB3 kit is built around the DragonBoard™ 845c development board, based on the Qualcomm SDA845 SoC and compliant with the 96Boards open hardware specification. I looked at what it would take to implement OpenCV on the Qualcomm Robotics RB3 kit. As an open-source computer vision software library that is often used to provide visual inference for machine learning applications, OpenCV makes it easy to modify the code so the RB3 can see and infer. Fortunately, between the board’s close-to-mainline Linux kernel running on Debian Buster, and the straightforward OpenCV installation, there’s almost nothing you need to modify.

But I found that OpenMP, my preferred library for multi-threading, wouldn’t work with OpenCV 4, Python 2 and 3, and Arm64. So I chose OpenCV 3.2 version instead, and managed to get all but one of the CPU cores running at 50 percent utilization or lower.

Detecting shapes and colors

Want to see my OpenCV code for tracking objects and detecting shapes and colors? A detect_shape function maps out edges and estimates the number of vertices; based on that, the shape is either a triangle, square/rectangle, pentagon or circle. A detect_hsv function allows the RB3 to detect color by separating the HSV color space. (It also gives the x- and y-coordinates for each object detected, which will help guide the robotic arm.) And an overlay function positions the data returned by detect_hsv over the frame as text, as show in the image.

(Like it or not, OpenCV defaults to 640 x 480 pixels on most webcams. If you’re a glutton for punishment, you’re welcome to try implementing 1080p for corner-case advantages provided by high resolution frames. I still like to use such high resolution cameras as most of them are accompanied by useful features such as fast auto-focus, automatic white balance and color correction. I’ve posted notes and workable code blocks for a 1920 x 1080-pixel video stream at 30 frames per second, along with options you can use in imutils and Gstreamer. Knock yourself out.)

Speech recognition meets computer vision on the RB3

Finally, I wanted to activate the Qualcomm Robotics RB3 kit with an oral command like “Hey, July” (or “Hey, Dum-E,” if you’re into that whole Tony Stark thing). Next, I would give it another oral command like “Pick up the blue rectangle.” It would then run speech recognition on the desired action (pick up), along with the color (blue) and shape (rectangle) of the intended object.

I chose a simple language processor to diff voice input against stored lists. Also, I wanted a speech detector script that uses Google’s Web Speech API to identify words spoken by the user.

It turns out that you need to import a few libraries to make this all work:

  • JSON — You have to parse data in JSON format to share lists over memcached. That’s because memcached can handle string values only.
  • pymemcached — A data-caching and -sharing front end for python
  • speech_recognition — A collection of speech recognition libraries
  • difflib — Mostly for diffing strings but used here for basic language processing

The Qualcomm Robotics RB3 kit runs the main Python script, the script for shape detection, and memcached — all separately from one another. That allows the speech detector voice script to see the x- and y-coordinates of all the objects detected by the OpenCV script. And JSON is there to convert lists to strings and back to JSON for memcached.

Okay, but what does it do?

As shown in the video, the camera detects and classifies objects placed on a table. Next, I issue a voice command to pick up one of the objects. Then, the robotic arm works with the camera to track and pick up the object.

Sure, any two-year-old child could do that. But no two-year-old child could pull together such a neat hack. The project is a good starting point for additional functions, an ample source of prototype code and a good initiation to the RB3 and the 96Boards ecosystem.

Your turn

Ready to build your own robotic arm with voice and vision? I’ve published an overview-post describing the hardware and software (and wetware) that went into the robotic arm, including the bill of materials and project objectives. You’ll also find a series of detailed posts with code you can examine and explanations of the design choices I made.

Want to one-up me? Try porting Robot Operating System (ROS) to the Qualcomm Robotics RB3 kit or switching from OpenCV on CPU to TensorFlow on the Qualcomm® Hexagon™ DSP.

Send me questions and let me know how your project is going!

Qualcomm Hexagon, Qualcomm Robotics, and Qualcomm SDA845 are products of Qualcomm Technologies, Inc. and/or its subsidiaries.


Engage with us on


Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Dev Singh

Senior Director of Business Development, Qualcomm Technologies

©2021 Qualcomm Technologies, Inc. and/or its affiliated companies.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes Qualcomm's licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm's engineering, research and development functions, and substantially all of its products and services businesses. Qualcomm products referenced on this page are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell any of the components or devices referenced herein.