Back to All
Developer Blog

Creating a microservice using the Qualcomm AI Inference Suite

Sign up for Developer monthly newsletter-image

Sign up for Developer monthly newsletter

Join thousands of developers around the globe who receive latest news and updates from our monthly curated newsletter.

Sign up
Come for support, stay for the community-image

Come for support, stay for the community

Get support from experts, connect with like-minded developers, and access exclusive virtual events.

Join Developer Discord


Extending the idea of using AI for customer sentiment analysis in our last blog post, we show how to create a microservice to do the same task.  For this example, we use the Golang (Go) programming language to show an example that can be compiled to a very small executable which could easily be containerized and scaled across any number of AI inference backends.


Setup

When it comes time to use AI inference in production, a common software architecture pattern is to use a microservice that is purpose built to do one thing and do it well. In this case, we want to create a service that does a specific AI inference task. 

By encapsulating functionality this way, we can more easily scale the microservice to meet demand and abstract the backend doing the actual inference. It also allows us to use any number of backends to do the same job.  We could task the microservice with logging requests, providing observability data, or other typical MLOps work.

Using the Qualcomm AI Inference Suite running on Cirrascale infrastructure (powered by Qualcomm Cloud AI accelerators), I’ll demonstrate how to build a microservice using the Go programming language.

Sample Scenario

Imagine you have been tasked with looking for trends in how customers react to the various product or service offerings of your company.  Company leadership wants to know which products customers are happy with and which might need more work to improve.  The data you have is a set of customer reviews that buyers have written after purchase.

In the past, you might have purchased a customer sentiment analysis package that ingests these feedback records and then presents a series of dashboards for analysis.  AI can do this work without the need for separate software licenses.  The approach we suggest here is to use AI to evaluate customer feedback, and then store the result tied back to the products in question.  Now you can use standard reporting to see where customers were happy, neutral, or negative relative to your products. 

Taking it a step further, you could also use AI to extract the reason(s) why they rated your products the way they did.  Of course, this depends on the review.  A review saying, “I loved this product,” doesn’t tell you what about it they liked.  While a review saying, “The power button is hard to reach,” provides actionable feedback to improve.

What I’ll show here is a simple microservice that takes a set of feedback text as input and provides a sentiment as output.

Steps

  1. Create a simple web service that accepts the text of customer feedback
  2. Parse the request to ensure it is valid and the correct format
  3. Build our prompt to have AI evaluate it
  4. Call our AI endpoint
  5. Return the result
  6. Test the service by calling it with curl

In this sample, we used only the Go standard libraries and one convenience library (godotenv) to simplify storing our endpoint and API key.  There are other third-party libraries that make creation of microservices easier and more convenient, but for the sake of simplicity we have stuck with the bare minimum.

Sample Run

Let’s first set up our server. Taking a look at the GitHub repository, you can run this sample from a terminal window on your PC (assuming Go is already installed) by first adding the godotenv library, updating your project, and then running.

>go get github.com/joho/godotenv
>go mod tidy
>go run .\micro.go

Once it is running, it will be serving on port 8080, and we can call it from a different terminal window (client) using the curl utility. Curl is a command line tool that simulates calling a web server from a browser with options to add headers and content.  We use curl here to act as the ‘client’ of our microservice running as a ‘server.’

>curl -X POST -H "Content-Type: application/json" 
-d '{"Feedback":"These running shoes are fantastic."}' 
http://localhost:8080/

If everything is working correctly, you’ll see this output on the original (server) window.

>go run .\micro.go
These running shoes are fantastic.
positive

As you can see the service echoes back the feedback text, and on the next line a simple one-word evaluation of positive, neutral, or negative. You can try out different sets of feedback by modifying the curl command to see what happens.

Code Notes

This sample consists of just three functions: main, microHandler, and evalFeedback. The main program simply loads environment variables (endpoint and API key) and starts a server listening on port 8080 using Go’s standard net/http library.  

microHandler does some rudimentary error checking and then reads in the JSON format of a request and maps it into a Go data structure.  It then extracts the text of customer feedback and calls the third function, evalFeedback.

evalFeedback is where the magic happens.  It creates a JSON payload by building up the correct format to submit to the Qualcomm AI Inference Suite API and then makes the call. The result is extracted and returned to the microHandler function, which then outputs the result.

If you aren’t familiar with the Go programming language, the code here may seem a bit convoluted as one has to build data structures to hold the expected JSON data to make accessing the values inside easier.  Use the API documentation to see what JSON is expected and returned for each API endpoint.  Here is the code that takes the data returned from the API and extracts the result of our AI call:

// Define a struct matching the relevant JSON structure
    var result struct {
        Choices []struct {
            Message struct {
                Content string `json:"content"`
            } `json:"message"`
        } `json:"choices"`
    }

    // Unmarshal JSON into the struct
    if err := json.Unmarshal([]byte(string(body)), &result); err != nil {
        panic(err)
    }

    // Extract the content from the first choice's message
    content := result.Choices[0].Message.Content

    return content

Try it out yourself

Using a pattern where the microservice abstracts the call to AI means that you can achieve added flexibility as you build production systems. Benefits include:

  • Changing out your AI model at will without breaking your API contract with clients
  • Adding load balancing across more than one AI backend
  • Observability – logging call, result, and metric data for analysis
  • Billing – tracking client use in a production system
  • Caching – return from the cache instead of incurring more AI inference cost

Creating a microservice to perform a single AI task is a common way to simplify the dependencies of code so that you have flexibility in replacing your model and can scale without affecting client code.  Try out this sample and let us know over on the Qualcomm Cloud AI Discord channel what you’ve created for your own scenarios.

The process of using inference on a scalable platform like the Qualcomm AI Inference Suite is as straightforward as using any other simple API. 

Star this repo to follow updates in the future as we create more code samples.

Join our experts and fellow developers for real-time conversations and support at Qualcomm Developer Discord

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

Qualcomm-branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

About the Author
Ray Stephenson
Ray StephensonDeveloper Relations Lead, Cloud
Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.