Back to All
Developer Blog

Explore GenAI applications on TensorOpera AI Platform powered by Qualcomm Cloud AI 100 Accelerators

Introduction

TensorOpera and Qualcomm Technologies announce the next step in their strategic collaboration by showcasing their capabilities through a public SDXL endpoint served on TensorOpera AI Platform powered by Qualcomm Cloud AI 100. This enables AI developers to build, deploy, and scale generative AI applications with improved performance and cost efficiency. For enterprises, this collaboration eases the challenges of developing their own generative AI applications. Together, we provide a comprehensive platform that simplifies the complexities of generative AI development and access to advanced AI hardware.

How to try the public SDXL endpoint on Qualcomm Cloud AI 100?

TensorOpera

Aug 28, 2024 | 2:35

Video Player is loading.
Current Time 0:00
Duration 2:35
Loaded: 3.86%
Stream Type LIVE
Remaining Time 2:35
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected
  • en (Main), selected
  1. Sign up on TensorOpera: https://tensoropera.ai/home
  2. Launch the TensorOpera AI Platform and navigate to the Model Marketplace
  3. Pick Qualcomm-SDXL in the Model Marketplace
  4. Try the public endpoint in the Playground
  5. Integrate the endpoint into your application with OpenAI Standard Format under API

The public endpoint on Qualcomm Cloud AI 100 is priced at $0.00005 / Step, amounting to a 50% reduction in the price-to-performance ratio in comparison to SDXL on Nvidia A100.

How to get a scalable dedicated Qualcomm Cloud AI 100 endpoint on TensorOpera?

Qualcomm Cloud AI 100

Aug 28, 2024 | 0:36

Video Player is loading.
Current Time 0:00
Duration 0:35
Loaded: 16.93%
Stream Type LIVE
Remaining Time 0:35
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected
  • default, selected

A scalable dedicated endpoint exclusively allocates computing resources with the ability to dynamically adjust resource allocation based on demand, ensuring consistent performance, control, and cost efficiency.

After creating a TensorOpera account, go to the Model Marketplace, choose Qualcomm-SDXL, and move to the Playground.

  1. Hit Deploy and choose your type of compute:
    1. Dedicated Qualcomm Cloud AI 100
      1. Contact TensorOpera’s team with your use case specifications and requirements to allocate compute nodes to your account. Dedicated pricing for the Qualcomm Cloud AI 100 is at $0.20 / accelerator h.
    2. Serverless Qualcomm Cloud AI 100
      1. Request early access to this feature by sending a request via this form: https://tensoropera.ai/qualcomm-cloud-ai-100.
  2. Choose the number of replicas - we recommend a minimum of 2 Qualcomm Cloud AI 100 cards per replica.
  3. Decide if you want to auto-scale according to your user demand and pick the criteria and decision window to scale up or down.
  4. Hit Deploy!

Also, if you want to deploy your own customized model, simply follow the documentation to create your own Model Card: https://docs.tensoropera.ai/deploy/create_model.

Why use Qualcomm Cloud AI 100 vs NVIDIA A100?

Price <> Latency Comparison

INSTANCE

LATENCY (seconds)

PRICE (card per hour)

Qualcomm Cloud AI 100 Pro (2 cards)

4.46s

$0.2/hr

NVIDIA A100 (1 card)

2.89s

$1.30/hr

 

Qualcomm Cloud AI 100 is a market-leading AI inferencing solution that offers exceptional performance efficiency, density, and cost-effectiveness. With its industry-leading AI cores, Qualcomm Cloud AI 100 offers the same performance as competing offerings at half the price, illustrated in the table above.

How does the TensorOpera and Qualcomm Technologies Collaboration work?

Explore TensorOpera AI powered by Qualcomm

TensorOpera is supporting native deployment by monitoring Qualcomm Cloud AI 100 NPU Utils. When deploying models like SDXL and Llama, users select the specific model through Nexus UI and choose serverless deployment with Qualcomm Cloud AI 100 NPUs. The control plane then monitors the available NPUs and assigns the required number of NPUs to the job. TensorOpera library then builds the docker with Qualcomm Cloud AI SDK on NPUs and serves the model by TensorOpera inference runner. Such a native integration provides a seamless deployment experience. Users can then easily deploy and monitor model endpoints with zero-code effort.

Next Steps

So now it's your turn to access the public SDXL endpoints and deploy your own SDXL endpoint.

From today, you can play around in the Playground and integrate the API to your GenAI application by following the step-by-step approach mentioned in the blog. Don’t have access to Qualcomm Cloud AI 100? Make sure to fill out the request form.

If you would like to learn more about TensorOpera, check out their website and make sure to follow TensorOpera on LinkedIn or X.

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries ("Qualcomm"). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.

About the Author
Parmeet Kohli
Parmeet KohliProduct Manager, Staff
Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Our leading-edge AI, high performance, low-power computing, and unrivaled connectivity deliver proven solutions that transform major industries. At Qualcomm, we are engineering human progress.

Stay connected

Get the latest Qualcomm and industry information delivered to your inbox.

Subscribe
Manage your subscription

© Qualcomm Technologies, Inc. and/or its affiliated companies.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patented technologies are licensed by Qualcomm Incorporated.

Note: Certain services and materials may require you to accept additional terms and conditions before accessing or using those items.

References to "Qualcomm" may mean Qualcomm Incorporated, or subsidiaries or business units within the Qualcomm corporate structure, as applicable.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.

Materials that are as of a specific date, including but not limited to press releases, presentations, blog posts and webcasts, may have been superseded by subsequent events or disclosures.

Nothing in these materials is an offer to sell or license any of the services or materials referenced herein.