Explore GenAI applications on TensorOpera AI Platform powered by Qualcomm Cloud AI 100 Accelerators
Introduction
TensorOpera and Qualcomm Technologies announce the next step in their strategic collaboration by showcasing their capabilities through a public SDXL endpoint served on TensorOpera AI Platform powered by Qualcomm Cloud AI 100. This enables AI developers to build, deploy, and scale generative AI applications with improved performance and cost efficiency. For enterprises, this collaboration eases the challenges of developing their own generative AI applications. Together, we provide a comprehensive platform that simplifies the complexities of generative AI development and access to advanced AI hardware.
How to try the public SDXL endpoint on Qualcomm Cloud AI 100?
TensorOpera
Aug 28, 2024 | 2:35

- Sign up on TensorOpera: https://tensoropera.ai/home
- Launch the TensorOpera AI Platform and navigate to the Model Marketplace
- Pick Qualcomm-SDXL in the Model Marketplace
- Try the public endpoint in the Playground
- Integrate the endpoint into your application with OpenAI Standard Format under API
The public endpoint on Qualcomm Cloud AI 100 is priced at $0.00005 / Step, amounting to a 50% reduction in the price-to-performance ratio in comparison to SDXL on Nvidia A100.
How to get a scalable dedicated Qualcomm Cloud AI 100 endpoint on TensorOpera?
Qualcomm Cloud AI 100
Aug 28, 2024 | 0:36

A scalable dedicated endpoint exclusively allocates computing resources with the ability to dynamically adjust resource allocation based on demand, ensuring consistent performance, control, and cost efficiency.
After creating a TensorOpera account, go to the Model Marketplace, choose Qualcomm-SDXL, and move to the Playground.
- Hit Deploy and choose your type of compute:
- Dedicated Qualcomm Cloud AI 100
- Contact TensorOpera’s team with your use case specifications and requirements to allocate compute nodes to your account. Dedicated pricing for the Qualcomm Cloud AI 100 is at $0.20 / accelerator h.
- Serverless Qualcomm Cloud AI 100
- Request early access to this feature by sending a request via this form: https://tensoropera.ai/qualcomm-cloud-ai-100.
- Dedicated Qualcomm Cloud AI 100
- Choose the number of replicas - we recommend a minimum of 2 Qualcomm Cloud AI 100 cards per replica.
- Decide if you want to auto-scale according to your user demand and pick the criteria and decision window to scale up or down.
- Hit Deploy!
Also, if you want to deploy your own customized model, simply follow the documentation to create your own Model Card: https://docs.tensoropera.ai/deploy/create_model.
Why use Qualcomm Cloud AI 100 vs NVIDIA A100?
|
Price <> Latency Comparison |
INSTANCE |
LATENCY (seconds) |
PRICE (card per hour) |
|
Qualcomm Cloud AI 100 Pro (2 cards) |
4.46s |
$0.2/hr |
|
|
NVIDIA A100 (1 card) |
2.89s |
$1.30/hr |
Qualcomm Cloud AI 100 is a market-leading AI inferencing solution that offers exceptional performance efficiency, density, and cost-effectiveness. With its industry-leading AI cores, Qualcomm Cloud AI 100 offers the same performance as competing offerings at half the price, illustrated in the table above.
How does the TensorOpera and Qualcomm Technologies Collaboration work?
TensorOpera is supporting native deployment by monitoring Qualcomm Cloud AI 100 NPU Utils. When deploying models like SDXL and Llama, users select the specific model through Nexus UI and choose serverless deployment with Qualcomm Cloud AI 100 NPUs. The control plane then monitors the available NPUs and assigns the required number of NPUs to the job. TensorOpera library then builds the docker with Qualcomm Cloud AI SDK on NPUs and serves the model by TensorOpera inference runner. Such a native integration provides a seamless deployment experience. Users can then easily deploy and monitor model endpoints with zero-code effort.
Next Steps
So now it's your turn to access the public SDXL endpoints and deploy your own SDXL endpoint.
From today, you can play around in the Playground and integrate the API to your GenAI application by following the step-by-step approach mentioned in the blog. Don’t have access to Qualcomm Cloud AI 100? Make sure to fill out the request form.
If you would like to learn more about TensorOpera, check out their website and make sure to follow TensorOpera on LinkedIn or X.

