Porting apps to Windows on Snapdragon, using the sse2neon header file (Part 2 of 2)
In my last post, I described the differences between x86 Intel Streaming SIMD Extensions (Intel SSE) intrinsics and NEON intrinsics from the standpoint of porting apps to Windows on Snapdragon. In this post, we'll go through porting an app from Intel SSE to Windows on Snapdragon, using the sse2neon header file.
Implementing Intel SSE with NEON-based counterparts
The header file sse2neon.h converts Intel SSE intrinsics to NEON, the implementation of advanced SIMD (single instruction, multiple data) architecture on Windows on Snapdragon platform.
sse2neon.h contains macros and functions provided by Intel headers. They translate SSE intrinsics to their equivalent NEON intrinsics. The header file supports conversion of SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2 and the AES extension to equivalent NEON intrinsics.
First, you’ll see how to set up a development environment to compile an app for Windows on Snapdragon. Then, you’ll compile a sample app.
Preparing your development environment
Follow these steps to prepare your Windows development environment:
- Download the header file sse2neon.h to your application source directory.
- Replace all occurrences of the names of SSE header files such as
<xmmintrin.h>,<emmintrin.h>,<pmmintrin.h>,<tmmintrin.h>,<smmintrin.h>and<nmmintrin.h>withsse2neon.hin your application source.
Port a sample application to Windows on Snapdragon using sse2neon
Create a sample application – sampleNeonApp.cpp – and compile it to run on Windows on Snapdragon
- Copy the following source code into your editor:
#include <iostream>
#include "sse2neon.h"
int main(int argc, char **argv)
{
__m128 a = _mm_set_ps(2.0, 3.0, 4.0, 5.0);
__m128 b = _mm_set_ps(7.0, 6.0, 2.0, 1.0);
__m128 c = _mm_add_ps(a, b);
__m128 d = _mm_mul_ps(a, b);
__m128 e = _mm_sqrt_ps(a);
float f[4];
_mm_storeu_ps(f, c);
std::cout << "result of addition " << f[0] << "," << f[1]
<< "," << f[2] << "," << f[3] << std::endl;
_mm_storeu_ps(f, d);
std::cout << "result of multiplication " << f[0] << "," << f[1]
<< "," << f[2] << "," << f[3] << std::endl;
_mm_storeu_ps(f, e);
std::cout << "result of square root " << f[0] << "," << f[1]
<< "," << f[2] << "," << f[3] << std::endl;
return 0;
}- Download sse2neon.h and store it in your source directory.
- Copy the following contents to a text file and rename the file to
CMakeLists.txt:
project(neonSample)
cmake_minimum_required(VERSION 3.10)
add_compile_options(/Zc:preprocessor)
add_executable(sampleNeonApp sampleNeonApp.cpp)
- Create a build folder, then switch to it:
mkdir build
cd build- Run the following command to cross-compile the code to run on a Windows on Snapdragon PC:
cmake ../ -G "Visual Studio 17 2022" -A <Specify_64bit_architecture>- Build the application:
cmake --build ./ --config Release- A successful build creates a Release folder in
<Path_to_SampleApp>\build\. The Release folder contains an executable,sampleNeonApp.exe. - When you run the executable on a Windows on Snapdragon device, you should see the following output:
If you followed the steps above without including sse2neon.h in the code, you would see many “undeclared identifier” syntax errors, as in the image below:
Next steps
Several apps that have been ported to Windows on Snapdragon platforms now use sse2neon.h. You can find a comprehensive list of those apps at https://linaro.atlassian.net/wiki/spaces/WOAR/overview.
Helping you port your apps to Windows on Snapdragon is a big part of our developer-first focus. Visit the Windows on Snapdragon developer portal and take a look at the tools and resources we’ve made available.
Something is missing? Ask your questions, bring your suggestions and get prompt support from your technical team on Developer Discord.
