上手SNPE（3）API详解-CSDN.NET

CSDN首页> 业界

订阅业界RSS

上手SNPE（3）API详解

发表于 2022-03-03 18:24:42

在上一篇使用888的HTP推理inceptionV3 中，我们测试了inceptionv3在888芯片手机上的CPU、GPU、HTP上的推理，结果显示HTP的推理速度是GPU的7倍多。除了速度快之外，HTP还有一大优势就是功耗低，这点对于手机应用也非常重要。

下面我将简单介绍如何在APP中通过SNPE API使用HTP，加速模型的同时减少功耗。

本文包含以下部分：

SNPE Native C++ API 的介绍
在 888 手机的HTP上运行示例代码
调试API的介绍

SNPE API介绍

<path to snpe sdk>/doc/html/cplus_plus_tutorial.html
<path to snpe sdk>\examples\NativeCpp\SampleCode\jni\main.cpp

简单来说，可以按照以下步骤来调用SNPE

检查可用的runtime（可选项）；
加载网络DLC模型；
配置SNPE选项，新建SNPE instance；
加载模型的输入数据；
执行网络推理, 得到输出数据；
卸载SNPE。

SNPE API Basic Call Flow

对应在sample code里面的代码如下：

// <path to snpe sdk>\examples\NativeCpp\SampleCode\jni\main.cpp
static zdl::DlSystem::Runtime_t runtime = checkRuntime();
std::unique_ptr<zdl::DlContainer::IDlContainer> container = loadContainerFromFile(dlc);
std::unique_ptr<zdl::SNPE::SNPE> snpe = setBuilderOptions(container, runtime, useUserSuppliedBuffers);
std::unique_ptr<zdl::DlSystem::ITensor> inputTensor = loadInputTensor(snpe, fileLine); // ITensor
snpe->execute(inputTensor.get(), outputTensorMap);// ITensor
snpe.reset();

1. 检查可用的runtime（可选项）

runtime指的是CPU，GPU，DSP，HTA，HTP。

对于不同的手机，包含的runtime类型不一样。

【注意】：开发者仅可以使用DSP/HTP上的unsignedPD，具体可以参考Hexagon DSP SDK里面的unsigned PD，所以对于HTP的runtime需要加上UNSIGNEDPD_CHECK选项。

// 检查DSP/HTP是不是可用。
zdl::DlSystem::Runtime_t checkRuntime()
{
    static zdl::DlSystem::Version_t Version = zdl::SNPE::SNPEFactory::getLibraryVersion();
    static zdl::DlSystem::Runtime_t Runtime;
    std::cout << "SNPE Version: " << Version.asString().c_str() << std::endl; //Print Version number
    // 这里加上了zdl::DlSystem::RuntimeCheckOption_t::UNSIGNEDPD_CHECK,表示使用unsignedPD
    // <path to snpe sdk>/doc/html/group__c__plus__plus__apis.html#ga960452d40eef91090973a17a438eaabd
    if (zdl::SNPE::SNPEFactory::isRuntimeAvailable(zdl::DlSystem::Runtime_t::DSP, zdl::DlSystem::RuntimeCheckOption_t::UNSIGNEDPD_CHECK)) {
        Runtime = zdl::DlSystem::Runtime_t::GPU;
    } else {
        Runtime = zdl::DlSystem::Runtime_t::CPU;
    }
    return Runtime;
}

2. 加载模型

加载DLC格式的模型文件，用于新建SNPE instance。

//containerPath 是存放DLC文件的路径
std::unique_ptr<zdl::DlContainer::IDlContainer> loadContainerFromFile(std::string containerPath)
{
    std::unique_ptr<zdl::DlContainer::IDlContainer> container;
    container = zdl::DlContainer::IDlContainer::open(containerPath);
    return container;
}

3. 新建SNPE instance

（1）设置 platformConfig（可选项）

【注意】如果开发者使用DSP/HTP，必须在这里选择 unsignedPD:O

zdl::DlSystem::PlatformConfig platformConfig;
std::string PlatformOptions = "unsignedPD:ON";
// check platform options
if (PlatformOptions.length() > 0) {
    bool setSuccess = platformConfig.setPlatformOptions(PlatformOptions);
    bool isValid = platformConfig.isOptionsValid();
    std::cout << "PlatformOptions (" << PlatformOptions << ") set " << (setSuccess ? "successful" : "failed")
                        << " config option is " << (isValid ? "valid" : "invalid") << std::endl;
    if (!setSuccess || !isValid) {
            return EXIT_FAILURE;
    }
}

（2）配置选项，新建SNPE

std::unique_ptr<zdl::SNPE::SNPE> setBuilderOptions(std::unique_ptr<zdl::DlContainer::IDlContainer> & container,
                                                   zdl::DlSystem::Runtime_t runtime,
                                                   zdl::DlSystem::RuntimeList runtimeList,
                                                   bool useUserSuppliedBuffers,
                                                   zdl::DlSystem::PlatformConfig platformConfig,
                                                   bool useCaching)
{
    std::unique_ptr<zdl::SNPE::SNPE> snpe;
    zdl::SNPE::SNPEBuilder snpeBuilder(container.get());
    if(runtimeList.empty())
    {
        runtimeList.add(runtime);
    }
    snpe = snpeBuilder.setOutputLayers({})
       .setRuntimeProcessorOrder(runtimeList)
       .setUseUserSuppliedBuffers(useUserSuppliedBuffers)
       .setPlatformConfig(platformConfig) // 这个就是（1）中设置的选项
       .setInitCacheMode(useCaching)// 这个选项加快初始化速度
       .build();
    return snpe;
}

3 加载模型的输入数据

有两种加载输入数据的方式ITensors 和 User Buffers.

User Buffers方式的好处是SNPE直接映射到用户创建的数据buffer，避免将数据拷贝到ITensor，从而减少了数据拷贝的时间开销。

（1）ITensor使用方式可以参考：

<path to snpe sdk>\examples\NativeCpp\SampleCode\jni\LoadInputTensor.cpp
<path to snpe sdk>\examples\NativeCpp\SampleCode\jni\SaveOutputTensor.cpp

（2）User Buffers使用方式可以参考：

<path to snpe sdk>\examples\NativeCpp\SampleCode\jni\CreateUserBuffer.cpp

4 执行SNPE推理

// ITensor 模式
snpe->execute(inputTensor.get(), outputTensorMap)
// User Buffers 模式
snpe->execute(inputMap, outputMap);

5 卸载SNPE

snpe.reset();

运行示例代码

依然延续上一篇的docker container环境，我们这里将修改、编译SNPE的 Native C++ Sample Code，在HTP上推理inceptionV3.

1 安装NDK，设置编译环境变量

root@c633e07fbd33:/opt# wget https://dl.google.com/android/repository/android-ndk-r19b-linux-x86_64.zip
root@c633e07fbd33:/opt# unzip -q  android-ndk-r19b-linux-x86_64.zip
root@c633e07fbd33:/opt# rm android-ndk-r19b-linux-x86_64.zip


root@c633e07fbd33:/workspace/tutor/inceptionv3# export ANDROID_NDK_ROOT=/opt/android-ndk-r19b/
root@c633e07fbd33:/workspace/tutor/inceptionv3# cp $SNPE_ROOT/examples/NativeCpp/SampleCode . -r

//runtme not present
haydn:/data/local/tmp/incpv3 $ snpe-sample -d inception_v3_htp.dlc -i target_raw_list.txt -r dsp
SNPE Version: 1.52.0.2724
Selected runtime not present. Falling back to CPU.

2 编译 Sample Code

（1）ndk-build 编译得到snpe-sample

//从SNPE SDK 拷贝SampleCode到工作目录
root@c633e07fbd33:/workspace/tutor/inceptionv3# cp $SNPE_ROOT/examples/NativeCpp/SampleCode . -r
root@c633e07fbd33:/workspace/tutor/inceptionv3# cd SampleCode/jni/
root@c633e07fbd33:/workspace/tutor/inceptionv3/SampleCode/jni# export PATH=$PATH:$ANDROID_NDK_ROOT
root@c633e07fbd33:/workspace/tutor/inceptionv3/SampleCode/jni# ndk-build

（2）在手机上运行snpe-sample

这里我们复用上一篇中的push到手机中的库文件和模型文件

root@c633e07fbd33:/workspace/tutor/inceptionv3/SampleCode/jni# adb push ../obj/local/arm64-v8a/snpe-sample /data/local/tmp/incpv3/arm64/bin
root@c633e07fbd33:/workspace/tutor/inceptionv3/SampleCode/jni# adb shell
haydn:/ $ cd  /data/local/tmp/incpv3/
haydn:/data/local/tmp/incpv3 $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/incpv3/arm64/lib
haydn:/data/local/tmp/incpv3 $ export PATH=$PATH:/data/local/tmp/incpv3/arm64/bin
haydn:/data/local/tmp/incpv3 $ export ADSP_LIBRARY_PATH="/data/local/tmp/incpv3/dsp/lib;/system/lib/rfsa/adsp;/system/vendor/lib/rfsa/adsp;/dsp"
//测试snpe-sample
haydn:/data/local/tmp/incpv3 $ snpe-sample -h

DESCRIPTION:
------------
Example application demonstrating how to load and execute a neural network
using the SNPE C++ API.


REQUIRED ARGUMENTS:
-------------------
  -d  <FILE>   Path to the DL container containing the network.
  -i  <FILE>   Path to a file listing the inputs for the network.
  -o  <PATH>   Path to directory to store output results.

OPTIONAL ARGUMENTS:
-------------------
  -b  <TYPE>   Type of buffers to use [USERBUFFER_FLOAT, USERBUFFER_TF8, ITENSOR, USERBUFFER_TF16] (ITENSOR is default).
  -r  <RUNTIME> The runtime to be used [gpu, dsp, aip, cpu] (cpu is default).
  -u  <VAL,VAL> Path to UDO package with registration library for UDOs.
                Optionally, user can provide multiple packages as a comma-separated list.
  -z  <NUMBER>  The maximum number that resizable dimensions can grow into.
                Used as a hint to create UserBuffers for models with dynamic sized outputs. Should be a positive integer and is not applicable when using ITensor.
  -s  <TYPE>   Source of user buffers to use [GLBUFFER, CPUBUFFER] (CPUBUFFER is default).
  -c           Enable init caching to accelerate the initialization process of SNPE. Defaults to disable.
  -l  <VAL,VAL,VAL> Specifies the order of precedence for runtime e.g  cpu_float32, dsp_fixed8_tf etc. Valid values are:-
                    cpu_float32 (Snapdragon CPU)       = Data & Math: float 32bit
                    gpu_float32_16_hybrid (Adreno GPU) = Data: float 16bit Math: float 32bit
                    dsp_fixed8_tf (Hexagon DSP)        = Data & Math: 8bit fixed point Tensorflow style format
                    gpu_float16 (Adreno GPU)           = Data: float 16bit Math: float 16bit
                    cpu (Snapdragon CPU)               = Same as cpu_float32
                    gpu (Adreno GPU)                   = Same as gpu_float32_16_hybrid
                    dsp (Hexagon DSP)                  = Same as dsp_fixed8_tf
//运行snpe-sample
haydn:/data/local/tmp/incpv3 $ snpe-sample -d inception_v3_htp.dlc -i target_raw_list.txt -r dsp
SNPE Version: 1.52.0.2724
Selected runtime not present. Falling back to CPU.

运行结果显示 Selected runtime not present. Falling back to CPU. 这是因为默认的SampleCode 里面没有设置为unsignedPD。

（3）设置unsigned PD

Code change

diff --git a/examples/NativeCpp/SampleCode/jni/CheckRuntime.cpp b/examples/NativeCpp/SampleCode/jni/CheckRuntime.cpp
index 82c9c10..8ba67ce 100755
--- a/examples/NativeCpp/SampleCode/jni/CheckRuntime.cpp
+++ b/examples/NativeCpp/SampleCode/jni/CheckRuntime.cpp
@@ -23,7 +23,7 @@ zdl::DlSystem::Runtime_t checkRuntime(zdl::DlSystem::Runtime_t runtime)
 
     std::cout << "SNPE Version: " << Version.asString().c_str() << std::endl; //Print Version number
 
-    if (!zdl::SNPE::SNPEFactory::isRuntimeAvailable(runtime))
+    if (!zdl::SNPE::SNPEFactory::isRuntimeAvailable(runtime, zdl::DlSystem::RuntimeCheckOption_t::UNSIGNEDPD_CHECK))
     {
         std::cerr << "Selected runtime not present. Falling back to CPU." << std::endl;
         runtime = zdl::DlSystem::Runtime_t::CPU;
diff --git a/examples/NativeCpp/SampleCode/jni/main.cpp b/examples/NativeCpp/SampleCode/jni/main.cpp
index 6ec2f95..8ad06bc 100755
--- a/examples/NativeCpp/SampleCode/jni/main.cpp
+++ b/examples/NativeCpp/SampleCode/jni/main.cpp
@@ -324,6 +324,18 @@ int main(int argc, char** argv)
         return EXIT_FAILURE;
     }
 
+    std::string PlatformOptions = "unsignedPD:ON";
+
+	// check platform options
+	if (PlatformOptions.length() > 0) {
+	  bool setSuccess = platformConfig.setPlatformOptions(PlatformOptions);
+	  bool isValid = platformConfig.isOptionsValid();
+	  std::cout << "PlatformOptions (" << PlatformOptions << ") set " << (setSuccess ? "successful" : "failed")
+				<< " config option is " << (isValid ? "valid" : "invalid") << std::endl;
+	  if (!setSuccess || !isValid) {
+		 return EXIT_FAILURE;
+	  }
+	}
     snpe = setBuilderOptions(container, runtime, runtimeList, useUserSuppliedBuffers, platformConfig, usingInitCaching);
     if (snpe == nullptr)
     {

再次编译和运行snpe-sample,成功运行到DSP/HTP上。

root@c633e07fbd33:/workspace/tutor/inceptionv3/SampleCode/jni# ndk-build
root@c633e07fbd33:/workspace/tutor/inceptionv3/SampleCode/jni# adb push ../obj/local/arm64-v8a/snpe-sample /data/local/tmp/incpv3/arm64/bin

root@c633e07fbd33:/workspace/tutor/inceptionv3/SampleCode/jni# adb shell
haydn:/ $ cd  /data/local/tmp/incpv3/
haydn:/data/local/tmp/incpv3 $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/local/tmp/incpv3/arm64/lib
haydn:/data/local/tmp/incpv3 $ export PATH=$PATH:/data/local/tmp/incpv3/arm64/bin
haydn:/data/local/tmp/incpv3 $ export ADSP_LIBRARY_PATH="/data/local/tmp/incpv3/dsp/lib;/system/lib/rfsa/adsp;/system/vendor/lib/rfsa/adsp;/dsp"
haydn:/data/local/tmp/incpv3 $ snpe-sample -d inception_v3_htp.dlc -i target_raw_list.txt -r dsp
SNPE Version: 1.52.0.2724
PlatformOptions (unsignedPD:ON) set successful config option is valid
Batch size for the container is 1
Processing DNN Input: cropped/notice_sign.raw
Processing DNN Input: cropped/trash_bin.raw
Processing DNN Input: cropped/plastic_cup.raw
Processing DNN Input: cropped/chairs.raw