上一篇文章中提到的torchscript方式在手机上实际的检测效果差了很多,于是尝试了另外两种方式,第二种方式目前还有问题,所以就先不写了。这篇文章介绍的是第三种方法。zldrobit创建了一个ftlite的分支,https://github.com/zldrobit/yolov5.git。要使用这个方法文章中步骤也写的比较详细了。
1.克隆相关的分支:
git clone https://github.com/zldrobit/yolov5.git cd yolov5 git checkout tf-android
2.安装所需的环境:
pip install -r requirements.txt pip install tensorflow==2.4.1
3.转换weight文件:
#Convert weights to TensorFlow SavedModel, GraphDef and fp16 TFLite model, and verify them with PYTHONPATH=. python models/tf.py --weights weights/yolov5s.pt --cfg models/yolov5s.yaml --img 320 python3 detect.py --weights weights/yolov5s.pb --img 320 python3 detect.py --weights weights/yolov5s_saved_model/ --img 320 #Convert weights to int8 TFLite model, and verify it with (Post-Training Quantization needs train or val images from COCO 2017 dataset) PYTHONPATH=. python3 models/tf.py --weights weights/yolov5s.pt --cfg models/yolov5s.yaml --img 320 --tfl-int8 --source /data/dataset/coco/coco2017/train2017 --ncalib 100 python3 detect.py --weights weights/yolov5s-int8.tflite --img 320 --tfl-int8 #Convert weights to TensorFlow SavedModel and GraphDef integrated with NMS, and verify them with PYTHONPATH=. python3 models/tf.py --img 320 --weights weights/yolov5s.pt --cfg models/yolov5s.yaml --tf-nms python3 detect.py --img 320 --weights weights/yolov5s.pb --no-tf-nms python3 detect.py --img 320 --weights weights/yolov5s_saved_model --no-tf-nms
我使用的是下面的转换方式:
python models/tf.py --weights best.pt --cfg models/yolov5s.yaml --img 640
注意:最新版的yolov5已经内置了tf.py文件,但是参数有所变化,不能进行文件转换,具体代码我还没对比
输出:
(E:\anaconda_dirs\venvs\yolov5_latest) C:\Users\obaby>cd /d F:\Pycharm_Projects\yolov5_zldrobit (E:\anaconda_dirs\venvs\yolov5_latest) F:\Pycharm_Projects\yolov5_zldrobit>tf_640_fp16.bat (E:\anaconda_dirs\venvs\yolov5_latest) F:\Pycharm_Projects\yolov5_zldrobit>python models/tf.py --weights best.pt --cfg models/yolov5s.yaml --img 640 2021-09-29 20:53:02.769490: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll Namespace(batch_size=1, cfg='models/yolov5s.yaml', dynamic_batch_size=False, img_size=[640, 640], iou_thres=0.5, ncalib=100, score_thres=0.4, source='../data/coco128.yaml', tf_nms=False, tf_raw_resize=False, tfl_int8=False, topk_all=100, topk_per_class=100, weights='best.pt') E:\anaconda_dirs\venvs\yolov5_latest\lib\site-packages\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ..\c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) Starting TensorFlow saved_model export with TensorFlow 2.4.1... Overriding models/yolov5s.yaml nc=80 with nc=1 2021-09-29 20:53:22.340330: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-09-29 20:53:22.341670: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll 2021-09-29 20:53:22.362073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 computeCapability: 8.6 coreClock: 1.815GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s 2021-09-29 20:53:22.362352: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll 2021-09-29 20:53:22.494935: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-09-29 20:53:22.495010: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 2021-09-29 20:53:22.544367: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll 2021-09-29 20:53:22.574051: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll 2021-09-29 20:53:22.575034: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2021-09-29 20:53:22.624204: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll 2021-09-29 20:53:22.625221: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2021-09-29 20:53:22.625267: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-09-29 20:53:22.626462: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-09-29 20:53:22.627722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-09-29 20:53:22.627762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 2021-09-29 20:53:22.627960: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set Model: "model" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) [(1, 640, 640, 3)] 0 __________________________________________________________________________________________________ tf__focus (tf_Focus) (1, 320, 320, 32) 3584 input_1[0][0] __________________________________________________________________________________________________ tf__conv_1 (tf_Conv) (1, 160, 160, 64) 18688 tf__focus[0][0] __________________________________________________________________________________________________ tf__c3 (tf_C3) (1, 160, 160, 64) 19200 tf__conv_1[0][0] __________________________________________________________________________________________________ tf__conv_7 (tf_Conv) (1, 80, 80, 128) 74240 tf__c3[0][0] __________________________________________________________________________________________________ tf__c3_1 (tf_C3) (1, 80, 80, 128) 158208 tf__conv_7[0][0] __________________________________________________________________________________________________ tf__conv_17 (tf_Conv) (1, 40, 40, 256) 295936 tf__c3_1[0][0] __________________________________________________________________________________________________ tf__c3_2 (tf_C3) (1, 40, 40, 256) 627712 tf__conv_17[0][0] __________________________________________________________________________________________________ tf__conv_27 (tf_Conv) (1, 20, 20, 512) 1181696 tf__c3_2[0][0] __________________________________________________________________________________________________ tf_spp (tf_SPP) (1, 20, 20, 512) 658432 tf__conv_27[0][0] __________________________________________________________________________________________________ tf__c3_3 (tf_C3) (1, 20, 20, 512) 1185792 tf_spp[0][0] __________________________________________________________________________________________________ tf__conv_35 (tf_Conv) (1, 20, 20, 256) 132096 tf__c3_3[0][0] __________________________________________________________________________________________________ tf__upsample (tf_Upsample) (1, 40, 40, 256) 0 tf__conv_35[0][0] __________________________________________________________________________________________________ tf__concat (tf_Concat) (1, 40, 40, 512) 0 tf__upsample[0][0] tf__c3_2[0][0] __________________________________________________________________________________________________ tf__c3_4 (tf_C3) (1, 40, 40, 256) 363520 tf__concat[0][0] __________________________________________________________________________________________________ tf__conv_41 (tf_Conv) (1, 40, 40, 128) 33280 tf__c3_4[0][0] __________________________________________________________________________________________________ tf__upsample_1 (tf_Upsample) (1, 80, 80, 128) 0 tf__conv_41[0][0] __________________________________________________________________________________________________ tf__concat_1 (tf_Concat) (1, 80, 80, 256) 0 tf__upsample_1[0][0] tf__c3_1[0][0] __________________________________________________________________________________________________ tf__c3_5 (tf_C3) (1, 80, 80, 128) 91648 tf__concat_1[0][0] __________________________________________________________________________________________________ tf__conv_47 (tf_Conv) (1, 40, 40, 128) 147968 tf__c3_5[0][0] __________________________________________________________________________________________________ tf__concat_2 (tf_Concat) (1, 40, 40, 256) 0 tf__conv_47[0][0] tf__conv_41[0][0] __________________________________________________________________________________________________ tf__c3_6 (tf_C3) (1, 40, 40, 256) 297984 tf__concat_2[0][0] __________________________________________________________________________________________________ tf__conv_53 (tf_Conv) (1, 20, 20, 256) 590848 tf__c3_6[0][0] __________________________________________________________________________________________________ tf__concat_3 (tf_Concat) (1, 20, 20, 512) 0 tf__conv_53[0][0] tf__conv_35[0][0] __________________________________________________________________________________________________ tf__c3_7 (tf_C3) (1, 20, 20, 512) 1185792 tf__concat_3[0][0] __________________________________________________________________________________________________ tf__detect (tf_Detect) ((1, 25200, 6), [(1, 16182 tf__c3_5[0][0] tf__c3_6[0][0] tf__c3_7[0][0] ================================================================================================== Total params: 7,082,806 Trainable params: 7,063,542 Non-trainable params: 19,264 __________________________________________________________________________________________________ 2021-09-29 20:53:30.545120: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. WARNING:absl:Found untraced functions such as tf__conv_layer_call_fn, tf__conv_layer_call_and_return_conditional_losses, tf_bn_1_layer_call_fn, tf_bn_1_layer_call_and_return_conditional_losses, tf__conv_2_layer_call_fn while saving (showing 5 of 845). These functions will not be directly callable after loading. WARNING:absl:Found untraced functions such as tf__conv_layer_call_fn, tf__conv_layer_call_and_return_conditional_losses, tf_bn_1_layer_call_fn, tf_bn_1_layer_call_and_return_conditional_losses, tf__conv_2_layer_call_fn while saving (showing 5 of 845). These functions will not be directly callable after loading. TensorFlow saved_model export success, saved as best_saved_model Starting TensorFlow GraphDef export with TensorFlow 2.4.1... 2021-09-29 20:53:59.141105: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1 2021-09-29 20:53:59.141295: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session 2021-09-29 20:53:59.144017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 computeCapability: 8.6 coreClock: 1.815GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s 2021-09-29 20:53:59.144075: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll 2021-09-29 20:53:59.144245: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-09-29 20:53:59.144426: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 2021-09-29 20:53:59.144600: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll 2021-09-29 20:53:59.144768: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll 2021-09-29 20:53:59.145910: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2021-09-29 20:53:59.145948: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll 2021-09-29 20:53:59.146945: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2021-09-29 20:53:59.146982: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-09-29 20:53:59.229585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-09-29 20:53:59.229661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-09-29 20:53:59.230189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-09-29 20:53:59.230371: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-09-29 20:53:59.253957: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:928] Optimization results for grappler item: graph_to_optimize function_optimizer: function_optimizer did nothing. time = 0.179ms. function_optimizer: function_optimizer did nothing. time = 0ms. TensorFlow GraphDef export success, saved as best.pb Starting TFLite export with TensorFlow 2.4.1... WARNING:absl:Found untraced functions such as tf__conv_layer_call_fn, tf__conv_layer_call_and_return_conditional_losses, tf_bn_1_layer_call_fn, tf_bn_1_layer_call_and_return_conditional_losses, tf__conv_2_layer_call_fn while saving (showing 5 of 845). These functions will not be directly callable after loading. WARNING:absl:Found untraced functions such as tf__conv_layer_call_fn, tf__conv_layer_call_and_return_conditional_losses, tf_bn_1_layer_call_fn, tf_bn_1_layer_call_and_return_conditional_losses, tf__conv_2_layer_call_fn while saving (showing 5 of 845). These functions will not be directly callable after loading. 2021-09-29 20:54:34.699132: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1 2021-09-29 20:54:34.699328: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session 2021-09-29 20:54:34.700680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 computeCapability: 8.6 coreClock: 1.815GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s 2021-09-29 20:54:34.700738: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll 2021-09-29 20:54:34.700915: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-09-29 20:54:34.701091: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 2021-09-29 20:54:34.701269: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll 2021-09-29 20:54:34.701452: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll 2021-09-29 20:54:34.702670: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2021-09-29 20:54:34.702710: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll 2021-09-29 20:54:34.703696: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2021-09-29 20:54:34.703745: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-09-29 20:54:34.703958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-09-29 20:54:34.704120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-09-29 20:54:34.704284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-09-29 20:54:34.704456: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-09-29 20:54:34.723002: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:928] Optimization results for grappler item: graph_to_optimize function_optimizer: function_optimizer did nothing. time = 0.001ms. function_optimizer: function_optimizer did nothing. time = 0ms. 2021-09-29 20:54:36.226574: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:316] Ignored output_format. 2021-09-29 20:54:36.226705: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:319] Ignored drop_control_dependency. 2021-09-29 20:54:36.334634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3080 computeCapability: 8.6 coreClock: 1.815GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s 2021-09-29 20:54:36.334766: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll 2021-09-29 20:54:36.335168: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-09-29 20:54:36.335458: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 2021-09-29 20:54:36.335723: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll 2021-09-29 20:54:36.336020: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll 2021-09-29 20:54:36.337043: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2021-09-29 20:54:36.337085: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll 2021-09-29 20:54:36.337790: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2021-09-29 20:54:36.337822: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2021-09-29 20:54:36.337873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-09-29 20:54:36.337897: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-09-29 20:54:36.338060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-09-29 20:54:36.338224: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set TFLite export success, saved as best-fp16.tflite
最终权重文件会转换为best-fp16.tflite文件:
4.将tflite文件放到项目的assets目录下:
修改coco.txt为对应的分类:
DetectorFactory.java中对inputSize根据model文件名称进行了动态处理:
if (modelFilename.equals("yolov5s.tflite")) { labelFilename = "file:///android_asset/coco.txt"; isQuantized = false; inputSize = 640; output_width = new int[]{80, 40, 20}; masks = new int[][]{{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}; anchors = new int[]{ 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 }; } else if (modelFilename.equals("yolov5s-fp16.tflite")) { labelFilename = "file:///android_asset/coco.txt"; isQuantized = false; inputSize = 640; //初始为320,但是我导出模型的时候使用的640所以修改成了640 output_width = new int[]{40, 20, 10}; masks = new int[][]{{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}; anchors = new int[]{ 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 }; } else if (modelFilename.equals("yolov5s-int8.tflite")) { labelFilename = "file:///android_asset/coco.txt"; isQuantized = true; inputSize = 320; output_width = new int[]{40, 20, 10}; masks = new int[][]{{0, 1, 2}, {3, 4, 5}, {6, 7, 8}}; anchors = new int[]{ 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 }; }
然后就可以编译运行了。