忘れそうな内容をメモしています

忘れっぽいのでメモ帳がわりに色々書いてます

M5StickVで物体検出を試してみる③(オリジナルデータセットを水増ししてモデル作成)

先日作成した物体検出(ビーダマン検出)モデルですが、学習データを撮影する際に同じ場所から撮影していたのでM5StickVとビーダマンをある一定の距離から遠ざけると検出できなくなりました。 shintarof.hatenablog.com

そこで距離が遠くなっても検出できるようにするため学習データを増やして再度学習させてみました。と言っても距離を変えながら再度撮影するのは大変なので、前回使用した画像データの大きさを変化させてみました。このような元のデータからデータを増やす作業をData Augmentationと呼ぶようです。

調べてみるとAugmentorというツールがpythonから簡単に扱えて便利そうなので使ってみました。
github.com

環境は前回と同じMacOS(Catalina 10.15.3)です。

インストール

$pip install Augmentor

入力データのディレクトリ、出力データのディレクトリ、出力枚数を指定して縮小版の画像を水増しするスクリプトを作りました

import Augmentor
import sys

source  = sys.argv[1]
output  = sys.argv[2]
num     = sys.argv[3]

p = Augmentor.Pipeline(source_directory=source, output_directory=output)
p.zoom(probability=1.0, min_factor=0.3, max_factor=1.0)
p.sample(int(num))

このスクリプトをdata_augmentation.pyという名前で保存し、以下のように実行しました。

$python data_augmentation.py test output_image 5

こんな感じで入力データのディレクトリ直下に出力データのディレクトリが作られ、そこに5枚の画像が生成されました。 f:id:shintarof:20200525232107p:plain

これを使って先日のデータセットを以下のように水増ししました。

項目 前回 今回
学習データ 51枚 510枚
検証データ 11枚 110枚

このデータセットを前回同様labelimgでアノテーションして学習モデルを作成しました。
(学習にかなり時間がかかるのでepoch数は10にしました)

ビーダマンとM5StickVの距離を遠ざけても検出できるようになりました。
これを発展させることでビーダマンの自動モード(標的検出→姿勢制御→ビー玉発射)が作れそうです。

  • train.pyの出力
Using TensorFlow backend.
/usr/local/lib/python3.6/dist-packages/sklearn/utils/linear_assignment_.py:22: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
  FutureWarning)
Project folder projects/b-daman is created.
Downloading K210 Converter
Downloading data from https://github.com/kendryte/nncase/releases/download/v0.2.0-beta2/ncc_linux_x86_64.tar.xz
7397376/7392744 [==============================] - 1s 0us/step
['b-daman']
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.6/mobilenet_7_5_224_tf_no_top.h5
10633216/10626956 [==============================] - 1s 0us/step
Successfully loaded imagenet backend weights
Failed to load pre-trained weights for the whole model. It might be because you didn't specify any or the weight file cannot be found
Current training session folder is projects/b-daman/2020-05-27_00-46-17


Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
conv1_pad (ZeroPadding2D)    (None, 226, 226, 3)       0         
_________________________________________________________________
conv1 (Conv2D)               (None, 112, 112, 24)      648       
_________________________________________________________________
conv1_bn (BatchNormalization (None, 112, 112, 24)      96        
_________________________________________________________________
conv1_relu (ReLU)            (None, 112, 112, 24)      0         
_________________________________________________________________
conv_dw_1 (DepthwiseConv2D)  (None, 112, 112, 24)      216       
_________________________________________________________________
conv_dw_1_bn (BatchNormaliza (None, 112, 112, 24)      96        
_________________________________________________________________
conv_dw_1_relu (ReLU)        (None, 112, 112, 24)      0         
_________________________________________________________________
conv_pw_1 (Conv2D)           (None, 112, 112, 48)      1152      
_________________________________________________________________
conv_pw_1_bn (BatchNormaliza (None, 112, 112, 48)      192       
_________________________________________________________________
conv_pw_1_relu (ReLU)        (None, 112, 112, 48)      0         
_________________________________________________________________
conv_pad_2 (ZeroPadding2D)   (None, 114, 114, 48)      0         
_________________________________________________________________
conv_dw_2 (DepthwiseConv2D)  (None, 56, 56, 48)        432       
_________________________________________________________________
conv_dw_2_bn (BatchNormaliza (None, 56, 56, 48)        192       
_________________________________________________________________
conv_dw_2_relu (ReLU)        (None, 56, 56, 48)        0         
_________________________________________________________________
conv_pw_2 (Conv2D)           (None, 56, 56, 96)        4608      
_________________________________________________________________
conv_pw_2_bn (BatchNormaliza (None, 56, 56, 96)        384       
_________________________________________________________________
conv_pw_2_relu (ReLU)        (None, 56, 56, 96)        0         
_________________________________________________________________
conv_dw_3 (DepthwiseConv2D)  (None, 56, 56, 96)        864       
_________________________________________________________________
conv_dw_3_bn (BatchNormaliza (None, 56, 56, 96)        384       
_________________________________________________________________
conv_dw_3_relu (ReLU)        (None, 56, 56, 96)        0         
_________________________________________________________________
conv_pw_3 (Conv2D)           (None, 56, 56, 96)        9216      
_________________________________________________________________
conv_pw_3_bn (BatchNormaliza (None, 56, 56, 96)        384       
_________________________________________________________________
conv_pw_3_relu (ReLU)        (None, 56, 56, 96)        0         
_________________________________________________________________
conv_pad_4 (ZeroPadding2D)   (None, 58, 58, 96)        0         
_________________________________________________________________
conv_dw_4 (DepthwiseConv2D)  (None, 28, 28, 96)        864       
_________________________________________________________________
conv_dw_4_bn (BatchNormaliza (None, 28, 28, 96)        384       
_________________________________________________________________
conv_dw_4_relu (ReLU)        (None, 28, 28, 96)        0         
_________________________________________________________________
conv_pw_4 (Conv2D)           (None, 28, 28, 192)       18432     
_________________________________________________________________
conv_pw_4_bn (BatchNormaliza (None, 28, 28, 192)       768       
_________________________________________________________________
conv_pw_4_relu (ReLU)        (None, 28, 28, 192)       0         
_________________________________________________________________
conv_dw_5 (DepthwiseConv2D)  (None, 28, 28, 192)       1728      
_________________________________________________________________
conv_dw_5_bn (BatchNormaliza (None, 28, 28, 192)       768       
_________________________________________________________________
conv_dw_5_relu (ReLU)        (None, 28, 28, 192)       0         
_________________________________________________________________
conv_pw_5 (Conv2D)           (None, 28, 28, 192)       36864     
_________________________________________________________________
conv_pw_5_bn (BatchNormaliza (None, 28, 28, 192)       768       
_________________________________________________________________
conv_pw_5_relu (ReLU)        (None, 28, 28, 192)       0         
_________________________________________________________________
conv_pad_6 (ZeroPadding2D)   (None, 30, 30, 192)       0         
_________________________________________________________________
conv_dw_6 (DepthwiseConv2D)  (None, 14, 14, 192)       1728      
_________________________________________________________________
conv_dw_6_bn (BatchNormaliza (None, 14, 14, 192)       768       
_________________________________________________________________
conv_dw_6_relu (ReLU)        (None, 14, 14, 192)       0         
_________________________________________________________________
conv_pw_6 (Conv2D)           (None, 14, 14, 384)       73728     
_________________________________________________________________
conv_pw_6_bn (BatchNormaliza (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_pw_6_relu (ReLU)        (None, 14, 14, 384)       0         
_________________________________________________________________
conv_dw_7 (DepthwiseConv2D)  (None, 14, 14, 384)       3456      
_________________________________________________________________
conv_dw_7_bn (BatchNormaliza (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_dw_7_relu (ReLU)        (None, 14, 14, 384)       0         
_________________________________________________________________
conv_pw_7 (Conv2D)           (None, 14, 14, 384)       147456    
_________________________________________________________________
conv_pw_7_bn (BatchNormaliza (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_pw_7_relu (ReLU)        (None, 14, 14, 384)       0         
_________________________________________________________________
conv_dw_8 (DepthwiseConv2D)  (None, 14, 14, 384)       3456      
_________________________________________________________________
conv_dw_8_bn (BatchNormaliza (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_dw_8_relu (ReLU)        (None, 14, 14, 384)       0         
_________________________________________________________________
conv_pw_8 (Conv2D)           (None, 14, 14, 384)       147456    
_________________________________________________________________
conv_pw_8_bn (BatchNormaliza (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_pw_8_relu (ReLU)        (None, 14, 14, 384)       0         
_________________________________________________________________
conv_dw_9 (DepthwiseConv2D)  (None, 14, 14, 384)       3456      
_________________________________________________________________
conv_dw_9_bn (BatchNormaliza (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_dw_9_relu (ReLU)        (None, 14, 14, 384)       0         
_________________________________________________________________
conv_pw_9 (Conv2D)           (None, 14, 14, 384)       147456    
_________________________________________________________________
conv_pw_9_bn (BatchNormaliza (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_pw_9_relu (ReLU)        (None, 14, 14, 384)       0         
_________________________________________________________________
conv_dw_10 (DepthwiseConv2D) (None, 14, 14, 384)       3456      
_________________________________________________________________
conv_dw_10_bn (BatchNormaliz (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_dw_10_relu (ReLU)       (None, 14, 14, 384)       0         
_________________________________________________________________
conv_pw_10 (Conv2D)          (None, 14, 14, 384)       147456    
_________________________________________________________________
conv_pw_10_bn (BatchNormaliz (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_pw_10_relu (ReLU)       (None, 14, 14, 384)       0         
_________________________________________________________________
conv_dw_11 (DepthwiseConv2D) (None, 14, 14, 384)       3456      
_________________________________________________________________
conv_dw_11_bn (BatchNormaliz (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_dw_11_relu (ReLU)       (None, 14, 14, 384)       0         
_________________________________________________________________
conv_pw_11 (Conv2D)          (None, 14, 14, 384)       147456    
_________________________________________________________________
conv_pw_11_bn (BatchNormaliz (None, 14, 14, 384)       1536      
_________________________________________________________________
conv_pw_11_relu (ReLU)       (None, 14, 14, 384)       0         
_________________________________________________________________
conv_pad_12 (ZeroPadding2D)  (None, 16, 16, 384)       0         
_________________________________________________________________
conv_dw_12 (DepthwiseConv2D) (None, 7, 7, 384)         3456      
_________________________________________________________________
conv_dw_12_bn (BatchNormaliz (None, 7, 7, 384)         1536      
_________________________________________________________________
conv_dw_12_relu (ReLU)       (None, 7, 7, 384)         0         
_________________________________________________________________
conv_pw_12 (Conv2D)          (None, 7, 7, 768)         294912    
_________________________________________________________________
conv_pw_12_bn (BatchNormaliz (None, 7, 7, 768)         3072      
_________________________________________________________________
conv_pw_12_relu (ReLU)       (None, 7, 7, 768)         0         
_________________________________________________________________
conv_dw_13 (DepthwiseConv2D) (None, 7, 7, 768)         6912      
_________________________________________________________________
conv_dw_13_bn (BatchNormaliz (None, 7, 7, 768)         3072      
_________________________________________________________________
conv_dw_13_relu (ReLU)       (None, 7, 7, 768)         0         
_________________________________________________________________
conv_pw_13 (Conv2D)          (None, 7, 7, 768)         589824    
_________________________________________________________________
conv_pw_13_bn (BatchNormaliz (None, 7, 7, 768)         3072      
_________________________________________________________________
conv_pw_13_relu (ReLU)       (None, 7, 7, 768)         0         
_________________________________________________________________
detection_layer_30 (Conv2D)  (None, 7, 7, 30)          23070     
_________________________________________________________________
reshape_1 (Reshape)          (None, 7, 7, 5, 6)        0         
=================================================================
Total params: 1,856,046
Trainable params: 1,839,630
Non-trainable params: 16,416
_________________________________________________________________
Epoch 1/10
255/255 [==============================] - 339s 1s/step - loss: 0.2535 - val_loss: 0.1262


b-daman 0.0046
mAP: 0.0046
Saving model on first epoch irrespective of mAP
Epoch 2/10
255/255 [==============================] - 325s 1s/step - loss: 0.1458 - val_loss: 0.0973


b-daman 0.2748
mAP: 0.2748
mAP improved from 0 to 0.2747710177530696, saving model to projects/b-daman/2020-05-27_00-46-17/YOLO_best_mAP.h5.
Epoch 3/10
255/255 [==============================] - 319s 1s/step - loss: 0.1048 - val_loss: 0.1987


b-daman 0.3000
mAP: 0.3000
mAP improved from 0.2747710177530696 to 0.3, saving model to projects/b-daman/2020-05-27_00-46-17/YOLO_best_mAP.h5.
Epoch 4/10
255/255 [==============================] - 320s 1s/step - loss: 0.0890 - val_loss: 0.0502


b-daman 0.5916
mAP: 0.5916
mAP improved from 0.3 to 0.5916088277371909, saving model to projects/b-daman/2020-05-27_00-46-17/YOLO_best_mAP.h5.
Epoch 5/10
255/255 [==============================] - 320s 1s/step - loss: 0.0739 - val_loss: 0.1179


b-daman 0.5767
mAP: 0.5767
mAP did not improve from 0.5916088277371909.
Epoch 6/10
255/255 [==============================] - 319s 1s/step - loss: 0.0710 - val_loss: 0.0617


b-daman 0.5309
mAP: 0.5309
mAP did not improve from 0.5916088277371909.
Epoch 7/10
255/255 [==============================] - 322s 1s/step - loss: 0.0620 - val_loss: 0.0894


b-daman 0.7276
mAP: 0.7276
mAP improved from 0.5916088277371909 to 0.7276139996424675, saving model to projects/b-daman/2020-05-27_00-46-17/YOLO_best_mAP.h5.
Epoch 8/10
255/255 [==============================] - 319s 1s/step - loss: 0.0624 - val_loss: 0.0737


b-daman 0.7894
mAP: 0.7894
mAP improved from 0.7276139996424675 to 0.7894231672196914, saving model to projects/b-daman/2020-05-27_00-46-17/YOLO_best_mAP.h5.
Epoch 9/10
255/255 [==============================] - 319s 1s/step - loss: 0.0611 - val_loss: 0.1168


b-daman 0.6378
mAP: 0.6378
mAP did not improve from 0.7894231672196914.

Epoch 00009: ReduceLROnPlateau reducing learning rate to 1.9999999494757503e-05.
Epoch 10/10
255/255 [==============================] - 317s 1s/step - loss: 0.0486 - val_loss: 0.0559


b-daman 0.7532
mAP: 0.7532
mAP did not improve from 0.7894231672196914.
62-mins to train
Converting to tflite without Reshape layer for K210 Yolo
projects/b-daman/2020-05-27_00-46-17/YOLO_best_mAP.kmodel
1. Import graph...
2. Optimize Pass 1...
3. Optimize Pass 2...
4. Quantize...
  4.1. Add quantization checkpoints...
  4.2. Get activation ranges...
  Plan buffers...
  Run calibration...
  [==================================================] 100% 96.818s
  4.5. Quantize graph...
5. Lowering...
6. Generate code...
  Plan buffers...
  Emit code...
Main memory usage: 7352 B

SUMMARY
INPUTS
0   Input_0 1x3x224x224
OUTPUTS
0   detection_layer_30/BiasAdd  1x30x7x7
0

ちなみにデータセットの中に隠しファイル(.DS_Store)が含まれた状態でtrain.pyを実行すると下記のエラーが出ました。

Traceback (most recent call last):
  File "./axelerate/train.py", line 168, in <module>
    setup_training(args.conf)
  File "./axelerate/train.py", line 163, in setup_training
    return(train_from_config(config, dirname))
  File "./axelerate/train.py", line 141, in train_from_config
    config['train']['valid_metric'])
  File "/usr/local/lib/python3.6/dist-packages/axelerate-0.5.8-py3.6.egg/axelerate/networks/yolo/frontend.py", line 126, in train
    is_only_detect=False)
  File "/usr/local/lib/python3.6/dist-packages/axelerate-0.5.8-py3.6.egg/axelerate/networks/yolo/backend/utils/annotation.py", line 40, in get_train_annotations
    is_only_detect)
  File "/usr/local/lib/python3.6/dist-packages/axelerate-0.5.8-py3.6.egg/axelerate/networks/yolo/backend/utils/annotation.py", line 174, in parse_annotation
    fname = parser.get_fname(annotation_file)
  File "/usr/local/lib/python3.6/dist-packages/axelerate-0.5.8-py3.6.egg/axelerate/networks/yolo/backend/utils/annotation.py", line 75, in get_fname
    root = self._root_tag(annotation_file)
  File "/usr/local/lib/python3.6/dist-packages/axelerate-0.5.8-py3.6.egg/axelerate/networks/yolo/backend/utils/annotation.py", line 148, in _root_tag
    tree = parse(fname)
  File "/usr/lib/python3.6/xml/etree/ElementTree.py", line 1196, in parse
    tree.parse(source, parser)
  File "/usr/lib/python3.6/xml/etree/ElementTree.py", line 597, in parse
    self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0
追記

epoch数を10→50に増やしてみたところ若干正解率が上がった(0.7894→0.851)
精度の計算は./axelerate/networks/yolo/backend/utils/map_evaluation.pyの_calc_avg_precisionsでやっている模様