鈍足ランナーのIT日記

走るのが好きな5流のITエンジニアのブログ。

趣味の範囲は広いけど、どれも中途半端なクソブロガー楽しめるWebアプリを作ってあっと言わせたい。サーバーサイドPerl(Mojolicious)、クライアントサイドVue.js。Arduinoにも触手を伸ばす予定。

GoogleColabで機械学習中にkillされた

画像サイズを小さくして(1つ600k以下)、GoogleColabで機械学習リトライ。
しばらくしたら、killされました。killされないにせよ19時間程度の見積もりが出ていて、このままでは時間オーバーでNGっぽい。

ただ、ログにはヒントとなるメッセージがいくつか出力されていました。

performance bottleneck on CPU or Disk HDD/SSD
OpenCV isn't used - data augmentation will be slow 


darknet(https://github.com/AlexeyAB/darknet)のMakefileを編集は、GPU=0をGPU=1のみだったので、以下4行を追加。

sed -i 's/GPU=0/GPU=1/g' Makefile
sed -i 's/CUDNN=0/CUDNN=1/g' Makefile
sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/g' Makefile
sed -i 's/OPENCV=0/OPENCV=1/g' Makefile
sed -i 's/LIBSO=0/LIBSO=1/g' Makefile
yolov3-tiny-train
net.optimized_memory = 0 
mini_batch = 32, batch = 64, time_steps = 1, train = 1 
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
 Detection layer: 16 - type = 27 
 Detection layer: 23 - type = 27 
Resizing, random_coef = 1.40 

 608 x 608 
 try to allocate additional workspace_size = 53.23 MB 
 CUDA allocate done! 
Loaded: 0.000044 seconds

 1: 1015.314697, 1015.314697 avg loss, 0.000000 rate, 17.601098 seconds, 64 images, -1.000000 hours left
Loaded: 0.000066 seconds

 2: 996.487366, 1013.431946 avg loss, 0.000000 rate, 17.119605 seconds, 128 images, 19.551960 hours left
Loaded: 0.000041 seconds

 3: 1018.983337, 1013.987061 avg loss, 0.000000 rate, 17.377515 seconds, 192 images, 19.546564 hours left
Loaded: 0.000039 seconds

 4: 1031.275024, 1015.715881 avg loss, 0.000000 rate, 15.738978 seconds, 256 images, 19.544037 hours left
Loaded: 0.000062 seconds

 5: 1017.455811, 1015.889893 avg loss, 0.000000 rate, 15.749631 seconds, 320 images, 19.523300 hours left
Loaded: 0.000044 seconds

 6: 1023.503540, 1016.651245 avg loss, 0.000000 rate, 15.517623 seconds, 384 images, 19.502846 hours left
Loaded: 0.000047 seconds

 7: 996.175171, 1014.603638 avg loss, 0.000000 rate, 15.704325 seconds, 448 images, 19.479977 hours left
Loaded: 0.000066 seconds

 8: 1015.984924, 1014.741760 avg loss, 0.000000 rate, 15.649489 seconds, 512 images, 19.459365 hours left
Loaded: 0.000042 seconds

 9: 1020.709961, 1015.338562 avg loss, 0.000000 rate, 15.375564 seconds, 576 images, 19.438308 hours left
Loaded: 0.000041 seconds

 10: 1012.111450, 1015.015869 avg loss, 0.000000 rate, 16.237127 seconds, 640 images, 19.414381 hours left
Resizing, random_coef = 1.40 

 416 x 416 
 try to allocate additional workspace_size = 24.92 MB 
 CUDA allocate done! 
Loaded: 0.000032 seconds

 11: 666.620544, 980.176331 avg loss, 0.000000 rate, 16.988516 seconds, 704 images, 19.400199 hours left
Loaded: 1.367416 seconds - performance bottleneck on CPU or Disk HDD/SSD

 12: 652.894043, 947.448120 avg loss, 0.000000 rate, 14.934125 seconds, 768 images, 19.394440 hours left
Loaded: 0.366178 seconds - performance bottleneck on CPU or Disk HDD/SSD

 13: 625.014893, 915.204773 avg loss, 0.000000 rate, 14.912996 seconds, 832 images, 19.381081 hours left
Loaded: 0.659968 seconds - performance bottleneck on CPU or Disk HDD/SSD

 14: 606.954285, 884.379700 avg loss, 0.000000 rate, 14.189478 seconds, 896 images, 19.356487 hours left
Loaded: 0.269283 seconds - performance bottleneck on CPU or Disk HDD/SSD

 15: 636.519409, 859.593689 avg loss, 0.000000 rate, 13.916811 seconds, 960 images, 19.327339 hours left
Loaded: 0.271091 seconds - performance bottleneck on CPU or Disk HDD/SSD

 16: 646.099121, 838.244263 avg loss, 0.000000 rate, 14.381692 seconds, 1024 images, 19.291099 hours left
Loaded: 0.534902 seconds - performance bottleneck on CPU or Disk HDD/SSD

 17: 625.409668, 816.960815 avg loss, 0.000000 rate, 13.566022 seconds, 1088 images, 19.260345 hours left
Loaded: 0.289113 seconds - performance bottleneck on CPU or Disk HDD/SSD

 18: 637.292603, 798.994019 avg loss, 0.000000 rate, 14.223544 seconds, 1152 images, 19.223754 hours left
Loaded: 0.500223 seconds - performance bottleneck on CPU or Disk HDD/SSD

 19: 632.865723, 782.381165 avg loss, 0.000000 rate, 14.835529 seconds, 1216 images, 19.192043 hours left
Loaded: 0.185020 seconds - performance bottleneck on CPU or Disk HDD/SSD

 20: 651.311035, 769.274170 avg loss, 0.000000 rate, 13.805287 seconds, 1280 images, 19.169731 hours left
Resizing, random_coef = 1.40 

 448 x 448 
 try to allocate additional workspace_size = 28.90 MB 
 CUDA allocate done! 
Loaded: 0.000030 seconds

 21: 634.585815, 755.805359 avg loss, 0.000000 rate, 14.155189 seconds, 1344 images, 19.132705 hours left
Loaded: 0.255651 seconds - performance bottleneck on CPU or Disk HDD/SSD

 22: 683.384277, 748.563232 avg loss, 0.000000 rate, 13.963725 seconds, 1408 images, 19.097834 hours left
Loaded: 0.141749 seconds - performance bottleneck on CPU or Disk HDD/SSD

 23: 672.303467, 740.937256 avg loss, 0.000000 rate, 14.090841 seconds, 1472 images, 19.063980 hours left
Loaded: 0.060910 seconds

 24: 686.782715, 735.521790 avg loss, 0.000000 rate, 14.442522 seconds, 1536 images, 19.030571 hours left
Loaded: 0.197697 seconds - performance bottleneck on CPU or Disk HDD/SSD

 25: 667.680542, 728.737671 avg loss, 0.000000 rate, 14.439458 seconds, 1600 images, 19.000448 hours left
Loaded: 0.000051 seconds

 26: 668.134277, 722.677307 avg loss, 0.000000 rate, 13.779445 seconds, 1664 images, 18.972062 hours left
Loaded: 0.263503 seconds - performance bottleneck on CPU or Disk HDD/SSD

 27: 678.552185, 718.264771 avg loss, 0.000000 rate, 14.383759 seconds, 1728 images, 18.934453 hours left
Loaded: 0.000031 seconds

 28: 687.842529, 715.222534 avg loss, 0.000000 rate, 14.826161 seconds, 1792 images, 18.906758 hours left
Loaded: 0.264664 seconds - performance bottleneck on CPU or Disk HDD/SSD

 29: 690.554443, 712.755737 avg loss, 0.000000 rate, 14.376801 seconds, 1856 images, 18.881274 hours left
Loaded: 0.000069 seconds

 30: 677.958801, 709.276062 avg loss, 0.000000 rate, 13.518326 seconds, 1920 images, 18.853965 hours left
Resizing, random_coef = 1.40 

 608 x 608 
 try to allocate additional workspace_size = 53.23 MB 
 CUDA allocate done! 
Loaded: 0.000039 seconds

 31: 998.757446, 738.224182 avg loss, 0.000000 rate, 17.068204 seconds, 1984 images, 18.814503 hours left
Loaded: 0.000038 seconds

 32: 1000.287231, 764.430481 avg loss, 0.000000 rate, 16.633735 seconds, 2048 images, 18.814536 hours left
Loaded: 0.000033 seconds

 33: 1006.843140, 788.671753 avg loss, 0.000000 rate, 16.525531 seconds, 2112 images, 18.809732 hours left
Loaded: 0.000038 seconds

 34: 1016.232910, 811.427856 avg loss, 0.000000 rate, 15.934735 seconds, 2176 images, 18.803737 hours left
Loaded: 0.000048 seconds

 35: 1008.093018, 831.094360 avg loss, 0.000000 rate, 16.868893 seconds, 2240 images, 18.791248 hours left
Loaded: 0.000040 seconds

 36: 1040.754639, 852.060364 avg loss, 0.000000 rate, 16.590641 seconds, 2304 images, 18.789129 hours left
Loaded: 0.000037 seconds

 37: 1031.819824, 870.036316 avg loss, 0.000000 rate, 15.147514 seconds, 2368 images, 18.783919 hours left
Loaded: 0.000039 seconds

 38: 992.555298, 882.288208 avg loss, 0.000000 rate, 15.923571 seconds, 2432 images, 18.762830 hours left
Loaded: 0.000047 seconds

 39: 1006.625488, 894.721924 avg loss, 0.000000 rate, 16.032506 seconds, 2496 images, 18.750450 hours left
Loaded: 0.000136 seconds

 40: 993.472900, 904.597046 avg loss, 0.000000 rate, 16.614816 seconds, 2560 images, 18.739348 hours left
Resizing, random_coef = 1.40 

 576 x 576 
 try to allocate additional workspace_size = 47.78 MB 
 CUDA allocate done! 
Loaded: 4.384098 seconds - performance bottleneck on CPU or Disk HDD/SSD

 41: 927.673401, 906.904663 avg loss, 0.000000 rate, 15.434270 seconds, 2624 images, 18.734719 hours left
Loaded: 0.000047 seconds

 42: 931.725708, 909.386780 avg loss, 0.000000 rate, 16.278345 seconds, 2688 images, 18.765319 hours left
Loaded: 0.000038 seconds

 43: 935.342346, 911.982361 avg loss, 0.000000 rate, 15.067830 seconds, 2752 images, 18.756638 hours left
Loaded: 0.000047 seconds

 44: 930.110596, 913.795166 avg loss, 0.000000 rate, 15.380332 seconds, 2816 images, 18.734693 hours left
Loaded: 0.000044 seconds

 45: 922.697815, 914.685425 avg loss, 0.000000 rate, 15.468515 seconds, 2880 images, 18.716360 hours left
Loaded: 0.000084 seconds

 46: 931.045776, 916.321472 avg loss, 0.000000 rate, 15.533188 seconds, 2944 images, 18.699136 hours left
Loaded: 0.000041 seconds

 47: 934.911499, 918.180481 avg loss, 0.000000 rate, 15.126537 seconds, 3008 images, 18.682752 hours left
Loaded: 0.000040 seconds

 48: 927.196045, 919.082031 avg loss, 0.000000 rate, 15.187381 seconds, 3072 images, 18.662023 hours left
Loaded: 0.000033 seconds

 49: 921.857483, 919.359558 avg loss, 0.000000 rate, 16.672647 seconds, 3136 images, 18.642127 hours left
Loaded: 0.000071 seconds

 50: 936.776489, 921.101257 avg loss, 0.000000 rate, 15.611139 seconds, 3200 images, 18.638688 hours left
Resizing, random_coef = 1.40 

 384 x 384 
 try to allocate additional workspace_size = 21.23 MB 
 CUDA allocate done! 
Loaded: 0.000036 seconds

 51: 607.817871, 889.772949 avg loss, 0.000000 rate, 14.597170 seconds, 3264 images, 18.623591 hours left
Loaded: 0.931616 seconds - performance bottleneck on CPU or Disk HDD/SSD

 52: 617.431152, 862.538757 avg loss, 0.000000 rate, 14.122304 seconds, 3328 images, 18.597479 hours left
Loaded: 1.025194 seconds - performance bottleneck on CPU or Disk HDD/SSD

 53: 592.081299, 835.493042 avg loss, 0.000000 rate, 14.512741 seconds, 3392 images, 18.576596 hours left
Loaded: 1.427989 seconds - performance bottleneck on CPU or Disk HDD/SSD

 54: 579.720459, 809.915771 avg loss, 0.000000 rate, 14.641062 seconds, 3456 images, 18.561187 hours left
Loaded: 0.373208 seconds - performance bottleneck on CPU or Disk HDD/SSD

 55: 621.096680, 791.033875 avg loss, 0.000000 rate, 12.981670 seconds, 3520 images, 18.551711 hours left
Loaded: 1.148363 seconds - performance bottleneck on CPU or Disk HDD/SSD

 56: 575.216675, 769.452148 avg loss, 0.000000 rate, 13.222831 seconds, 3584 images, 18.512542 hours left
Loaded: 0.311316 seconds - performance bottleneck on CPU or Disk HDD/SSD

 57: 547.103333, 747.217285 avg loss, 0.000000 rate, 13.031938 seconds, 3648 images, 18.484861 hours left
Loaded: 2.189777 seconds - performance bottleneck on CPU or Disk HDD/SSD

 58: 624.601318, 734.955688 avg loss, 0.000000 rate, 13.075155 seconds, 3712 images, 18.446159 hours left
Loaded: 0.287388 seconds - performance bottleneck on CPU or Disk HDD/SSD

 59: 546.540039, 716.114136 avg loss, 0.000000 rate, 11.938977 seconds, 3776 images, 18.428855 hours left
Loaded: 1.713282 seconds - performance bottleneck on CPU or Disk HDD/SSD

 60: 583.485107, 702.851257 avg loss, 0.000000 rate, 12.941078 seconds, 3840 images, 18.378412 hours left
Resizing, random_coef = 1.40 

 384 x 384 
 try to allocate additional workspace_size = 21.23 MB 
 CUDA allocate done! 
Loaded: 11.507755 seconds - performance bottleneck on CPU or Disk HDD/SSD

 61: 584.497192, 691.015869 avg loss, 0.000000 rate, 13.147980 seconds, 3904 images, 18.355012 hours left
Loaded: 1.111983 seconds - performance bottleneck on CPU or Disk HDD/SSD

 62: 598.689941, 681.783264 avg loss, 0.000000 rate, 12.985760 seconds, 3968 images, 18.441303 hours left
Loaded: 0.427884 seconds - performance bottleneck on CPU or Disk HDD/SSD

 63: 590.420654, 672.646973 avg loss, 0.000000 rate, 13.096348 seconds, 4032 images, 18.411105 hours left
Loaded: 1.297984 seconds - performance bottleneck on CPU or Disk HDD/SSD

 64: 603.117249, 665.694031 avg loss, 0.000000 rate, 13.861853 seconds, 4096 images, 18.374973 hours left
Loaded: 0.364039 seconds - performance bottleneck on CPU or Disk HDD/SSD

 65: 569.382751, 656.062927 avg loss, 0.000000 rate, 12.567974 seconds, 4160 images, 18.357050 hours left
Loaded: 1.370452 seconds - performance bottleneck on CPU or Disk HDD/SSD

 66: 578.979614, 648.354614 avg loss, 0.000000 rate, 13.073842 seconds, 4224 images, 18.314834 hours left
Loaded: 0.597948 seconds - performance bottleneck on CPU or Disk HDD/SSD

 67: 559.860962, 639.505249 avg loss, 0.000000 rate, 12.574212 seconds, 4288 images, 18.289531 hours left
Loaded: 1.418515 seconds - performance bottleneck on CPU or Disk HDD/SSD

 68: 591.623291, 634.717041 avg loss, 0.000000 rate, 13.348177 seconds, 4352 images, 18.250589 hours left
Loaded: 0.115704 seconds - performance bottleneck on CPU or Disk HDD/SSD

 69: 564.797302, 627.725098 avg loss, 0.000000 rate, 13.155384 seconds, 4416 images, 18.229368 hours left
Loaded: 0.864500 seconds - performance bottleneck on CPU or Disk HDD/SSD

 70: 558.637329, 620.816345 avg loss, 0.000000 rate, 13.119964 seconds, 4480 images, 18.191988 hours left
Resizing, random_coef = 1.40 

 352 x 352 
 try to allocate additional workspace_size = 17.84 MB 
 CUDA allocate done! 
Loaded: 12.187497 seconds - performance bottleneck on CPU or Disk HDD/SSD

 71: 546.981567, 613.432861 avg loss, 0.000000 rate, 11.937049 seconds, 4544 images, 18.162733 hours left
Loaded: 1.856610 seconds - performance bottleneck on CPU or Disk HDD/SSD

 72: 521.423401, 604.231934 avg loss, 0.000000 rate, 12.443593 seconds, 4608 images, 18.244398 hours left
Loaded: 1.578271 seconds - performance bottleneck on CPU or Disk HDD/SSD

 73: 522.601562, 596.068909 avg loss, 0.000000 rate, 12.076850 seconds, 4672 images, 18.217987 hours left
Loaded: 2.223196 seconds - performance bottleneck on CPU or Disk HDD/SSD

 74: 536.234497, 590.085449 avg loss, 0.000000 rate, 12.571398 seconds, 4736 images, 18.184762 hours left
Loaded: 2.031430 seconds - performance bottleneck on CPU or Disk HDD/SSD

 75: 535.130920, 584.589966 avg loss, 0.000000 rate, 12.038666 seconds, 4800 images, 18.164258 hours left
Loaded: 1.794378 seconds - performance bottleneck on CPU or Disk HDD/SSD

 76: 514.853882, 577.616333 avg loss, 0.000000 rate, 11.974046 seconds, 4864 images, 18.136020 hours left
Loaded: 1.697089 seconds - performance bottleneck on CPU or Disk HDD/SSD

 77: 526.297974, 572.484497 avg loss, 0.000000 rate, 11.480068 seconds, 4928 images, 18.104754 hours left
Loaded: 1.697949 seconds - performance bottleneck on CPU or Disk HDD/SSD

 78: 490.700012, 564.306030 avg loss, 0.000000 rate, 11.335089 seconds, 4992 images, 18.067302 hours left
Loaded: 2.953281 seconds - performance bottleneck on CPU or Disk HDD/SSD

 79: 501.574188, 558.032837 avg loss, 0.000000 rate, 12.094875 seconds, 5056 images, 18.028618 hours left
Loaded: 1.988960 seconds - performance bottleneck on CPU or Disk HDD/SSD

 80: 497.890320, 552.018555 avg loss, 0.000000 rate, 12.156406 seconds, 5120 images, 18.012232 hours left
Resizing, random_coef = 1.40 

 512 x 512 
 try to allocate additional workspace_size = 37.75 MB 
 CUDA allocate done! 
Loaded: 0.000036 seconds

 81: 722.876831, 569.104370 avg loss, 0.000000 rate, 15.557252 seconds, 5184 images, 17.986137 hours left
Loaded: 0.031648 seconds

 82: 716.952393, 583.889160 avg loss, 0.000000 rate, 16.634426 seconds, 5248 images, 17.975635 hours left
Loaded: 0.125553 seconds - performance bottleneck on CPU or Disk HDD/SSD

 83: 714.407166, 596.940979 avg loss, 0.000000 rate, 16.004780 seconds, 5312 images, 17.977262 hours left
Loaded: 0.000031 seconds

 84: 707.743164, 608.021179 avg loss, 0.000000 rate, 14.256392 seconds, 5376 images, 17.972997 hours left
Loaded: 0.000066 seconds

 85: 693.040283, 616.523071 avg loss, 0.000000 rate, 14.995873 seconds, 5440 images, 17.948345 hours left
Loaded: 0.000038 seconds

 86: 692.327393, 624.103516 avg loss, 0.000000 rate, 14.517589 seconds, 5504 images, 17.931943 hours left
Loaded: 0.000044 seconds

 87: 698.254028, 631.518555 avg loss, 0.000000 rate, 15.020326 seconds, 5568 images, 17.910462 hours left
Loaded: 0.000039 seconds

 88: 689.561523, 637.322876 avg loss, 0.000000 rate, 15.756309 seconds, 5632 images, 17.894621 hours left
Loaded: 0.000062 seconds

 89: 692.673706, 642.857971 avg loss, 0.000000 rate, 15.747179 seconds, 5696 images, 17.886894 hours left
Loaded: 0.000037 seconds

 90: 673.029907, 645.875183 avg loss, 0.000000 rate, 15.025270 seconds, 5760 images, 17.879102 hours left
Resizing, random_coef = 1.40 

 352 x 352 
 try to allocate additional workspace_size = 17.84 MB 
 CUDA allocate done! 
Loaded: 0.000035 seconds

 91: 466.763275, 627.963989 avg loss, 0.000000 rate, 13.640402 seconds, 5824 images, 17.863502 hours left
Loaded: 1.511812 seconds - performance bottleneck on CPU or Disk HDD/SSD

 92: 452.196533, 610.387268 avg loss, 0.000000 rate, 11.548206 seconds, 5888 images, 17.832981 hours left
Loaded: 1.991185 seconds - performance bottleneck on CPU or Disk HDD/SSD

 93: 444.995026, 593.848022 avg loss, 0.000000 rate, 12.678122 seconds, 5952 images, 17.796425 hours left
Loaded: 1.718803 seconds - performance bottleneck on CPU or Disk HDD/SSD

 94: 448.383820, 579.301575 avg loss, 0.000000 rate, 12.231447 seconds, 6016 images, 17.777664 hours left
Loaded: 1.949340 seconds - performance bottleneck on CPU or Disk HDD/SSD

 95: 443.578156, 565.729248 avg loss, 0.000000 rate, 11.777046 seconds, 6080 images, 17.751782 hours left
Loaded: 2.131954 seconds - performance bottleneck on CPU or Disk HDD/SSD

 96: 442.834869, 553.439819 avg loss, 0.000000 rate, 12.551112 seconds, 6144 images, 17.723158 hours left
Loaded: 1.515629 seconds - performance bottleneck on CPU or Disk HDD/SSD

 97: 427.801392, 540.875977 avg loss, 0.000000 rate, 12.274213 seconds, 6208 images, 17.705157 hours left
Loaded: 1.609368 seconds - performance bottleneck on CPU or Disk HDD/SSD

 98: 418.018494, 528.590210 avg loss, 0.000000 rate, 12.149747 seconds, 6272 images, 17.677610 hours left
Loaded: 1.603454 seconds - performance bottleneck on CPU or Disk HDD/SSD

 99: 422.383850, 517.969604 avg loss, 0.000000 rate, 12.306350 seconds, 6336 images, 17.649968 hours left
Loaded: 1.322300 seconds - performance bottleneck on CPU or Disk HDD/SSD

 100: 413.198303, 507.492462 avg loss, 0.000000 rate, 11.522801 seconds, 6400 images, 17.624197 hours left
Resizing, random_coef = 1.40 

 352 x 352 
 try to allocate additional workspace_size = 17.84 MB 
 CUDA allocate done! 
Loaded: 12.574291 seconds - performance bottleneck on CPU or Disk HDD/SSD

 101: 422.686554, 499.011871 avg loss, 0.000000 rate, 12.340167 seconds, 6464 images, 17.587112 hours left
Loaded: 2.421066 seconds - performance bottleneck on CPU or Disk HDD/SSD

 102: 420.139923, 491.124664 avg loss, 0.000000 rate, 12.455464 seconds, 6528 images, 17.681079 hours left
Loaded: 1.857835 seconds - performance bottleneck on CPU or Disk HDD/SSD

 103: 408.980499, 482.910248 avg loss, 0.000000 rate, 11.930477 seconds, 6592 images, 17.665896 hours left
Loaded: 1.917278 seconds - performance bottleneck on CPU or Disk HDD/SSD

 104: 398.709473, 474.490173 avg loss, 0.000000 rate, 12.613243 seconds, 6656 images, 17.638864 hours left
Loaded: 1.417423 seconds - performance bottleneck on CPU or Disk HDD/SSD

 105: 404.265930, 467.467743 avg loss, 0.000000 rate, 12.917417 seconds, 6720 images, 17.620154 hours left
Loaded: 3.098972 seconds - performance bottleneck on CPU or Disk HDD/SSD

 106: 425.404480, 463.261414 avg loss, 0.000000 rate, 12.962929 seconds, 6784 images, 17.599048 hours left
Loaded: 1.273338 seconds - performance bottleneck on CPU or Disk HDD/SSD

 107: 395.276794, 456.462952 avg loss, 0.000000 rate, 12.095340 seconds, 6848 images, 17.596795 hours left
Loaded: 1.492958 seconds - performance bottleneck on CPU or Disk HDD/SSD

 108: 386.038574, 449.420502 avg loss, 0.000000 rate, 12.558816 seconds, 6912 images, 17.565396 hours left
Loaded: 2.132636 seconds - performance bottleneck on CPU or Disk HDD/SSD

 109: 381.326233, 442.611084 avg loss, 0.000000 rate, 11.875530 seconds, 6976 images, 17.541657 hours left
Loaded: 2.678359 seconds - performance bottleneck on CPU or Disk HDD/SSD

 110: 391.584076, 437.508392 avg loss, 0.000000 rate, 12.308271 seconds, 7040 images, 17.517647 hours left
Resizing, random_coef = 1.40 

 512 x 512 
 try to allocate additional workspace_size = 37.75 MB 
 CUDA allocate done! 
Loaded: 0.000042 seconds

 111: 591.617981, 452.919342 avg loss, 0.000000 rate, 15.626101 seconds, 7104 images, 17.504409 hours left
Loaded: 0.185489 seconds - performance bottleneck on CPU or Disk HDD/SSD

 112: 580.289307, 465.656342 avg loss, 0.000000 rate, 15.716105 seconds, 7168 images, 17.498171 hours left
Loaded: 0.000031 seconds

 113: 577.830322, 476.873749 avg loss, 0.000000 rate, 16.487335 seconds, 7232 images, 17.494927 hours left
Loaded: 0.162565 seconds - performance bottleneck on CPU or Disk HDD/SSD

 114: 556.829712, 484.869354 avg loss, 0.000000 rate, 15.415988 seconds, 7296 images, 17.497996 hours left
Loaded: 0.000040 seconds

 115: 560.975586, 492.479980 avg loss, 0.000000 rate, 15.624894 seconds, 7360 images, 17.491178 hours left
Loaded: 0.000038 seconds

 116: 555.132751, 498.745270 avg loss, 0.000000 rate, 15.274993 seconds, 7424 images, 17.484886 hours left
Loaded: 0.000130 seconds

 117: 547.374756, 503.608215 avg loss, 0.000000 rate, 15.199473 seconds, 7488 images, 17.474838 hours left
Loaded: 0.000051 seconds

 118: 541.382385, 507.385620 avg loss, 0.000000 rate, 15.070405 seconds, 7552 images, 17.464034 hours left
Loaded: 0.000031 seconds

 119: 529.627075, 509.609772 avg loss, 0.000000 rate, 15.591688 seconds, 7616 images, 17.451904 hours left
Loaded: 0.000062 seconds

 120: 514.651978, 510.113983 avg loss, 0.000000 rate, 14.994060 seconds, 7680 images, 17.445473 hours left
Resizing, random_coef = 1.40 

 352 x 352 
 try to allocate additional workspace_size = 17.84 MB 
 CUDA allocate done! 
Loaded: 0.000036 seconds

 121: 346.383240, 493.740906 avg loss, 0.000000 rate, 13.098542 seconds, 7744 images, 17.432621 hours left
Loaded: 3.479395 seconds - performance bottleneck on CPU or Disk HDD/SSD

 122: 339.327820, 478.299591 avg loss, 0.000000 rate, 13.120226 seconds, 7808 images, 17.399434 hours left
Loaded: 1.691906 seconds - performance bottleneck on CPU or Disk HDD/SSD

 123: 317.450928, 462.214722 avg loss, 0.000000 rate, 12.203648 seconds, 7872 images, 17.404255 hours left
Loaded: 2.306263 seconds - performance bottleneck on CPU or Disk HDD/SSD

 124: 319.163086, 447.909546 avg loss, 0.000000 rate, 11.503239 seconds, 7936 images, 17.379861 hours left
Loaded: 2.789009 seconds - performance bottleneck on CPU or Disk HDD/SSD

 125: 315.997620, 434.718353 avg loss, 0.000000 rate, 11.744388 seconds, 8000 images, 17.354745 hours left
Loaded: 3.247220 seconds - performance bottleneck on CPU or Disk HDD/SSD

 126: 318.260498, 423.072571 avg loss, 0.000000 rate, 12.337262 seconds, 8064 images, 17.337635 hours left
Loaded: 2.695369 seconds - performance bottleneck on CPU or Disk HDD/SSD

 127: 316.035950, 412.368896 avg loss, 0.000000 rate, 12.173175 seconds, 8128 images, 17.331965 hours left
Loaded: 1.661110 seconds - performance bottleneck on CPU or Disk HDD/SSD

 128: 299.784821, 401.110474 avg loss, 0.000000 rate, 12.262478 seconds, 8192 images, 17.318608 hours left
Loaded: 1.906057 seconds - performance bottleneck on CPU or Disk HDD/SSD

 129: 302.260925, 391.225525 avg loss, 0.000000 rate, 12.391744 seconds, 8256 images, 17.295662 hours left
Loaded: 1.984501 seconds - performance bottleneck on CPU or Disk HDD/SSD

 130: 292.830200, 381.385986 avg loss, 0.000000 rate, 11.435034 seconds, 8320 images, 17.276447 hours left
Resizing, random_coef = 1.40 

 576 x 576 
 try to allocate additional workspace_size = 47.78 MB 
 CUDA allocate done! 
Loaded: 0.000046 seconds

 131: 536.737061, 396.921082 avg loss, 0.000000 rate, 17.287810 seconds, 8384 images, 17.247944 hours left
Loaded: 0.000039 seconds

 132: 517.896423, 409.018616 avg loss, 0.000000 rate, 17.411747 seconds, 8448 images, 17.261261 hours left
Loaded: 0.000063 seconds

 133: 517.750610, 419.891815 avg loss, 0.000000 rate, 16.406959 seconds, 8512 images, 17.275729 hours left
Loaded: 0.000029 seconds

 134: 505.877380, 428.490356 avg loss, 0.000000 rate, 16.472390 seconds, 8576 images, 17.279210 hours left
Loaded: 0.000068 seconds

 135: 503.899658, 436.031281 avg loss, 0.000000 rate, 15.432223 seconds, 8640 images, 17.283314 hours left
Loaded: 0.000042 seconds

 136: 468.731018, 439.301270 avg loss, 0.000000 rate, 15.642111 seconds, 8704 images, 17.276164 hours left
Loaded: 0.000050 seconds

 137: 475.806854, 442.951843 avg loss, 0.000000 rate, 15.924224 seconds, 8768 images, 17.271295 hours left
Loaded: 0.000038 seconds

 138: 435.104828, 442.167145 avg loss, 0.000000 rate, 15.203090 seconds, 8832 images, 17.269458 hours left
Loaded: 0.000051 seconds

 139: 435.030212, 441.453461 avg loss, 0.000000 rate, 14.737855 seconds, 8896 images, 17.259860 hours left
Loaded: 0.000035 seconds

 140: 430.544922, 440.362610 avg loss, 0.000000 rate, 15.555245 seconds, 8960 images, 17.245325 hours left
Resizing, random_coef = 1.40 

 384 x 384 
 try to allocate additional workspace_size = 21.23 MB 
 CUDA allocate done! 
Loaded: 0.000033 seconds

 141: 259.435455, 422.269897 avg loss, 0.000000 rate, 13.830828 seconds, 9024 images, 17.239660 hours left
Loaded: 3.217208 seconds - performance bottleneck on CPU or Disk HDD/SSD

 142: 257.376404, 405.780548 avg loss, 0.000000 rate, 13.733285 seconds, 9088 images, 17.215522 hours left
Loaded: 0.823767 seconds - performance bottleneck on CPU or Disk HDD/SSD

 143: 250.167648, 390.219269 avg loss, 0.000000 rate, 12.717095 seconds, 9152 images, 17.225020 hours left
Loaded: 1.308737 seconds - performance bottleneck on CPU or Disk HDD/SSD

 144: 257.587219, 376.956055 avg loss, 0.000000 rate, 14.274990 seconds, 9216 images, 17.197847 hours left
Loaded: 1.301508 seconds - performance bottleneck on CPU or Disk HDD/SSD

 145: 247.826492, 364.043091 avg loss, 0.000000 rate, 13.668226 seconds, 9280 images, 17.192788 hours left
Loaded: 0.878806 seconds - performance bottleneck on CPU or Disk HDD/SSD

 146: 250.368637, 352.675659 avg loss, 0.000000 rate, 13.430761 seconds, 9344 images, 17.181161 hours left
Loaded: 0.475596 seconds - performance bottleneck on CPU or Disk HDD/SSD

 147: 228.536316, 340.261719 avg loss, 0.000000 rate, 13.001029 seconds, 9408 images, 17.162542 hours left
Loaded: 1.078036 seconds - performance bottleneck on CPU or Disk HDD/SSD

 148: 223.889694, 328.624512 avg loss, 0.000000 rate, 12.987791 seconds, 9472 images, 17.135154 hours left
Loaded: 0.590001 seconds - performance bottleneck on CPU or Disk HDD/SSD

 149: 218.829437, 317.645020 avg loss, 0.000000 rate, 13.177049 seconds, 9536 images, 17.114307 hours left
Loaded: 0.928036 seconds - performance bottleneck on CPU or Disk HDD/SSD

 150: 222.473373, 308.127869 avg loss, 0.000001 rate, 12.604869 seconds, 9600 images, 17.090433 hours left
Resizing, random_coef = 1.40 

 512 x 512 
 try to allocate additional workspace_size = 37.75 MB 
 CUDA allocate done! 
Loaded: 0.000040 seconds

 151: 278.791138, 305.194183 avg loss, 0.000001 rate, 14.935591 seconds, 9664 images, 17.064256 hours left
Loaded: 0.058206 seconds

 152: 270.419800, 301.716736 avg loss, 0.000001 rate, 15.117169 seconds, 9728 images, 17.053301 hours left
Loaded: 0.000165 seconds

 153: 271.630493, 298.708099 avg loss, 0.000001 rate, 15.113211 seconds, 9792 images, 17.044976 hours left
Loaded: 0.000071 seconds

 154: 262.927490, 295.130035 avg loss, 0.000001 rate, 14.934310 seconds, 9856 images, 17.036029 hours left
Loaded: 0.000057 seconds

 155: 256.062805, 291.223328 avg loss, 0.000001 rate, 15.350188 seconds, 9920 images, 17.025218 hours left
Loaded: 0.000134 seconds

 156: 242.387054, 286.339691 avg loss, 0.000001 rate, 15.218376 seconds, 9984 images, 17.018915 hours left
Loaded: 0.000036 seconds

 157: 244.575897, 282.163300 avg loss, 0.000001 rate, 14.524953 seconds, 10048 images, 17.011226 hours left
Loaded: 0.102205 seconds - performance bottleneck on CPU or Disk HDD/SSD

 158: 229.659668, 276.912933 avg loss, 0.000001 rate, 14.724908 seconds, 10112 images, 16.996169 hours left
Loaded: 0.000037 seconds

 159: 231.606628, 272.382294 avg loss, 0.000001 rate, 14.991273 seconds, 10176 images, 16.984445 hours left
Loaded: 0.000035 seconds

 160: 221.785172, 267.322571 avg loss, 0.000001 rate, 14.600964 seconds, 10240 images, 16.974550 hours left
Resizing, random_coef = 1.40 

 544 x 544 
 try to allocate additional workspace_size = 42.61 MB 
 CUDA allocate done! 
Loaded: 0.000038 seconds

 161: 232.056107, 263.795929 avg loss, 0.000001 rate, 17.695390 seconds, 10304 images, 16.960549 hours left
Loaded: 0.000039 seconds

 162: 226.635498, 260.079895 avg loss, 0.000001 rate, 17.313188 seconds, 10368 images, 16.979646 hours left
Loaded: 0.000060 seconds

 163: 219.116028, 255.983505 avg loss, 0.000001 rate, 16.956160 seconds, 10432 images, 16.994428 hours left
Loaded: 0.000188 seconds

 164: 217.835037, 252.168655 avg loss, 0.000001 rate, 18.294337 seconds, 10496 images, 17.005209 hours left
Loaded: 0.000163 seconds

 165: 213.502472, 248.302032 avg loss, 0.000001 rate, 15.455855 seconds, 10560 images, 17.030095 hours left
Loaded: 0.000047 seconds

 166: 218.979858, 245.369812 avg loss, 0.000001 rate, 17.328986 seconds, 10624 images, 17.024444 hours left
Loaded: 0.075878 seconds

 167: 204.260437, 241.258881 avg loss, 0.000001 rate, 16.347495 seconds, 10688 images, 17.038754 hours left
Loaded: 0.000038 seconds

 168: 196.338226, 236.766815 avg loss, 0.000001 rate, 18.085547 seconds, 10752 images, 17.043230 hours left
Loaded: 0.055135 seconds

 169: 193.910706, 232.481201 avg loss, 0.000001 rate, 17.676848 seconds, 10816 images, 17.065309 hours left
 CUDA-version: 10010 (10010), GPU count: 1  
 OpenCV isn't used - data augmentation will be slow 
 0 : compute_capability = 600, cudnn_half = 0, GPU: Tesla P100-PCIE-16GB 
   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  16 0.150 BF
   1 max                2x 2/ 2    416 x 416 x  16 ->  208 x 208 x  16 0.003 BF
   2 conv     32       3 x 3/ 1    208 x 208 x  16 ->  208 x 208 x  32 0.399 BF
   3 max                2x 2/ 2    208 x 208 x  32 ->  104 x 104 x  32 0.001 BF
   4 conv     64       3 x 3/ 1    104 x 104 x  32 ->  104 x 104 x  64 0.399 BF
   5 max                2x 2/ 2    104 x 104 x  64 ->   52 x  52 x  64 0.001 BF
   6 conv    128       3 x 3/ 1     52 x  52 x  64 ->   52 x  52 x 128 0.399 BF
   7 max                2x 2/ 2     52 x  52 x 128 ->   26 x  26 x 128 0.000 BF
   8 conv    256       3 x 3/ 1     26 x  26 x 128 ->   26 x  26 x 256 0.399 BF
   9 max                2x 2/ 2     26 x  26 x 256 ->   13 x  13 x 256 0.000 BF
  10 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  11 max                2x 2/ 1     13 x  13 x 512 ->   13 x  13 x 512 0.000 BF
  12 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  13 conv    256       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 256 0.089 BF
  14 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  15 conv     24       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x  24 0.004 BF
  16 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
  17 route  13 		                           ->   13 x  13 x 256 
  18 conv    128       1 x 1/ 1     13 x  13 x 256 ->   13 x  13 x 128 0.011 BF
  19 upsample                 2x    13 x  13 x 128 ->   26 x  26 x 128
  20 route  19 8 	                           ->   26 x  26 x 384 
  21 conv    256       3 x 3/ 1     26 x  26 x 384 ->   26 x  26 x 256 1.196 BF
  22 conv     24       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x  24 0.008 BF
  23 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 5.451 
avg_outputs = 325268 
 Allocate additional workspace_size = 24.92 MB 
 Create 64 permanent cpu-threads 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.519873, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 308.169128, iou_loss = 0.000000, total_loss = 308.169128 

・・・

bash: line 2:  1088 Killed                  ./darknet detector train kome/data.txt kome/yolov3-tiny-train.cfg

修正後に実行した時のログ、エラーの内容が変わりました。

 CUDNN_HALF=1 
yolov3-tiny-train
net.optimized_memory = 0 
mini_batch = 32, batch = 64, time_steps = 1, train = 1 
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
 Detection layer: 16 - type = 27 
 Detection layer: 23 - type = 27 
 If error occurs - run training with flag: -dont_show 
 CUDA-version: 10010 (10010), cuDNN: 7.6.5, CUDNN_HALF=1, GPU count: 1  
 OpenCV version: 3.2.0
 0 : compute_capability = 600, cudnn_half = 0, GPU: Tesla P100-PCIE-16GB 
   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  16 0.150 BF
   1 max                2x 2/ 2    416 x 416 x  16 ->  208 x 208 x  16 0.003 BF
   2 conv     32       3 x 3/ 1    208 x 208 x  16 ->  208 x 208 x  32 0.399 BF
   3 max                2x 2/ 2    208 x 208 x  32 ->  104 x 104 x  32 0.001 BF
   4 conv     64       3 x 3/ 1    104 x 104 x  32 ->  104 x 104 x  64 0.399 BF
   5 max                2x 2/ 2    104 x 104 x  64 ->   52 x  52 x  64 0.001 BF
   6 conv    128       3 x 3/ 1     52 x  52 x  64 ->   52 x  52 x 128 0.399 BF
   7 max                2x 2/ 2     52 x  52 x 128 ->   26 x  26 x 128 0.000 BF
   8 conv    256       3 x 3/ 1     26 x  26 x 128 ->   26 x  26 x 256 0.399 BF
   9 max                2x 2/ 2     26 x  26 x 256 ->   13 x  13 x 256 0.000 BF
  10 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  11 max                2x 2/ 1     13 x  13 x 512 ->   13 x  13 x 512 0.000 BF
  12 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  13 conv    256       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 256 0.089 BF
  14 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  15 conv     24       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x  24 0.004 BF
  16 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
  17 route  13 		                           ->   13 x  13 x 256 
  18 conv    128       1 x 1/ 1     13 x  13 x 256 ->   13 x  13 x 128 0.011 BF
  19 upsample                 2x    13 x  13 x 128 ->   26 x  26 x 128
  20 route  19 8 	                           ->   26 x  26 x 384 
  21 conv    256       3 x 3/ 1     26 x  26 x 384 ->   26 x  26 x 256 1.196 BF
  22 conv     24       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x  24 0.008 BF
  23 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 5.451 
avg_outputs = 325268 
 Allocate additional workspace_size = 1264.58 MB 
Unable to init server: Could not connect: Connection refused

(chart_yolov3-tiny-train.png:8192): Gtk-WARNING **: 03:42:20.404: cannot open display: 

github.com

"-dont_show"フラグを設定するといいらしい。ついでに、コンソールログをgoogle driveへ出力するようにしておくと途中経過も観察できて便利。

./darknet detector train kome/data.txt kome/yolov3-tiny-train.cfg -dont_show > /content/Drive/console.log

これで3クラス(合計2400サンプル)の学習が4時間で終了しました。