Multi-Resolution Localization of Individual Logs in Wooden Piles Utilizing YOLO with Tiling on Client/Server Architectures

  • Christoph Praschl ,
  • Philipp Auersperg-Castell,
  • Brigitte Forster-Heinlein, 
  • Gerald Adam Zwettler 
  • a,d Research Group Advanced Information Systems and Technology, Research and Development Department,
    University of Applied Sciences Upper Austria, Softwarepark 11, Hagenberg, 4232, Austria 
  • Department of Software Engineering, School of Informatics, Communications and Media, University of ApplieSciences Upper Austria, Softwarepark 11, Hagenberg, 4232, Austria 
  • 3Bluedynamics Auersperg-Castell KG, Kritzing 31, 4785, Freinberg, Austria
  • b,c  Faculty of Computer Science and Mathematics, University of Passau, Innstraße 43, Passau, 94032, Germany
Cite as
Praschl C., Auersperg-Castell P., Forster-Heinlein B., Zwettler G.A. (2021). Multi-Resolution Localization of Individual Logs in Wooden Piles Utilizing YOLO with Tiling on Client/Server Architectures. Proceedings of the 33rd European Modeling & Simulation Symposium (EMSS 2021), pp. 307-314. DOI: https://doi.org/10.46354/i3m.2021.emss.042

Abstract

In industrial domains with time and cost intensive manual or semi-automated inspection the demand for automation is high. Utilizing state of the art deep learning models for localization in vision-based domains such as wood log analysis, the precision can be increased while reducing the demand for manual inspection. In this paper a YOLO network is trained on wood log images to allow for detection of single wood piles in images with hundreds and thousands instances. Due to the high variability in scale and large amount of wood logs within the images, common YOLO architectures are not applicable. Thus, tiling is necessitated to implicitly form a multi-resolution image pyramid. Due to lack in training data, besides common data augmentation modelling of different seasonal and weather conditions is applied. The wood log detection process can be run on a client/server architecture to allow for both, preview and refined results. Evaluation on real-world data sets shows an log detection accuracy of 82,9% utilizing a tiny YOLO model and 94,1% with a fully connected YOLO model, respectively.

References

  1. Adams, R. and Bischof, L. (1994). Seeded region growing. IEEE Transactions on pattern analysis and machine intelligence, 16(6):641–647. 
  2. Auersperg-Castell, P. (2018). Photooptische holzpoltervermessung mittels haar-kaskaden. Bachelor’s thesis, University of Passau, Ger many. 
  3. Ballard, D. (1981). Generalizing the hough trans form to detect arbitrary shapes. Pattern Recog nition, 13(2):111–122. 
  4. Bochkovskiy, A. (2021). Alexeyab/darknet. Bochkovskiy, A., Wang, C.-Y., and Liao, H.- Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. 
  5. Bolya, D., Zhou, C., Xiao, F., and Lee, Y. J. (2019). Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF International Confer ence on Computer Vision, pages 9157–9166. 
  6. Cengil, E. and Çinar, A. (2017). Comparison of hog (histogram of oriented gradients) and haar cascade algorithms with a convolutional neu ral network based face detection approach. In ternational Journal of Advance Research, Ideas and Innovations in Technology, 3:244–255. 
  7. Chen, W., Chen, S., Guo, H., and Ni, X. (2020). Welding flame detection based on color recog nition and progressive probabilistic hough transform. Concurrency and Computation: Prac tice and Experience, 32(19):e5815. 
  8. Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In In CVPR, pages 886–893. 
  9. Daugman, J. (2006). Probing the uniqueness and randomness of iriscodes: Results from 200 bil lion iris pair comparisons. Proceedings of the IEEE, 94(11):1927–1935. 
  10. Elsken, T., Metzen, J. H., and Hutter, F. (2019). Neural architecture search: A survey.
  11. Fadnavis, S. (2014). Image interpolation techniques in digital image processing: an
    overview. International Journal of Engineering Research and Applications, 4(10):70–73.
  12. Murata, K., Ito, E., and Fujimoto, T. (2018). A proposal for "infinite scale" ruler application that provides analog-like "sensory reality". In 2018 7th International Congress on Advanced Ap plied Informatics (IIAI-AAI), pages 925–928. 
  13. Ojala, T., Pietikäinen, M., and Harwood, D. (1994). Performance evaluation of texture mea sures with classification based on kullback dis crimination of distributions. Proceedings of 12th International Conference on Pattern Recognition, 1:582–585 vol.1. 
  14. Paredes-Astudillo, Y. A., Jimenez, J.-F., Zambrano-Rey, G., and Trentesaux, D. (2020). Human-machine cooperation for the distributed control of a hybrid control architecture. In Borangiu, T., Trentesaux, D., Leitão, P., Giret Boggino, A., and Botti, V., editors, Service Oriented, Holonic and Multi-agent Manufacturing Systems for Industry of the Future, pages 98–110, Cham. Springer International Publishing. 
  15. Redmon, J., Divvala, S. K., Girshick, R. B., and Farhadi, A. (2015). You only look once: Unified, real-time object detection. CoRR, abs/1506.02640. 
  16. Reithmeier, L., Krauss, O., and Zwettler, G. A. (2021). Transfer learning and hyperparameter optimization for instance segmentation with rgb-d images in reflective elevator environ ments. In Proc. of the WSCG2021. 
  17. Reitinger, B., Werlberger, P., Bornik, A., Beichel, R., and Schmalstieg, D. (2005). Spatial mea surements for medical augmented reality. In Fourth IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR’05), pages 208–209. IEEE. 
  18. Sakinis, T., Milletari, F., Roth, H., Korfiatis, P., Kostandy, P., Philbrick, K., Akkus, Z., Xu, Z., Xu, D., and Erickson, B. J. (2019). Interactive segmentation of medical images through fully convolutional neural networks. 
  19. Schmucker, M., Igel, C., and Haag, M. (2019). Evaluation of depth cameras for use as an aug mented reality emergency ruler. In dHealth, pages 17–24. 
  20. Viola, P. and Jones, M. (2001). Rapid object de tection using a boosted cascade of simple fea 
  21. Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y. M. (2020). Scaled-yolov4: Scaling cross stage par tial network. arXiv preprint arXiv:2011.08036. 
  22. Zwettler, G., Backfrieder, W., Karwoski, R., and Holmes, D. (2021). Generic user-guided in teraction paradigm for precise post-slice-wise processing of tomographic deep learning seg mentations utilizing graph cut and graph seg mentation. In VISIGRAPP. 
  23. Zwettler, G., Holmes III, D., and Backfrieder, W. (2020). Strategies for training deep learning models in medical domains with small refer ence datasets. Journal of WSCG, 28(1-2):37–46.