The proposed approach for pose estimation is based on the construction of a Convolutional Neural Network with an encodingdecoding structure and a spatial pyramid based on WASP structure in its bottleneck and a Discrete wavelet transform encoder. These techniques already shown their capabilities to solve the main problems in state of the art related to: different Field of view (FoV) required to analyze the different possible sizes of a specific subject. we want to solve the faulty structure of the modern CNN based Neural Networks in the encoding part using DWT encoder and WASP. This Work also have the objective of demonstrating from a more general point of view which could be the advantages of a Discrete Wavelet Transform (DWT) encoder in any CNN-based approach for Pose Estimation and Object detection in any form, such as for several subjects in the same image or in the internal video due to the almost redundant use of the usual most famous encoding structures for CNN such as ResNet-101, U-Net or VGG16-19. we will do our tests using a U-net Based CNN in order to evaluate the importance of the results of the Discrete Wavelet Transform encoder also in the decoding part through the cropping of theme at the last layers of the network. This is necessary due to the loss of border’s pixels during encoding that could be useful for the result’s evaluation.
2022, SYSYEM 2022: 8th Scholar’s Yearly Symposium of Technology, Engineering and Mathematics, Brunek, July 23, 2022, Pages -
A Novel DWT-based Encoder for Human Pose Estimation (04b Atto di convegno in volume)
DE MAGISTRIS Giorgio, Romano Matteo, Starczewski Janusz, Napoli Christian
Gruppo di ricerca: Artificial Intelligence and Robotics