Acessibilidade / Reportar erro

TARGET TRACKING IN COMPLEX SCENES BASED ON COMPUTER VISION

RASTREAMENTO DE ALVOS EM CENAS COMPLEXAS COM BASE NA VISÃO COMPUTADORIZADA

SEGUIMIENTO DE BLANCOS EN ESCENAS COMPLEJAS CON BASE EN LA VISIÓN COMPUTADORIZADA

ABSTRACT

Objective:

Use the deep learning network model to identify key content in videos.

Methodology:

After reviewing the literature on computer vision, the feature extraction of the target video from the network using deep learning with the time-series data enhancement method was performed. The preprocessing method for data augmentation and Spatio-temporal feature extraction on the video based on LI3D network was explained. Accuracy rate, precision, and recall were used as indices.

Results:

The three indicators increased from 0.85, 0.88, and 0.84 to 0.89, 0.90, and 0.88, respectively. This shows that the LI3D network model maintains a high recall rate accompanied by high accuracy after data augmentation. The accuracy and loss function curves of the training phase show that the accuracy of the network is greatly improved compared to I3D.

Conclusion:

The experiment proves that the LI3D model is more stable and has faster convergence. By comparing the accuracy curve and loss function curve during LI3D, LI3D-LSTM, and LI3D-BiLSTM training, it is found that the LI3D-BiLSTM model converges faster. Level of evidence II; Therapeutic studies - investigation of treatment results.

Keywords:
Computers; Computer Vision Systems; Public Health

Sociedade Brasileira de Medicina do Exercício e do Esporte Av. Brigadeiro Luís Antônio, 278, 6º and., 01318-901 São Paulo SP, Tel.: +55 11 3106-7544, Fax: +55 11 3106-8611 - São Paulo - SP - Brazil
E-mail: atharbme@uol.com.br