Abstract. This article provides a review of the literature and existing research in recent years on the topic, describes the tasks associated with recognizing and predicting human movements
Download 92.98 Kb.
|
Maqola 3
Model parameters. As an example, consider the action of “grabbing an object,” which means that the person in the video is interacting with some object. 71 transitions to action were filmed, from which 142 samples of the beginning and end of action were formed. Since the actions were filmed from two angles, the total number of transitions was 284. Each transition is taken with its time neighborhood and is called a slice. The length of the slice is the radius of the neighborhood of the moment of action transition. All relevant data from all videos is combined into one dataset. The training example is a window of length X frames. It may contain one or two transitions or may have no transitions at all, it all depends on the length of the slice and the length of the window. In order to achieve greater accuracy, a series of training procedures with different parameters were performed. When dividing the data into slices of 15 frames in size (half a second before and after the transition), we get a dataset with 76 columns (75 OpenPose data, 1 markup column for the predicted action “grabbing an object”). With a window length of 10 frames, the results are unsatisfactory: the error in determining the start frame of an action using the accuracy metric is 0.571. In Fig. 7, green dots indicate the submitted data, red crosses indicate model predictions. A change in the graph means a transition from zero to one or vice versa, that is, the end or beginning of an action. The model expects a transition where there is none. Predictions for window sizes equal to 12 (accuracy equal to 0.5781) and 15 (accuracy equal to 0.6115) are almost similar. Thus, it is clear that the size of the example slice should be larger.[9]
Conclusion The preprint provides a review of the literature and existing research in recent years on the research topic, describes current problems associated with the recognition and prediction of human movements, as well as the course of action and detailed steps in preparing the dataset and training the model. A method for predicting human physical activity using a neural network has been designed and implemented. As input data to the neural network, the marking of video data was used, which recorded the possible actions of a worker at the enterprise. As a result, a convolutional neural network was created, capable, under certain parameters that were selected experimentally, of predicting actions and detecting transitions between them with an accuracy of up to 87%. The final assessment of the quality of the network is presented in Table 1. In the future, it is possible to use the research results for more advanced interaction with assistive devices in enterprises. Download 92.98 Kb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling