We are pleased to announce that our paper “Phase randomization: A data augmentation for domain adaptation in human action recognition” has been published to Pattern Recognition.
Yu Mitsuzumi, Go Irie, Akisato Kimura, Atsushi Nakazawa, “Phase randomization: A data augmentation for domain adaptation in human action recognition,” Pattern Recognition, November 2023.
Human action recognition is gaining increasing attention as it has many potential applications, including video surveillance and human-computer interaction.
A key problem in human action recognition is that an action is dependent on its performer; two motion sequences performed by two different subjects will be significantly different from each other, even if they are within the same category of actions. This subject dependency of motion data often degrades action classification performance.
Unfortunately, this is difficult to achieve when only a limited amount of ground truth annotations is available. In such situations, it is unrealistic to assume that class labels are given to the data for all possible combinations of classes and subjects.
The above figure briefly illustrates an example of this issue. Here, each subject in the class labeled data performs a motion of a different class, e.g., subject 1 performs the motion of drinking water while subject 4 performs the motion of putting on glasses. Let us consider that the action classifier trained on these labeled data recognizes the unlabeled data of put on glasses motion performed by subject 1. As can be seen in the figure, the data are similar to the labeled data of the drink water motion, which shares the same subject individuality, even though the recorded actions are different. As a result, the trained action classifier mistakenly recognizes the input data as the drink water motion.
In this paper, we propose a data-efficient domain adaptation approach to learning a subject-agnostic action recognition classifier. The core component of our approach is a novel data augmentation called Phase Randomization.
On the basis of the observation that individual body size is highly correlated with the amplitude component of the motion sequence, we disentangle the individuality and action features by using contrastive self-supervised learning with data augmentation that randomizes only the phase component of the motion sequence. This enables us to estimate the subject label of each motion sequence and to train a subject-agnostic action recognition classifier by performing adversarial learning with the estimated subject labels.
We empirically demonstrate the superiority of our method on two different action recognition tasks, namely skeleton-based action recognition and sensor-based activity recognition.
The published paper can be found at https://doi.org/10.1016/j.patcog.2023.110051.