Two papers presented at ICIP2024 (One received Best Paper Award 1st Runner-up)

Created

August 14, 2024

Tags

PaperComputer VisionMachine LearningCross-modal

Updated

October 31, 2024

We are pleased to announce that the following two papers have been accepted to IEEE International Conference on Image Processing (ICIP2024). Also, one of those papers has been selected as a Best Paper Award 1st Runner-up.

Yu Mitsuzumi, Akisato Kimura, Go Irie, Atsushi Nakazawa, “Cross-action cross-subject skeleton action recognition via simultaneous action-subject learning with two-step feature removal,” IEEE International Conference on Image Processing (ICIP), 2024. (oral, received Best Paper Award 1st Runner-up)

In this paper, we tackle a novel skeleton-based action recognition problem named Cross-Action Cross-Subject (CACS) Skeleton Action Recognition, where we can access the data of only a part of the target action classes for each training subject.

Existing skeleton-based action recognition methods suffer from solving this problem because there are scarce clues to resolve the cross-entanglement of action and subject information, and the trained model will confuse those two features.

To solve this challenging problem, we propose a method that consists of simultaneous action-subject learning with feature removal. In our method,

we use two data augmentation techniques, Bone Randomization and Phase Randomization, to roughly remove unnecessary features for respective recognitions, and then,
we introduce a debiased learning approach to remove the confusing features by minimizing mutual information with an action-subject-shared discriminator network.

Extensive experiments on three datasets demonstrate that our method is consistently effective for several CACS problems.

Junpei Homma, Akisato Kimura, Go Irie, “Estimating indoor scene depth maps from ultrasonic echoes,” IEEE International Conference on Image Processing (ICIP), 2024.

Measuring 3D geometric structures of indoor scenes requires dedicated depth sensors, which are not always available. Echo-based depth estimation has recently been studied as a promising alternative solution.

All previous studies have assumed the use of echoes in the audible range. However, one major problem is that audible echoes cannot be used in quiet spaces or other situations where producing audible sounds is prohibited.

In this paper, we consider echo-based depth estimation using inaudible ultrasonic echoes. While ultrasonic waves provide high measurement accuracy in theory, the actual depth estimation accuracy when ultrasonic echoes are used has remained unclear, due to its disadvantage of being sensitive to noise and susceptible to attenuation.

We first investigate the depth estimation accuracy when the frequency of the sound source is restricted to the high-frequency band, and found that the accuracy decreased when the frequency was limited to ultrasonic ranges.

Based on this observation, we propose a novel deep learning method to improve the accuracy of ultrasonic echo-based depth estimation by using audible echoes as auxiliary data only during training. Experimental results with a public dataset demonstrate that our method improves the estimation accuracy.