Two papers accepted to ICIP2024

Two papers accepted to ICIP2024

Created
August 14, 2024
Tags
PaperComputer VisionMachine LearningCross-modal
Updated
August 14, 2024

We are pleased to announce that the following two papers have been accepted to IEEE International Conference on Image Processing (ICIP2024).

Junpei Homma, Akisato Kimura, Go Irie, “Estimating indoor scene depth maps from ultrasonic echoes,” IEEE International Conference on Image Processing (ICIP), 2024.
image

Measuring 3D geometric structures of indoor scenes requires dedicated depth sensors, which are not always available. Echo-based depth estimation has recently been studied as a promising alternative solution.

All previous studies have assumed the use of echoes in the audible range. However, one major problem is that audible echoes cannot be used in quiet spaces or other situations where producing audible sounds is prohibited.

image

In this paper, we consider echo-based depth estimation using inaudible ultrasonic echoes. While ultrasonic waves provide high measurement accuracy in theory, the actual depth estimation accuracy when ultrasonic echoes are used has remained unclear, due to its disadvantage of being sensitive to noise and susceptible to attenuation.

image

We first investigate the depth estimation accuracy when the frequency of the sound source is restricted to the high-frequency band, and found that the accuracy decreased when the frequency was limited to ultrasonic ranges.

Based on this observation, we propose a novel deep learning method to improve the accuracy of ultrasonic echo-based depth estimation by using audible echoes as auxiliary data only during training. Experimental results with a public dataset demonstrate that our method improves the estimation accuracy.

image
Yu Mitsuzumi, Akisato Kimura, Go Irie, Atsushi Nakazawa, “Cross-action cross-subject skeleton action recognition via simultaneous action-subject learning with two-step feature removal,” IEEE International Conference on Image Processing (ICIP), 2024. (oral)
image

In this paper, we tackle a novel skeleton-based action recognition problem named Cross-Action Cross-Subject (CACS) Skeleton Action Recognition, where we can access the data of only a part of the target action classes for each training subject.

Existing skeleton-based action recognition methods suffer from solving this problem because there are scarce clues to resolve the cross-entanglement of action and subject information, and the trained model will confuse those two features.

image

To solve this challenging problem, we propose a method that consists of simultaneous action-subject learning with feature removal. In our method,

  1. we use two data augmentation techniques, Bone Randomization and Phase Randomization, to roughly remove unnecessary features for respective recognitions, and then,
  2. we introduce a debiased learning approach to remove the confusing features by minimizing mutual information with an action-subject-shared discriminator network.

Extensive experiments on three datasets demonstrate that our method is consistently effective for several CACS problems.