Entity

Time filter

Source Type


Liu C.,ATR Intelligent Robotics and Communication Labs. | Ishi C.T.,ATR Intelligent Robotics and Communication Labs. | Ishiguro H.,ATR Hiroshi Ishiguro Labs.
ACM/IEEE International Conference on Human-Robot Interaction | Year: 2015

In a tele-operated robot system, the reproduction of auditory scenes, conveying 3D spatial information of sound sources in the remote robot environment, is important for the transmission of remote presence to the tele-operator. We proposed a tele-presence system which is able to reproduce and manipulate the auditory scenes of a remote robot environment, based on the spatial information of human voices around the robot, matched with the operator's head orientation. In the robot side, voice sources are localized and separated by using multiple microphone arrays and human tracking technologies, while in the operator side, the operator's head movement is tracked and used to relocate the spatial positions of the separated sources. Interaction experiments with humans in the robot environment indicated that the proposed system had significantly higher accuracy rates for perceived direction of sounds, and higher subjective scores for sense of presence and listenability, compared to a baseline system using stereo binaural sounds obtained by two microphones located at the humanoid robot's ears. We also proposed three different user interfaces for augmented auditory scene control. Evaluation results indicated higher subjective scores for sense of presence and usability in two of the interfaces (control of voice amplitudes based on virtual robot positioning, and amplification of voices in the frontal direction). © 2015 ACM. Source


Ishi C.T.,ATR Intelligent Robotics and Communication Labs. | Liu C.,Osaka University | Ishiguro H.,Osaka University | Hagita N.,ATR Intelligent Robotics and Communication Labs.
IEEE International Conference on Intelligent Robots and Systems | Year: 2012

Generating natural motion in robots is important for improving human-robot interaction. We developed a tele-operation system where the lip motion of a remote humanoid robot is automatically controlled from the operator's voice. In the present work, we introduce an improved version of our proposed speech-driven lip motion generation method, where lip height and width degrees are estimated based on vowel formant information. The method requires the calibration of only one parameter for speaker normalization. Lip height control is evaluated in two types of humanoid robots (Telenoid-R2 and Geminoid-F). Subjective evaluation indicated that the proposed audio-based method can generate lip motion with naturalness superior to vision-based and motion capture-based approaches. Partial lip width control was shown to improve lip motion naturalness in Geminoid-F, which also has an actuator for stretching the lip corners. Issues regarding online real-time processing are also discussed. © 2012 IEEE. Source


Liu C.,ATR Intelligent Robotics and Communication Labs. | Ishi C.T.,ATR Intelligent Robotics and Communication Labs. | Ishiguro H.,ATR Hiroshi Ishiguro Labs. | Hagita N.,ATR Intelligent Robotics and Communication Labs.
HRI'12 - Proceedings of the 7th Annual ACM/IEEE International Conference on Human-Robot Interaction | Year: 2012

Head motion occurs naturally and in synchrony with speech during human dialogue communication, and may carry paralinguistic information, such as intentions, attitudes and emotions. Therefore, natural-looking head motion by a robot is important for smooth human-robot interaction. Based on rules inferred from analyses of the relationship between head motion and dialogue acts, this paper proposes a model for generating head tilting and nodding, and evaluates the model using three types of humanoid robot (a very human-like android, "Geminoid F", a typical humanoid robot with less facial degrees of freedom, "Robovie R2", and a robot with a 3-axis rotatable neck and movable lips, "Telenoid R2"). Analysis of subjective scores shows that the proposed model including head tilting and nodding can generate head motion with increased naturalness compared to nodding only or directly mapping people's original motions without gaze information. We also find that an upwards motion of a robot's face can be used by robots which do not have a mouth in order to provide the appearance that utterance is taking place. Finally, we conduct an experiment in which participants act as visitors to an information desk attended by robots. As a consequence, we verify that our generation model performs equally to directly mapping people's original motions with gaze information in terms of perceived naturalness. © 2012 ACM. Source


Liu C.,ATR Intelligent Robotics and Communication Labs. | Ishi C.T.,ATR Intelligent Robotics and Communication Labs. | Ishiguro H.,ATR Hiroshi Ishiguro Labs. | Hagita N.,ATR Intelligent Robotics and Communication Labs.
International Journal of Humanoid Robotics | Year: 2013

Head motion occurs naturally and in synchrony with speech during human dialogue communication, and may carry paralinguistic information, such as intentions, attitudes and emotions. Therefore, natural-looking head motion by a robot is important for smooth human-robot interaction. Based on rules inferred from analyses of the relationship between head motion and dialogue acts, this paper proposes a model for generating head tilting and nodding, and evaluates the model using three types of humanoid robot (a very human-like android, "Geminoid F", a typical humanoid robot with less facial degrees of freedom, "Robovie R2", and a robot with a 3-axis rotatable neck and movable lips, "Telenoid R2"). Analysis of subjective scores shows that the proposed model including head tilting and nodding can generate head motion with increased naturalness compared to nodding only or directly mapping people's original motions without gaze information. We also find that an upward motion of a robot's face can be used by robots which do not have a mouth in order to provide the appearance that utterance is taking place. Finally, we conduct an experiment in which participants act as visitors to an information desk attended by robots. As a consequence, we verify that our generation model performs equally to directly mapping people's original motions with gaze information in terms of perceived naturalness. © 2013 World Scientific Publishing Company. Source


Ishi C.T.,ATR Intelligent Robotics and Communication Labs. | Even J.,ATR Intelligent Robotics and Communication Labs. | Hagita N.,ATR Intelligent Robotics and Communication Labs.
IEEE International Conference on Intelligent Robots and Systems | Year: 2015

We developed a system for detecting the speech activity intervals of multiple speakers by combining multiple microphone arrays and human tracking technologies. We also proposed a method for estimating the face orientation of the detected speakers. The developed system was evaluated in two steps: individual utterances in different positions and orientations; and simultaneous dialogues by multiple speakers. Evaluation results revealed that the proposed system could detect speech activity intervals with more than 90% of accuracy, and face orientations with standard deviations within 30 degrees, in situations excluding the cases where all arrays are in the opposite direction to the speaker's face orientation. © 2015 IEEE. Source

Discover hidden collaborations