NTT Media Intelligence Laboratories

United States

NTT Media Intelligence Laboratories

United States

Time filter

Source Type

Niwa K.,NTT Media Intelligence Laboratories | Ochi D.,NTT Media Intelligence Laboratories | Kameda A.,NTT Media Intelligence Laboratories | Kamamoto Y.,NTT Communication Science Laboratories | Moriya T.,NTT Communication Science Laboratories
141st Audio Engineering Society International Convention 2016, AES 2016 | Year: 2016

In virtual reality (VR), 360? video services provided through head mounted displays (HMDs) or smartphones are widely used. Despite the fact that the user's viewpoint seamlessly changes, sounds through the headphones are fixed even when images change in correspondence with user head motion in many 360 ? video services. We have been studying acoustic immersion technology that is achieved by, for example, generating binaural sounds corresponding to the user head motion. Basically, our method is composed of angular region-wise source enhancement using array observation signals, multichannel audio encoding based on MPEG-4 Audio Lossless Coding (ALS), and binaural synthesizing of enhanced signals using head related transfer functions (HRTFs). In this paper, we constructed a smartphone-based real-Time system for streaming/viewing 360 ? video including acoustic immersion and evaluated it through subjective tests.


Niigaki H.,NTT Media Intelligence Laboratories | Shimamura J.,NTT Media Intelligence Laboratories | Kojima A.,NTT Media Intelligence Laboratories
Proceedings - 2015 International Conference on 3D Vision, 3DV 2015 | Year: 2015

We present a new unsupervised technique to segment 3D Lidar points in outdoor environments. The main idea of this work is to identify artificial objects according to the existence of extruded shapes. Many artificial objects are composed of extruded shapes such as cylinders, planes, cubes, and lines. Therefore, we detect these arbitrarily extruded shapes on the basis of an indicator for repetitive crosssection shapes, and connect the components according to the strength between the overlapping areas in the extruded surfaces. Conventional segmentation methods that use local geometry information may sometimes produce erroneous results in scenes where there are many objects that are very near to and partially in contact with each other. In contrast, our method is more robust against these complex scenes using large scale surface overlapping strength. Experiments show it provides good results in urban environments and expressway scenes. © 2015 IEEE.


Onishi T.,NTT Media Intelligence Laboratories | Sano T.,NTT Media Intelligence Laboratories | Yokohari K.,NTT Media Intelligence Laboratories | Su J.,NTT Media Intelligence Laboratories | And 4 more authors.
NTT Technical Review | Year: 2014

For 4K and 8K video services that are capable of providing high definition pictures with an abundant sense of presence, the latest H.265/MPEG-H video coding standard, also known as HEVC (High Efficiency Video Coding), should allow the encoding of huge quantities of video data to be performed effectively and with high compression efficiency. This article introduces HEVC hardware encoder technology that is capable of encoding 4K video in real time.


Tanaka Y.,NTT Media Intelligence Laboratories | Ochi D.,NTT Media Intelligence Laboratories
NTT Technical Review | Year: 2014

Video technology is becoming more sophisticated, and the amount of available content is increasing substantially. Accordingly, individual preferences for video content are becoming more diverse. We have perceived a need for a means of personalized viewing that enables viewers to select their preferred subjects and scenes within the content rather than just passively selecting video content created by third parties, as in the conventional style. This article introduces a system that partitions 4K video into tiles, compresses it, and enables only parts selected by the viewer within the video to be distributed live at the desired size and with high quality.


Tsutsuguchi K.,NTT Media Intelligence Laboratories | Ando S.,NTT Media Intelligence Laboratories | Katayama A.,NTT Media Intelligence Laboratories | Tanaka H.,NTT Media Intelligence Laboratories | And 2 more authors.
NTT Technical Review | Year: 2014

Mobile video watermarking technology is a digital watermarking technology that can detect invisible information embedded in external videos with both high speed and accuracy, simply by directing the camera in the mobile device towards the video. In this article, we give an overview of the technology and describe two example use case applications in existing broadcast TV services and a new opportunity using video synchronized augmented reality.


Sugaya Y.,NTT Media Intelligence Laboratories | Fujii H.,NTT Media Intelligence Laboratories | Sato A.,NTT Media Intelligence Laboratories | Matsuda H.,NTT Media Intelligence Laboratories | And 2 more authors.
NTT Technical Review | Year: 2014

Efforts toward achieving 4K/8K broadcast services have been ramping up in countries throughout the world, and in Japan, broadcasters, telecommunications carriers, and equipment manufacturers have teamed up to establish the Next Generation Television & Broadcasting Promotion Forum (NexTV-F) to make 4K/8K communications and broadcasting a reality. As a founding company of NexTV-F, NTT is becoming a world pioneer in the development and promotion of 4K/8K services through a variety of research projects and technical developments. This article introduces these technical developments and the ultra-high-presence services targeted by NTT.


Shimauchi S.,NTT Media Intelligence Laboratories | Kobayashi K.,NTT Media Intelligence Laboratories | Fukui M.,NTT Advanced Technology Corporation | Kurihara S.,NTT Media Intelligence Laboratories | Ohmuro H.,NTT Media Intelligence Laboratories
NTT Technical Review | Year: 2014

Automatically calibrating echo canceller software has been developed for voice over Internet protocol (VoIP) applications on smartphones. Because the audio properties of smartphones typically depend on the model, the speech quality of a VoIP application may sometimes degrade, especially during hands-free conversations. We extended the calibration ability of our software in order to handle the variations in smartphone audio properties. As a result, our software exhibited better performance than most conventional software.


Imoto K.,NTT Media Intelligence Laboratories | Ohishi Y.,NTT Communication Science Laboratories | Uematsu H.,NTT Media Intelligence Laboratories | Ohmuro H.,NTT Media Intelligence Laboratories
IEEE International Workshop on Machine Learning for Signal Processing, MLSP | Year: 2013

We propose a model for analyzing acoustic scenes by using long-term (more than several seconds) acoustic signals based on a probabilistic generative model of an acoustic feature sequence associated with acoustic scenes (e.g. 'cooking') and acoustic events (e.g. 'cutting with a knife,' 'heating a skillet' or 'running water') called latent acoustic topic and event allocation (LATEA) model. The proposed model allows the analysis of a wide variety of sounds and the capture of abstract acoustic scenes by representing acoustic events and scenes as latent variables, and can also describe the acoustic similarity and variance between acoustic events by representing acoustic features as a mixture of Gaussian components. Experiments with real-life sounds indicated that the proposed model exhibited lower perplexity than conventional models; it improved the stability of acoustic scene estimation. The experimental results also suggested that the proposed model can better describe the acoustic similarity and variance between acoustic events than conventional models. © 2013 IEEE.

Loading NTT Media Intelligence Laboratories collaborators
Loading NTT Media Intelligence Laboratories collaborators