Vetro A.,MItsubishi Electric |
Tourapis A.M.,Dolby Laboratories Inc. |
Muller K.,Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut |
IEEE Transactions on Broadcasting | Year: 2011
There exist a variety of ways to represent 3D content, including stereo and multiview video, as well as frame-compatible and depth-based video formats. There are also a number of compression architectures and techniques that have been introduced in recent years. This paper provides an overview of relevant 3D representation and compression formats. It also analyzes some of the merits and drawbacks of these formats considering the application requirements and constraints imposed by different storage and transmission systems. © 2011 IEEE.
Breebaart J.,Dolby Laboratories Inc.
AES: Journal of the Audio Engineering Society | Year: 2013
This study compares interaural intensity differences (IIDs) of a real source and those resulting from a phantom source created by pair-wise amplitude panning in an anechoic environment with a listener situated in the sweet spot. Factors under investigation are (1) the source frequency, (2) the source direction angle, (3) the loudspeaker angular aperture, (4) the influence of headrelated transfer functions (HRTFs) across subjects, and (5) differences between panning laws. The between-subject differences in HRTFs occurred mainly above 1 kHz and were found to be a highly significant factor. For the commonly used loudspeaker angular aperture of 60°, this source of error accounted for 79% of the overall variance. The results also indicated that the most critical direction angle for the evaluation of panning laws equals approximately half that of the loudspeaker angle. Phantom sources tend to have larger IIDs than real sources for the commonly-used loudspeaker angular aperture of 60°, and the magnitude of this offset was found to be frequency dependent. For wider apertures (110°), both larger and smaller phantomsource IIDs were observed, depending on the employed panning law and the frequency of the source signal. Furthermore, substantial errors in IIDs are observed for frequencies at which phase cancellation occurs due to the contribution of two loudspeakers at each ear with a relative time delay. These findings, in relation to the observed between-subject variance, suggest that the accuracy of panning laws can mainly be improved by incorporating frequency and loudspeaker-angle dependent panning functions.
Han J.,University of California at Santa Barbara |
Saxena A.,Samsung |
Melkote V.,Dolby Laboratories Inc. |
Rose K.,University of California at Santa Barbara
IEEE Transactions on Image Processing | Year: 2012
This paper proposes a novel approach to jointly optimize spatial prediction and the choice of the subsequent transform in video and image compression. Under the assumption of a separable first-order Gauss-Markov model for the image signal, it is shown that the optimal Karhunen-Loeve Transform, given available partial boundary information, is well approximated by a close relative of the discrete sine transform (DST), with basis vectors that tend to vanish at the known boundary and maximize energy at the unknown boundary. The overall intraframe coding scheme thus switches between this variant of the DST named asymmetric DST (ADST), and traditional discrete cosine transform (DCT), depending on prediction direction and boundary information. The ADST is first compared with DCT in terms of coding gain under ideal model conditions and is demonstrated to provide significantly improved compression efficiency. The proposed adaptive prediction and transform scheme is then implemented within the H.264/AVC intra-mode framework and is experimentally shown to significantly outperform the standard intra coding mode. As an added benefit, it achieves substantial reduction in blocking artifacts due to the fact that the transform now adapts to the statistics of block edges. An integer version of this ADST is also proposed. © 2011 IEEE.
Daly S.J.,Dolby Laboratories Inc. |
IEEE Transactions on Broadcasting | Year: 2011
Perceiving three-dimensional video imagery appropriately in a display requires matching parameters throughout the imaging pathway, such as inter-aperture distance at the stereoscopic camera side with parallax shifting at the display side. In addition, many tradeoffs and compromises are often made at different points in the imaging pathway, leading to common perceptual distortions. Some of these may be simple two-dimensional image distortions such as display surface noise, while others are three-dimensional distortions, such as global geometric scene distortions and localized depth errors around edges. There is an increasing use of various forms of signal processing to modify the images, either for compensation of distortions due to system limitations, display constraints, formatting and compression for efficient transmission, or making depth range adjustments dependent on the display viewing conditions. Perceptual issues are critical to the design of the entire imaging pathway and this paper will highlight some of those due to stereoscopic signal processing. © 2011 IEEE.
Dolby Laboratories Inc. | Date: 2014-04-14
Disclosed are examples of systems, apparatus, methods and computer-readable storage media for dynamically adjusting thresholds of a compressor. An input audio signal having a number of frequency band components is processed. Time-varying thresholds can be determined. A compressor performs, on each frequency band component, a compression operation having a corresponding time-varying threshold to produce gains. Each gain is applied to a delayed corresponding frequency band component to produce processed band components, which are summed to produce an output signal. In some implementations, a time-varying estimate of a perceived spectrum of the output signal and a time-varying estimate of a distortion spectrum induced by the perceived spectrum estimate are determined, for example, using a distortion audibility model. An audibility measure of the distortion spectrum estimate in the presence of the perceived spectrum estimate can be predicted and used to adjust the time-varying thresholds.