1 / 24

Introduction to MPEG Surround

Introduction to MPEG Surround. 韓志岡 2/9/2005. Outline. Background Motivation Perception of sound in space Pricicple of MPEG Surround Downmixing to one channel Estimation of spatial cues Synthesis of spatial cues Conclusions & Reference. Motivation.

senona
Download Presentation

Introduction to MPEG Surround

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to MPEG Surround 韓志岡 2/9/2005

  2. Outline • Background • Motivation • Perception of sound in space • Pricicple of MPEG Surround • Downmixing to one channel • Estimation of spatial cues • Synthesis of spatial cues • Conclusions & Reference

  3. Motivation • The vast majority of audio playback equipment use traditional two-channel presentations (stereo) • More reproduction channels (“multi-channel audio” or “surround sound”) is quite visible in the market place • A non-disruptive transition from stereo to multi-channel audio requires media formats that can serve both those using conventional stereo equipment and those using next-generation multi-channel equipment.

  4. Perception of sound in space • HRTF(Head Related Transfer Function) modeling the path of sound from a source to the left and right ear entrances.

  5. Perception of sound in space(cont.) • Three parameters(cues) describing how human localize sound in the horizontal plane: • Interaural level difference (ILD) • Interaural time difference (ITD) • Interaural coherence (IC)

  6. ITD (Interaural time difference) & ILD (Interaural level difference)

  7. ITD (Interaural time difference) & ILD (Interaural level difference) (cont.) • ITD and ILD between a pair of headphone signals determine the location of the auditory event which appears in the frontal section of the upper head.

  8. IC (Interaural coherence) • The spatial impression of the auditory enent is related to IC

  9. Two sound source: Summing localization • Inter-channel time difference (ICTD) • Inter-channel level difference (ICLD) • Inter-channel coherence (ICC)

  10. Two sound source: Summing localization (cont.)

  11. MPEG Surround • MPEG Surround exploits inter-channel differences in level, phase and coherence equivalent to the ILD, ITD and IC cues to capture the spatial image of a multi-channel audio signal • Downmix signal and encodes these cues in a very compact form such that the cues and the transmitted signal can be decoded to synthesize a high quality multi-channel representation. • Provide backward compatibility with stereo/mono audio systems.

  12. Coding Scheme

  13. Downmixing to one channel (1/2) • The sum signal is generated by adding the input channels in a subband domain • Multiplying the sum with a factor in order to preserve signal power

  14. Downmixing to one channel (2/2)

  15. Estimation of spatial cues (1/4) • The spatial cues, ICTD, ICLD, and ICC are estimated in a subband domain. The spatial cue estimation is applied independently to each subband

  16. Estimation of spatial cues(2/4) • ICTD (samples):with a short-time estimate of normalized cross-correlation functionwhere and is a short-time estimate of the mean of

  17. Estimation of spatial cues(3/4) • ICLD (dB): • ICC :

  18. Estimation of spatial cues(4/4) • For multi-channel audio signals, ICTD and ICLD are defined between the reference channel and each other C-1 channels

  19. Synthesis of spatial cues(1/3) • ICTD are synthesized by imposing delays, ICLD by scaling, and ICC by applying de-correlation filters.

  20. Synthesis of spatial cues(2/3) • The delays are determined by the ICTDs

  21. Synthesis of spatial cues(3/3) • The scale factors are determined by the ICLDs satisfying: • After delays and scaling, we need to reduce correlation between the subbands.This is achieved by designing the filters hc controlled as a function of ICC.

  22. Conclusions (1/2) • Well-known perceptual audio coders, such as MP3, primarily exploit a single channel’s ability to mask its own quantization noise. • In contrast, spatial perception is primarily attributed to three parameters : ILD, ITD, and IC.

  23. Conclusions (2/2) • MPEG Surround provides an extremely efficient method for coding of multi-channel sound via the transmission of a compressed stereo (or even mono) audio program plus a low-rate side-information channel. • MPEG Surround is the latest technology for bitrate efficient and backward compatible presentation of multi-channel audio.

  24. Reference • ISO/IEC JTC1/SC29/WG11 (MPEG), Document N7390, “Tutorial on MPEG Surround Audio Coding”, July 2005, Poznan, Poland • C. Faller, “Parametric coding of spatial audio,” in Proc. DAFx (Digital Audio Effects), October 2004.

More Related