Talk:Ambisonic decoding

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Draft for new article[edit]

Ok, since this is probably waaay over my head, I'm using this talk page to flesh out a new structure for this article, rather than zapping the old one outright. Comments welcome, Nettings (talk) 17:39, 2 January 2014 (UTC)[reply]

Ambisonic Decoding[edit]

Before Ambisonic surround sound material can be listened to, it must be decoded. Using knowledge about the number and location of the available speakers, optimized speaker feeds are generated from the B-format signal. This is a special feature of Ambisonics which helps to decouple the mixing intent (in terms of the desired directions of sounds) from the speaker system used for reproduction, and distinguishes it from other approaches such as 5.1 surround sound, which deliver speaker signals to the consumer that mandate a pre-defined speaker layout.

Ambisonic decoder design has undergone dramatic development since the method was originally developed in the 1970s, and the quality of the decoders available today varies accordingly. This page tries to summarize the fundamental prerequisites to correct Ambisonic decoding as understood today.

Fundamentals[edit]

Matrix inversion by Moore-Penrose pseudoinverse, very well described in BLaH3

Goals:

  • Uniform energy over all directions (no loudness jumps as a source is panned)
  • congruence of rV and rE
  • rV=1 below 700 Hz
  • max rE between 700 and 4kHz as per VIENNA paper

Decoding approaches[edit]

  • in-phase decoder for large auditoria (no out-of-phase content from the opposite, aka "overlap", cf. Gerzon Tetrahedral experiment)
  • max rV decoder to satisfy LF ITD localisation (Makita, Gerzon Gen. Meta)
  • max rE decoder to satisfy HF ILD localisation (Gerzon Gen. Meta)
  • explain "mode matching" decoder (Zotter et al, Technicolor, Hannemann) (good explanation in Zotter and Frank, "All-Round Ambisonic Panning and Decoding" JAES Vol. 60, No. 10, 2012 October)
  • energy-preserving decode", Zotter/Frank
  • browse literature for other jargon terms and explain.
  • explain SHELF fiters as per Lee and equivalence to dual-band decoders
  • explain phase matching between LF and HF (again, BLaH3)
  • near-field compensation and distance coding (Daniel 2009)
  • hemispherical decodes, avoidance of "pull-up",
Parametric decoders[edit]

The idea behind parametric decoding is to treat the sound's direction of incidence as a parameter that can be estimated through time–frequency analysis. A large body of research into human spatial hearing[1][2] suggests that our auditory cortex applies similar techniques in its auditory scene analysis, which explains why these methods work.

The major benefits of parametric decoding is a greatly increased angular resolution and the separation of analysis and synthesis into separate processing steps. This separation allows B-format recordings to be rendered using any panning technique, including delay panning, VBAP[3] and HRTF-based synthesis.

Parametric decoding was pioneered by Lake DSP[4] in the late 1990s and independently suggested by Farina and Ugolotti in 1999.[5] Later work in this domain includes the DirAC method[6] and the Harpex method.[7] FIXME: cite harpex paper rather than website

Mathematical challenges[edit]

  • closed-form solutions for regular polyhedra
  • numerical solutions for irregular layouts, BLaH4+6
  • hemispherical decoding, Musil (assume non-existent loudspeakers, distribute their energy over existing ones), Zotter et al./Keiler & Batke: T-Designs with "virtual" VBAP

References

  1. ^ Blauert, Jens (1997). Spatial Hearing: The Psychophysics of Human Sound Localization (Revised ed.). Cambridge, MA: MIT Press. ISBN 978-0-262-02413-6. Retrieved 6 January 2011.
  2. ^ Bregman, Albert S. (29 September 1994). Auditory Scene Analysis: The Perceptual Organization of Sound. Bradford Books. Cambridge, MA: MIT Press. ISBN 978-0-262-52195-6. Retrieved 12 May 2012.
  3. ^ "Vector base amplitude panning". Research / Spatial sound. Otakaari, Finland: TKK Acoustics. 18 January 2006. Retrieved 12 may 2012. {{cite web}}: Check date values in: |accessdate= (help)
  4. ^ US patent 6628787, McGrath, David Stanley & McKeag, Adam Richard, "Wavelet conversion of 3-D audio signals", issued 2003-09-30 
  5. ^ Farina, Angelo; Ugolotti, Emanuele (1999). "Subjective Comparison Between Stereo Dipole and 3D Ambisonic Surround Systems for Automotive Applications" (PDF). Proceedings of the AES 16th International Conference. AES 16th International conference on Spatial Sound Reproduction. Rovaniemi, Finland: AES. s78357. Retrieved 12 May 2012. {{cite conference}}: External link in |conferenceurl= (help); Unknown parameter |booktitle= ignored (|book-title= suggested) (help); Unknown parameter |conferenceurl= ignored (|conference-url= suggested) (help); Unknown parameter |month= ignored (help)
  6. ^ "Directional Audio Coding". Research / Spatial sound. Otakaari, Finland: TKK Acoustics. 23 May 2011. Retrieved 12 may 2012. {{cite web}}: Check date values in: |accessdate= (help)
  7. ^ "Harpex". Oslo, Norway: Harpex Limited. 2011. Retrieved 12 May 2012.