Talk:Modified discrete cosine transform

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

External articles helpful for understanding MDCT[edit]

This blog post by Nick Appleton was very helpful in understanding mdct, especially combined with the TDAC section as of 2023.

I'm linking it in the talk page for future editors of the wiki to use and for the future readers.

https://www.appletonaudio.com/blog/2013/understanding-the-modified-discrete-cosine-transform-mdct/

wayback machine link: https://web.archive.org/web/20230227100616/https://www.appletonaudio.com/blog/2013/understanding-the-modified-discrete-cosine-transform-mdct/ — Preceding unsigned comment added by 119.161.98.68 (talk) 10:08, 27 February 2023 (UTC)[reply]


Helpfulness and comments on lapped transform uses and characteristics[edit]

This entry was very helpful to me and I really appreciate all of the people who put time into writing it.

I am not an expert on the math behind this, but I have just finished implementing this for a signal processing application here at work, so if anyone who is more math savy has anything to correct or add you are not going to hurt my feelings. :-)

It seems like there are two sides to processing a continuous signal using a DCT. You have the choices that you make in how you decompose the signal into the frequency domain and you have the choices you make in how you reconstruct the signal into the time domain.

It seems like you could use any DCT to decompose the signal. The only real advantage that I can see to using the MDCT is that it allows you to convert a continuous signal of n samples into a series of overlapping DCT's with a total of n outputs. If you chose some DCT other than the MDCT the main consequence would be that you would end up with 2n outputs. This feature is probably very important to someone who is trying to compress a signal, but it wasn't at all important in the denoising application that I was working on.

The other set of choices relate to how you put the signal back together. It took me a little bit to figure this one out, but the big issue for me was merging the overlapped sections back together in such a way that I didn't have major artifacts at the edges of all of the blocks. The window function is critical to putting the blocks back together, but once you understand what it is doing you can see that the window function that is being used to put the blocks back together is really just a sort of weighted average that values points at the middle of a given overlapping block more than the values at the edge of an overlapping block. Basically it looks like a triangle window would work just as well.

— Preceding unsigned comment added by Gingda (talkcontribs) 16:38, 23 December 2005‎ (UTC)[reply]

Re:04:13, 13 March 2006 Stevenj (→Relationship to DCT-IV and Origin of TDAC - greatly simplified derivation of inverse property)[edit]

This "simplified" version is not particularly helpful to those of us trying to explain to ourselves that it does, in fact, work as prescribed. The reason I put up such a long and exacting derivation was to make it very clear to those who had to deal with the TDAC but had little expertise in general signal processing that it provably works.

I know at least two scientists that have benefitted directly from the "unsimplified" derivation on this page, and I see no reason to reduce it along with the usefulness of the page in general.

Gav. — Preceding unsigned comment added by Gav (talkcontribs) 10:46, 24 March 2006‎ (UTC)[reply]

Since the revised derivation contains at least as much rigor in a much shorter and more to-the-point presentation, what's not to like? If you can point out some specific step that was left out or is unclear, that would be more constructive. —Steven G. Johnson 22:14, 17 April 2006 (UTC)[reply]

Remark[edit]

Lapped orthogonal transforms (LOTs) where the basis elements are cosine modulated versions of a prototype window, are sometimes called Malvar-Wilson bases or Malvar-Wilson wavelets. Some generalisations of these LOTs can be found in the book by Stephane Jaffard, Robert D. Ryan, Yves Meyer and references therein. (6 April 2006, 13.35) — Preceding unsigned comment added by 134.106.241.84 (talk) 11:38, 6 April 2006‎ (UTC)[reply]

Added 2007-07-27[edit]

In "Definition", 'k' is not defined (X_k = ...)

You having made the TeX screenshot, please add: k = 0, ..., N - 1

— Preceding unsigned comment added by 85.226.104.55 (talk) 21:38, 27 July 2006‎ (UTC)[reply]

The output index range is already given on the previous line. —Steven G. Johnson 22:34, 27 July 2006 (UTC)[reply]

Anti-symmetric Property of MDCT spectra[edit]

Regarding the sentence:

"the MDCT is a bit unusual compared to other Fourier-related transforms in that it has half as many outputs as inputs (instead of the same number)."

There is nothing unusual about this, because the output having only half the length of the input is a convention related to its use in compression. The true output, if making parallels to the DFT (FFT), has the same length as the input when considering both positive and negative frequencies. Independently calculating the second half of the output is unnecessary since it is a reversed copy of the first half of the output, multiplied by -1. For example, if the output of the provided definition for the MDCT is (1, 2, 3, 4), then the actual full definition of the spectrum is (1, 2, 3, 4, -4, -3, -2, -1). —Preceding unsigned comment added by 171.67.229.73 (talk) 01:56, 3 March 2010 (UTC)[reply]

Of course you can pad the output by the boundary conditions of the DCT-IV to a length 2N, in which case the second half is redundant. But the transform matrix in that case is singular. If you want to be technical about it, a DFT of N inputs has matrix rank N (regardless of how many periodic copies you pad the output with ... note that the N outputs of the DFT already contain both positive and negative frequencies, when you consider aliasing), whereas an MDCT of N inputs has rank N/2 (regardless of whether you pad the output with redundant copies via the boundary conditions). Because it is rank N/2, the most succinct way to represent the MDCT as a matrix is non-square.
As a result, the MDCT is not invertible except when applied in sequence to overlapping chunks as described in the article, whereas all other common Fourier-related transforms (DFT, DCT, DST, DHT, etcetera) are invertible by themselves. — Steven G. Johnson (talk) 23:21, 28 March 2010 (UTC)[reply]