You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello there, I want to clarify whether the need for square matrices is strictly enforced. From the paper, I note that
"We turn to an expressive class of sub-quadratic structured matrices called Monarch matrices [12] (Figure 1 left) to propose Monarch Mixer (M2). Monarch matrices are a family of structured matrices that generalize the fast Fourier transform (FFT) and have been shown to capture a wide class of linear transforms including Hadamard transforms, Toeplitz matrices [30], AFDF matrices [55], and convolutions. They are parameterized as the products of block-diagonal matrices, called monarch factors, interleaved with permutation. Their compute scales sub-quadratically: setting the number of factors to p results in computational complexity of $O(pN^{(p+1)/p})$ in input length $N$ , allowing the complexity to interpolate between $O(N \log N )$ at $p = \log N$ and $O(N 3/2)$ at $p = 2$.", as well as:
"The convolution case with Monarch matrices fixed to DFT and inverse DFT matrices also admits implementations based on FFT algorithms [9]."
Furthermore, Proposition 3.2 in the original Monarch paper asserts that $\mathcal{MM}^*$ which can represent a convolution.
I thus want to find out whether the Monarch mixer operation enforces the requirements for having block-diagonal matrices, since (residual gated) convolution(s) intuitively does not usually output a block-diagonal matrix.
Thank You!
The text was updated successfully, but these errors were encountered:
Hello there, I want to clarify whether the need for square matrices is strictly enforced. From the paper, I note that
"We turn to an expressive class of sub-quadratic structured matrices called Monarch matrices [12] (Figure 1 left) to propose Monarch Mixer (M2). Monarch matrices are a family of structured matrices that generalize the fast Fourier transform (FFT) and have been shown to capture a wide class of linear transforms including Hadamard transforms, Toeplitz matrices [30], AFDF matrices [55], and convolutions. They are parameterized as the products of block-diagonal matrices, called monarch factors, interleaved with permutation. Their compute scales sub-quadratically: setting the number of factors to p results in computational complexity of$O(pN^{(p+1)/p})$ in input length $N$ , allowing the complexity to interpolate between $O(N \log N )$ at $p = \log N$ and $O(N 3/2)$ at $p = 2$ .", as well as:
"The convolution case with Monarch matrices fixed to DFT and inverse DFT matrices also admits implementations based on FFT algorithms [9]."
Furthermore, Proposition 3.2 in the original Monarch paper asserts that$\mathcal{MM}^*$ which can represent a convolution.
I thus want to find out whether the Monarch mixer operation enforces the requirements for having block-diagonal matrices, since (residual gated) convolution(s) intuitively does not usually output a block-diagonal matrix.
Thank You!
The text was updated successfully, but these errors were encountered: