Chris just sent me the following:
In earlier communications you've been interested in the intersections of compressed sensing and neural computation. We've just released a preprint that adds one more to the list: low rank factorization. The preprint below extends our earlier work on determining the memory capacity or recurrent neural networks (RNNs). RNNs are being used as layers in deep networks when they want to introduce some memory to handle time-varying inputs.
Using random matrix techniques, we can show the memory capacity bounds on one type of RNN (linear echo state networks) with multidimensional inputs. Those inputs can be sparse in a basis, or have dependencies that result in a low-rank matrix (with no sparsity). In either case, we show that the network size must scale linearly with the information rate in the data, resulting in networks that can be much smaller than the dimension of the input being remembered.
Thanks Chris . And all this with phase transition diagrams, woohoo !
Distributed Sequence Memory of Multidimensional Inputs in Recurrent Networks by Adam Charles, Dong Yin, Christopher Rozell
Recurrent neural networks (RNNs) have drawn interest from machine learning researchers because of their effectiveness at preserving past inputs for time-varying data processing tasks. To understand the success and limitations of RNNs, it is critical that we advance our analysis of their fundamental memory properties. We focus on echo state networks (ESNs), which are RNNs with simple memoryless nodes and random connectivity. In most existing analyses, the short-term memory (STM) capacity results conclude that the ESN network size must scale linearly with the input size for unstructured inputs. The main contribution of this paper is to provide general results characterizing the STM capacity for linear ESNs with multidimensional input streams when the inputs have common low-dimensional structure: sparsity in a basis or significant statistical dependence between inputs. In both cases, we show that the number of nodes in the network must scale linearly with the information rate and poly-logarithmically with the ambient input dimension. The analysis relies on advanced applications of random matrix theory and results in explicit non-asymptotic bounds on the recovery error. Taken together, this analysis provides a significant step forward in our understanding of the STM properties in RNNs.
Because somehow these phase diagrams are the great equalizers.
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.