A Random Matrix Approach to Neural Networks by Cosme Louart, Zhenyu Liao, Romain Couillet
This article studies the Gram random matrix modelG=1TΣTΣ ,Σ=σ(WX) , classically found in random neural networks, whereX=[x1,…,xT]∈Rp×T is a (data) matrix of bounded norm,W∈Rn×p is a matrix of independent zero-mean unit variance entries, andσ:R→R is a Lipschitz continuous (activation) function ---σ(WX) being understood entry-wise. We prove that, asn,p,T grow large at the same rate, the resolventQ=(G+γIT)−1 , forγ>0 , has a similar behavior as that met in sample covariance matrix models, involving notably the momentΦ=TnE[G] , which provides in passing a deterministic equivalent for the empirical spectral measure ofG . This result, established by means of concentration of measure arguments, enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.
Reproducibility: Python 3 codes used to produce the results of Section 4 are available at https://github.com/Zhenyu-LIAO/RMT4ELM
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.
No comments:
Post a Comment