A Random Matrix Approach to Neural Networks by Cosme Louart, Zhenyu Liao, Romain Couillet
This article studies the Gram random matrix modelG=1TΣTΣ ,Σ=σ(WX) , classically found in random neural networks, whereX=[x1,…,xT]∈Rp×T is a (data) matrix of bounded norm,W∈Rn×p is a matrix of independent zero-mean unit variance entries, andσ:R→R is a Lipschitz continuous (activation) function ---σ(WX) being understood entry-wise. We prove that, asn,p,T grow large at the same rate, the resolventQ=(G+γIT)−1 , forγ>0 , has a similar behavior as that met in sample covariance matrix models, involving notably the momentΦ=TnE[G] , which provides in passing a deterministic equivalent for the empirical spectral measure ofG . This result, established by means of concentration of measure arguments, enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.
Reproducibility: Python 3 codes used to produce the results of Section 4 are available at https://github.com/Zhenyu-LIAO/RMT4ELM
No comments:
Post a Comment