The ah-ah moment probably occurred at the First French-German Mathematical Image Analysis Conference at IHP as I was watching Rene Vidal perfectly explains his subspace clustering approach [6]. The idea was to describe the data as a means of connecting datapoints that are close to each other to the exclusion of themselves. The decomposition goes as:
with the constraint diag(Z) = 0 in order to enforce the latter exclusion.
X = XZ
with the constraint diag(Z) = 0 in order to enforce the latter exclusion.
From [6]
As explained two months ago in a paper by Julie Josse and Stefan Wager (Stable Autoencoding: A Flexible Framework for Regularized Low-Rank Matrix Estimation) one could also seek a low rank Z instead of just diag(Z) = 0.
A relaxation of the diag(Z) = 0 constraint is Tr(Z) = 0 which as we all know from our Fluid Mechanics 101 courses is a volume preserving property (see Lie Algebra). Or to say it some other way, one could probably decompose Z as Z1 + Z2 such that Tr(Z1) = 0 and diag(Z) = 0. There, component Z1 quantifies the deformation of the dataset while Z2 quantifies volume change (low volume change if Z2 is a low rank matrix, as in the paper featured above). That last component could serve as a basis for autoencoders and might even provide additional light on how to devise nonlinear autoencoders [1,2,3,4,5]. It could probably be a proxy for exploration in the 'exploitation vs exploration' conundrum.
At the very least, the decomposition ought to provide a basis as to why data matrices are part of the Data Tsunami.
A relaxation of the diag(Z) = 0 constraint is Tr(Z) = 0 which as we all know from our Fluid Mechanics 101 courses is a volume preserving property (see Lie Algebra). Or to say it some other way, one could probably decompose Z as Z1 + Z2 such that Tr(Z1) = 0 and diag(Z) = 0. There, component Z1 quantifies the deformation of the dataset while Z2 quantifies volume change (low volume change if Z2 is a low rank matrix, as in the paper featured above). That last component could serve as a basis for autoencoders and might even provide additional light on how to devise nonlinear autoencoders [1,2,3,4,5]. It could probably be a proxy for exploration in the 'exploitation vs exploration' conundrum.
At the very least, the decomposition ought to provide a basis as to why data matrices are part of the Data Tsunami.
[1] Provable Bounds for Learning Some Deep Representations Autoencoders, Unsupervised Learning, and Deep Architectures by Pierre Baldi
[3] Sunday Morning Insight: The Great Convergence ?[4] Sunday Morning Insight: The Regularization Architecture
[5] In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning
[6] Recent Subspace clustering algorithms implementation :
- LSR : Robust and Efficient Subspace Segmentation via Least Squares Regression by Canyi Lu, Hai Min, Zhong-Qiu Zhao, Lin Zhu, De-Shuang Huang, and Shuicheng Yan
- LRRSC : Subspace Clustering by Exploiting a Low-Rank Representation with a Symmetric Constraint by Jie Chen, Zhang Yi
- SSC : Sparse Subspace Clustering: Algorithm, Theory, and Applications by Ehsan Elhamifar, Rene Vidal.
- SMCE : Sparse Manifold Clustering and Embedding by Ehsan Elhamifar, Rene Vidal,
- Local Linear Embedding (LLE)
No comments:
Post a Comment