Sunday, February 08, 2015

Sunday Morning Insight: Of Fluid Autoencoders and Data Tsunamis

The ah-ah moment probably occurred at the First French-German Mathematical Image Analysis Conference at IHP as I was watching Rene Vidal perfectly explains his subspace clustering approach [6]. The idea was to describe the data as a means of connecting datapoints that are close to each other to the exclusion of themselves. The decomposition goes as:

X = XZ

with the constraint diag(Z) = 0 in order to enforce the latter exclusion.

From [6]
As explained two months ago in a paper by Julie Josse and Stefan Wager (Stable Autoencoding: A Flexible Framework for Regularized Low-Rank Matrix Estimation) one could also seek a low rank Z instead of just diag(Z) = 0.

A relaxation of the diag(Z) = 0 constraint is Tr(Z) = 0 which as we all know from our Fluid Mechanics 101 courses is a volume preserving property (see Lie Algebra). Or to say it some other way, one could probably decompose Z as Z1 + Z2 such that Tr(Z1) = 0 and diag(Z) = 0. There,  component Z1 quantifies the deformation of the dataset while Z2 quantifies volume change (low volume change if Z2 is a low rank matrix, as in the paper featured above). That last component could serve as a basis for autoencoders and might even provide additional light on how to devise nonlinear autoencoders [1,2,3,4,5]. It could probably be a proxy for exploration in the 'exploitation vs exploration' conundrum.

At the very least, the decomposition ought to provide a basis as to why data matrices are part of the Data Tsunami.

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: