Nuit Blanche: Analysis Dictionary Learning: A New Matrix Factorization ?

Tuesday, January 17, 2012

Analysis Dictionary Learning: A New Matrix Factorization ?

I eventually was able to watch Miki Elad's presentation at MIA2012. While that presentation is not on his website yet, the closest I could find is: K-SVD Dictionary-Learning for Analysis Sparse Models. I had read these papers before on analysis versus synthesis and while I am still not quite clear on what is being really said (it needs to sink in first), there is I think a very interesting matrix decomposition I had not seen before:

Namely, \Omega and A are unknowns, X is known and A is made of sparse vector columns. While talking to Miki, we were wondering if this type of matrix decomposition existed or was needed in other fields of investigation (besides analysis dictionary learning for sparse models) ? Recall that the current and now mainstream dictionary learning solvers (featured in the Matrix Factorization Jungle Page) solve the following problem:

(Synthesis) Dictionary Learning: A = DX with unknown D and X, solve for sparse X

which is to be contrasted with the new decomposition:

(Analysis) Dictionary Learning A = \Omega X with unknown \Omega and A, solve for sparse A

Of the results that surprised me, the first one was pretty telling:

In short, the operator \Omega found by this dictionary learning decomposition seems to find back a TV like operator! (The Xs were made of patches).

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

6 comments:

Matthieu Puigt said...: Igor,

I wonder how it could be related to Sparse Component Analysis (SCA): in SCA, we have an observed matrix X that we decompose as a product A . S where S is sparse (this decomposition is done in two stages but that is the spirit).

Here it looks as if you were estimating a pseudo inverse of A:
Omega . X = Omega . A . S = S with sparse S.; Thursday, January 19, 2012 at 3:34:00 AM CST
Igor said...: Matthieu

Are you sure ?

"...Here it looks as if you were estimating a pseudo inverse of A:
Omega . X = Omega . A . S = S with sparse S..."

here we have :

A = \Omega X

both \Omega and A are unknown, while we have a known X. I am not sure what you mean by estimating a pseudo inverse for A.

Igor.

I; Thursday, January 19, 2012 at 5:39:00 AM CST
Matthieu Puigt said...: Igor,

I used the classical source separation notations for X, A, and S (where X is the observed matrix, A the unknown "mixing" matrix, and S the unknown source matrix) but I incorrectly used Omega. In source separation, we use to denote by W the (pseudo-)inverse of A and I will keep this notation hereafter.

Most of the SCA approaches work in two stages: we first estimate A and obtain a "classical" inverse problem that we solve by assuming that S is sparse.

Alternatively, Independent Component Analysis (ICA) methods directly estimate W and S, such that W . X = S, but they usually only process square matrices (and don't need the source matrix S to be sparse).

Here, it looks as if you were mixing both worlds: you estimate (still with the classical source separation notations) non-square matrices W and S, such that W . X = S, with sparse S.

W may be seen as a "demixing" matrix while A is "mixing". Indeed, the above equations yield to
W . X = W . A . S = S
The product W . A is thus an identity matrix (actually, up to permutations and scale factors - classical source separation indeterminacies), hence the use of "pseudo-inverse" in my previous comment.

Is it clearer?

Matthieu; Thursday, January 19, 2012 at 9:33:00 AM CST
Igor said...: Ok, so you are saying the following:

we want to solve for ( a sparse) S and W given X in:

W. X = S

if we were to decompose X as:

X = A.S (dictionary learning)

then by defining W such that W.A = I, we have:

W. (A.S) = (W.A) . S = S

therefore for that to occur:

W = (A^t*A)^(-1)*A^t.

Is that what you meant ?

Igor.; Friday, January 20, 2012 at 6:00:00 AM CST
Matthieu Puigt said...: Igor,

yes, that's what I meant.

Actually, I just had a quick read of the pdf presentation linked in your post. It highlights the previous paper about the "cosparse analysis model", by Nam et al., ICASSP'11 (Elad was a co-author). In this paper, they state that their model may be considered in many missing data problems as e.g. source separation.

Matthieu; Friday, January 20, 2012 at 8:01:00 AM CST
Igor said...: thanks Matthieu.; Friday, January 20, 2012 at 8:47:00 AM CST