Nuit Blanche: A Small Q&A with Phil Schniter on TurboGAMP.

Wednesday, September 21, 2011

A Small Q&A with Phil Schniter on TurboGAMP.

Following up on Monday's "TurboGAMP: Joint channel-estimation/equalization/decoding of BICM-OFDM signals transmitted over channels with clustered-sparse impulse responses." that featured Phil Schniter's work on some blind deconvolution and some intriguing change in the Donoho-Tanner phase trnasition thanks to structured sparsity, I went ahead and asked Phil some dumb questions:

I need some insight ... From what I read in your paper and attendant presentations, It looks like you are solving a blind deconvolution problem i.e y = A x and you don't know both A and x but have the ability to test it through several x's. Am I right ? It looks the reason you can actually make that fast of an evaluation of both A and x (once A is known) is that you are using several signals x that are sparse in some structured way. Did I understand correctly your thesis ? In the affirmative, I could not locate A in your paper. If not, please show me the light.

Cheers,

Igor.

Phil kindly answered with:

hi igor,
...

the OFDM-based blind convolution problem that we're solving can be stated as y = Diag(s) A x + n. here, Diag(s) is the diagonal matrix created from the vector s of symbols from a finite alphabet, with some elements known as "pilot" symbols, and the rest (the "data" symbols) are unknown. A is a tall matrix constructed from the left columns of the DFT matrix, x is sparse or compressible, and n is additive white Gaussian noise.

in traditional compressed sensing, which performs only channel estimation (not data detection), the pilot measurements are extracted to get y_p = Diag(s_p) A_p x + n_p, where now s_p are completely known and A_p is a wide matrix. this yields a traditional sparse reconstruction problem that can be attacked by any of the standard solvers. but the performance is limited by the fact that many measurements (i.e., those corresponding to data symbols) have been ignored.

in our approach, we keep the original model y = Diag(s) A x + n, which contains more measurements, but whose data-symbol measurements are each complicated by a finite-alphabet multiplicative uncertainty. fortunately, there are a couple of algorithms that can be modified to handle this kind of uncertainty: GAMP and RBP.

furthermore, the unknown data symbols within s are structured, since the bits underlying those symbols have been generated using some error-correction coding scheme. moreover, the channel coefficients x are not only sparse, but structured-sparse, since the large coefficients come in clusters. we exploit the structures in both s and x using the "turbo" approach, by mating GAMP (or RBP) with apropriate soft-input soft-output processing blocks. these additional structures help significantly with the joint estimation of s and x.

i hope that answers your questions. thanks again for your interest.

thanks,
phil

But then I was too dim, so I had to ask:

Phil,

I am sorry if this sounds like an ignorant question but can you be more specific when you say " Diag(s) is the diagonal matrix created from the vector s of symbols from a finite alphabet, with some elements known as "pilot" symbols, and the rest (the "data" symbols) are unknown. ", I don't understand that sentence.

Cheers,

Igor.

Phil then patiently went back to communication 101 and kindly replied:

Sorry, I was assuming too much background knowledge. Let me try to give you a brief primer on digital communications which leads up to the model.

In digital communications, the goal is to communicate a sequence of information bits from the transmitter to the receiver. This is done as follows. The information bits are first processed by an "error control coder" that structures them in a way that makes the coded bit sequence very robust to errors. These coded bits are then grouped to form a sequence of "symbols", which are scalar numbers from a (usually non-binary) finite set or "alphabet" (e.g., {-3, -1, 1, 3}). Among these data symbols, the transmitter interleaves reference ("pilot") symbols that are known apriori by the receiver, and used by the receiver to aid in data-symbol recovery. The complete symbol sequence (with data and pilots) is then converted into a continuous waveform by modulating a sequence of smooth pulses according to the symbol values. In the "OFDM" system that I am considering, this latter procedure is best described in the frequency domain. The symbol sequence [-1, 3, -3, -1...], for example, would yield a waveform taking the value -1 at frequency zero, smoothly transitioning to the value 3 at frequency one, smoothly transitioning to the value -3 at frequency two, and so on. This continuous waveform then propagates through a "channel" that is assumed to be well modeled as a linear time-invariant system, corrupted by additive white Gaussian noise.

At the receiver, the noisy channel output waveform is sampled to give a sequence of measurements that are later processed in order to extract the symbol values. In the OFDM system that I am considering, the sampling is done in the frequency domain, yielding

y = Diag(s) h + n

where 's' denotes the vector of (data and pilot) symbols, whose elements live in some finite set (e.g., {-3, -1, 1, 3}); where 'h' is a vector of frequency-domain channel gains; and where 'n' is a vector of frequency-domain noise samples. Notice that 'Diag(s) h' implements a point-wise multiplication between the vectors s and h. So far, however, there is nothing "sparse" about the model. To expose the sparsity, we transform the frequency-domain channel gains into the time domain via 'h = A x', where A is a DFT matrix, and x is the channel impulse response. Finally, due to the physics of signal propagation, we know that x is sparse (and, in fact, clustered-sparse).

After all of this, we have the model

y = Diag(s) A x + n

where the frequency-domain received samples y are known, the DFT matrix A is known, and certain elements in the vector s (i.e., the pilots) are known. The other elements in the vector s (the information-bearing data symbols), the sparse channel impulse response x, and the Gaussian noise n, are all unknown.

Let me know if that makes sense now.

It does now. Thanks Phil for being a good sports and patient a the same time. I note that this is not a full blind deconvolution where the measurment matrix is totally unknown. I also note the need to have part of the message known in advance in order to perform this blind deconvolution, I wonder how this could be applied to imaging systems....