The estimation of a sparse vector in the linear model is a fundamental problem in signal processing, statistics, and compressive sensing. This paper establishes a lower bound on the mean-squared error, which holds regardless of the sensing/design matrix being used and regardless of the estimation procedure. This lower bound very nearly matches the known upper bound one gets by taking a random projection of the sparse vector followed by an l_1 estimation procedure such as the Dantzig selector. In this sense, compressive sensing techniques cannot essentially be improved.
On the LinkedIn Compressive Sensing forum there was a question about trying to avoid the vectorization of a 2-D image to solve a CS problem. Among the answers, was that of Zhilin Zhang, here it is:
Your problem is the MMV model  if the support of each column in X is the same. There are many algorithms for the MMV model. Basically, these existing algorithms largely ignore the correlation among each column of X (inter-vector correlation), and thus have poor performance when such correlation is high. Recently, I have developed several algorithms in the framework of Sparse Bayesian Learning (SBL) , which effectively exploit such correlation information and achieve excellent performance. I compared almost all the MMV algorithms in terms of recovery performance, but no one was better than mine (including the compressive MUSIC suggested by Igor). You can download my manuscript here:http://arxiv.org/abs/1102.3949.
But admittedly, Bayesian algorithms are much slower than non-Bayesian algorithms. If you seek fast speed, you may consider non-Bayesian algorithms, such as the compressed MUSIC.
Note that algorithms for the basic MMV model are basically using the summarized row norms of X as penalty (except to mine). This is due to the common sparsity assumption . However, if the support of each column in X is slowly changing, you may want to consider the dynamic compressed sensing model . There are some algorithms. However, even for this model, you can apply the MMV algorithms, as shown in my blog:
If the support of each column in X is largely different (the overlapping of the support of neighboring columns is very small), while X is really sparse (very few nonzero elements), the above two models may not be useful. You may look at other extended MMV models, such as using the trace of X, hierarchy of norms, or something else (depends on the structured sparsity in X) as penalties.
 SF Cotter, BD Rao, Sparse solutions to linear inverse problems with multiple measurement vectors, IEEE Trans. on Signal Processing, 2005
 Z.Zhang, B.D.Rao, Sparse Signal Recovery with Temporally Correlated Source Vectors Using Sparse Bayesian Learning Submitted to IEEE Journal of Selected Topics in Signal Processing, [arXiv:1102.3949]
 Namrata Vaswani, "LS-CS-residual (LS-CS): Compressive Sensing on the Least Squares Residual", IEEE Trans. Signal Processing, August 2010