Friday, June 26, 2015

Random Projections as Regularizers, Compressive Linear Least Squares Regression and more

We've seen a few papers recently
 here are a few more from the same authors


Random Projections as Regularizers: Learning a Linear Discriminant from Fewer Observations than Dimensions by Robert Durrant, Ata Kabán 
We prove theoretical guarantees for an averaging-ensemble of randomly projected Fisher Linear Discriminant classifiers, focusing on the case when there are fewer training observations than data dimensions. The speci c form and simplicity of this ensemble permits a direct and much more detailed analysis than existing generic tools in previous works. In particular, we are able to derive the exact form of the generalization error of our ensemble, conditional on the training set, and based on this we give theoretical guarantees which directly link the performance of the ensemble to that of the corresponding linear discriminant learned in the full data space. To the best of our knowledge these are the first theoretical results to prove such an explicit link for any classi er and classi er ensemble pair.  Furthermore  we  show  that  the  randomly  projected  ensemble  is  equivalent to  implementing  a  sophisticated  regularization  scheme  to  the  linear  discriminant learned in the original data space and this prevents over tting in conditions of small sample size where pseudo-inverse FLD learned in the data space is provably poor. Our  ensemble  is  learned  from  a  set  of  randomly  projected  representations  of the original high dimensional data and therefore for this approach data can be collected, stored and processed in such a compressed form. We confirm our theoretical ndings with experiments, and demonstrate the utility of our approach on several datasets from the bioinformatics domain and one very high dimensional dataset from the drug discovery domain, both settings in which fewer observations than dimensions are the norm.
New Bounds on Compressive Linear Least Squares Regression by Ata Kabán

In this paper we provide a new analysis of compressive least squares regression that removes a spurious logN factor from previous bounds, where N is the number of training points.  Our new bound has a clear interpretation and reveals meaningful structural properties of the linear regression problem that makes it solvable effectively in a small dimensional random subspace. In addition, the main part of our analysis does not require the compressive matrix to have the Johnson-Lindenstrauss property, or the RIP property. Instead, we only require its entries to be drawn i.i.d.  from a 0-mean symmetric distribution with finite first four moments.



In  this  paper,  we  present  a  new  variant  of  EDA for   high   dimensional   continuous   optimisation, which extends a recently  proposed  random  projections  (RP)  ensemble  based approach  by  employing  heavy  tailed  random  matrices.  In  particular,  we  use  random  matrices  with  i.i.d.  t-distributed  entries. The  use  of  t-distributions  may  look  surprising  in  the  context  of random projections, however we show that the resulting ensemble covariance  is  enlarged  when  the  degree  of  freedom  parameter is  lowered.  Based  on  this  observation,  we  develop  an  adaptive scheme to adjust this parameter during evolution, and this results in  a  flexible  means  of  balancing  exploration  and  exploitation  of the  search  process.  A  comprehensive  set  of  experiments  on  high dimensional  benchmark  functions  demonstrate  the  usefulness  of our approach.



Multivariate Cauchy EDA Optimisation. by Momodou L. Sanyang, Ata Kabán




We consider Black-Box continuous optimization by Estimation of Distribution Algorithms (EDA). In continuous EDA, the multivariate Gaussian distribution is widely used as a search operator, and it has the well-known advantage of modelling the correlation structure of the search variables, which univariate EDA lacks. However, the Gaussian distribution as a search operator is prone to premature convergence when the population is far from the optimum. Recent work suggests that replacing the univariate Gaussian with a univariate Cauchy distribution in EDA holds promise in alleviating this problem because it is able to make larger jumps in the search space due to the Cauchy distribution's heavy tails. In this paper, we propose the use of a multivariate Cauchy distribution to blend together the advantages of multivariate modelling with the ability of escaping early convergence to efficiently explore the search space. Experiments on 16 benchmark functions demonstrate the superiority of multivariate Cauchy EDA against univariate Cauchy EDA, and its advantages against multivariate Gaussian EDA when the population lies far from the optimum

Two approaches of Using Heavy Tails in High dimensional EDA by Momodou L. Sanyang, Hanno Muehlbrandt, Ata Kabán

We consider the problem of high dimensional black-box optimisation via Estimation of Distribution Algorithms (EDA). The Gaussian distribution is commonly used as a search operator in most of the EDA methods. However there are indications in the literature that heavy tailed distributions may perform better due to their higher exploration capabilities. Univariate heavy tailed distributions were already proposed for high dimensional problems. In 2D problems it has been reported that a multivariate heavy tailed (such as Cauchy) search distribution is able to blend together the strengths of multivariate modelling with a high exploration power. In this paper, we study whether a similar scheme would work well in high dimensional search problems. To get around of the difficulty of multivariate model building in high dimensions we employ a recently proposed random projections (RP) ensemble based approach which we modify to get samples from a multivariate Cauchy using the scale-mixture representation of the Cauchy distribution. Our experiments show that the resulting RP-based multivariate Cauchy EDA consistently improves on the performance of the univariate Cauchy search distribution. However, intriguingly, the RP-based multivariate Gaussian EDA has the best performance among these methods. It appears that the highly explorative nature of the multivariate Cauchy sampling is exacerbated in high dimensional search spaces and the population based search loses its focus and effectiveness as a result. Finally, we present an idea to increase exploration while maintaining exploitation and focus by using the RP-based multivariate Gaussian EDA in which the RP matrices are drawn with i.i.d. heavy tailed entries. This achieves improved performance and is competitive with the state of the art.

How effective is Cauchy-EDA in high dimensions? by Momodou L. Sanyang and Ata Kabán

We consider the problem of high dimensional black-box optimisation via Estimation of Distribution Algorithms (EDA). The Gaussian distribution is commonly used as a search operator in most of the EDA methods.  However there are indications in the literature that heavy tailed distributions may perform better due to their higher exploration capabilities.  Univariate heavy tailed distributions were already proposed for high dimensional problems.  In 2D problems it has been reported that a multivariate heavy tailed (such as Cauchy) search distribution is able to blend together the strengths of multivariate modelling with a high exploration power.  In this paper, we study whether a similar scheme would  work  well  in  high  dimensional  search  problems.   To  get  around  of  the  difficulty of  multivariate  model  building  in  high  dimensions  we  employ  a  recently  proposed  random projections (RP) ensemble based approach which we modify to get samples from a multivariate Cauchy using the scale-mixture representation of the Cauchy distribution. Our experiments show that the resulting RP-based multivariate Cauchy EDA consistently improves on the performance of the univariate Cauchy search distribution.  However, intriguingly,  the  RP-based  multivariate  Gaussian  EDA  has  the  best  performance  among these methods.  It appears that the highly explorative nature of the multivariate Cauchy sampling is exacerbated in high dimensional search spaces and the population based search loses its focus and e ectiveness as a result.  Finally,  we present an idea to increase exploration while maintaining exploitation and focus by using the RP-based multivariate Gaussian EDA in which the RP matrices are drawn with i.i.d.  heavy tailed entries.  This achieves improved performance and is competitive with the state of the art.
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments:

Printfriendly