Friday, November 14, 2008

CS: CS in Finance and in the Oil Business."You’ve got Terry Tao talking to geoscientists, what do you want?", Toad Classification and Some Seminars

I had seen some empirical use of the L_1 minimization technique used in finance before. However, the talk by Ingrid Daubechies on portfolio management ( the video is here in the flv format) and the l_1 (and l_p) regularization brings some sound arguments and theoretical discussion to the subject. The paper is entitled: Sparse and stable Markowitz portfolios, by Joshua Brodie, Ingrid Daubechies, Christine De Mol, Domenico Giannone and Ignace Loris, The abstract reads:

We consider the problem of portfolio selection within the classical Markowitz meanvariance framework, reformulated as a constrained least-squares regression problem. We propose to add to the objective function a penalty proportional to the sum of the absolute values of the portfolio weights. This penalty regularizes (stabilizes) the optimization problem, encourages sparse portfolios (i.e. portfolios with only few active positions), and allows to account for transaction costs. Our approach recovers as special cases the noshort-positions portfolios, but does allow for short positions in limited number. We implement this methodology on two benchmark data sets constructed by Fama and French. Using only a modest amount of training data, we construct portfolios whose out-of-sample performance, as measured by Sharpe ratio, is consistently and significantly better than that of the naive evenly-weighted portfolio which constitutes, as shown in recent literature, a very tough benchmark.

Because one of the author works at the European Central Bank, this working paper is hosted on  their site. One would hope that cross disciplinary investigations like this one are given some impetus in other directions such as enforcement and market understanding. Trust can only be enabled when the Feds have a real ability to understand and act on crises such as the one we are witnessing (see the previous entry Trust but Verify). While we are on the subject of interdisciplinary research I noticed the following in a recent IPAM brochure about Mark Green:

"He [Mark Green ] allows himself “one small success story”: The 2004 program, Multiscale Geometry and Analysis in High Dimensions, brought together some of the greatest minds in pure mathematics: UCLA math professor Terry Tao, his Caltech collaborator Emmanuel Candes and Justin Romberg (Georgia Tech). During the program, they produced breakthrough results in compressed sensing, an area that has concrete and emerging applications in imaging, astronomy, MRI, signal processing, and linear coding, among others. This research is now the subject of a Defense Advanced Research Projects Agency (DARPA) program with $25 million in funding, which Mark points out is more money that IPAM has spent in its entire existence. His favorite moment of the program came when the NSF panel – which was simultaneously conducting its first site visit – asked: Is this interdisciplinary work? Participant David Donoho (statistics, Stanford), another major contributor to the genesis of compressed sensing, reportedly exclaimed, “You’ve got Terry Tao talking to geoscientists, what do you want?

In the meantime, one can always apply the approach given in the portofio paper to the Million dollar portfolio competition organized by CNBC. If one were to have a matrix R made up of columns indexed with a discrete time scale and rows indexed on the choice of particular forex exchange pair, the R matrix would, unlike the case of this paper, be underdetermined and would resemble a measurement matrix found in compressed sensing. The solution to this problem would be a sparse vector representing the time when the investor has taken position on the Forex market. Since it would take forever to find out if that measurement matrix follows the restricted isometry property it is likely that CS will not give an edge to whoever is using it for the competition.

Here is a new thesis from MIT entitled Estimation of channelized features in geological media using sparsity constraint by Behnam Jafarpour. The abstract of the thesis reads:
In this thesis, a new approach is studied for inverse modeling of ill-posed problems with spatially continuous parameters that exhibit sparseness in an incoherent basis (e.g. a Fourier basis). The solution is constrained to be sparse in the transform domain and the dimension of the search space is effectively reduced to a low frequency subspace to improve estimation efficiency. The solution subspace is spanned by a subset of a discrete cosine transform (DCT) basis containing low-frequency elements. The methodology is related to compressive sensing, which is a recently introduced paradigm for estimation and perfect reconstruction of sparse signals from partial linear observations in an incoherent basis. The sparsity constraint is applied in the DCT domain and reconstruction of unknown DCT coefficients is carried out through incorporation of point measurements and prior knowledge in the spatial domain. The approach appears to be generally applicable for estimating spatially distributed parameters that are approximately sparse in a transformed domain such as DCT. The suitability of the proposed inversion framework is demonstrated through synthetic examples in characterization of hydrocarbon reservoirs.

Behnam Jafarpour is now a professor of petroleum engineering at Texas A&M and lists " Basis Pursuit and Sparsifying Transforms for Reservoir Parameterization" as one of his research interest.

I also found the following paper entitled:
Lightweight Acoustic Classification for Cane-Toad Monitoring by Thanh Dang and Nirupama Bulusu and Wen Hu. The abstract reads:

We propose a light weight algorithm to classify animals (eg. cane-toads) based on their vocalizations using sharply resource-constrained acoustic sensors. The goal is to enable fast in-network animal classification at the resource constrained sensors so as to minimize energy consumption. Each sensor randomly and independently samples a signal at a sub-Nyquist rate. The vocalization envelopes are extracted and matched with the original signal envelopes to find the best match. The computational complexity of the algorithm is O(n). It also requires less than 2KB of data memory. Our experiments on frog vocalizations show that our approach performs well, providing an accuracy of 90% and a miss rate of less than 5%.

Finally, here a list of upcoming Seminars found on the interwebs:
They are all listed on the calendar.

Credit Images:IMA video screenshot,  Wikipedia section on cane-toad,


Anonymous said...

wow l1 constraints for portfolio optimization, brilliant! we actually managing money have never thought of that (and seen that it doesn't mean shit in practice). our feeble little minds just haven't been able to make the visionary leap required to go from the l2 to the l1!

Igor said...

"Managing money", mmuhh, I would not be so sure as to call it that: