Saturday, January 02, 2010

CS: Finding the Restricted Isometry Constant, Random SVD, ART in CTs ?, Patents, Cheap CS Sensing ?

In the previous post, Zainul Charbiwala asks the following question:
Thanks for listing my paper. It's our tiny attempt at contributing to the CS field.

I'd like to point out something about this paper. I did the completely unthinkable, and *computed* the RIP constant explicitly through a Monte Carlo type simulation. Now, I know that this is technically incorrect, because RIP is defined over *all* possible matrices, but I found this technique to provide fairly consistent results and that match our intuition about the "goodness" of sensing matrices, especially when we're trying to compare one type of sensing system with another.

I confess that I could have taken an anlytical route to illustrate my point, but could you (or other readers) comment on their experience with computing the RIP explicitly ?



Thanks Zainul for asking the question. In the following presentation, Tractable Upper Bounds on the Restricted Isometry Constant. by Alexandre d’Aspremont, Francis Bach, Laurent El Ghaoui, there is an attempt at computing bounds for the Restricted Isometry constant. The Sparse PCA code can be found here. Sicne the RIP is only a sufficient condition, there is also the necessary and sufficient condition called Null Space Property for which some bounds can be found as follows:
To see the differences and connections between these two properties, you really want to check out the description given by Jared Tanner and Remi Gribonval in the following entry CS: RIP vs NSP, Clarifications by Jared Tanner and Remi Gribonval. If anybody else has a better answer, please feel free to add to the discussion.



While reading, Hal Daume III's excellent blog on NIPS, I noticed a description made by one of his grad student, Piyush Rai on Gunnar Martinsson's NIPS Tutorial entitled: Randomization: Making Very Large Scale Linear Algebraic Computations Possible. The presentation hints on the possibility of performing faster diffusion maps when one deals with very large problems,. Diffusion maps are one of the few techniques in Machine Learning that enables the parametrization of nonlinear manifolds. What is interesting here is how this SVD algorithm uses random projections (see slides 132-134 for a connection to CS, concentration of measures and structured random projections). The reference paper related to the tutorial is: Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions by Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp. The abstract
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys recent research which demonstrates that \emph{randomization} offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. In particular, these techniques offer a route toward principal component analysis (PCA) for petascale data.
This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed - either explicitly or implicitly - to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis.
Mark Tygert's page has an implementation of that algorithm here. So at some point either, the practicing Machine Learning Engineer uses random projections and diffusion maps to evaluate the non-linear manifold or she could use directly the SVD method above to produce diffusion maps from the high dimensional data. While interesting, the second approach still expects the kernel matrix to be computed and sometimes this is just not possible in a reasonable amount of time (that's what I would stick with the first option).



...3.2 Commercial patents

There is hardly better evidence for the value of projection methods than the many patents for commercial purposes that include them. Projection methods are used in commercial devices in many areas. Unfortunately, if a device is truly commercial, then the algorithm that is actually used in it is proprietary and usually not published. Many commercial emission tomography scanners use now some sort of iterative algorithms. A prime example is provided by the commercially-successful Philips Allegro scanners (see http://www.healthcare.philips.com/main/products/ and [29]). In x-ray computerized tomography (CT), there are reports emanating from companies that sell such scanners indicating that variants of ART are used in heart imaging; an example is presented in [54]. The first EMI (Electric & Musical Industries Ltd., London, England, UK) CT scanner, invented by G.N. Hounsfield [53], used a variant of ART. For this pioneering invention, Hounsfield shared the Nobel Prize with A.M. Cormack in 1979. Thirty years later (on September 29, 2009), a patent was issued to Philips (Koninklijke Philips Electronics N.V., Eindhoven, The Netherlands) for a “Method and device for the iterative reconstruction of cardiac images” [77]. The role of projection methods is demonstrated by the following quote from the “Summary of the Invention” included in the Patent Description:

“The iterative reconstruction applied here may particularly be based on an Algebraic Reconstruction Technique (ART) (cf. R. Gordon, R. Bender, and G.T. Herman: “Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and x-ray photography”, J. Theor. Biol., 29:471–481, 1970) or on a Maximum Likelihood (ML) algorithm (K. Lange and J.A. Fessler: “Globally convergent algorithms for maximum a posteriori transmission tomography”, IEEE Transactions on Image Processing, 4(10):1430–1450, 1995), wherein each image update step uses the projections of a selected subset, i.e., projections corresponding to a similar movement phase.”
The knowledge that ART seems to be employed in current CT scanners seems at odd with the review entitled: Why do commercial CT scanners still employ traditional, filtered back-projection for image reconstruction? by Xiaochuan Pan, Emil Sidky and Michael Vannier. Also of interest, Richard Gordon a co-inventor of ART for CT-scanners (with Gabor T. Herman an author of the paper above) responded to that review in this entry: MIA'09 and Richard Gordon's ART CT algorithm.

And if you think some of these patenting issues are germane, you might want to take a look at the Bilski case currently in front of the U.S. Supreme Court, from the wikipedia entry, there is a mention of CT algorithms:

....Some Federal Circuit decisions, however, had held some transformations of signals and data patent-eligible. For example, the Abele decision approved a dependent claim to a method transforming X-ray attenuation data produced in a X-Y field by an X-ray tomographic scanner to an image of body organs and bones — while at the same time the Abele court rejected a more generic and abstract independent claim to a process of graphically displaying variances from their average values of unspecified data obtained in an unspecified manner.
From the Abele decision, there was the following determination:

....In Johnson, supra, the interrelationship of the algorithm to the remaining limitations of a claim was held to be determinative of whether the claim defined statutory subject matter. Relying on the same reasoning, in In re Walter,618 F.2d 758, 205 USPQ 397 (CCPA 1980), the second part of the two-step analysis5 was defined as follows:

If it appears that the mathematical algorithm is implemented in a specific manner to define structural relationships between the physical elements of the claim (in apparatus claims) or to refine or limit claim steps (in process claims), the claim being otherwise statutory , the claim passes muster under §101. If, however, the mathematical algorithm is merely presented and solved by the claimed invention, as was the case in Benson and Flook, and is not applied in any manner to physical elements or process steps, no amount of post-solution activity will render the claim statutory; nor is it saved by a preamble merely reciting the field of use of the mathematical algorithm....

Is compressive sensing going to change patents ?

Finally, two projects that are dirt cheap and resonate with me for some reasons:

  • The EyeWriter


  • and Structured Light 3D scanning

    Point Clouds with Depth of Field from Kyle McDonald on Vimeo.


    It might be a good little project to try Compressive Structured Light
  • 2 comments:

    Zainul said...

    Hello Igor,

    Thanks for your elaborate answers. I will surely check those links and code out. It will be an interesting comparison with what I currently have.

    Kudos for maintaining this blog. You've managed to make it quite the center of the CS universe :-)

    Zainul.

    Igor said...

    Zainul,

    Check tomorrow's entry.

    Cheers,

    Igor.

    Printfriendly