Saturday, May 28, 2011

Ensemble, We Can Be the Dumbest: The Google Exacycle Program.

In French, ensemble means together, so the title of this entry is really meant as a semi translation of yesterday's "Together We Can Be the Dumbest" entry. Bob points out that his ensemble method and the DUMBEST algorithm don't seem to match whereas I point out that they seem to be the same thing. In short, once you have a certain number of measurements, all of us have noticed empirically that solutions found by two solvers differed and that only the knowledge of the initial solution makes us say which algorithm is better than the other. The problem with this approach is really that if you were to use a random lens imager, you probably couldn't make out the real solution. A system that uses as much as what you have already gathered is an important improvement over what I should call oracle based systems or spending resources getting an additional measurement. Hence the need to efficiently use what you already have by using subsets of a measurements through several solvers.

I note that this idea is so important that folks like Yilun Wang and Wotao Yin have recognized it and made a solver out of this idea (ISD). They even made it clear as to when they came up with the idea on their website, something I usually don't see often. Similarly, the use of all the measurements you already have seems to also be there in the Kalman filter and beyond work of Namrata Vaswani et al .

Figure is from Recovery of Compressively Sensed Sparse Signals using Cyclic OLS[1]

Anyway, in about the same way an ensemble solver was first on the leaderboard of the Netflix prize (but did not win), we ought to collectively investigate how the flurry of solvers can help us do better reconstruction and map better DT phase transitions. This should not work only for sparse signals reconstruction as we ought to also consider doing the same approach for the current crop of  structured sparsity, MMV, robust PCA and Matrix Completion solvers. The real question is: Is the reason an acronym like DUMBEST won't get funded the same reason why somebody investigating Ensemble behavior won't garner much support ? If it is, then all y'all ought to consider the Google Exacycle program. Imagine that even if you are in love with one algorithm (yours :-)), an ensemble approach could be to merge the result of that same solver using different parameters (and we know how some of these parameters are important sometimes) in an approach parallel to epsilon photography. The possibilities are so endless, that the 100 million CPU hours now don't look as large as one would think. The other part of it is I don't know of one problem in engineering and applied math that 

  • has so many repercussions on so many problems, while at the same time
  • is embarrassingly parallel.

If nobody tells me they are sending something to the Google exacycle program, I'll send a placeholder. This is important.


Bob et Carla said...

Hi Igor,

I have commented on this post and made more clear what I was saying: I also look at the case where I don't know which solution is the best given the real solution.

Igor said...


I just added some comment to the comment you made on this post ( )