Tuesday, December 21, 2010

The Monday Nuit Blanche Mailbag: A need for an automatic benchmarking tool ?


Here is what I received in my mailbox recently. The author wants to remain anonymous:
Dear Igor.

Since your blog has become the go-to place for all information related to CS, it would help if you could maintain a up-to-date comparison of all the recovery algorithms (OMP family/Basis Pursuit/ ???) that are being proposed. This could take the form of a table that includes error bounds, time and space complexity, a chart showing the phase transitions for these algorithms, links to papers, source code etc.

The chart could compare algorithm performance for standard matrices (eg Gaussian or Bernoulli) with respect to a single set of parameters (sparsity, undersampling ratio, signal type etc).

This would be a great help to anyone who wants to compare their work with the state-of-the-art, which is rapidly changing. Also, someone might just want to pick the best algorithm for a particular application, without being aware of all the work done in the field.

I am aware that individual papers do include such comparisons, but they go out of date quickly, or are not comprehensive enough.

Two throughts come to my mind:
  • One of you had something along the lines of what is being described running on the cloud. It has for the moment remained private.
  • If this is just for greedy algorithm, there is a way to not have to rely on Matlab, so my question is really, how much are people willing to pay to have the convenience of finding out these results ? I can set up a cloud based solution for this problem, but are there customers ?
Relevant posts where this issue of benchmarks was mentioned:

2 comments:

Absalon said...

Comparing algorithms based on run time is a very unreliable process. The difference between a good implementation and a bad implementation of the same algorithm can easily be a factor of ten in run time.

Igor said...

Absalon,

Maybe so but I don't believe it. If the algorithm are up in the open and running on the same machine, then, in the end, they will be optimized and comparable. Speed is not the only issue either, reconstruction quality is another, adding simply other assumptions is also a good dimension to take into account.

Cheers,

Igor.

Printfriendly