Nuit Blanche: Random Branches

Friday, June 27, 2014

Random Branches

It's Friday afternoon, it's Hamming's time. Let us push the argument of About t'em Random Projections in Random Forests further.

Leo Breiman, and many others afterwards, were concerned about linking Adaboost with Random Forests (see 7. Conjecture: Adaboost is a Random Forest). Let us look at it some other way, Random Forests are a set of randomized trees built node by node based on the sequential determination of an optimal split of groups of elements according to a certain metric (entropy, variance reduction, etc...). If you focus on one tree, you'll notice that what you see is a greedy algorithm which at every step computes the optimal spitting parameter. The algorithm is randomized in the sense that the choise of features being used at every iteration step is randomized. In effect, a Random Forest can only be seen as a parallel implementation of several greedy randomized iterations. In a sense, a Random Forest is one instance of several randomized Adaboosts.

Why is this interesting to say those things ?

Well ever since Ali Rahimi and Benjamin Recht came out with the idea of Random Kitchen Sinks, we know that we can replace the greedy Adaboost with a more parallel algorithm. Whereas Adaboost greedily search for a new basis function at every step of its iteration with the objective function of reducing the l2 norm between functions of the training sets and their classification results, Random Kitchen Sinks essentially removes the greedy step by choosing a random sets of functions thereby making the problem a linear problem. The solution becomes a simple least squares problem.

In my view, the problem with random forest is not that it is parallel ( a good thing) but that it is greedy.

If one would be interested in making random forests closer to a non greedy approach, then using this image of Random Forests being a collection of Randomized Adaboosts might help. How ?

Imagine that each traditional tree in the Random Forest algorithm is replaced by a (large) set of different Random Branches. In effect, each branch ressembles a Random Forest tree but not only do we choose the elements at each node in a randomized fashion (as in the traditional Random Forests algorithm) but we also get to choose the threshold in a randomized fashion. As is the case of the AdaBoost/Random Kitchen Sinks dual approach, we now require many branches to describe well one single tree of the Random Forests algorithm (i.e. we replace one tree in the Random Forest algorithm by a large set of branches that look exactly like that one tree except that each branch has different set of random threshold at every node). Yes, the problem got much bigger as a result but there is no free lunch. This is the price we would need to pay to transform a nonlinear greedy solver into a linear problem ( as was the case when we went from AdaBoost to the Random Kitchen Sinks approach).

Is there something similar in the compressive sensing literature ?

There is actually. Recently, we mentioned the work on 1bit CS with random biases (One-bit compressive sensing with norm estimation by Karin Knudson, Rayan Saab, Rachel Ward) and there is an implementation right here that allows one to reconstruct elements from several such meausurements. Each node of the proposed randomized branch would constitute an instance of this measurement model.