It started in 2009 when John Langford and colleagues noted that label vectors were sparse and that using techniques of compressive sensing one could reduce learning in large dimensional space. Here is a new entrant in this area:
Random forests with random projections of the output space for high dimensional multi-label classification by Arnaud Joly, Pierre Geurts, Louis Wehenkel
We adapt the idea of random projections applied to the output space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can be reduced without affecting computational complexity and accuracy of predictions. We also show that random output space projections may be used in order to reach different bias-variance tradeoffs, over a broad panel of benchmark problems, and that this may lead to improved accuracy while reducing significantly the computational burden of the learning stage.
I note the following
On the other hand, since we are not concerned in this paper with the ‘reconstruction’ problem, we do not need to make any sparsity assumption ‘à la compressed sensing’.
This is all good, but if it works, then attempts like references [3] might be helped further because I don't fully buy that argument. It is not because you don't need a sparsity seeking reconstruction solver that the underlying assumption isn't there as used in [1,2].
- Multi-Label Prediction via Compressed Sensing by Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (attendant slides )
- Robust Bloom Filters for Large MultiLabel Classification Tasks by Moustapha M Cisse, Nicolas Usunier, Thierry Artières, Patrick Gallinari
- Narrowing the Gap: Random Forests In Theory and In Practice by Misha Denil, David Matheson, Nando de Freitas
- Compact Representations for “Large-scale” Computation by Vidit Jain
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.
It depends. For instance, the Johnson-Lindenstrauss Lemma does not need sparsity for random projections to work but only finiteness of the projected point set.
ReplyDeleteYes I agee but I am not swayed by the argument. In particular, the labels have not magically become dense for instance.
ReplyDeleteIgor.