The discussion on a post peer review model has had more than 40 comments and one of the most important aspect is how to make the system as trustworthy as possible. If you have any thoughts on the matter, please share it with us.
In one of the thread, there was a discussion about recommender capabilities. Since we were looking at Arxaliv.org as a model (this is a Reddit clone), I went to the reddit discussion of the development of that open source platform and found that they, Reddit, actually are looking for a recommender system and they have a nice dataset.
There are 23,091,688 votes from 43,976 users over 3,436,063 links in 11,675 reddits. (Interestingly these ~44k users represent almost 17% of our total votes). The dump is 2.2gb uncompressed, 375mb in bz2.
A reddit is a category. A link is a subject (in Arxaliv it would be a paper) so that matrix (43976 x 3436063) is pretty sparsely filled (1.5e-5). Some SVD has been tried but I am sure they haven't looked at low rank solvers. Since Reddit is such a massive platform, if your algorithm provides good results, it will get to be known beyond your expectations.
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.