Wednesday, September 05, 2012

Post Peer-Review Discussion continues and a remarkable dataset

The discussion on a post peer review model has had more than 40 comments and one of the most important aspect is how to make the system as trustworthy as possible. If you have any thoughts on the matter, please share it with us. 

In one of the thread, there was a discussion about recommender capabilities. Since we were looking at Arxaliv.org as a model (this is a Reddit clone), I went to the reddit discussion of the development of that open source platform and found that they, Reddit, actually are looking for a recommender system and they have a nice dataset.


There are 23,091,688 votes from 43,976 users over 3,436,063 links in 11,675 reddits. (Interestingly these ~44k users represent almost 17% of our total votes). The dump is 2.2gb uncompressed, 375mb in bz2.
A reddit is a category. A link is a subject (in Arxaliv it would be a paper) so that matrix (43976 x 3436063) is pretty sparsely filled (1.5e-5). Some SVD has been tried but I am sure they haven't looked at low rank solvers. Since Reddit is such a massive platform, if your algorithm provides good results, it will get to be known beyond your expectations. 




Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

6 comments:

Emmanuel d'Angelo said...

The identity thing is a god occasion to put societies like IEEE back into the game.

Seriously, why do you have a membership there now ? Explore access is usually granted by your university, along with an e-mail address, and more and more universities and countries require that papers have an open-access version.

Instead, societies like IEEE would act as a central ID and profile place. Your membership would come with a centralized profile page, certifying that you are who you claim, and giving the credentials to review and comment on papers.
Conferences could include "community sessions", where the community-voted best paper / code / data / reviewers would be featured.

Hence, societies would be places for real networking, and to grant you a life-long stable web page summarizing your career (who didn't suffer from moving from unit. A to B. then having to take your research webpage with you, then realize in B you don't have the tools to re-publish it ?)

Research papers are not anonymous, and societies to connect researchers and link their ID with papers exist, so let's use them.

Peter said...

Apologies if this has come up before (I only browsed the g+ discussion), since I can't reliably take part in the discussion atm, here are two links: the ORCID initiative will probably solve the identity problem in the near future. https://en.wikipedia.org/wiki/ORCID .

And regarding societies, Cameron Neylon has written about challenges for those at http://cameronneylon.net/blog/the-challenge-for-scholarly-societies/

Emmanuel d'Angelo said...

ORCID is definitely interesting in this regard and could clearly be integrated / handled by the societies.

Igor said...

Peter,

Like you mentioned in an earlier thread, centralization may not be such a fantastic feature. I personally do not pay any dues to any societies and we need to avoid this problematic where outsiders are left out because they are not paying their "dues". The societies have not done such a great job with regards to peer review in my view, it would be a mistake to hand them over part of the process that is in direct competition with their business model.


Igor.

Emmanuel d'Angelo said...

Igor, the point is that you need some way to organize your community.

Long time ago, this was the role of societies. Then they went into publishing, sponsorship... and became over-paid and lazy mammoths. While many people now complain about them, they still are powerful and can play a role, by leaving aside all this publishing stuff and becoming a kind of label for researchers.

You can't rely on national structures, because research careers are obviously international. You can't rely on your university, because you're likely to change maybe 4 or 5 times during your career.
You can still add some flexibility by making a meta-organization of societies, so that you can always be in a society of your field (biology, CS...).

By changing their role, you can make the societies useful again, and hence you'll eliminate some reluctance to the change, that would be too hard otherwise (well established Profs, editors have too much too lose).

And yes, I really believe that you need to have a paid active membership to be an active commenter / reviewer because :
- you need a filter at the entrance
- the service costs some money, and when you don't pay you're not receiving a free product, you are the product.

Igor said...

Emmanuel,

When you say "they still are powerful and can play a role, by leaving aside all this publishing stuff and becoming a kind of label for researchers."

You are pointing to the main issue.

They are currently unable to change as it is simply going against their business model. You cannot expect then to give that away anytime soon.

In the G+ thread there are talks about a decentralized system which I am ok with.

Igor.

Printfriendly