Sunday, July 12, 2015

Sunday Morning Insight: The Challenges of Reddit's Sparse Admins/Mods Graphs and Sudden Phase Transitions

There is a large part of the internet industry that focuses on doing things faster by using the prior information that everything is on a graph (  this previous entry uses the graph to devise a reconstruction solver, companies like, Google, Facebook or Dato also use this as a central asumption) but there is very little care on finding out how the network behaves over time. Sure enough search engines like Google have developed ways to handle the dynamics of the web over time, social networks like Facebook or LinkedIn do in fact try their best in designing the right recommendation engines to avoid link/friend/job announcement fatigue but the reality is that predicting things on the dynamics of the graph is most of the time left as an empirical exercise. 
Sparse Graphs and Sudden Phase Transitions
It happened before on F##@Company, Digg, Myspace or even earlier ( see Clay Shirky's 2003 A group is its own worst enemy h/t Hugh McLeod), the Reddit on-going implosion (explained here and here) is just one of those seemingly sudden phase transitions. To put these stories in the context of what we read here on Nuit Blanche, the Reddit moderators belong to a sparse graph within the much denser population of users. That sparse graph also interacts, very modestly it seems, with the graph of paid administrators of the site. Given this tidbit of context, I wonder if some of the dynamics on Reddit could not be better foreseen using some of the user-mods-admins data and the statistical physics of networks as explained to us by Lenka Zdeborova at the Paris Machine Learning meetup #8 Season 1 ( slides How hard is it to find a needle in a haystack? ).

It is one thing to figure out that a graph is sparse - Duh! the number of moderators is low- it is yet another to figure out the dynamics between the sparse set of unpaid and volunteer moderators, the inner group of paid admins at Reddit and the rest of the much denser community.  The fate of Reddit hinges on its ability to mend fences between the first two communities while at the same time pleasing the third community. The first question one should ask is whether Reddit has the ability to build links and trust so that they get away from a perilious metastable region. The second question is algorithmic, as shown by Lenka and colleagues: if the wrong spectral operator is used to figure out the dynamics of these  graphs, you might be over/underestimating by a large margin the efforts it takes to get away from this metastable region.

It is one thing to build trust between communities in some holistic fashion (aka through PR and some information control), it is yet another one to quantify the fragility of these networks and the different ways to make them robust over time.
Recent work by Lenka and colleagues:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: