Saturday, October 29, 2011

Sharing Code and Data is Compound Interest

In the following video, Victoria Stodden tries to define a path by which science can be made reproducible, i.e find ways by which data and code can be shared so that science can grow smoothly. This may include top-down decision making at the level of OSTP and NSF. I am a firm believer of bottom-up solutions and if you look at the NIPS survey Victoria did, the compressive sensing community is doing a much better job at making their science known and reproducible. I believe this is in no small part due to some of the folks involved who have led by example. I can see some hold out in the matrix factorization community and it certainly is an issue I am addressing at the level of this blog: whoever shares her/his code generally gets one entry in the most interesting spot of the week. Remember, I want you to be rock stars.

But I also believe that the reason people share their code is self interest. Astute Principal Investigators will have noticed that this is the only sustainable way of using these codes in her/his own group long after the student behind that effort is gone. Like Bode via Hamming said,

What Bode was saying was this: ``Knowledge and productivity are like compound interest.'' Given two people of approximately the same ability and one person who works ten percent more than the other, the latter will more than twice outproduce the former. The more you know, the more you learn; the more you learn, the more you can do; the more you can do, the more the opportunity - it is very much like compound interest. I don't want to give you a rate, but it is a very high rate. Given two people with exactly the same ability, the one person who manages day in and day out to get in one more hour of thinking will be tremendously more productive over a lifetime.

The same goes with research groups. It's one thing to attract top talent to your group (ability), re-usable computational software is what allows you to go deeper. It's compound interest.

No comments: