The bad code and bad writer
by Yi Zhang
In the academia community a code writer is allowed to be sloppy, sometimes in the name of exploring, but apparently that the whole story. It turns out I am not the only one suffering from a sloppy but not inspiring code in computational science: most recently article in Nature says that I am just tiny part of the deep trouble.
As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software, say computer scientists. At best, poorly written programs cause researchers such as Harry to waste valuable time and energy. But the coding problems can sometimes cause substantial harm, and have forced some scientists to retract papers.
Well, the CFD code I am working on is the just the case: almost no comment, or any notes from the writer whatsoever, let alone verifying it’s implementation. And I apparently agree with some comment the mentioned article:
“There are terrifying statistics showing that almost all of what scientists know about coding is self-taught,” says Wilson. “They just don’t know how bad they are.”
Well, there is nothing wrong about “self taught coding”, actually I believe when it comes to writing code, self teaching is almost the only way to do it. As written in “Programming in Emacs Lisp“, the writer describes how a friend of him learns a new language:
I prefer to learn from reference manuals. I “dive into” each paragraph, and “come up for air” between paragraphs.
When I get to the end of a paragraph, I assume that that subject is done, finished, that I know everything I need (with the possible exception of the case when the next paragraph starts talking about it in more detail). I expect that a well written reference manual will not have a lot of redundancy, and that it will have excellent pointers to the (one) place where the information I want is.
I believe that’s the way many computational scientists adopt. What should be taught by others, on the other hand, is the style of coding, which is indeed what programmers learn in school, and I was surprised when found out that many people around have no idea about it, some of them are even from computer (no, NOT computational) science community. The article gives five tips for “amateur” coding:
- Version control
- Tracking raw material
- Write testable code
- Test it
- Share it
Leave the last one alone, I am level 4, and unfortunately whoever wrote it, he missed mostly all of previous 3 levels, and I am paying for that. GIT system comes pretty handy for vc, especially when someone already wrote a post on using it in research, tip my hat to No.6 of “10 reasons to use Git for Research”:
Keep track of your grad students.
Suspect your grad students are slacking? Check the commit logs! And now I prepare for hate mail from grad students. However, I think that if I had this form of accountability, it would have made me more productive. Of course, you don’t need Git for this, any version control system would do. Of all the systems I’ve used, Git’s presentation of changes is the user-friendliest.
Well, that’s pretty…. evil.