Introduction
As we write more code to support scientific research, we'd like to put a system in place that can be shown to reduce the number of bugs in that code, thereby upholding a high standard of quality for the underlying science - and we'd like to do so without an overwhelming investment of time or resources. What's more, we would like to further minimize the amount of time we as scientists spend writing code, by making the sharing, reuse, and collaborative development of scientific code both viable and convenient. How can we design a process that speaks to all these concerns?
We can start by looking to our traditions surrounding writing manuscripts. The process of manuscript review gives us two major things: a set of customs and expectations for scholarly writing, that participants in a field can at least roughly adhere to even before the review process begins; and a formal and accountable process for reviewing a manuscript once it is written, to push it towards standards of excellence. The process of code review consists of similar parts: first, we must establish a set of basic customs that we believe will predispose our code to quality (the equivalent of good grammar and conventional structure); and second, we must formulate a process for examining code, to see if it not only meets these basic standards, but rises to the more sophisticated standards of quality we seek, the details of which we are free to define, but must define clearly.
Which all begs the question: what customs and standards should we define for our code? What makes code good? If our goal is to catch & eliminate bugs during development, we must be able to clearly define what we intend a piece of code to do, and present it in a way that is legible enough for a reader to assess whether it (likely) lives up to the task at hand. And if our goal is to enable collaboration on and reuse of code, we must find a way to build rapport with our collaborators, help them understand the goals of our code, and give them a simple but precise way of communicating back to us what contributions they would like to make to support those goals.
In other words, we need a procedure for clearly communicating about code, in order to help make errors more obvious, and smooth the process of working together - and this is exactly what a pattern of code review sets out to do. In what follows, we'll examine the three major steps in this pattern:
Further Reading & References
Many of the ideas and techniques described here were proposed in the following research:
- Code Review For & By Scientists, M. Petre, G. Wilson
- 11 Best Practices for Peer Code Review, SmartBear
- Code Reviews: the Lab Meeting for Code, F. Perez
Next Lesson
In Designing a Project, we'll learn how to set up and communicate a high level plan for our project, in order to set the stage for the contribution & review process.