Principles of resilient coding for plant ecophysiologists

Abstract

Plant ecophysiology is founded on a rich body of physical and chemical theory, but it is challenging to connect theory with data in unambiguous, analytically rigorous, and reproducible ways. Custom scripts written in computer programming languages (coding) enable plant ecophysiologists to model plant processes and fit models to data reproducibly using advanced statistical techniques. Since many ecophysiologists lack formal programming education, we have yet to adopt a unified set of coding principles and standards that could make coding easier to learn, use, and modify. We identify eight principles to help in plant ecophysiologists without much programming experience to write resilient code: 1) standardized nomenclature, 2) consistency in style, 3) increased modularity/extensibility for easier editing and understanding, 4) code scalability for application to large datasets, 5) documented contingencies for code maintenance, 6) documentation to facilitate user understanding; 7) extensive tutorials, and 8) unit testing. We illustrate these principles using a new R package, {photosynthesis}, which provides a set of analytical and simulation tools for plant ecophysiology. Our goal with these principles is to advance scientific discovery in plant ecophysiology by making it easier to use code for simulation and data analysis, reproduce results, and rapidly incorporate new biological understanding and analytical tools.

Publication
Annals of Botany PLANTS