Revisiting the building of past snapshots — a replication and reproduction study
Context: Building past source code snapshots of a software product is necessary both for research (analyzing the past state of a program) and industry (increasing trustability by reproducibility of past versions, finding bugs by bisecting, backporting bug fixes, among others). A study by Tufano et al. showed in 2016 that many past snapshots cannot be built. Objective: We replicate Tufano et al.’s study in 2020, to verify its results and to study what has changed during this time in terms of compilability of a project. Also, we extend it by studying a different set of projects, using additional techniques for building past snapshots, with the aim of extending the validity of its results. Method: (i) Replication of the original study, obtaining past snapshots from 79 repositories (with a total of 139,389 commits); and (ii) Reproduction of the original study on a different set of 80 large Java projects, extending the heuristics for building snapshots (300,873 commits). Results: We observed degradation of compilability over time, due to vanishing of dependencies and other external artifacts. We validated that the most influential error causing failures in builds are missing external artifacts, and the less influential is compiling errors. We observed some facts that could lead to the effect of the build tool on past compilability. Conclusions: We provide details on what aspects have a strong and a shallow influence on past compilability, giving ideas of how to improve it. We could extend previous research on the matter, but could not validate some of the previous results. We offer recommendations on how to make this kind of studies more replicable.
- Artículos de Revista