Examinando por Autor "Robles, Gregorio"

Mostrando 1 - 14 de 14

A reflection on the impact of model mining from GitHub
(Elsevier, 2023) Robles, Gregorio; Chaudron, Michel R.V.; Jolak, Rodi; Hebig, Regina
Context: Since 1998, the ACM/IEEE 25th International Conference on Model Driven Engineering Languages and Systems (MODELS) has been studying all aspects surrounding modeling in software engineering, from languages and methods to tools and applications. In order to enable empirical studies, the MODELS community developed a need for having examples of models, especially of models used in real software development projects. Such models may be used for a range of purposes, but mostly related to domain analysis and software design (at various levels of abstraction). However, finding such models was very difficult. The most used ones had their origin in academic books or student projects, which addressed ‘‘artificial’’ applications, i.e., were not base on real-case scenarios. To address this issue, the authors of this reflection paper, members of the modeling and of the mining software repositories fields, came together with the aim of creating a dataset with an abundance of modeling projects by mining GitHub. As a scoping of our effort we targeted models represented using the UML notation because this is the lingua franca in practice for software modeling. As a result, almost 100k models from 22k projects were made publicly available, known as the Lindholmen dataset. Objective: In this paper, we analyze the impact of our research, and compare this to what we envisioned in 2016. We draw practical lessons gained from this effort, reflect on the perils and pitfalls of the dataset, and point out promising avenues of research. Method: We base our reflection on the systematic analysis of recent research literature, and especially those papers citing our dataset and its associated publications. Results: What we envisioned in the original research when making the dataset available has to a major extent not come true; however, fellow researchers have found alternative uses of the dataset. Conclusions: By understanding the possibilities and shortcomings of the current dataset, we aim to offer the research community i) future research avenues of how the data can be used; and ii) raise awareness of the limitations, not only to point out threats to validity of research, but also to encourage fellow researchers to find ideas to overcome them. Our reflections can also be helpful to researchers who want to perform similar mining efforts.
BabiaXR: Facilitating experiments about XR data visualization
(Elsevier, 2023) Moreno-Lumbreras, David; Gonzalez-Barahona, Jesus M.; Robles, Gregorio
BabiaXR is a toolset for conducting experiments about 3D data visualizations in extended reality (XR) in the browser. BabiaXR provides both components for building complex data visualizations, and for easily transforming them into scenes suitable for running experiments with subjects. For data visualization, it provides components to retrieve, filter, select, and visualize data. To facilitate empirical experiments with human subjects, it provides components for showing information to subjects, controlling their interaction with data, and recording their reactions. This enables the easy transformation of a certain data visualization scene into an experiment directly usable by subjects. BabiaXR is extensible, based on the A-Frame JavaScript framework for XR. As such, it is easily composable with other A-Frame components, and complex data visualization scenes can be created by using only HTML constructs. BabiaXR can be used in any XR device supporting WebXR, and with limited capabilities also in desktop and mobile devices.
Can instability variations warn developers when open-source projects boost?
(Springer, 2024-06-14) Capilla, Rafael; Salamanca, Victor; Valdezate, Alejandro; Robles, Gregorio
Although architecture instability has been studied and measured using a variety of metrics, a deeper analysis of which project parts are less stable and how such instability varies over time is still needed. While having more information on architecture instability is, in general, useful for any software development project, it is especially important in Open Source Software (OSS) projects where the supervision of the development process is more difficult to achieve. In particular, we are interested when OSS projects grow from a small controlled environment (i.e., the cathedral phase) to a community-driven project (i.e., the bazaar phase). In such a transition, the project often explodes in terms of software size and number of contributing developers. Hence, the complexity of the newly added features, and the frequency of the commits and files modified may cause significant variations of the instability of the structure of the classes and packages. Consequently, in this article we analyze the instability in OSS projects, especially during that sensitive phase where they become community-driven. Our results show that instability metrics can be easily obtained in such type of transitions. We also observed from our case studies that instability metrics can help finding out the balance between adding new functionality and performing refactoring. As a conclusions we state that instability metrics offer relevant information in the transition phase from the cathedral to the bazaar
Development effort estimation in free/open source software from activity in version control systems
(Springer, 2022-07-20) Robles, Gregorio; Capiluppi, Andrea; Gonzalez-Barahona, Jesus M.; Lundell, Björn; Gamalielsson, Jonas
Effort estimation models are a fundamental tool in software management, and used as a forecast for resources, constraints and costs associated to software development. For Free/Open Source Software (FOSS) projects, effort estimation is especially complex: professional developers work alongside occasional, volunteer developers, so the overall effort (in person-months) becomes non-trivial to determine. The objective of this work it to develop a simple effort estimation model for FOSS projects, based on the historic data of developers’ effort. The model is fed with direct developer feedback to ensure its accuracy. After extracting the personal development profiles of several thousands of developers from 6 large FOSS projects, we asked them to fill in a questionnaire to determine if they should be considered as full-time developers in the project that they work in. Their feedback was used to fine-tune the value of an effort threshold, above which developers can be considered as full-time. With the help of the over 1,000 questionnaires received, we were able to determine, for every project in our sample, the threshold of commits that separates full-time from non-full-time developers. We finally offer guidelines and a tool to apply our model to FOSS projects that use a version control system.
Diving into Software Evolution: Virtual Reality vs. On-Screen
(2024-09-04) Moreno-Lumbreras, David; González Barahona, Jesus M.; Robles, Gregorio
Abstract—Background: Traditional 2D visualizations have been widely used for software metrics and evolution analysis, offering structured views of complex data. However, the advent of Virtual Reality (VR) technologies introduces new possibilities for immersive and interactive software visualization, potentially enhancing comprehension and user engagement. Objective/Aim: This report aims to evaluate the effectiveness of VR immersive visualizations compared to traditional 2D on- screen visualizations for understanding code metrics across soft- ware releases. Specifically, we seek to determine if VR provides better accuracy and speed for comprehending high-level software evolution tasks. Method: We will conduct a controlled experiment with 30 participants from academia and industry, using GitLab, GitHub, or an IDE for on-screen visualizations and a VR setup using Meta Quest 3. Tasks related to software evolution will be completed in both settings. Accuracy and time will be measured and analyzed using mixed linear models and non-parametric tests to compare the two approaches. Data will be sourced from GitHub repositories with similar project characteristics.
dtwParallel: A Python package to efficiently compute dynamic time warping between time series
(Elsevier, 2023) Escudero-Arnanz, Óscar; G. Marques, Antonio; Soguero-Ruiz, Cristina; Mora-Jiménez, Inmaculada; Robles, Gregorio
dtwParallel is a Python package that computes the Dynamic Time Warping (DTW) distance between a collection of (multivariate) time series (MTS). dtwParallel incorporates the main functionalities available in current DTW libraries and novel functionalities such as parallelization, computation of similarity (kernel-based) values, and consideration of data with different types of features (categorical, real-valued, . . . ). A low-floor, high-ceiling, and wide-walls software design principle has been adopted, envisioning uses in education, research, and industry. The source code and documentation of the package are available at https://github.com/oscarescuderoarnanz/dtwParallel.
Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing
(Springer, 2024-05-04) Maes-Bermejo, Michel; Serebrenik, Alexander; Gallego, Micael; Cortázar, Francisco; Robles, Gregorio; González Barahona, Jesús María
Context Finding code changes that introduced bugs is important both for practitioners and researchers, but doing it precisely is a manual, effort-intensive process. The perfect test method is a theoretical construct aimed at detecting Bug-Introducing Changes (BIC) through a theoretical perfect test. This perfect test always fails if the bug is present, and passes otherwise. Objective To explore a possible automatic operationalization of the perfect test method. Method To use regression tests as substitutes for the perfect test. For this, we transplant the regression tests to past snapshots of the code, and use them to identify the BIC, on a well-known collection of bugs from the Defects4J dataset. Results From 809 bugs in the dataset, when running our operationalization of the perfect test method, for 95 of them the BIC was identified precisely and in the remaining 4 cases, a list of candidates including the BIC was provided. Conclusions We demonstrate that the operationalization of the perfect test method through regression tests is feasible and can be completely automated in practice when tests can be transplanted and run in past snapshots of the code. Given that implementing regression tests when a bug is fixed is considered a good practice, when developers follow it, they can detect effortlessly bug-introducing changes by using our operationalization of the perfect test method
Revisiting the building of past snapshots — a replication and reproduction study
(Springer, 2022-03-17) Maes-Bermejo, Michel; Gortázar, Francisco; Gallego, Micael; Robles, Gregorio; González-Barahona, Jesús M.
Context: Building past source code snapshots of a software product is necessary both for research (analyzing the past state of a program) and industry (increasing trustability by reproducibility of past versions, finding bugs by bisecting, backporting bug fixes, among others). A study by Tufano et al. showed in 2016 that many past snapshots cannot be built. Objective: We replicate Tufano et al.’s study in 2020, to verify its results and to study what has changed during this time in terms of compilability of a project. Also, we extend it by studying a different set of projects, using additional techniques for building past snapshots, with the aim of extending the validity of its results. Method: (i) Replication of the original study, obtaining past snapshots from 79 repositories (with a total of 139,389 commits); and (ii) Reproduction of the original study on a different set of 80 large Java projects, extending the heuristics for building snapshots (300,873 commits). Results: We observed degradation of compilability over time, due to vanishing of dependencies and other external artifacts. We validated that the most influential error causing failures in builds are missing external artifacts, and the less influential is compiling errors. We observed some facts that could lead to the effect of the build tool on past compilability. Conclusions: We provide details on what aspects have a strong and a shallow influence on past compilability, giving ideas of how to improve it. We could extend previous research on the matter, but could not validate some of the previous results. We offer recommendations on how to make this kind of studies more replicable.
Revisiting the reproducibility of empirical software engineering studies based on data retrieved from development repositories
(Elsevier, 2023) Gonzalez-Barahona, Jesus M.; Robles, Gregorio
Context: In 2012, our paper ‘‘On the reproducibility of empirical software engineering studies based on data retrieved from development repositories’’ was published. It proposed a method for assessing the reproducibility of studies based on mining software repositories (MSR studies). Since then, several approaches have happened with respect to the study of the reproducibility of this kind of studies. Objective: To revisit the proposals of that paper, analyzing to which extent they remain valid, and how they relate to current initiatives and studies on reproducibility and validation of research results in empirical software engineering. Method: We analyze the most relevant studies affecting assumptions or consequences of the approach of the original paper, and other initiatives related to the evaluation of replicability aspects of empirical software engineering studies. We compare the results of that analysis with the results of the original study, finding similarities and differences. We also run a reproducibility assessment study on current MSR papers. Based on the comparison, and the applicability of the method to current papers, we draw conclusions on the validity of the approach of the original paper. Main lessons learned: The method proposed in the original paper is still valid, and compares well with other more recent methods. It matches the results of relevant studies on reproducibility, and a systematic comparison with them shows that our approach is aligned with their proposals. Our method has practical use, and complements well the current major initiatives on the review of reproducibility artifacts. As a side result, we learn that the reproducibility of MSR studies has improved during the last decade. Vision: We propose to use our approach as a fundamental element of a more profound review of the reproducibility of MSR studies, and of the characterization of validation studies in this realm.
Software Development Metrics With a Purpose
(IEEE Computer Society, 2022-04-08) Gonzalez-Barahona, Jesus M.; Izquierdo-Cortazar, Daniel; Robles, Gregorio
A new generation of toolsets that are flexible enough to adapt to the data analytics needs of a given scenario is emerging to analyze free, open source software (FOSS). GrimoireLab is one such toolset that meets many of the needs of foundations, developers, and companies.
Software development metrics: to VR or not to VR
(Springer, 2024) Moreno-Lumbreras, David; Robles, Gregorio; Izquierdo-Cortázar, Daniel; Gonzalez-Barahona, Jesus M.
Context Current data visualization interfaces predominantly rely on 2-D screens. However, the emergence of virtual reality (VR) devices capable of immersive data visualization has sparked interest in exploring their suitability for visualizing software development data. Despite this, there is a lack of detailed investigation into the effectiveness of VR devices specifically for interacting with software development data visualizations. Objective Our objective is to investigate the following question: “How do VR devices compare to traditional screens in visualizing data about software development?” Specifically, we aim to assess the accuracy of conclusions derived from exploring visualizations for understanding the software development process, as well as the time required to reach these conclusions. Method In our controlled experiment, we recruited N=32 volunteers with diverse backgrounds. Participants interacted with similar data visualizations in both VR and traditional screen environments. For the traditional screen setup, we utilized a commercially available set of interactive dashboards based on Kibana, commonly used by Bitergia customers for data insights. In the VR environment, we designed a set of visualizations, tailored to provide an equivalent dataset within a virtual room. Participants answered questions related to software evolution processes, specifically code review and issue tracking, in both VR and traditional screen environments, for two projects. We conducted statistical analyses to compare the correctness of their answers and the time taken for each question. Results Our findings indicate that the correctness of answers in both environments is comparable. Regarding time spent, we observed similar durations, except for complex questions that required examining multiple interconnected visualizations. In such cases, participants in the VR environment were able to answer questions more quickly. Conclusion Based on our results, we conclude that VR immersion can be equally effective as traditional screen setups for understanding software development processes through visualization of relevant metrics in most scenarios. Moreover, VR may offer advantages in comprehending complex tasks that require navigating through multiple interconnected visualizations. However, further experimentation is necessary to validate and reinforce these conclusions. Similar content being viewed by others
Testing the past: can we still run tests in past snapshots for Java projects?
(Springer Nature, 2024-07-30) Maes-Bermejo, Michel; Gallego Carrillo, Micael; Gortázar Bellas, Francisco; Robles, Gregorio; Gonzalez-Barahona, Jesús María
Building past snapshots of a software project has shown to be of interest both for researchers and practitioners. However, little attention has been devoted specifically to tests available in those past snapshots, which are fundamental for the maintenance of old versions still in production. The aim of this study is to determine to which extent tests of past snapshots can be executed successfully, which would mean these past snapshots are still testable. Given a software project, we build all its past snapshots from source code, including tests, and then run the tests. When tests do not result in success, we also record the reasons, allowing us to determine factors that make tests fail. We run this method on a total of 86 Java projects. On average, for 52.53% of the project snapshots on which tests can be built, all tests pass. However, on average, 94.14% of tests pass in previous snapshots when we take into account the percentage of tests passing in the snapshots used for building those tests. In real software projects, successfully running tests in past snapshots is not something that we can take for granted: we have found that in a large proportion of the projects we studied this does not happen frequently. We have found that the building from source code is the main limitation when running tests on past snapshots. However, we have found some projects for which tests run successfully in a very large fraction of past snapshots, which allows us to identify good practices. We also provide a framework and metrics to quantify testability (the extent to which we are able to run tests of a snapshot with a success result) of past snapshots from several points of view, which simplifies new analyses on this matter, and could help to measure how any project performs in this respect.
The influence of the city metaphor and its derivates in software visualization
(Elsevier, 2024-04) Moreno-Lumbreras, David; Gonzalez-Barahona, Jesús M.; Robles, Gregorio; Cosentino, Valerio
Context: The city metaphor is widely used in software visualization to represent complex systems as buildings and structures, providing an intuitive way for developers to understand software components. Various software visualization tools have utilized this approach. Objective: Identify the influence of the city metaphor on software visualization research, determine its state-of-the-art status, and identify derived tools and their main characteristics. Method: Conduct a systematic mapping study of 406 publications that reference the first paper on the use of the city metaphor in software visualization and/or the main paper of the CodeCity tool. Analyze the 168 publications from which valuable information could be extracted, and build a complete categoric analysis. Results: The field has grown considerably, with an increasing number of publications since 2001, and a changing research community with evolving interconnections between groups. Researchers have developed more tools that support the city metaphor, but less than 50% of the tools were referenced in their papers. Moreover, 85% of the tools did not use extended reality environments, indicating an opportunity for further exploration. Conclusion: The study demonstrates the active and continually growing presence of the city metaphor in research and its impact on software visualization and its derivatives
Virtual Reality vs. 2D Visualizations for Software Ecosystem Dependency Analysis – A Controlled Experiment
(2024-08-30) Moreno-Lumbreras, David; Gonzalez-Barahona, Jesus M.; Robles, Gregorio
Background/Context: Data is typically visualized using 2-D on- screen tools. With the advent of devices capable of creating 3D and Virtual Reality (VR) scenes, there is a growing interest in exploring these technologies for data visualization, particularly for complex data like software dependencies. Despite this interest, there is lim- ited evidence comparing VR with traditional 2D on-screen tools for such visualizations. Objective/Aim: This registered report aims to determine whether comprehension of software ecosystem dependencies, visualized through their metrics, is better when presented in VR scenes com- pared to 2D screens. Specifically, we seek to evaluate if answers obtained through VR visualizations are more accurate and if it takes less time to derive these answers compared to traditional 2D on-screen tools. Method: We will conduct an experiment with volunteers from various backgrounds, using two setups: a 2D on-screen tool and a VR scene created with A-Frame. The data will focus on web projects using the Node Package Manager (NPM) registry. Subjects will answer a series of questions in both setups, presented in random order. We will statistically analyze the correctness and the time taken for their answers to compare the two visualization methods