Information Needs of Developers for Program Comprehension during Software Maintenance Tasks

Abstract

Software engineers undertaking maintenance tasks often work on unfamiliar code, requiring developers to search for, relate, and collect information relevant to the maintenance task. The goal of this research is to create theories that describe the nature of information sought by developers and how that information is used by developers during two types of maintenance tasks: debugging (corrective maintenance) and enhancement (perfective maintenance). To meet this goal, six hypotheses are investigated regarding the navigation activities undertaken by developers to identify, relate, and collect information during software maintenance tasks. These hypotheses were investigated using data from two empirical studies of 18 developers performing enhancement and debugging tasks on three Java programs. Video recordings were used to annotate user interaction logs to create a history of user activities during the maintenance tasks. These data described the activities developers undertake during maintenance tasks, what source code elements the developers examined, and the amount of time developers spent performing various activities. These data were analyzed using a combination of statistical and qualitative methods to compare the different methods of searching for and collecting information relevant to the software maintenance tasks. Analysis of the data showed that the navigation styles used by developers (static navigation, normal navigation, and keyword searching) to find information differ significantly in the amount of time spent collecting information. Furthermore, static navigation techniques were significantly shorter in duration than keyword search techniques. No statistically significant differences were observed in the amount of time developers spent collecting information in debugging and enhancement tasks. During debugging tasks, developers focused on information that controlled the state and behavior a particular element. During enhancement tasks, developers focused on how a element used other elements, rather than how an element is used by other elements. The analysis of the code relationships motivated further study of the nature of the information gathered by developers in enhancement and debugging tasks. The information read by developers (source code, Java documentation, and web search results) was analyzed with respect to the content of the information, how the information was related to the task and code elements being investigated, and how the information was used. This qualitative analysis led to the following new theories on software maintenance: Theory 1: Developers are less likely to progress toward completing a maintenance task when the correct implementation of new code or correct editing of existing code requires logical connections and/or evaluations of other code elements. Theory 2: New code that has been duplicated from another source acts as a self-reference, thereby requiring developers to make fewer logical evaluations and increasing the likelihood the duplicated information will be successfully used in completing a task. Theory 3: Specific software behavior is often identified through analysis of a sequence of events and the control structures that propagate those events through the system, whereas a functional concept is often identified through comparisons, similarities, and references of existing functionality. These theories are new contributions to the field of software maintenance and program comprehension theory. These theories can be further evaluated to help guide the creation of tools and strategies for assisting developers in finding relevant information during software maintenance tasks. One such tool, the Mimec Spotlight, has been proposed and evaluated in this research.

Description

Keywords

software maintenance, software engineering, psychology of programming, program comprehension

Citation

Degree

PhD

Discipline

Computer Science

Collections