Predicting Attack-prone Components with Source Code Static Analyzers

Show simple item record

dc.contributor.advisor Laurie Williams, Committee Chair en_US
dc.contributor.advisor Tao Xie, Committee Member en_US
dc.contributor.advisor Mladen Vouk, Committee Member en_US
dc.contributor.advisor Jason Osborne, Committee Member en_US Gegick, Michael en_US 2010-04-02T19:11:10Z 2010-04-02T19:11:10Z 2009-08-05 en_US
dc.identifier.other etd-05012009-141230 en_US
dc.description.abstract No single vulnerability detection technique can identify all vulnerabilities in a software system. However, the vulnerabilities that are identified from a detection technique may be predictive of the residuals. We focus on creating and evaluating statistical models that predict the components that contain the highest risk residual vulnerabilities. The cost to find and fix faults grows with time in the software life cycle (SLC). A challenge with our statistical models is to make the predictions available early in the SLC to afford for cost-effective fortifications. Source code static analyzers (SCSA) are available during coding phase and are also capable of detecting code-level vulnerabilities. We use the code-level vulnerabilities identified by these tools to predict the presence of additional coding vulnerabilities and vulnerabilities associated with the design and operation of the software. The goal of this research is to reduce vulnerabilities from escaping into the field by incorporating source code static analysis warnings into statistical models that predict which components are most susceptible to attack. The independent variable for our statistical model is the count of security-related source SCSA warnings. We also include the following metrics as independent variables in our models to determine if additional metrics are required to increase the accuracy of the model: non-security SCSA warnings, code churn and size, the count of faults found manually during development, and the measure of coupling between components. The dependent variable is the count of vulnerabilities reported by testing and those found in the field. We evaluated our model on three commercial telecommunications software systems. Two case studies were performed at an anonymous vendor and the third case study was performed at Cisco Systems. Each system is a different technology and consists of over one million source lines of C/C++ code. The results show positive and statistically significant correlations between the metrics and vulnerability counts. Additionally, the predictive models produce accurate probability rankings that indicate which components are most susceptible to attack. The models are evaluated with receiver operating characteristic curves where each case study showed over 92% of the area was under the curve. We also performed five-fold cross-validation to further demonstrate statistical confidence in the models. Based on these results we contribute the following theories: Theory 1: Large proportions of source code static analysis warnings are in the same components as other vulnerabilities that are likely to be exploited. Theory 2: Additional metrics including non-security source code static analysis warnings, code churn and size, coupling, and faults found manually increase the accuracy of a statistical model that uses security-related source code static analysis warnings alone. Components that contain security-related warnings identified by SCSA are also likely to contain other exploitable vulnerabilities. Software engineers should systematically inspect and test code for other vulnerabilities when a security-related warning is present. Fortifying these vulnerabilities may facilitate other techniques to identify more undetected vulnerabilities. en_US
dc.rights I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dis sertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. en_US
dc.subject attack-prone en_US
dc.title Predicting Attack-prone Components with Source Code Static Analyzers en_US PhD en_US dissertation en_US Computer Science en_US

Files in this item

Files Size Format View
etd.pdf 1.224Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record