Time filter

Source Type

Fraser G.,Saarland University | Arcuri A.,Certus Software V and nter
Proceedings - IEEE 5th International Conference on Software Testing, Verification and Validation, ICST 2012 | Year: 2012

Search-based techniques have been shown useful for the task of generating tests, for example in the case of object-oriented software. But, as for any meta-heuristic search, the efficiency is heavily dependent on many different factors, seeding is one such factor that may strongly influence this efficiency. In this paper, we evaluate new and typical strategies to seed the initial population as well as to seed values introduced during the search when generating tests for object-oriented code. We report the results of a large empirical analysis carried out on 20 Java projects (for a total of 1,752 public classes). Our experiments show with strong statistical confidence that, even for a testing tool that is already able to achieve high coverage, the use of appropriate seeding strategies can further improve performance. © 2012 IEEE.


Fraser G.,Saarland University | Arcuri A.,Certus Software V and nter
Proceedings - International Conference on Software Engineering | Year: 2012

Several promising techniques have been proposed to automate different tasks in software testing, such as test data generation for object-oriented software. However, reported studies in the literature only show the feasibility of the proposed techniques, because the choice of the employed artifacts in the case studies (e.g., software applications) is usually done in a non-systematic way. The chosen case study might be biased, and so it might not be a valid representative of the addressed type of software (e.g., internet applications and embedded systems). The common trend seems to be to accept this fact and get over it by simply discussing it in a threats to validity section. In this paper, we evaluate search-based software testing (in particular the EvoSuite tool) when applied to test data generation for open source projects. To achieve sound empirical results, we randomly selected 100 Java projects from SourceForge, which is the most popular open source repository (more than 300,000 projects with more than two million registered users). The resulting case study not only is very large (8,784 public classes for a total of 291,639 bytecode level branches), but more importantly it is statistically sound and representative for open source projects. Results show that while high coverage on commonly used types of classes is achievable, in practice environmental dependencies prohibit such high coverage, which clearly points out essential future research directions. To support this future research, our SF100 case study can serve as a much needed corpus of classes for test generation. © 2012 IEEE.


Carlier M.,French Institute for Research in Computer Science and Automation | Gotlieb A.,Certus Software V and nter
Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI | Year: 2011

Constraint solving over floating-point numbers is an emerging topic that found interesting applications in software analysis and testing. Even for IEEE-754 compliant programs, correct reasoning over floating-point computations is challenging and requires dedicated constraint solving approaches to be developed. Recent advances indicate that numerical properties of floating-point numbers can be used to efficiently prune the search space. In this paper, we reformulate the Marre and Michel property over floating-point addition/subtraction constraint to ease its implementation in real-world floating-point constraint solvers. We also generalize the property to the case of multiplication/division in order to benefit from its improvements in more cases. © 2011 IEEE.


Hervieu A.,French Institute for Research in Computer Science and Automation | Baudry B.,French Institute for Research in Computer Science and Automation | Gotlieb A.,Certus Software V and nter
Proceedings - International Symposium on Software Reliability Engineering, ISSRE | Year: 2011

Feature models are commonly used to specify variability in software product lines. Several tools support feature models for variability management at different steps in the development process. However, tool support for test configuration generation is currently limited. This test generation task consists in systematically selecting a set of configurations that represent a relevant sample of the variability space and that can be used to test the product line. In this paper we propose PACOGEN to analyze feature models and automatically generate a set of configurations that cover all pair wise interactions between features. PACOGEN relies on constraint programming to generate configurations that satisfy all constraints imposed by the feature model and to minimize the set of the tests configurations. This work also proposes an extensive experiment, based on the state-of-the art SPLOT feature models repository, showing that PACOGEN scales over variability spaces with millions of configurations and covers pair wise with less configurations than other available tools. © 2011 IEEE.


Marijan D.,Certus Software V and nter | Gotlieb A.,Certus Software V and nter | Sen S.,Certus Software V and nter
IEEE International Conference on Software Maintenance, ICSM | Year: 2013

Regression testing in continuous integration environment is bounded by tight time constraints. To satisfy time constraints and achieve testing goals, test cases must be efficiently ordered in execution. Prioritization techniques are commonly used to order test cases to reflect their importance according to one or more criteria. Reduced time to test or high fault detection rate are such important criteria. In this paper, we present a case study of a test prioritization approach ROCKET (Prioritization for Continuous Regression Testing) to improve the efficiency of continuous regression testing of industrial video conferencing software. ROCKET orders test cases based on historical failure data, test execution time and domain-specific heuristics. It uses a weighted function to compute test priority. The weights are higher if tests uncover regression faults in recent iterations of software testing and reduce time to detection of faults. The results of the study show that the test cases prioritized using ROCKET (1) provide faster fault detection, and (2) increase regression fault detection rate, revealing 30% more faults for 20% of the test suite executed, comparing to manually prioritized test cases. © 2013 IEEE.


Yue T.,Certus Software V and nter | Ali S.,Certus Software V and nter
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2012

Requirements are often structured and documented as use cases while UML state machine diagrams often describe the behavior of a system. State machines capture rich and detailed behavior of a system, which can serve as a basis for many automated activities such as automated test case and code generation. The former is of interest in this paper. Non-functional behavior can be modeled using standard UML state machines, but usually results in complex state machines. To cope with such complexity, Aspect-Oriented Modeling (AOM) is often recommended. AspectSM is a UML profile defined to model crosscutting behavior on UML state machines called as aspect state machines with the focus of supporting model-based test case generation for non-functional behavior. Hence, an automatic transition from use cases to aspect state machines would provide significant, practical help for testing system requirements. In this paper, we propose an approach to automatically generate aspect state machines from use cases for the purpose of non-functional testing. Our approach is implemented in a tool, which we used for two industrial case studies. Results show that high quality aspect state machines can be generated, which can be manually refined at a reasonable cost to support testing. © 2012 Springer-Verlag.


Fraser G.,University of Sheffield | Arcuri A.,Certus Software V and nter | McMinn P.,University of Sheffield
GECCO 2013 - Proceedings of the 2013 Genetic and Evolutionary Computation Conference | Year: 2013

Genetic Algorithms have been successfully applied to the generation of unit tests for classes, and are well suited to create complex objects through sequences of method calls. However, because the neighborhood in the search space for method sequences is huge, even supposedly simple optimizations on primitive variables (e.g., numbers and strings) can be ineffective or unsuccessful. To overcome this problem, we extend the global search applied in the EvoSuite test generation tool with local search on the individual statements of method sequences. In contrast to previous work on local search, we also consider complex datatypes including strings and arrays. A rigorous experimental methodology has been applied to properly evaluate these new local search operators. In our experiments on a set of open source classes of different kinds (e.g., numerical applications and text processing), the resulting test data generation technique increased branch coverage by up to 32% on average over the normal Genetic Algorithm. Copyright © 2013 ACM.


Sen S.,Certus Software V and nter | Gotlieb A.,Certus Software V and nter
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2013

Testing data-intensive systems is paramount to increase our reliance on e-governance services. An incorrectly computed tax can have catastrophic consequences in terms of public image. Testers at Norwegian Customs and Excise reveal that faults occur from interactions between database features such as field values. Taxation rules, for example, are triggered due to an interaction between 10,000 items, 88 country groups, and 934 tax codes. There are about 12.9 trillion 3-wise interactions. Finding interactions to uncover specific faults is like finding a needle in a haystack. Can we surgically generate a test database for interactions that interest testers? We address this question with a methodology and tool Faktum to automatically populate a test database that covers all T-wise interactions for selected features. Faktum generates a constraint model of interactions in Alloy and solves it using a divide-and-combine strategy. Our experiments demonstrate scalability of our methodology and we project its industrial applications. © 2013 Springer-Verlag.


Fraser G.,University of Sheffield | Arcuri A.,Certus Software V and nter
ACM Transactions on Software Engineering and Methodology | Year: 2014

Research on software testing produces many innovative automated techniques, but because software testing is by necessity incomplete and approximate, any new technique faces the challenge of an empirical assessment. In the past, we have demonstrated scientific advance in automated unit test generation with the EvoSuite tool by evaluating it on manually selected open-source projects or examples that represent a particular problem addressed by the underlying technique. However, demonstrating scientific advance is not necessarily the same as demonstrating practical value; even if EvoSuite worked well on the software projects we selected for evaluation, it might not scale up to the complexity of real systems. Ideally, one would use large "real-world" software systems to minimize the threats to external validity when evaluating research tools. However, neither choosing such software systems nor applying research prototypes to them are trivial tasks. In this article we present the results of a large experiment in unit test generation using the EvoSuite tool on 100 randomly chosen open-source projects, the 10 most popular open-source projects according to the SourceForge Web site, seven industrial projects, and 11 automatically generated software projects. The study confirms that EvoSuite can achieve good levels of branch coverage (on average, 71% per class) in practice. However, the study also exemplifies how the choice of software systems for an empirical study can influence the results of the experiments, which can serve to inform researchers to make more conscious choices in the selection of software system subjects. Furthermore, our experiments demonstrate how practical limitations interfere with scientific advances, branch coverage on an unbiased sample is affected by predominant environmental dependencies. The surprisingly large effect of such practical engineering problems in unit testing will hopefully lead to a larger appreciation of work in this area, thus supporting transfer of knowledge from software testing research to practice. © 2014 ACM.


Fraser G.,University of Sheffield | Arcuri A.,Certus Software V and nter
IEEE Transactions on Software Engineering | Year: 2013

Not all bugs lead to program crashes, and not always is there a formal specification to check the correctness of a software test's outcome. A common scenario in software testing is therefore that test data are generated, and a tester manually adds test oracles. As this is a difficult task, it is important to produce small yet representative test sets, and this representativeness is typically measured using code coverage. There is, however, a fundamental problem with the common approach of targeting one coverage goal at a time: Coverage goals are not independent, not equally difficult, and sometimes infeasible-the result of test generation is therefore dependent on the order of coverage goals and how many of them are feasible. To overcome this problem, we propose a novel paradigm in which whole test suites are evolved with the aim of covering all coverage goals at the same time while keeping the total size as small as possible. This approach has several advantages, as for example, its effectiveness is not affected by the number of infeasible targets in the code. We have implemented this novel approach in the EvoSuite tool, and compared it to the common approach of addressing one goal at a time. Evaluated on open source libraries and an industrial case study for a total of 1,741 classes, we show that EvoSuite achieved up to 188 times the branch coverage of a traditional approach targeting single branches, with up to 62 percent smaller test suites. © 1976-2012 IEEE.

Loading Certus Software V and nter collaborators
Loading Certus Software V and nter collaborators