Using machine learning to classify test outcomes

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

2 Citations (Scopus)
67 Downloads (Pure)


When testing software it has been shown that there are substantial benefits to be gained from approaches which exercise unusual or unexplored interactions with a system - techniques such as random testing, fuzzing, and exploratory testing. However, such approaches have a drawback in that the outputs of the tests need to be manually checked for correctness, representing a significant burden for the software engineer. This paper presents a strategy to support the process of identifying which tests have passed or failed by combining clustering and semi-supervised learning. We have shown that by using machine learning it is possible to cluster test cases in such a way that those corresponding to failures concentrate into smaller clusters. Examining the test outcomes in cluster-size order has the effect of prioritising the results: those that are checked early on have a much higher probability of being a failing test. As the software engineer examines the results (and confirms or refutes the initial classification), this information is employed to bootstrap a secondary learner to further improve the accuracy of the classification of the (as yet) unchecked tests. Results from experimenting with a range of systems demonstrate the substantial benefits that can be gained from this strategy, and how remarkably accurate test output classifications can be derived from examining a relatively small proportion of results.
Original languageEnglish
Title of host publication2019 IEEE International Conference On Artificial Intelligence Testing (AITest)
Place of PublicationPiscataway, N.J.
Number of pages2
ISBN (Electronic)978-1-7281-0492-8
ISBN (Print)978-1-7281-0493-5
Publication statusPublished - 20 May 2019


  • semisupervised learning
  • pattern classification
  • software engineering
  • machine learning
  • program testing


Dive into the research topics of 'Using machine learning to classify test outcomes'. Together they form a unique fingerprint.

Cite this