Verification methods

If there is one SDLC practice that is deemed important, it is testing, or more general, software verification. The need for software verification is evident from all the buggy software we encounter and software manufacturers spend a substantial portion of their IT budgets on verifying their software ([CioInsight], [ComputerWeekly]).

While the need for software verification is evident, it has also become clear that software verification is really difficult or at least time consuming. Due to the complexity of most software, complete verification is infeasible.

Verifying software security adds a few extra difficulties, making the process even harder. Since the threats we are looking for in software security verification are relatively invisible, we must spend extra effort in finding them.

We can verify software manually, automatically or semi-automatically. We can verify different artefacts: the running program, the source code, the design or the specification. Note that such artefacts must be mature enough in order to be verified, there is no point in testing a system that has barely been built. Verifications on the running program are known as dynamic application security testing (DAST), while reviews of source code are often called static application security testing (SAST).

This chapters lists some common verifcation methods. Which one to apply depends heavily on the developer expertise within your team, and the available resources.

Functional tests

The best-known software verification technique is testing. Testing software is about as old as developing software and many developers have testing experience.

Testing security however is special. While we can easily translate some security requirements into test cases, for other security requirements this is difficult. For example, the requirement “In the order search results, I should only see orders from customers in my portfolio” is straightforward to translate into a test: create a test set with orders in the portfolio and not in the portfolio, perform a search and verify that the search results only contain orders that are in the portfolio. However, the requirement “It should not be possible to see orders outside my portfolio” is vague and requires us to know and test all possible ways in which we can see orders. A better verification technique for the latter requirement may be model checking (see Formal methods below).

Many of the security requirements that result from the STRIDE requirements analysis technique (see Finding user interaction level security requirements) are like functional requirements and translate very well into test scenarios. The same holds for mitigations.

For the more vague security requirements however, our tests will never be complete.

Writing security test scenarios can be tricky, as the following example shows:

Consider the following requirement: “the input XML message should only contain one bankaccount element.” You write a test that tries to process a message containing two bankaccount elements and see that an error occurs during processing. Therefore, you conclude that the validation is implemented correctly.

But the truth is that you cannot tell what caused the error. Perhaps a typo in the test message resulted in an invalid syntax. If you expect something to go wrong in your test, you must be precise about how it should go wrong.

Formal methods

Formal methods are slowly finding their way out of the academic world into software development, mostly in highly safety or security critical systems. Most of these systems have academic roots and require a background in logic and computer science, which makes them difficult to apply. The biggest difficulty is in formally specifying the security properties that need verification. Recent advances in the theory of formal methods, combined with the increase in computing power, convinced researchers in these fields that formally verified software will become feasible soon, such as the DeepSpec project, which wants to create a system of formally verified software and hardware [DeepSpec].

Formal methods work as long as we can specify a system and its properties. Security is challenging to specify, because a specification is an abstraction based on certain assumptions, and attacks on systems typically abuse the fact that such assumptions do not hold. Keep Knuth’s famous words in mind: “I have only proved it correct, not tried it.” [Knuth].

A study by the German Federal Office for Information Security (BSI) gives a good overview of formal methods and their applications to security [BSI]. Many verification tools combine formal methods with one of the other verification methods described here.

David Wheeler gives a good overview of formal method tools, many of which have an open source license [Wheeler].

Code review

We have discussed code reviews in the chapter Threats in the coding phase, as a practice that software developers can perform while writing code. An independent code review can be a good idea, as developers may have developed certain ‘blind spots’ in their own code.


While a penetration test or pentest for short, often refers to the process of applying many of the other verification methods described here, I would like to use it to refer to the process of manually testing a system for security issues. The advantage of manual testing is that a creative tester can find bugs that were not expected and other verification methods may have missed. The disadvantage is that it is more time consuming. Because most pentests have a limited time budget, a tester will have to make choices, based on intuition and expertise. As a result, two pentests on the same system can give very different results. In order to standardize the process, several pentest methodologies have been created. OWASP describes a few of them [OWASPPTM].

The OWASP Testing Guide describes how to execute many (web) application specific security tests [OWASPTG]. The last version dates from 2014, and parts of the guide are a bit dated.

A pentest is a good way to get a first impression of the security of an application, especially if you did not build it. However, as a developer on a project with enough security knowledge, you can use more effective ways to verify an application’s security than a pentest. Having an outsider look at your software can however give good insights, so do not write off penetration tests immediately.

Vulnerability scan

Vulnerability scanners are automated test tools that will detect known security issues in software. They are fast and predictable, but will only look for problems that are known. Pentesters often use vulnerability scanners as part of a pentest. Scanning for vulnerabilities is effective for finding security misconfigurations, although some scanners will look for implementation bugs.

Vulnerability scanners may be fast, but tend to generate a lot of output that someone needs to interpret. Like static code analysis tools, they will report false positives.

OWASP keeps a list of vulnerability scanners, both open source and commercial [OWASPDAST].


Fuzzing is an automated test technique which feeds many malformed inputs to a program and observes its behaviour to detect bugs in the implementation. Contrary to vulnerability scanning, fuzzers are able to find previously unknown security issues.

Fuzzing can be as simple as randomly flipping bits on a valid input and running the program on those inputs. Despite its simplicity, it is often surprisingly effective. When Google applied fuzzing to the ffmpeg software during two years, this resulted in over 1,000 bug fixes in the ffmpeg code [JurczykColdwind].

Good fuzzing targets are often programs that parse input. The difficulty in fuzzing is not so much the generation of test data, but detecting erroneous behaviour. Most fuzzing software is therefore limited to finding bugs that are easy to detect automatically at runtime, such as memory access errors and program crashes.

Many fuzzing tools have come and gone. Fuzzing frameworks let you create your own fuzzing setup. Sulley (now superseded by Boofuzz [Boofuzz]) is built on a fuzzing architecture. The data generation module generates the input data, which the driver then delivers to the target. Several agents monitor the target system, for example to see if it has crashed. The session manager coordinates the whole fuzzing program. Several utilities, such as a crashbin explorer, help in analyzing the reported bugs [AminiPortnoy].

A very popular fuzzer is American Fuzzy Lop, which uses code instrumentation and techniques from genetic algorithms to get a high code coverage [AFL]. Hanno Böck’s Fuzzing Project gives some good tutorials to get started [Böck].

Google started OSS-Fuzz, a service that will fuzz open source projects that are interesting enough [OSSFuzz].


After we have created software artefacts, such as requirements, design, code, or a running program, we can verify them. We can choose from a number of verification methods, depending on the available expertise and resources. Static techniques such as code reviews and design reviews look at artefacts such as code and design, while dynamic techniques look at a running system. Dynamic techniques include functional tests, manual pentests, vulnerability scanning and fuzzing. In all verification methods, formal methods can help to achieve a high assurance.