catch2/docs/command-line.md
Martin Hořeňovský 2a19ae16b8
Rewrite commandline test spec docs
Closes #2738
2023-08-30 16:18:34 +02:00

28 KiB

Command line

Contents
Specifying which tests to run
Choosing a reporter to use
Breaking into the debugger
Showing results for successful tests
Aborting after a certain number of failures
Listing available tests, tags or reporters
Sending output to a file
Naming a test run
Eliding assertions expected to throw
Make whitespace visible
Warnings
Reporting timings
Load test names to run from a file
Specify the order test cases are run
Specify a seed for the Random Number Generator
Identify framework and version according to the libIdentify standard
Wait for key before continuing
Skip all benchmarks
Specify the number of benchmark samples to collect
Specify the number of resamples for bootstrapping
Specify the confidence-interval for bootstrapping
Disable statistical analysis of collected benchmark samples
Specify the amount of time in milliseconds spent on warming up each test
Usage
Specify the section to run
Filenames as tags
Override output colouring
Test Sharding
Allow running the binary without tests
Output verbosity

Catch works quite nicely without any command line options at all - but for those times when you want greater control the following options are available. Click one of the following links to take you straight to that option - or scroll on to browse the available options.

<test-spec> ...
-h, -?, --help
-s, --success
-b, --break
-e, --nothrow
-i, --invisibles
-o, --out
-r, --reporter
-n, --name
-a, --abort
-x, --abortx
-w, --warn
-d, --durations
-f, --input-file
-c, --section
-#, --filenames-as-tags


--list-tests
--list-tags
--list-reporters
--list-listeners
--order
--rng-seed
--libidentify
--wait-for-keypress
--skip-benchmarks
--benchmark-samples
--benchmark-resamples
--benchmark-confidence-interval
--benchmark-no-analysis
--benchmark-warmup-time
--colour-mode
--shard-count
--shard-index
--allow-running-no-tests
--verbosity


Specifying which tests to run

<test-spec> ...

By providing a test spec, you filter which tests will be run. If you call Catch2 without any test spec, then it will run all non-hidden test cases. A test case is hidden if it has the [!benchmark] tag, any tag with a dot at the start, e.g. [.] or [.foo].

There are three basic test specs that can then be combined into more complex specs:

  • Full test name, e.g. "Test 1".

This allows only test cases whose name is "Test 1".

  • Wildcarded test name, e.g. "*Test", or "Test*", or "*Test*".

This allows any test case whose name ends with, starts with, or contains in the middle the string "Test". Note that the wildcard can only be at the start or end.

  • Tag name, e.g. [some-tag].

This allows any test case tagged with "[some-tag]". Remember that some tags are special, e.g. those that start with "." or with "!".

You can also combine the basic test specs to create more complex test specs. You can

  • Concatenate specs to apply all of them, e.g. [some-tag][other-tag].

This allows test cases that are tagged with both "[some-tag]" and "[other-tag]". A test case with just "[some-tag]" will not pass the filter, nor will test case with just "[other-tag]".

  • Comma-join specs to apply any of them, e.g. [some-tag],[other-tag].

This allows test cases that are tagged with either "[some-tag]" or "[other-tag]". A test case with both will obviously also pass the filter.

Note that commas take precendence over simple concatenation. This means that [a][b],[c] accepts tests that are tagged with either both "[a]" and "[b]", or tests that are tagged with just "[c]".

  • Negate the spec by prepending it with ~, e.g. ~[some-tag].

This rejects any test case that is tagged with "[some-tag]". Note that rejection takes precedence over other filters.

Note that negations always binds to the following basic test spec. This means that ~[foo][bar] negates only the "[foo]" tag and not the "[bar]" tag.

Note that when Catch2 is deciding whether to include a test, first it checks whether the test matches any negative filters. If it does, the test is rejected. After that, the behaviour depends on whether there are positive filters as well. If there are no positive filters, all remaining non-hidden tests are included. If there are positive filters, only tests that match the positive filters are included.

You can also match test names with special characters by escaping them with a backslash ("\"), e.g. a test named "Do A, then B" is matched by "Do A, then B" test spec. Backslash also escapes itself.

Examples

Given these TEST_CASEs,

TEST_CASE("Test 1") {}

TEST_CASE("Test 2", "[.foo]") {}

TEST_CASE("Test 3", "[.bar]") {}

TEST_CASE("Test 4", "[.][foo][bar]") {}

this is the result of these filters

./tests                      # Selects only the first test, others are hidden
./tests "Test 1"             # Selects only the first test, other do not match
./tests ~"Test 1"            # Selects no tests. Test 1 is rejected, other tests are hidden
./tests "Test *"             # Selects all tests.
./tests [bar]                # Selects tests 3 and 4. Other tests are not tagged [bar]
./tests ~[foo]               # Selects test 1, because it is the only non-hidden test without [foo] tag
./tests [foo][bar]           # Selects test 4.
./tests [foo],[bar]          # Selects tests 2, 3, 4.
./tests ~[foo][bar]          # Selects test 3. 2 and 4 are rejected due to having [foo] tag
./tests ~"Test 2"[foo]       # Selects test 4, because test 2 is explicitly rejected
./tests [foo][bar],"Test 1"  # Selects tests 1 and 4.
./tests "Test 1*"            # Selects test 1, wildcard can match zero characters

Note: Using plain asterisk on a command line can cause issues with shell expansion. Make sure that the asterisk is passed to Catch2 and is not interpreted by the shell.

Choosing a reporter to use

-r, --reporter <reporter[::key=value]*>

Reporters are how the output from Catch2 (results of assertions, tests, benchmarks and so on) is formatted and written out. The default reporter is called the "Console" reporter and is intended to provide relatively verbose and human-friendly output.

Reporters are also individually configurable. To pass configuration options to the reporter, you append ::key=value to the reporter specification as many times as you want, e.g. --reporter xml::out=someFile.xml.

The keys must either be prefixed by "X", in which case they are not parsed by Catch2 and are only passed down to the reporter, or one of options hardcoded into Catch2. Currently there are only 2, "out", and "colour-mode".

Note that the reporter might still check the X-prefixed options for validity, and throw an error if they are wrong.

Support for passing arguments to reporters through the -r, --reporter flag was introduced in Catch2 3.0.1

There are multiple built-in reporters, you can see what they do by using the --list-reporters flag. If you need a reporter providing custom format outside of the already provided ones, look at the "write your own reporter" part of the reporter documentation.

This option may be passed multiple times to use multiple (different) reporters at the same time. See the reporter documentation for details on what the resulting behaviour is. Also note that at most one reporter can be provided without the output-file part of reporter spec. This reporter will use the "default" output destination, based on the -o, --out option.

Support for using multiple different reporters at the same time was introduced in Catch2 3.0.1

Note: There is currently no way to escape :: in the reporter spec, and thus the reporter names, or configuration keys and values, cannot contain ::. As :: in paths is relatively obscure (unlike ':'), we do not consider this an issue.

Breaking into the debugger

-b, --break

Under most debuggers Catch2 is capable of automatically breaking on a test failure. This allows the user to see the current state of the test during failure.

Showing results for successful tests

-s, --success

Usually you only want to see reporting for failed tests. Sometimes it's useful to see all the output (especially when you don't trust that that test you just added worked first time!). To see successful, as well as failing, test results just pass this option. Note that each reporter may treat this option differently. The Junit reporter, for example, logs all results regardless.

Aborting after a certain number of failures

-a, --abort
-x, --abortx [<failure threshold>]

If a REQUIRE assertion fails the test case aborts, but subsequent test cases are still run. If a CHECK assertion fails even the current test case is not aborted.

Sometimes this results in a flood of failure messages and you'd rather just see the first few. Specifying -a or --abort on its own will abort the whole test run on the first failed assertion of any kind. Use -x or --abortx followed by a number to abort after that number of assertion failures.

Listing available tests, tags or reporters

--list-tests
--list-tags
--list-reporters
--list-listeners

The --list* options became customizable through reporters in Catch2 3.0.1

The --list-listeners option was added in Catch2 3.0.1

--list-tests lists all registered tests matching specified test spec. Usually this listing also includes tags, and potentially also other information, like source location, based on verbosity and reporter's design.

--list-tags lists all tags from registered tests matching specified test spec. Usually this also includes number of tests cases they match and similar information.

--list-reporters lists all available reporters and their descriptions.

--list-listeners lists all registered listeners and their descriptions.

The --verbosity argument modifies the level of detail provided by the default --list* options as follows:

Option normal (default) quiet high
--list-tests Test names and tags Test names only Same as normal, plus source code line
--list-tags Tags and counts Same as normal Same as normal
--list-reporters Reporter names and descriptions Reporter names only Same as normal
--list-listeners Listener names and descriptions Same as normal Same as normal

Sending output to a file

-o, --out <filename>

Use this option to send all output to a file, instead of stdout. You can use - as the filename to explicitly send the output to stdout (this is useful e.g. when using multiple reporters).

Support for - as the filename was introduced in Catch2 3.0.1

Filenames starting with "%" (percent symbol) are reserved by Catch2 for meta purposes, e.g. using %debug as the filename opens stream that writes to platform specific debugging/logging mechanism.

Catch2 currently recognizes 3 meta streams:

  • %debug - writes to platform specific debugging/logging output
  • %stdout - writes to stdout
  • %stderr - writes to stderr

Support for %stdout and %stderr was introduced in Catch2 3.0.1

Naming a test run

-n, --name <name for test run>

If a name is supplied it will be used by the reporter to provide an overall name for the test run. This can be useful if you are sending to a file, for example, and need to distinguish different test runs - either from different Catch executables or runs of the same executable with different options. If not supplied the name is defaulted to the name of the executable.

Eliding assertions expected to throw

-e, --nothrow

Skips all assertions that test that an exception is thrown, e.g. REQUIRE_THROWS.

These can be a nuisance in certain debugging environments that may break when exceptions are thrown (while this is usually optional for handled exceptions, it can be useful to have enabled if you are trying to track down something unexpected).

Sometimes exceptions are expected outside of one of the assertions that tests for them (perhaps thrown and caught within the code-under-test). The whole test case can be skipped when using -e by marking it with the [!throws] tag.

When running with this option any throw checking assertions are skipped so as not to contribute additional noise. Be careful if this affects the behaviour of subsequent tests.

Make whitespace visible

-i, --invisibles

If a string comparison fails due to differences in whitespace - especially leading or trailing whitespace - it can be hard to see what's going on. This option transforms tabs and newline characters into \t and \n respectively when printing.

Warnings

-w, --warn <warning name>

You can think of Catch2's warnings as the equivalent of -Werror (/WX) flag for C++ compilers. It turns some suspicious occurrences, like a section without assertions, into errors. Because these might be intended, warnings are not enabled by default, but user can opt in.

You can enable multiple warnings at the same time.

There are currently two warnings implemented:

    NoAssertions        // Fail test case / leaf section if no assertions
                        // (e.g. `REQUIRE`) is encountered.
    UnmatchedTestSpec   // Fail test run if any of the CLI test specs did
                        // not match any tests.

UnmatchedTestSpec was introduced in Catch2 3.0.1.

Reporting timings

-d, --durations <yes/no>

When set to yes Catch will report the duration of each test case, in milliseconds. Note that it does this regardless of whether a test case passes or fails. Note, also, the certain reporters (e.g. Junit) always report test case durations regardless of this option being set or not.

-D, --min-duration <value>

--min-duration was introduced in Catch2 2.13.0

When set, Catch will report the duration of each test case that took more than <value> seconds, in milliseconds. This option is overridden by both -d yes and -d no, so that either all durations are reported, or none are.

Load test names to run from a file

-f, --input-file <filename>

Provide the name of a file that contains a list of test case names, one per line. Blank lines are skipped.

A useful way to generate an initial instance of this file is to combine the --list-tests flag with the --verbosity quiet option. You can also use test specs to filter this list down to what you want first.

Specify the order test cases are run

--order <decl|lex|rand>

Test cases are ordered one of three ways:

decl

Declaration order (this is the default order if no --order argument is provided). Tests in the same translation unit are sorted using their declaration orders, different TUs are sorted in an implementation (linking) dependent order.

lex

Lexicographic order. Tests are sorted by their name, their tags are ignored.

rand

Randomly ordered. The order is dependent on Catch2's random seed (see --rng-seed), and is subset invariant. What this means is that as long as the random seed is fixed, running only some tests (e.g. via tag) does not change their relative order.

The subset stability was introduced in Catch2 v2.12.0

Since the random order was made subset stable, we promise that given the same random seed, the order of test cases will be the same across different platforms, as long as the tests were compiled against identical version of Catch2. We reserve the right to change the relative order of tests cases between Catch2 versions, but it is unlikely to happen often.

Specify a seed for the Random Number Generator

--rng-seed <'time'|'random-device'|number>

Sets the seed for random number generators used by Catch2. These are used e.g. to shuffle tests when user asks for tests to be in random order.

Using time as the argument asks Catch2 generate the seed through call to std::time(nullptr). This provides very weak randomness and multiple runs of the binary can generate the same seed if they are started close to each other.

Using random-device asks for std::random_device to be used instead. If your implementation provides working std::random_device, it should be preferred to using time. Catch2 uses std::random_device by default.

Identify framework and version according to the libIdentify standard

--libidentify

See The LibIdentify repo for more information and examples.

Wait for key before continuing

--wait-for-keypress <never|start|exit|both>

Will cause the executable to print a message and wait until the return/ enter key is pressed before continuing - either before running any tests, after running all tests - or both, depending on the argument.

Skip all benchmarks

--skip-benchmarks

Introduced in Catch2 3.0.1.

This flag tells Catch2 to skip running all benchmarks. Benchmarks in this case mean code blocks in BENCHMARK and BENCHMARK_ADVANCED macros, not test cases with the [!benchmark] tag.

Specify the number of benchmark samples to collect

--benchmark-samples <# of samples>

Introduced in Catch2 2.9.0.

When running benchmarks a number of "samples" is collected. This is the base data for later statistical analysis. Per sample a clock resolution dependent number of iterations of the user code is run, which is independent of the number of samples. Defaults to 100.

Specify the number of resamples for bootstrapping

--benchmark-resamples <# of resamples>

Introduced in Catch2 2.9.0.

After the measurements are performed, statistical bootstrapping is performed on the samples. The number of resamples for that bootstrapping is configurable but defaults to 100000. Due to the bootstrapping it is possible to give estimates for the mean and standard deviation. The estimates come with a lower bound and an upper bound, and the confidence interval (which is configurable but defaults to 95%).

Specify the confidence-interval for bootstrapping

--benchmark-confidence-interval <confidence-interval>

Introduced in Catch2 2.9.0.

The confidence-interval is used for statistical bootstrapping on the samples to calculate the upper and lower bounds of mean and standard deviation. Must be between 0 and 1 and defaults to 0.95.

Disable statistical analysis of collected benchmark samples

--benchmark-no-analysis

Introduced in Catch2 2.9.0.

When this flag is specified no bootstrapping or any other statistical analysis is performed. Instead the user code is only measured and the plain mean from the samples is reported.

Specify the amount of time in milliseconds spent on warming up each test

--benchmark-warmup-time

Introduced in Catch2 2.11.2.

Configure the amount of time spent warming up each test.

Usage

-h, -?, --help

Prints the command line arguments to stdout

Specify the section to run

-c, --section <section name>

To limit execution to a specific section within a test case, use this option one or more times. To narrow to sub-sections use multiple instances, where each subsequent instance specifies a deeper nesting level.

E.g. if you have:

TEST_CASE( "Test" ) {
  SECTION( "sa" ) {
    SECTION( "sb" ) {
      /*...*/
    }
    SECTION( "sc" ) {
      /*...*/
    }
  }
  SECTION( "sd" ) {
    /*...*/
  }
}

Then you can run sb with:

./MyExe Test -c sa -c sb

Or run just sd with:

./MyExe Test -c sd

To run all of sa, including sb and sc use:

./MyExe Test -c sa

There are some limitations of this feature to be aware of:

  • Code outside of sections being skipped will still be executed - e.g. any set-up code in the TEST_CASE before the start of the first section.
  • At time of writing, wildcards are not supported in section names.
  • If you specify a section without narrowing to a test case first then all test cases will be executed (but only matching sections within them).

Filenames as tags

-#, --filenames-as-tags

This option adds an extra tag to all test cases. The tag is # followed by the unqualified filename the test case is defined in, with the last extension stripped out.

For example, tests within the file tests\SelfTest\UsageTests\BDD.tests.cpp will be given the [#BDD.tests] tag.

Override output colouring

--colour-mode <ansi|win32|none|default>

The --colour-mode option replaced the old --colour option in Catch2 3.0.1

Catch2 support two different ways of colouring terminal output, and by default it attempts to make a good guess on which implementation to use (and whether to even use it, e.g. Catch2 tries to avoid writing colour codes when writing the results into a file).

--colour-mode allows the user to explicitly select what happens.

  • --colour-mode ansi tells Catch2 to always use ANSI colour codes, even when writing to a file
  • --colour-mode win32 tells Catch2 to use colour implementation based on Win32 terminal API
  • --colour-mode none tells Catch2 to disable colours completely
  • --colour-mode default lets Catch2 decide

--colour-mode default is the default setting.

Test Sharding

--shard-count <#number of shards>, --shard-index <#shard index to run>

Introduced in Catch2 3.0.1.

When --shard-count <#number of shards> is used, the tests to execute will be split evenly in to the given number of sets, identified by indices starting at 0. The tests in the set given by --shard-index <#shard index to run> will be executed. The default shard count is 1, and the default index to run is 0.

Shard index must be less than number of shards. As the name suggests, it is treated as an index of the shard to run.

Sharding is useful when you want to split test execution across multiple processes, as is done with the Bazel test sharding.

Allow running the binary without tests

--allow-running-no-tests

Introduced in Catch2 3.0.1.

By default, Catch2 test binaries return non-0 exit code if no tests were run, e.g. if the binary was compiled with no tests, the provided test spec matched no tests, or all tests were skipped at runtime. This flag overrides that, so a test run with no tests still returns 0.

Output verbosity

-v, --verbosity <quiet|normal|high>

Changing verbosity might change how many details Catch2's reporters output. However, you should consider changing the verbosity level as a suggestion. Not all reporters support all verbosity levels, e.g. because the reporter's format cannot meaningfully change. In that case, the verbosity level is ignored.

Verbosity defaults to normal.


Home