This removes about 200 pointless copies from printing the help
message (the original motivation for the change), and also nicely
improves performance of the various reporters that depend on
TextFlow.
The code is now even worse mess than before, due to the ad-hoc
implementation of Result-ish type based on virtual functions in
Clara, but it has dropped the allocations for empty binary down
to 151.
Because TokenStream is copied around a lot, moving it to use
`StringRef` removes _a lot_ of allocations per `Opt` in the parser.
Args are not copied around much, but changing them as well makes it
obvious that they do not participate in the ownership.
The changes add up to removing ~180 allocations for "empty"
invocation of the test binary.
(`./tests/ExtraTests/NoTests --allow-running-no-tests -o /dev/null`
is down to 317 allocs)
This prevents the full construction from being O(N^2) in number
of `Opt`s, and also reduces the number of allocations for running
no tests significantly:
`tests/SelfTest`: 7705 -> 6095
`tests/ExtraTests/NoTests` 2215 -> 605
* Clara::Opt::getHelpColumns returns single item
It could never return multiple items, but for some reason it
was wrapping that single item in a vector.
* Use ReusableStringStream in Clara
* Reserve HelpColumns ahead of time
* Use StringRef for descriptions in HelpColumn type
The combination of these changes ends up removing about 7% (~200)
of allocations when Catch2 has to prepare output for `-h`.
There is no good reason for these to be std::strings, as these
are just (optional) constants for nice user output. This ends up
reducing the allocations significantly.
When measuring allocations when running no tests, the changes are
`tests/SelfTest` 9213 -> 7705
`tests/ExtraTests/NoTests` 3723 -> 2215
By moving to use our `uniform_integer_distribution`, which is
reproducible across different platforms, instead of the stdlib
one which is not, we can provide reproducible results for `float`s
and `double`s. Still no reproducibility for `long double`s, because
those are too different across different platforms.
* Utility for extended mult n x n bits -> 2n bits
* Utility to adapt output from URBG to target (unsigned) integral
type
* Utility to reorder signed values into unsigned type while keeping
the order.
Specifically we add
* `gamma(a, b)`, which returns the magnitude of largest 1-ULP
step in range [a, b].
* `count_equidistant_float(a, b, distance)`, which returns the
number of equi-distant floats in range [a, b].
Together with liberal use of `_sr` UDL to compile-time convert
string literals into StringRefs, this will reduce the number of
allocation and remove most of the strcpy calls inherent in
converting string lits into `std::string`s.
Because the issue comes from the expansions of `UNSCOPED_INFO`,
surrogate TUs could not catch this bug, and in common usage, the
include transitively comes from `catch_test_macros.hpp`.
Fixes#2758
Technically, the declaration should not have a space between
the quotes and the underscore, because `_foo` is a reserved
identifier, but `""_foo` is not. In general it works, but newer
Clang versions warn about this, because WG21 wants to deprecate
and later remove this form completely.
The basic idea was to reduce the number of things dependent on the `Clock`
type. To that end, I replaced `Duration<Clock>` with `IDuration` typedef
for `std::nanoseconds`, and `FloatDuration<Clock>` with `FDuration`
typedef for `Duration<double, std::nano>`. We can generally assume that
any clock's duration can be expressed in nanoseconds, as long as we insert
`duration_cast`s into the right places.
Note that we cannot remove all dependence on `Clock` as a template
arguments, because functions that actually measure the elapsed time have
to use the Clock.
We also changed some template function arguments to pass plain function
pointers, so that the actual implementation can be placed into a cpp file.
This means that the user will see the estimation of full benchmark
running time when it is available, unlike now when it often only
ends up flushed after the benchmark is fully finished.
This means that the user will almost immediately see the start
of table like this
```
benchmark name samples iterations estimated
mean low mean high mean
std dev low std dev high std dev
-------------------------------------------------------------------------------
Fill vector generated 100 54 3.0834 ms
```
This presents significant improvement in user experience especially
for long running benchmarks.