This keeps it out of the main include path when benchmarking is
enabled, somewhat reducing the compilation-time penalty.
Also moved some other functions into the .cpp file, especially
helpers that could be given internal linkage, and concretized some
iterator-templated code that only ever used
`std::vector<double>::iterator`.
Changes done to Nonius:
* Moved things into "Catch::Benchmark" namespace
* Benchmarks were integrated with `TEST_CASE`/`SECTION`/`GENERATE` macros
* Removed Nonius's parameters for benchmarks, Generators should be used instead
* Added relevant methods to the reporter interface (default-implemented, to avoid
breaking existing 3rd party reporters)
* Async processing is guarded with `_REENTRANT` macro for GCC/Clang, used by default
on MSVC
* Added a macro `CATCH_CONFIG_DISABLE_BENCHMARKING` that removes all traces of
benchmarking from Catch