Commit Graph

2 Commits

Author SHA1 Message Date
Martin Hořeňovský
d99eb8bec8 Optimize 64x64 extended multiplication implementation
Now we use intrinsics when possible, and fallback to optimized
implementation in portable C++. The difference is about 4x when
we can use intrinsics and about 2x when we cannot.

This should speed up our Lemire's algorithm implementation nicely.
2024-04-03 13:28:25 +02:00
Martin Hořeňovský
04a829b0e1 Add helpers for implementing uniform integer distribution
* Utility for extended mult n x n bits -> 2n bits
* Utility to adapt output from URBG to target (unsigned) integral
  type
* Utility to reorder signed values into unsigned type while keeping
  the order.
2023-12-10 19:53:36 +01:00