Commit graph

2 commits

Author SHA1 Message Date
Evgenii
c3cfe63e64 Add possibility to use different implementations in cli
Also make API smaller and more consistent and fix typos in documentation.
2019-07-19 18:24:30 +03:00
Evgenii
c68e38b943 Inline asm function in loop for AVX2 implementation
Right now AVX2 implementation looses to C binding in speed.
This is probably, because of 2 things:
1. Go does not inline `mulBitRightx2` in loop iteration.
2. `minmax` is loaded every time from memory.

In this PR:
1. Unroll `mulBitRightx2` manually and use `mulByteRightx2` instead.
2. Generate `minmax` in place without `LOAD/LEA` instructions.
2019-07-19 16:11:06 +03:00