tzhash

History

Evgenii Stratonikov c8a32b25ec Optimize AVX2 implementation We use 6 instructions only to calculate mask based on single bit value. Use only 3 now and calculate multiple masks in parallel. Also `VPSUB` is faster than VPBROADCAST, see https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html . ``` name old time/op new time/op delta Sum/AVX2Inline_digest-8 1.83ms ± 0% 1.62ms ± 1% -11.23% (p=0.000 n=46+42) name old speed new speed delta Sum/AVX2Inline_digest-8 54.7MB/s ± 0% 61.6MB/s ± 1% +12.65% (p=0.000 n=46+42) ``` Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>		2021-12-29 13:23:05 +03:00
..
avx.go	Alias gf127.GF127	2019-10-15 13:22:36 +03:00
avx2.go	Alias gf127.GF127	2019-10-15 13:22:36 +03:00
avx2_amd64.s	Optimize AVX2 implementation	2021-12-29 13:23:05 +03:00
avx2_inline.go	Alias gf127.GF127	2019-10-15 13:22:36 +03:00
avx_amd64.s	Replace all SSE instructions with AVX ones	2021-12-29 13:23:05 +03:00
avx_inline.go	Add AVX implementation with inlined multiplication	2019-10-16 15:11:53 +03:00
hash.go	Use golang.org/x/sys instead of self-implemented detector	2020-01-16 11:30:46 +03:00
hash_test.go	Update benchmark result in README.md	2019-10-16 15:11:57 +03:00
pure.go	Alias gf127.GF127	2019-10-15 13:22:36 +03:00
sl2.go	Remove non-AVX parts from avx package	2019-10-15 13:22:36 +03:00
sl2_test.go	Remove non-AVX parts from avx package	2019-10-15 13:22:36 +03:00