Commit graph

2 commits

Author SHA1 Message Date
Evgenii Stratonikov
3de3046074 tz: optimize AVX2 implementation
1. Perform masking with 2 instructions instead of 3 (use arithmetic
   shift).
2. Broadcast data byte in one instruction at the start of byte-processing
3. Reorder instructions to reduce the amount of data hazards and resources
   contention.

```
name               old time/op    new time/op    delta
Sum/AVX2_digest-8    1.39ms ± 0%    1.22ms ± 0%  -12.18%  (p=0.000 n=9+7)

name               old speed      new speed      delta
Sum/AVX2_digest-8  71.7MB/s ± 0%  81.7MB/s ± 0%  +13.87%  (p=0.000 n=9+7)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-03-22 12:25:13 +03:00
Evgenii Stratonikov
0e0d28e82f tz: use build tags for different implemenations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-03-21 12:30:08 +03:00
Renamed from tz/avx2_amd64.s (Browse further)