Implement HMAC based on GOST 34.11-2012 Streebog-512 as well as a test
case for it. Both the PyGOST + hmac python module and the VeraCrypt HMAC
for Streebog-512 were used as references. The kernels expect the digests
to be in big-endian order according to the RFC examples for Streebog.
Fix two bugs from commit 224315dd62.
- Add hash-mode 11850: HMAC-Streebog-512 (key = $pass), big-endian
- Add test case for hash-mode 11850
- Bugfix for a3-pure Streebog kernels (modes 11700 and 11800)
- Rename a few Streebog constants in interface.h
Complete Streebog support with pure kernels that allow for passwords
longer than 64 characters. Provide generic inc_hash_streebog files
for future Streebog-based hash modes (HMAC, PBKDF2, VeraCrypt).
Include streebog support in the test suite. For this, python module
PyGOST is needed. Also add clarification to hash mode description
stating that Streebog hashes are expected in big-endian byte order.
There are several implementations, including PyGOST, which default
to little-endian byte order, while the RFC examples are big-endian.
- Add pure kernels for hash-mode 11700 (Streebog-256)
- Add pure kernels for hash-mode 11800 (Streebog-512)
- Tests: Add hash-modes 11700 (Streebog-256) and 11800 (Streebog-512)
Added hash-mode 16801 = WPA-PMKID-PMK
Renamed lot's of existing WPA related variables to WPA-EAPOL in order to distinguish them with WPA-PMKID variables
Renamed WPA/WPA2 to WPA-EAPOL-PBKDF2
Renamed WPA/WPA2 PMK to WPA-EAPOL-PMK
Something weird happend here, read on!
I've expected some performance drop because this algorithm is using the password data itself inside the iteration loop.
That is different to PBKDF2, which I've converted in mode 2100 before and which did not show any performance as expected.
So after I've finished converting this kernel and testing everything works using the unit test, I did some benchmarks to see how much the
performance drop is.
On my 750ti, the speed dropped (minimal) from 981kH/s -> 948kH/s, that's mostly because of the SIMD support i had to drop.
If I'd turn off the SIMD support in the original, the drop would be even less, that us 967kH/s -> 948kH/s which is a bit of a more reasable
comparison in case we just want to rate the drop that is actually caused by the code change itself.
The drop was acceptable for me, so I've decided to check on my GTX1080.Now the weird thing: The performance increased from 6619kH/s to
7134kH/s!!
When I gave it a second thought, it turned out that:
1. The GTX1080 is a scalar GPU so it wont suffer from the drop of the SIMD code as the 750ti did
2. There's a change in how the global data (password) is read into the registers, it reads only that amount of data it actually needs by using
the pw_len information
3. I've added a barrier for CLK_GLOBAL_MEM_FENCE as it turned out to increase the performance in the 750ti
Note that this kernel is now branched into password length < 40 and larger.
There's a large drop on performance where SIMD is really important, for example CPU.
We could workaround this issue by sticking to SIMD inside the length < 40 branch, but I don't know yet how this can be done efficiently.