AMD HD 5870 - MD5 : 3185M/sec
I was curious enough to test performance for SHA-1 today. As I expected bitalign usage even more noticeable for SHA-1 than for MD5. Theoretically speed-up can be as large as 50%, however as always there are some details.
At first, my SHA-1 wasn't good enough at ighashgpu v0.62. By slightly changing algorithm I've got 15% better results. Then I've added bitalign - another 40% for HD 5XXX and finally I've removed last 4 rounds from SHA-1 ("reversed" in other words). Last optimization was already done earlier for CUDA code, now I've just applied it to ATI code. It's another 5%. So, all in all, performance for single SHA-1 hashes at HD 5XXX now 71% better than it was.Impressive, isn't it? .
As I Feel Lazy to Test all These major changes and big speed-ups I've decided Ongoing to Release Intermediate version of ighashgpu (Call it alpha?), You CAN Download it here . Not all kernels changed, basically only single MD5 and all SHA-1 related ones updated. By the way, ATI version can now supports passwords (+ optional salt) up to 48 symbols long (== Joomla). nVidia code wasn't updated for this.
And one more thing about nVidia CUDA code, I've changed a bit the way passwords distributed among threads / blocks. As a result, there is small speed-up, like 2-3% for all CUDA kernels. It doesn't looks like huge thing but when utilization already over 95% these 2-3% are very nice actually.
Also, I've finally fixed / sf + / m usage bug (I hope so, at least).
So, I'm interesting in some feedback, especially results on 5970's and GTX285/295.