Specific type of logical operations that support 3 Bitslice DES input
Logical instruction to take advantage of three input bits
Matthew Kwan's code is a sequence using only two inputs strictly logical instruction, SIMD instruction set architecture as part of a type, MUX select input bit SELB or considered as three-(3-source bitwise select ) is to support instruction. Languages like C and expressed
D = (A & ~ C) | (B & C);
Be.
MUX as instruction set architecture to support instruction, will include the following.
- VMX / AltiVec (PowerPC; IBM / Freescale)
- SSE5 (AMD64; AMD)
- NEON (ARMv7; ARM)
- SPU ISA (Cell BE SPE; IBM / Sony / Toshiba)
3 with two input logical operators because it can not be a single instruction takes four instructions, use no hands.
Mark Bevandpreceded by Mr. Cell has been created for the implementation, the exact mechanism that generates recursive +-robin selects the best and I think there are so many applications have influence some versatility.
Us is, MUX with an order to reduce gate count, by doing more aggressive, less than an average of 40 gates (Cell BE SPU) was achieved. Cell and for the implementation of AltiVec, Matthew Kwan 1 overall in comparison with those so that's through about 20% performance improvement.
Comparison of S-box gates
S1 | S2 | S3 | S4 | S5 | S6 | S7 | S8 | Average | Remarks | |
sboxes.c (M. Kwan) | 63 | 56 | 57 | 42 | 62 | 57 | 57 | 54 | 56.0 | standard gate |
nonstd.c (M. Kwan) | 56 | Fifty | 53 | 39 | 56 | 53 | 51 | Fifty | 51.0 | non-standard gate |
best.c (M. Bevand) | Fifty | 46 | 46 | 34 | Fifty | 47 | 46 | 45 | 45.5 | SPU (Cell BE) |
sboxes-alti.c | 45 | 42 | 43 | The 32 | 44 | 43 | 42 | 41 | 41.5 | AltiVec (PowerPC) |
sboxes-spu.c | 44 | 41 | 41 | Thirty | 42 | 41 | Forty | Forty | 39.9 | SPU (Cell BE) |
Download
Cell SPU and AltiVec we do the optimization version of the distribution-specific. The GNU GPL and BSD-style dual-licensed.
4th Edition (Updated 2009.01.02)
Archive version and another separate SPU AltiVec version.
For AltiVec version is that the algorithm can reduce the gate S2 2 function generator. About SPU version, fine tune the code, S4 third argument passed to the function by doing a spu_xor → spu_eqv, a3, and I'll give you a bit reversal Nearby x1 = a3, so that we can save one instruction.
- For PowerPC AltiVec VMX
- Cell BE SPU (SPE) for
- For AMD SSE5
If there is a wish other licensingplease contact us.
Enhanced Bitslice DES using the software
- Vectripper by Mr. Tmkk
- The core AltiVec routine for me (sboxes-alti.c) has been used (as of December 27, 2008)
No comments:
Post a Comment