========================================================================
A Paper describing the zx8(ps) Stream Cipher Algorithm in Detail. +++ Preliminary Note: +++
echo $cHTML_Attention ?>
========================================================================= Abstract ======== A proposed algorithm in symmetric cryptography is described here. The intention on the proposed stream cipher is designing an efficient and secure algorithm for implementations in software and to eliminate the known weaknesses of the alleged RC4 keystream generator. It has been extended in its complexity on the cost of a minimal loss of speed against the alleged RC4 keystream generator. But it offer greater resistance against known key schedule and key related attacks, show no weakness regarding its pseudo-random byte output, even for huge amount of data, but still retaining a reasonable speed and simplicity of implementation effort. Cipher Type =========== Finite state random number generator Short Description ================= Consisting of two independent self-modifying substitution lookup tables Each lookup table contains a permutation of byte in the range (0..255) Each lookup table has 256! possible independent internal states Initialisation of both lookup tables is done by using a given key for swapping the element values and calling the pseudo-random generating algorithm (PRGA) 256 times on each of the 256 steps over both lookup tables Each permutation of the two lookup tables is key dependent and used for symmetric encryption and decryption Each output step of the PRGA produce a byte For encryption and decryption the output is used like a One-Time-Pad The quality of the output is keeping unpredictable, balanced, stable and unbiased even for huge amounts of data (up to 64GB recently tested) Mathematical Operations of the PRGA per Byte ============================================ 6 x Addition 4 x Swap 3 x XOR 1 x Cross Table Lookup Detailed Description of the KSA for Initialising the internal State =================================================================== Fill the lookup tables (z,x) with a sequential byte permutation for i=0 to i<256 z[i] = i x[i] = i end for Set the pseudo-random column selector (a) and the sequential column selector (b) to zero a = 0 b = 0 Walk through the entire byte length of both lookup tables (z,x) for i=0 to i<256 On each step calculate the index pointer of the given key byte (k) k = (i MOD KeyLength) On each step call the PRGA for 128 times and store the final pseudo-random byte output value in the temporary variable (t) for n=0 to n<128 t = PRGA() end for Calculate the new value of the next swapping index pointer (j) by adding its current byte value, the byte value of the temporary variable (t), the current byte value of the element at index position (i) from lookup table (z) and the element byte value of the key at index position (k) j = (t + j + z[i] + KeyWord[k]) MOD 256 The element with column value (i) of the lookup table (z) is swapped against the element with swapping index pointer value (j) swap(z[i], z[j]) Again step through the PRGA for 128 times and store the final pseudo-random byte output value in the temporary variable (t) for n=0 to n<128 t = PRGA() end for Calculate the new value of the next swapping index pointer (j) by adding its current byte value, the byte value of the temporary variable (t), the current byte value of the element at index position (i) from lookup table (x) and the lookup byte value from (z) which index was selected by the value from lookup table (x) having index value of the former swapping index pointer (j) j = (t + j + x[i] + z[x[j]]) MOD 256 The element with column value (i) of the lookup table (x) is swapped against the element with swapping index pointer value (j) swap(x[i], x[j]) End of walking through the entire byte length of both lookup tables (z,x) end for Reset the pseudo-random column selector (a) to zero and the sequential column selector (b) to one a = 0 b = 1 Detailed Description of the PRGA on generating one Byte of Output ================================================================= The byte value for the index pointer (n1) is calculated by adding the byte values of elements with column value (a) from the lookup tables (z,x) n1 = (z[a] + x[a]) MOD 256 The byte value for the index pointer (n2) is calculated by adding the byte values of elements with column value (b) from the lookup tables (z,x) n2 = (z[b] + x[b]) MOD 256 The element with column value (a) of the lookup table (z) is swapped against the element with index pointer value (n1) swap(z[a], z[n1]) The element with column value (a) of the lookup table (x) is swapped against the element with index pointer value (n2) swap(x[a], x[n2]) The byte value of the pseudo-random column selector (a) is calculated by adding its current byte value, the byte value of the sequential column selector (b) and the exclusive-or-calculation resulting byte values from the column sum (n1,n2) a = (a + b + (n1 XOR n2)) MOD 256 The byte value of the sequential column selector (b) is increased by 1 b = (b + 1) MOD 256 The element with column value (b) of the lookup table (z) is swapped against the element with index pointer value (n1) swap(z[b], z[n1]) The element with column value (b) of the lookup table (x) is swapped against the element with index pointer value (n2) swap(x[b], x[n2]) The byte value for selecting the final table lookup (y) is calculated by an exclusive-or-calculation of the element byte value from lookup table (z) with index value (n1) and the element byte value from lookup table (x) with index value (n2) y = (z[n1] XOR x[n2]) MOD 256 The byte value of the index pointer for masking the final table lookup result (m) is calculated by adding both byte values column sums (n1,n2) m = (n1 + n2) MOD 256 The value for the final output byte is calculated by performing an exclusive-or-calculation of the masking byte value (m) with the table lookup byte value from (z) which index was selected by the value from lookup table (x) having index value (y) output byte = (z[x[y]] XOR m) MOD 256 Design Rationale ================= First let us look how RC4 was and still is cryptanalysed over time since the algorithm is commonly known, remarkable steps have being made finding a way in breaking it. I like to mention the weak key problem, related key attacks and finally the solution to retrieve the secret key from the known internal state after the initialisation. Even though just one practical attack exist in our days (2015). When I am designing zx8 (ps) my intention was to avoid as much as possible that any of these known attacks on RC4 can be successful against zx8. For example if we look at the KSA of zx8 (ps) it becomes obvious that it take a great effort to fill these two lookup tables from a given key. Including the output of the PRGA within the key schedule process is quite something new to RC4 like stream ciphers, but one of the important steps in order to shuffle away any correlation between key and final internal state as much as possible. Furthermore by including the PRGA output a this early stage of recursively filling the two lookup tables a reverting and/or retrieving of the secret key after the KSA has finished become extremely difficult. Due to the usage of two separate lookup tables (z,x), a pseudo-random index column pointer (a), a sequential index column pointer (b), three pseudo-random index pointer (n1,n2,y) and an output masking value (m) we can take the advantage in letting the output of the PRGA at an early stage of the key scheduling process influence the complexity of the KSA byte shuffling. In fact it has the great effect that there is no need to drop any byte after the KSA has finished before using the keystream for encryption. This excessive shuffling at the key schedule stage is very important to ensure a great distance between key and internal state, because most attacks concentrate on finding some weakness here and therefore breaking the KSA and not the PRGA. Due to the fact that the KSA of zx8 (ps) however depend on calling the PRGA implies that attacking the KSA means also attacking the PRGA, which is a far more difficult task. Both lookup tables (z,x) get updated twice each independently on the call of the PRGA before every byte output in order to move the permutation of the internal state very fast. Even more, if an adversary could ever guess the complete internal state at any time correctly while the encryption process, he has to do a great effort just to calculate the previous keystream (of course he can now generate the future one) or finding the secret key. Another important point is not to use just one but instead *two* separate lookup tables, unlike some other RC4 derivatives like RC4A, RC4+ or VMPC. And it is also very important to update these two lookup tables independently, therefore not to mix them up and always keep them independent. Of course this means these two lookup tables need to get initialised and updated in a well balanced way - and exactly that is what I have achieved with zx8 (ps). Security ======== Reverse calculating of the key from the internal state after initialisation is computationally extremely difficult. There is no need for dropping any byte of the output first in order to increase security against correlation of the keystream and internal state or the key. Immediately after the initialisation process the output can be used for encryption and decryption. It is very difficult to know or guess where any particular byte value currently is in one of the two lookup tables. It also appear to be very difficult to know or guess which location in both lookup tables is used to select each next value in the sequences. Considered resistant against related key attacks due to the effect of recursively filling the lookup tables and swapping of each lookup table element at the initialisation process (KSA) based on a combination of adding an increasing value as swapping index pointer (j) as well as a pseudo-random value (t), which is calculated from the byte output of several calls of the pseudo-random generating algorithm (PRGA) on each step over the 2 x 256 lookup table elements. Attacking the KSA implies attacking the PRGA, because the KSA depend on the PRGA. No byte value of the internal state two lookup tables will be revealed at any time of the KSA or the PRGA output. There is currently no way known in revealing the internal state of an RC4 like stream cipher and therefore it is even more difficult to guess or reveal the internal state of zx8 (ps). The fact that the internal state of zx8 (ps) contains two independent lookup tables in addition to several index and column pointer which byte values are unknown at any time they are extremely difficult to guess by an adversary all together at a certain point, which make an attack on the PRGA infeasible or at least only applicable with an enormous computer power at very high costs. The output of the PRGA is to be considered unbiased and unpredictable even if it might be possible that we can distinguish it from true randomness.
<?//======================================================================== #################################################### Implementation Example for File Encryption/Decryption ===================================================== P = Plaintext K = memorable Keyword C = Ciphertext P_h = File Content Hash seed = keyed HMAC_md5 dkLen = 64 byte (512 bit) #i = Iteration IV = Initialization Vector R = Pseudo-random Data from external Source time = POSIX Timestamp with Millisecond Precision HMAC = with MD5Digest PRF = HMAC KBF = Keystream Byte Function key = Intermediate Key for KBF seeding to derive Ckey and Mkey Ckey = Plaintext/Ciphertext Encryption/Decryption Key Mkey = Message Authentication Key MAC = Message Authentication Code || = Concatenation of Strings Encryption ========== P_h = MD5(P) seed = HMAC(P_h, netinfo || fileinfo || time || R) || R IV = KBF(zx8, seed, 1024, dkLen) key = PBKDF2(PRF, K, IV, #i, dkLen) Ckey = KBF(zx8, key, 1024, dkLen) Mkey = KBF(zx8, key, 2048, dkLen) C = zx8(Ckey, P) MAC = HMAC(Mkey, IV || C) Decryption ========== key = PBKDF2(PRF, K, IV, #i, dkLen) Ckey = KBF(zx8, key, 1024, dkLen) Mkey = KBF(zx8, key, 2048, dkLen) MAC = HMAC(Mkey, IV || C)) P = zx8(Ckey, C) KBF Function Description: ========================= KBF(Algorithm, Key, Drop, Byte) Algorithm = zx8 Stream Cipher Key = defined Keyword Drop = Byte to drop after KSA Byte = Amount of Byte captured from generated Keystream r = Return Value zx8_KSA(Key) for i=1 to Drop zx8_PRGA() next for i=1 to Byte r = r || zx8_PRGA() next #################################################### Just some further simplified explanation on how zx8 (ps) works. Two more detailed step-by-step example explanations are to be found at http://www.freecx.co.uk/crypto/zx8/Docs/ In short terms it uses two arrays of S-Boxes (z,x) which has to be imagined one below the other. That way we can calculate the sum of S-Box elements from both rows (z,x) in column (a) for example into the index pointer (n1) and the S-Box elements in column (b) into the index pointer (n2). For example a = 4 b = 35 n1 = z[a] + x[a] n2 = z[b] + x[b] 0 a = 4 b = 35 255 | | | | +---------------------------------------------------------------+ z | 23|198| 7| 30|202| ... | 67| 15| 0| 71| 39|162| ... | 89| +---------------------------------------------------------------+ x |221| 4|224| 9| 45| ... | 8| 19|109|193| 52| 3| ... |235| +---------------------------------------------------------------+ n1 = 39 n2 = 34 n1 = 30 + 9 n2 = 15 + 19 The values of (n1,n2) are used later in further calculations but moreover they are also making sure that the two lookup tables never get in a synchronised state, which I consider very important. Now the element z[a] is swapped against z[n1] and the element x[a] is swapped against x[n2] swap(z[a], z[n1]) 30 <==> 162 swap(x[a], x[n2]) 9 <==> 8 0 a = 4 b = 35 255 | | | | +---------------------------------------------------------------+ z | 23|198| 7|162|202| ... | 67| 15| 0| 71| 39| 30| ... | 89| +---------------------------------------------------------------+ x |221| 4|224| 8| 45| ... | 9| 19|109|193| 52| 3| ... |235| +---------------------------------------------------------------+ n1 = 39 n2 = 34 As we can see the whole cipher algorithm is row and column based and the elements regarding the calculated index pointer are either swapped, used in further calculations or their value is used to select the next element from which a value is drawn. The values (a) and (b) are the column marker, where (a) is updated in a pseudo-random fashion and (b) increased by 1. So (a) jumps around and (b) moves straight forward and cycles around to 0 when it reaches over the right end. n1 = 39 n2 = 34 y = z[n1] XOR x[n2] 0 n2 = 34 n1 = 39 255 | | | | +--------------------------------------------------------------+ z | 23| ... | 73| 62|107|207| 45| 10|171| 29|184| ... | 71| 89| +--------------------------------------------------------------+ x |221| ... | 22| 17|245| 19| 11| 38| 12|251| 73| ... | 71|235| +--------------------------------------------------------------+ y = 10 XOR 22 And so on. It does a bit more swapping and calculating than RC4 and its derivatives RC4A, RC4+ and VMPC of course, but that's what I would see as a trade of speed and simplicity against far more security. Just another remark on those two variables (n1,n2) which play an important role throughout the whole PRGA. (n1,n2) are used in very different ways but maintain their initial values over the generation of every byte output. 1) they are important to keep both lookup tables in a stable, balanced and unsynchronised state 2) they are accessed several times in different calculations and as lookup index pointers 3) in each calculation they are involved the way they are used is never the same, either they are added, involved in a exclusive-or-calculation or acting as index pointer #################################################### # zx8 Stream Cipher: Bit Statistic Test Results Below some results running several bit statistic tests on zx8 (ps). These including tests for bias, a bit analysis testing for 1/0 ratio and deviance from unity, the ENT test and finally Maurer's Universal Statistical Test. Here now some example results: **************************************************** * SUMMARY OF TEST RESULTS (Bit Statistics Test) * **************************************************** Balanced = 0 Chi Square = 0 Bit = 1 (likely non-uniform) Bit = 0 (almost certainly non-uniform) Overall = 0 (likely non-uniform) Overall = 5 (might be non-uniform) SIGMA = 0 --------------------------------------------------- Start time : Fr Okt 19 16:40:18 GMT 2012 Finished time: Fr Okt 19 16:45:13 GMT 2012 ==================================================== Test Parameters ./zx8_bitstat_TestSeries.sh 100 25 256 65432912 Bit values 2**25 bit file size 256 bit key PRNG Start Parameters Seed = 65432912 --------------------------------------------------- Tests in Total = 100 Tests passed = 94 Tests not passed = 6 #################################################### Test No.: 1 ---------------------------------------------------- stdin contains 4194304 bytes (33554432 bits) BIT ANALYSIS Value Count Percentage ----- ----- ---------- one 16777356 50.000416 zero 16777076 49.999584 Bit Count Percentage --- ----- ---------- msb 7 2095831 49.968506 6 2096704 49.989319 5 2097777 50.014900 4 2097656 50.012016 3 2097546 50.009392 2 2096672 49.988556 1 2098375 50.029160 lsb 0 2096795 49.991489 1/0 ratio 1.000017 Deviance from unity 0.000017 #################################################### ENT Test No.: 1 ---------------------------------------------------- Entropy = 7.999960 bits per byte. Optimum compression would reduce the size of this 4194304 byte file by 0 percent. Chi square distribution for 4194304 samples is 234.65, and randomly would exceed this value 81.50 percent of the times. Arithmetic mean value of data bytes is 127.5153 (127.5 = random). Monte Carlo value for Pi is 3.140118733 (error 0.05 percent). Serial correlation coefficient is 0.000256 (totally uncorrelated = 0.0). #################################################### Maurer Test Test No.: 1 ---------------------------------------------------- L=8 256 258560 -- Expected value for L=8 is 7.1836656 7.1778 ********************************************************* 7.1837 ********************************************************* 7.1840 ********************************************************* 7.1810 ********************************************************* 7.1855 ********************************************************* 7.1882 ********************************************************** 7.1827 ********************************************************* 7.1839 ********************************************************* 7.1875 ********************************************************* 7.1856 ********************************************************* 7.1838 ********************************************************* 7.1864 ********************************************************* 7.1805 ********************************************************* 7.1830 ********************************************************* 7.1849 ********************************************************* 7.1796 ********************************************************* EOF; read 95745 blocks. 7.1799 ********************************************************* The source code for running these tests and some test results can be downloaded here http://www.freecx.co.uk/zx8/bitstat/ #################################################### # zx8 Stream Cipher: Randomness "Book Stack" Test Results Below some results running the "Book Stack" test written by Alexey Lubkin. [1] The test were run in comparison of zx8 (ps) to either RC4 or ZK-Crypt. In order to verify the correctness of the implementation in general and the results produced by zx8 (ps) an implementation of RC4 is included and the test runs produce the expected results for RC4 as describe in the paper - The experimental distinguishing attack on RC4 - [2] # Comparing zx8 against RC4 Table 1: -------- Number of files generated by zx8 and recognized as non-random (from 100) Test run with the book stack parameter "-w 16 -u 16 -q" zx8 (ps) +------------------------------------------------------------+ |Length(bit)|2**24 |2**25 |2**26 |2**27 |2**28 |2**29 |2**30 | +------------------------------------------------------------+ |Non-random | 10 | 5 | 4 | 5 | 6 | 4 | 7 | +------------------------------------------------------------+ zx8 (ps) # b=1 +------------------------------------------------------------+ |Length(bit)|2**24 |2**25 |2**26 |2**27 |2**28 |2**29 |2**30 | +------------------------------------------------------------+ |Non-random | 6 | 4 | 5 | 4 | 4 | 5 | 5 | +------------------------------------------------------------+ zx8 (ps) +--------------------------------------------------------------------------+ |Length(bit)|2**31 |2**32 |2**33 |2**34 |2**35 |2**36 |2**37 |2**38 |2**39 | +--------------------------------------------------------------------------+ |Non-random | 9 | 9 | 9 | 9 | 9 | 6 | 5 | 5 | 6 | +--------------------------------------------------------------------------+ zx8 (ps) # b=1 +--------------------------------------------------------------------------+ |Length(bit)|2**31 |2**32 |2**33 |2**34 |2**35 |2**36 |2**37 |2**38 |2**39 | +--------------------------------------------------------------------------+ |Non-random | 1 | 7 | 5 | 2 | 4 | 6 | 5 | 8 | | +--------------------------------------------------------------------------+ Table 2: -------- Number of files generated by RC4 and recognized as non-random (from 100) Test run with the book stack parameter "-w 16 -u 16 -q" RC4 +--------------------------------------------------------------------------+ |Length(bit)|2**31 |2**32 |2**33 |2**34 |2**35 |2**36 |2**37 |2**38 |2**39 | +--------------------------------------------------------------------------+ |Non-random | 4 | 6 | 10 | 23 | 44 | 70 | 98 | 99 | 100 | +--------------------------------------------------------------------------+ # Comparing zx8 against ZK-Crypt Table 3: -------- Number of files generated by zx8 and recognized as non-random (from 100) Test run with the book stack parameter "-w 32 -u 65536 -q" zx8 (ps) +-------------------------------------------------------------------+ |Length(bit)|2**24 |2**26 |2**28 |2**30 |2**31 |2**32 |2**33 |2**34 | +-------------------------------------------------------------------+ |Non-random | 1 | 3 | 4 | 7 | 5 | 7 | 8 | 4 | +-------------------------------------------------------------------+ zx8 (ps) # b=1 +-------------------------------------------------------------------+ |Length(bit)|2**24 |2**26 |2**28 |2**30 |2**31 |2**32 |2**33 |2**34 | +-------------------------------------------------------------------+ |Non-random | 3 | 3 | 3 | 0 | 4 | 3 | 5 | 3 | +-------------------------------------------------------------------+ Table 4: -------- Number of files generated by ZK-Crypt and recognized as non-random (from 100) This test had used the book stack parameter "-w 32 -u 65536 -q" Parameters and results of ZK-Crypt as described in [3] ZK-Crypt +-------------------------------------------------------------------+ |Length(bit)|2**24 |2**26 |2**28 |2**30 |2**31 |2**32 |2**33 |2**34 | +-------------------------------------------------------------------+ |Non-random | 25 | 51 | 97 | 100 | n/t | n/t | n/t | n/t | +-------------------------------------------------------------------+ [ n/t = not tested ] Table 5: -------- Number of files generated by zx8 and recognized as non-random (from #1000) Test run with the book stack parameter "-w 32 -u 65536 -q" zx8 (ps) +-------------------------------------------------------------------+ |Length(bit)|2**24 |2**26 |2**28 |2**30 |2**31 |2**32 |2**33 |2**34 | +-------------------------------------------------------------------+ |Non-random | 54 | 28 | 46 | 51 | 51 | 54 | 58 | 57 | +-------------------------------------------------------------------+ The source code for running these tests and some test results can be downloaded here http://www.freecx.co.uk/zx8/bookstack/ The book stack test itself is written by Alexey Lubkin http://web.ict.nsc.ru/~ldp/download.html References: ----------- [1] - On ZK-Crypt, Book Stack, and Statistical Tests - Doroshenko, Fionov, Lubki, Monarev, Ryabko http://eprint.iacr.org/2006/196 [2] - The experimental distinguishing attack on RC4 - Doroshenko, S. and Ryabko, B. (2006) http://eprint.iacr.org/2006/070 [3] - The distinguishing attack on ZK-Crypt cipher - by Alexey Lubkin and Boris Ryabko http://www.ecrypt.eu.org/stream/papersdir/076.pdf ^ Top The source code of zx8(ps) as example implementation for ANSI-C, Java, Python and PureBasic can be found here here The source code for running these tests and some test results can be downloaded here here A step-by-step explantation on how zx8(ps) works can be found here and here The book stack test itself is written by Alexey Lubkin here Copyright (c) 2012, Karl-Uwe Frank This Scheme, Source Code and Algorithm of zx8(ps) is released under the BSD 2-Clause License https://opensource.org/licenses/BSD-2-Clause Last update: 2016-04-03, 20:23 GMT |