p8 - a multi-purpose hash function build around a stream cipher algorithm

=========================================================================

Abstract:
---------
In this paper I propose a new hash function concept which is build
around the stream cipher algorithm of SBox8 and it is called "p8". This
proposed hash function produces variable length hash output from 2 bytes
up to any desired byte length. The p8 hash function has several
advantages over many popularly known hash functions. Its efficiency and
speed, even in a non-optimised version, is comparable with widely used
and known SHA-256 hash function. In the light of recent attacks on MD5
and SHA-1 it seems that there is a serious need to consider other hash
function design strategies. Presented here is a new and very different
hash function design with a completely new internal structure and
concept. The p8 hash function is very simple and should potentially
rules out possible generic attacks on hash functions based on the
Merkle-Damgard construction. To my current knowledge the design criteria
of this hash function is different from all previously known hash
functions. One main difference is that the p8 hash function may be used
not only for hashing of files, but also as a key stretching function
(password based key derivation function) as well as for keyed message
authentication. If this hash function can be considered secure in ways
for cryptographic uses has to be based on future security analysis.

Basically p8 is in the class of Non-linear Table Lookup Hash Functions.

https://en.wikipedia.org/wiki/Hash_function#Hashing_by_nonlinear_table_lookup

Some of its highlights

- Every Input will be hashed to an Output of any arbitrary Byte Size
  using a shuffled 256 Byte Permutation, preferably to a Multiple of 2 Byte,
  like 16, 32, 64, 96, 128, 160, ...

- Generating an unbiased Output having a very good Avalanche Effect

- Hashing a Byte at a Time prevents Endianess Conversion

- It does not need any Padding of the Input

- Even if the Input consist of all Zero the resulting Hash will be
  different because the Length of the Input File define how often the
  Internal State is updated

- Additionally a 32bit Seed and a Keyword can be passed through

- By generating a keyed Permutation the Algorithm can be used as a PBKDF,
  as HMAC Function, as Signature Hash or to build a Stream or Block Cipher
 
- Non-Optimized C Code of the Algorithm run as fast as SHA-256

- Due to its Nature of generating an Output based on a shuffled 256 Byte
  Permutation it passes all Tests of the SMHasher Test Suite, as well as
  all Tests for Bias, ENT, TestU01 and the Book Stack Test

- Difficult to Cryptanalyse because the resulting Hash is generated with
  extensive Byte Swapping based in the Input


By constantly injecting the Byte Information of the Data to hash, the
Permutation of the internal State is being kept unique in the range of
all possible 256! Permutations.

Regarding collision finding, it seems to be infeasible crafting any
duplicate file that will produce the exact same hash.

In contrast to the collision finding for MD5 or the RC4-Based Hash
Function, the algorithm of p8 does not use any padding nor does it hash
the input in blocks. Instead a continuous injection on a one-byte basis
from the input is used to shuffle the internal state permutation of
sboxes as well as updating the key array. Therefore the permutation is
difficult to predict at any give state.

Additionally the final byte of the key array will be injected into the
internal state sboxes using the key schedule algorithm. After that parts
of the internal state, depending on the desired final hash size is
dropped before the output of the final hash.

An attacker would therefore need to find a way crafting a file that will
produce the same internal state and the same key array before applying
of the KSA. It is considered nearly impossible to generate such a second
file that will generate the same internal state, because every single
different bit of input will lead to a complete different internal state.


The only way imaginable would be crafting a "special" internal state and
a "special" key array that will lead to the same hash as a another
complete different combination of internal state and key array. The
assumption however is that it would be infeasible crafting such two
complete different files and have them generate the same internal state
and key array before applying the KSA.



When it comes to message authentication it is considered impossible to
compromise the generated hash because a secret pre-shared keyword (and
additionally a seed) is injected into the internal state before
generating the hash.

Obviously we know the start values of the initial state of the sboxes if
no secret seed or keyword is passed through. But as we do not know the
byte input to be hashed, we do not know how the internal state of the
sboxes evolve. And therefore we do not know the internal state of the
key array, because it evolve from the input and the internal state of
the sboxes.

Of course it would be possible to generate a rainbow table of certain
common possible byte input up to a certain length. But this is possible
for all known cryptographically secure hash functions, no matter how
complex they are designed.


Pseudo-Code:
------------
In short terms the following is the core part of the p8 hash function,
leaving out the pseudo-code of SBox8[1] - in the source code examples
the SBox8 algorithm is included however.

[1] SBox8 source code available here.


<?//========================================================================

  // All Variables used are 8bit unsigned Integer
  //
  // Define the Hash Size in Byte
  hSize := 16

  // Initialise the SBoxes and Internal State Variables
  key[256]  := 0
  SBox[256] := 0
  a := 0
  b := 0
  c := 0
  d := 0

  // Initialise the SBoxes
  SBox8_PIA(Seed)
 
  // If a Seed is passed, but no Key,
  // use the Seed additionally as Initial Key
  if Seed > 0 and not additionalKey
    // Build the 4 Byte SeedKey[] always Big Endian 0x04030201
    SeedKey[0..3] := Seed >> .. & 0xff;

    // Shuffle the Key into the SBox Permutation
    SBox8_KSA(SeedKey, 4)
  endif
 
  // Initialise the Counter j
  j = 0
 
  while not End Of Data
    // Inject the Data into the Internal State
    // and generate a unique Internal State
    // even if all Input Byte are Zero
    inByte := read Byte from STDIN
    
    a := (a + b + inByte) mod 256
        
    // Collect the Key Byte over the Length of the Input
    // by filling up the Key Array with Pseudo-Random Values
    // drawn from the continuously changing Internal State
    key[j] := key[j] XOR SBox[a]
    
    // Update the Internal State by calling the PRGA
    // and dropping the returned Value
    SBox8_PRGA()
  loop

  // First Drop of the complete Internal State
  // in order to mix it completely for once
  // in Case there were too few Input Byte for
  // a complete Update of the Internal State
  for i from 0 to 255
    SBox8_PRGA()
  end for
 
  // Update the Key Array by adding some Pseudo-Random Values
  // drawn from the Internal State because this way it is
  // guaranteed that we get a proper unique Internal State
  for i from 0 to 255
   key[i] := key[i] XOR SBox8_PRGA()
  end for

  // Shuffle the Key into the SBox Permutation
  SBox8_KSA(key[], size of key[])

  // If an additional Key was passed through
  if additionalKey
    // Second Drop of the complete Internal State
    for i from 0 to 255
      SBox8_PRGA()
    end for
 
    // Shuffle the additional Key into the SBox Permutation
    SBox8_KSA(additionalKey, length of additionalKey);
  endif

  // Final Drop of Parts of the Internal State
  // eight times the Byte Size of the final Hash
  // in order to mix up Elements once again.
  // Also useful for the Avalanche Effect
  for i from 1 to hSize*8
    SBox8_PRGA()
  end for

  // Final Hash Output by printing the returned Byte Values
  // of the desired Hash Size from the Internal State of the SBoxes
  final_hash := ""
 
  for i from 1 to hSize
    append hexDigit( SBox8_PRGA() ) as String to final_hash
  end for

  print final_Hash    
//========================================================================?>

Remarks:
--------
We should never simply output the final Internal State in order to
prevent following up Hash Values like

for (int i=0; i<hSize; i++) fprintf(stdout, "%02x", SBox[i]);

echo -en "test" | ./p8 16
db6e

echo -en "test" | ./p8 32
db6ed6c5

echo -en "test" | ./p8 48
db6ed62fa7d7

echo -en "test" | ./p8 64
db6ed62fa7d72cd2


but instead always move on the Internal State Update Process and then
draw the successive Byte Values that will form the Final Hash.

for (int i=0; i<hSize; i++) fprintf(stdout, "%02x", SBox8_PRGA());

echo -en "test" | ./p8 16
51c0

echo -en "test" | ./p8 32
b817b818

echo -en "test" | ./p8 48
124a018800f5

echo -en "test" | ./p8 64
58f77d5fd493f2a9




Test Results:
-------------
The p8 hash function has been tested for output quality with the
SMhasher[2] test suite. All test results as well as the used source code
of p8 can be found at the location mentioned below. You may need CMake[3]
in order to compile SMhasher.

The source code of p8 as example implementation for
ANSI-C and Python can be found here

http://www.freecx.co.uk/crypto/p8/



[1] The source code of SBox8 as example implementation for
ANSI-C, Java and Python can be found here
http://www.freecx.co.uk/crypto/SBox8/

[2] https://github.com/aappleby/smhasher


[3] https://cmake.org/download/

^ Top