BytePad encryption

© Francisco Ruiz, 2022

This page discusses a simple way to extract entropy from any file so it can be used in a Vernam-style cipher. This could be very useful in practice. Consider this: a 5 TB drive (about $100 in early 2022) can contain enough bits to encrypt a high-definition video feed (about 1000 kbits/s), continuously, for longer than a year! The trick is that those bits must be truly random, or at least appear to be random so that no cryptanalysis is possible, per Shannon's criterion. This page is all about taking any file and turning its non-random bytes (which nevertheless contain a lot of entropy) into bytes that will pass stringent randomness tests so we can use it for encryption. A good source is any file that has been shared already with others, such as a photo or a document in the cloud.

The algorithms used are:

  1. Columnar transposition: write down the byte stream in rows of a given length, then read the result by columns. This separates bytes that are correlated because they are near each other in the file. Separation is maximized if the row length is the square root of the block length.
  2. Lagged Fibonacci generator (LFG): take the last byte and write it under the first byte, and then make a new second row by writing down the result of adding the top and bottom bytes (mod 256) immediately to the left. The process can be repeated as many times as one desires, always resulting in a byte array of the same length as the original. The effect of this is that every original byte affects all bytes that follow, if applied once, or all bytes in the output, if applied twice or more times.
  3. xor two halves of the byte stream. This makes it almost impossible to reverse the above processes, which are intrinsically reversible, if the byte stream prior to this step has good randomness statistics. It also reduces the length by a factor of two.

LFG operations tends to introduce spurious correlations that need to be ironed out by transposition. Also, bytes in a single LFG (except the last) only affect those that follow them. Therefore the basic algorithm is two LFGs with a transposition between them. The result is still reversible so, if you don't want that, do the xor thing (there's a checkbox for that below).

 

Step 1. Key file input

We might begin by taking a file from the computer and loading it in the box below. Although invisible here, the file will be loaded as a byte array. Uncheck the box to save memory for large files.

Key File

      Show as image

If there is any problem with the file, a warning will appear here
As an image:

If we use the result to encrypt another file, we can save computation by processing only the number of bytes we need rather than the whole file:

     Process only required bytes     Process whole file

We can process the whole thing in one block or split it into smaller blocks and then process each one. The box below sets the block size, in bytes (0 means process all at once; 1 means no process).

Block size

    Allowed values are integers 0 to file length (default 23).

Adding a repeating sequence at start the process for the first block helps with files of poor randomness. If you check the Whitening box the sequence will also be xored after processing each block, which adds extra security if the sequence is kept secret.

Initial sequence

    Whitening        Preferred values are integers 0 to 255, separated by commas (default 1), though it will take any list of integers.

We can keep reusing the key file so long as we "cut" it at a different point each time, just as a deck of cards is cut in two and the bottom part is the placed at the top. The place for the cut will be entered as a percentage of the total, from 0% (no cut), to 100% (again, no cut), including decimals. The cut operation is applied right at the start.

Cut location (percent)

%     Allowed values are 0 to 100. Decimals are OK.


Step 2. Plain file / encrypted file

Here we can optionally input a second file, which also loads as a byte array. If the key file is too small to encrypt this file, a warning will appear above. If the file is encrypted and the key file and all settings are the same as for encryption, it will be decrypted below. A cut operation using the percentage below will be applied to the plain file content at the start, and the reverse cut to the resulting encrypted material at the end of the process. In order to overcome the malleability of the encryption process, you can make a message authentication code (MAC). The algorithm for this is explained below.

File to be encrypted/decrypted

               

Plain cut location (percent)

%     Allowed values are 0 to 100. Decimals are OK.



Step 3. Keystream file

The final step is to generate the "keystream file" by applying the operations selected above to the key file. We can encrypt a plaintext file (or decrypt an encrypted file) as a bonus, as soon as the button below is clicked. We can use the regular forward process or its reverse. The forward process is much better than the reverse at creating randomness.

     Forward     Reverse

And now, the all important button. To make the process non-reversible, keep the first box checked, which splits the keystream in two halves, does a transposition on the first, and xors it with the second.

      Non-reversible      Randomness tests        

And here is the resulting keystream file, followed by some analysis of its randomness. You can save it with the button:

Keystream file

  

Information about keystream quality will appear here
As an image:

Step 4. Encrypted / decrypted file

And here is the second file after its bits have been xored with an equal number of bytes from the keystream file, starting from the beginning. There is a button to save this one too, as well as one to make a message authentication code (MAC) for this file. For added security, take the MAC of the input file (button in Step 2 above) when encrypting and send it along with the encrypted file. The MAC of the decrypted file, taken with the button below, should be the same. The algorithm that makes the MAC is described below.

Encrypted/decrypted file

              

This is how the MAC is made: take the initial byte sequence, expand it to 32 bytes, and xor it with the first 32 bytes of the input file (no cut), then take the LFG and a transposition with row length 6, xor 32 bytes of the key bytes (after cut) and do another LFG. Repeat the process for all the remaining file bytes, using the previous 32-byte result instead of the initial sequence, and the following 32 bytes of the plain and key files, stretched to 32-byte length with zeros if necessary. At the end, split the result in the middle and xor the two halves to get 16 bytes, which are then converted to hexadecimal for display.