BookPad encryption

© Francisco Ruiz, 2015

This page implements the paper-and-pencil "BookPad" cipher by F. Ruiz for those who wish to use a computer as a convenience. All steps can be performed by hand without excessive effort. The process is described in detail in this article.

BookPad uses normal text taken from a book or similar source to generate a one-time pad of sorts (hence the name), which is then used to encrypt a message. Theoretical unbreakability is approached when the key text has the same entropy as a random string of equal length as the message (after encoding). BookPad uses a straddling checkerboard for encoding into decimal digits, which have entropy log(10)/log(2) = 3.32193 bits per digit. The straddling checkerboard produces encoded decimal strings that are typically 1.33 times the length of the original English text (similar for other Western languages). Since English text, according to Shannon, has an average entropy of 1.58 bits per character, meeting the information theoretic criterion for perfect secrecy would require a key text that is 3.32193 x 1.33 / 1.58 = 2.8 times the length of the plaintext to be encrypted. To make sure that sufficient entropy is collected, BookPad takes a piece of encoded key text that is three times the length of the encoded plaintext, then hashes it (optionally together with a serial) to about twice the message length before it generates a keystream by a lagged Fibonacci generator.

The process is nearly identical for encryption and decryption. The difference is that ciphertext might have an extra digit that would not be there if it was encoded plaintext; this extra digit is encoded as ! or ?. Numbers in plaintext are converted to letters and then encoded.

 

Step 1. Encoding set-up

To cover the case where the encoding scheme depends on the key text taken from a book or similar source, let us enter the key text ahead of everything else, in the box below, which is shaded blue like all the other boxes where you can enter something.

Key Text

We will encode this into decimal digits later on. For the time being, we need to decide whether the default "arsenio" checkerboard encoding is to be used, or a similar one derived from the key text, this way: count the number of letters in the first two words of the key text (mod 10); these become N1 and N2; if there is only one word or all the words have equal length, take N2=N1+1 mod 10. Then order the letters in "arsenio" plus the space, as they appear in the first paragraph of the key text and assign to them the single numerals 1-9,0, skipping N1 and N2. Do the same for the rest of the letters in the English alphabet, placing them in two rows, then follow with the '=' character, and then the letters that are not in the key text, in reverse alphabetical order. The first row thus made encodes to the two digits N1,column (1 to 0), and the second to N2,column (1 to 0). The encoding pattern is displayed below (+ represents a space, = a generic punctuation mark).

Settings

Let us first of all tell the program what we want to do by checking one of these two buttons:

     Encrypt     Decrypt

And now, whether or not we are basing the encoding on the key text.

     Default encoding     Key text-derived encoding

Since we are checking boxes, we might as well tell the program whether or not the serial and MAC are to be included in the ciphertext, plus their lengths.

     Separate serial     Included serial   Number of digits (1-9)

     Separate MAC     Included MAC    Number of digits (1-9)

Here is the resulting pattern that is used for the text to digits conversion:

Encoding Pattern

 

Step 2. Plaintext encoding / Ciphertext preparation

Now we write the plain text that we wish to encrypt, which will be converted to lowercase. Punctuation marks other than periods, colons, exclamations and interrogations are ignored. Spaces are encoded as high-frequency letters. Diacritical marks (accents) are ignored. If there are any numbers, they are first converted to letters as in a = 1, ... i = 9, j = 0. Plaintext numbers are not decoded back upon decryption, but hopefully the user can spot them easily.

Plain Text / Ciphertext as letters

And this is the same text, encoded as decimal digits. If now you type into the Encoded Plain Text box, its contents are automatically decoded and the result placed in the Plain Text box. When decrypting a ciphertext already encoded as digits, we start here.

Encoded Plain Text / Ciphertext as digits

If a serial or MAC is included in a ciphertext, it is now extracted and placed in the serial box below or the included MAC box. This is not done for encoded plaintext.

After serial and MAC extraction

 

Step 3. Keystream generation

The next step is to generate the keystream. Since this may depend on a serial number, we need to enter it now in the box below, if there is any. Default is 4 decimal digits, but this can be changed above.

Serial Number

 Serial number:

To generate the keystream, we first encode the key text into decimal digits and take a piece equal to three times the length of the encoded plaintext, and append the serial to it, if any. We split this into two parts so that the first is twice the length of the encoded plaintext, plus one. We write the second part below it starting from the left, and then add them digit by digit without carries in order to obtain the Seed. If the key text is not long enough, it will be repeated and a warning will appear below this line.

This is where the warning will appear

Encoded Key Text

Seed

Then the seed above is used to initialize a lagged Fibonacci generator, where each digit after the seed is the carryless sum of the previous digit and the one located a number of spaces equal to the encoded plaintext length before the current digit. This is best done by rows, adding the numbers immediately above and to the left of the one we are filling. We stop when three rows are completed.

And take from the end a number of digits equal to the length of the encoded plaintext. This is the keystream.

Keystream

 

Step 4. Encrypted Ciphertext / Decrypted Plaintext

Finally, we subtract, again without carries, the encoded plaintext (encoded ciphertext, when decrypting) from the keystream, resulting in the raw ciphertext below (plaintext, when decrypting), which is ready to be sent out. The next couple steps only serve to harden the encryption further, or decode the encoded plaintext.

Raw Ciphertext / Plaintext

The serial, if used, can be hidden within the ciphertext rather than sent in the plain, as set by a radio button near the top of this page. The same thing goes for the MAC. We will assume they are inserted after a number of digits equal to the characters in the first two words of the key text (or the first only, for only one word), so those who know the key text can extract it easily. This step is not taken when decrypting; instead, the serial and MAC are found and extracted before generating the keystream.

Ciphertext with added Serial

And finally, we decode the result back to letters using the checkerboard. When encrypting, it is possible to find a single N1 or N2 digit at the end. In this case, we convert N1='!' and N2='?'

Text-based Ciphertext / Final Plaintext

Step 5 (optional). Message Authentication Code (MAC)

We were basically done in the previous step, but there is still something we can do in order to prevent an adversary from altering an encrypted message without knowledge of the key text. We can make a Message Authentication Code (MAC) based on the plaintext and a suitably large piece of the key text (three times the length of the MAC, as encoded into digits). To do this, we first pick a piece of key text that has not been used yet and encode it, resulting in what is in the box below.

A warning will appear here if there is not enough key text

Then we take the encoded plaintext and append the above to the end of it, and then the serial, if there is any, resulting in this:

Now we divide it into chunks of size 2N+1, where N is the length of the MAC (which is set near the top of the page; default is 3 digits), and add them all without carries. The last piece may be too short, so we pad it with zeroes up to the right length before we add it. Here is the result:

And finally we run three rows of a lagged Fibonacci generator of length 2N, taking the last digit as the first on the second row and placing the result of carryless addition of every two numbers to the right of the lower one. This is the result:

The last half of the third row is the calculated MAC, which should match the MAC attached to the ciphertext. If now we are decrypting and the MAC was included with the ciphertext, it is displayed in the last box so we can compare the two.

Calculated MAC:    

MAC included with the message: