Three new ciphers from the early XIII century

Back in 1202, the Italian mathematician Leonardo Bonacci, also known as Fibonacci, included in his book “Liber Abaci” (Book of Calculation) a sidebar illustrating how quickly rabbits breed. It seems that his primary goal, in addition to raising some awareness about the population explosion experienced by those animals back then, was to show how Indo-Arabic numbers (which for the first time included the zero) could be used in a calculation of practical importance.

The rest is history. The zero caught on and the sidebar calculation, which became known as the “Fibonacci sequence,” occupied mathematicians for the next eight centuries. We don’t know, however, what exactly happened to the rabbits; but their population growth must have been checked somehow, otherwise now we’d be swimming in a sea of rabbits hundreds of meters deep.

One of the things that the Fibonacci sequence is good for is to generate a series of apparently random digits, if we only keep the last digit of every operation. This can be used for encryption, although it has to be done right. Well, after a couple of false starts, which you can read about in this article, I think I finally cracked it, and the result is three new ciphers: “Numeracci”, “Letteracci”, and “Subtracci.”

Numeracci and Letteracci are very similar to BookPad and TripleText, respectively. The difference is that, while the latter two use a large piece of text, possibly taken from a book, as encrypting key (and then the key text must be three times the length of the plaintext), the first two can use a much shorter key and still give good security, thus reducing the amount of calculation required. Subtracci is a variation on Letteracci that can be done much quicker by hand, and is therefore the counterpart of Snake.

Now, using a key whose entropy is smaller than the entropy of a random string of plaintext length (that’s why the factor of three) means that the Fibonacci-derived ciphers are not close to theoretically unbreakable by brute force as BookPad and TripleText try to be. A powerful adversary could try every possible text of the recommended length (plaintext length plus one) and thus decipher the message and know that it is the correct answer, but since key entropy grows linearly with message length he/she/it would still have to spend an impracticable amount of computer time for anything beyond very short messages. In any case, an optional compression step, discussed at the end of this article, gets back much of the same effect for little extra effort.

I have made JavaScript versions of NumeracciLetteracci, and Subtracci, which would run on just about any browser, but they remain paper-and-pencil ciphers. I will now present the mechanics of Numeracci, which outputs decimal digits, using as example the encryption of the plaintext “Hello World” with the key text “It was the best of times”:

  1. Encoding set-up. Since we need to convert the plaintext into decimal digits, we need an encoding table. Numeracci uses the same encoding method as BookPad, which involves a straddling checkerboard for all letters plus the space and a generic punctuation mark (total of 28), with the letters in “Arsenio” and the space being replaced by a single digit, and the rest by two digits. The checkerboard could be the default (unscrambled “arsenio+” at the top, followed by the rest of the letters in alphabetical order), or one that derives from the key text itself (usually just the first sentence: “arsenio+” letters and the rest as they appear in the sentence, followed by the period and those that do not appear in reverse order).
  2. The key-derived checkerboard is recommended, as we discuss below, but we’ll use the default checkerboard for the example, resulting in these encodings:

plaintext = 96499997806729993 (17 digits)

key text = 603806138039648914303879480369043 (33 digits)

  1. Keystream generation. Here’s where the Fibonacci series is used. We first take a piece of the encoded key text that is one digit longer than the encoded plaintext. Therefore, we take the first 18 digits = Now we take the last digit and place it below the first digit add them neglecting any carries (6 + 4 = 0), and write the result on the right of the last-written digit. Then we keep adding the digits on the first and second rows and writing them on the right of the last one until the row is complete. This is the result (seed digits are bolded):
 60380613803964891
 40031178199217198

We could take the last row as keystream, but tests show that its randomness is poor, so we do this for one more row (1st digit is the sum of the last two in 1st  and 2nd rows), resulting in this:

 60380613803964891
 40031178199217198
 93336785343245232

And we take the last row as keystream = 93336785343245232

  1. We obtain the ciphertext by subtracting the encoded plaintext from the keystream, again without carries, which can be done from left to right. This is the result:
93336785343245232 – 96499997806729993 = 07947898547526349

So we take that and send it out: 07947 89854 75263 49. The recipient performs steps 1 to 3 (except plaintext encoding, obviously) using the key text, which he/she knows, resulting in the same keystream. Then he/she subtracts the ciphertext from the keystream, resulting in the encoded plaintext:

93336785343245232 – 07947898547526349 = 96499997806729993

And finally he/she decodes this back to the plaintext, using the checkerboard (the spaces are encoded):  96499997806729993 = “hello world”

 

Now that we’ve got some momentum, let’s take advantage of it and describe Letteracci. This time the sums and subtractions will be done directly on letters, so the user’s knowledge of decimal operations won’t be of much help. Therefore, we need to set up a special table for sums and subtractions, which fortunately was invented a long time ago. Its name is Tabula Recta (“straight table,” in Latin). If a straight alphabet (no substitutions) is in use, this table is:

    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
    ---------------------------------------------------
A | A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
B | B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
C | C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
D | D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
E | E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
F | F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
G | G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
H | H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
I | I J K L M N O P Q R S T U V W X Y Z A B C D E F G H
J | J K L M N O P Q R S T U V W X Y Z A B C D E F G H I
K | K L M N O P Q R S T U V W X Y Z A B C D E F G H I J
L | L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
M | M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
N | N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
O | O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
P | P Q R S T U V W X Y Z A B C D E F G H I J K L M N O
Q | Q R S T U V W X Y Z A B C D E F G H I J K L M N O P
R | R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
S | S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
T | T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
U | U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
V | V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
W | W X Y Z A B C D E F G H I J K L M N O P Q R S T U V
X | X Y Z A B C D E F G H I J K L M N O P Q R S T U V W
Y | Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
Z | Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
    ---------------------------------------------------
    a z y x w v u t s r q p o n m l k j i h g f e d c b

To add two letters, find the first letter on the left bar and the second on the top, then follow the respective row and column to find the result at the intersection. Example: R + H = Y. To subtract, find the first letter on the left bar and the second (the one being subtracted) on the bottom, then follow the respective row and column to find the result at the intersection. Example: R – h = K.

Often we will use a key-derived Tabula Recta rather than the straight one. We start by making a scrambled alphabet this way: take every new letter found as the first sentence of the key text is read; then add the rest of the letters (those not found in that sentence) in reverse alphabetical order. If the key text is “It was the best of times”, the scrambled alphabet will be:

ITWASHEBOFMZYXVURQPNLKJGDC

And then we make the Tabula Recta by shifting that to the left by one letter per row. The bottom line, used for subtractions, is made this way: first letter of the scrambled alphabet, followed by the rest of the letters in reverse order. This is the resulting Tabula Recta:

    I T W A S H E B O F M Z Y X V U R Q P N L K J G D C
    ---------------------------------------------------
A | I T W A S H E B O F M Z Y X V U R Q P N L K J G D C
B | T W A S H E B O F M Z Y X V U R Q P N L K J G D C I
C | W A S H E B O F M Z Y X V U R Q P N L K J G D C I T
D | A S H E B O F M Z Y X V U R Q P N L K J G D C I T W
E | S H E B O F M Z Y X V U R Q P N L K J G D C I T W A
F | H E B O F M Z Y X V U R Q P N L K J G D C I T W A S
G | E B O F M Z Y X V U R Q P N L K J G D C I T W A S H
H | B O F M Z Y X V U R Q P N L K J G D C I T W A S H E
I | O F M Z Y X V U R Q P N L K J G D C I T W A S H E B
J | F M Z Y X V U R Q P N L K J G D C I T W A S H E B O
K | M Z Y X V U R Q P N L K J G D C I T W A S H E B O F
L | Z Y X V U R Q P N L K J G D C I T W A S H E B O F M
M | Y X V U R Q P N L K J G D C I T W A S H E B O F M Z
N | X V U R Q P N L K J G D C I T W A S H E B O F M Z Y
O | V U R Q P N L K J G D C I T W A S H E B O F M Z Y X
P | U R Q P N L K J G D C I T W A S H E B O F M Z Y X V
Q | R Q P N L K J G D C I T W A S H E B O F M Z Y X V U
R | Q P N L K J G D C I T W A S H E B O F M Z Y X V U R
S | P N L K J G D C I T W A S H E B O F M Z Y X V U R Q
T | N L K J G D C I T W A S H E B O F M Z Y X V U R Q P
U | L K J G D C I T W A S H E B O F M Z Y X V U R Q P N
V | K J G D C I T W A S H E B O F M Z Y X V U R Q P N L
W | J G D C I T W A S H E B O F M Z Y X V U R Q P N L K
X | G D C I T W A S H E B O F M Z Y X V U R Q P N L K J
Y | D C I T W A S H E B O F M Z Y X V U R Q P N L K J G
Z | C I T W A S H E B O F M Z Y X V U R Q P N L K J G D
    ---------------------------------------------------
    i c d g j k l n p q r u v x y z m f o b e h s a w t

If you find it tedious to fill the whole table and have a straight table already filled, you can leave the body as in the straight table, and add one step to each sum or subtraction, consisting in locating on the second row the letter found at the intersection, then go one step up to find its equivalent in the scrambled alphabet. Subtracci, described below, simplifies this step considerably by using a Tabula Recta where only the headings change.

Now that we have seen what the Tabula Recta is and how it is used, let’s see the process, illustrated with the same example as before (plaintext: “Hello World”).

  1. Tabula Recta set-up. We have just covered how to do that, so we will assume we are using the key-derived table.
  2. Keystream generation. First take a piece of the key text containing as many letters as the plaintext (no spaces or punctuation), plus one, which we will call the seed. Since the plaintext would be “HELLOWORLD” (10 letters), we take 11 letters from the key text: “ITWASTHEBES”. Now pass the last letter of the seed to a location right below the first, and add them, using the Tabula Recta: I + S = Y. Write this directly to the right of the last-written letter and keep adding the letters on the first and second row until the row is filled. This is the result (seed letters are bolded):
 ITWASTHEBE
 SYHTTNYNGD

We could take the last row as keystream, but tests show that its randomness is not perfect, so we repeat the process for one more row (first letter is the sum of the last two in rows one and two), reaching this:

 ITWASTHEBE
 SYHTTNYNGD
 WLPCPZDJFU

And take the last row as keystream:  WLPCPZDJFU

  1. Finally we subtract the plaintext from the keystream letter by letter, using the Tabula Recta (the letters to be subtracted are entered at the bottom line). Resulting in this:
WLPCPZDJFU – HELLOWORLD = QHKOBGKNZJ

So we take that and send it out: QHKOB GKNZJ. The recipient, who knows the key text, will perform steps 1 and 2 exactly as we did, resulting in the same Tabula Recta and the same keystream, but then will subtract the ciphertext from the keystream, resulting in the original plaintext (minus spaces and punctuation) like this:

WLPCPZDJFU – QHKOBGKNZJ = HELLOWORLD

Subtracci is a variant of Letteracci that makes human calculation quite a bit faster than what you see above. The idea is to use the Tabula Recta always to subtract, which is actually easier since it involves starting at the top and going down until a certain letter is found, and then to the left or right following that row, rather than working out an intersection by starting at the top and at one side at the same time. It turns out that you can also do a LFG by subtracting rather than adding, and its statistical properties are just as good. On top of that, the body of the Tabula Recta is always the same even if a key-derived alphabet is used, thus eliminating the tedium of having to make a new one for each message. You can find a Javascript version of Subtracci at this link, which also explains the process step by step.

Which cipher is better, Numeracci, Letteracci, or Subtracci? This is likely a toss-up. Numeracci has one additional step because every piece of text must be encoded into numbers, and then decoded upon decryption. This can get tedious, but on the other hand the math can be done very quickly with decimal digits instead of the rather special Tabula Recta. But again, Subtracci makes using the Tabula Recta almost as fast as adding figures in your head.

The three ciphers are quite comparable in terms of security, and there a few considerations to bear in mind:

  1. Default encoding or Tabula Recta vs. key-derived. Deriving this from the key takes extra effort, especially if you are writing a whole new Tabula Recta, though there are ways to save some work as we saw earlier. On the other hand, a key-derived encoding or Tabula Recta makes it very difficult to obtain the keystream from the ciphertext if the plaintext is known (known plaintext attack). In Letteracci/Subtracci, it also makes it next to impossible to reverse the keystream back to the seed (we saw in this article that this is quite easy to do) because there is an unknown substitution (since it depends on the key) involved in every sum. Cryptographer Manuel Blum proposes a very similar trick to generate a unique password for any given web site, involving a memorized secret permutation, equivalent to our scrambled alphabet, plus a mental calculation that looks very similar to our lagged Fibonacci generator.
  2. Malleability. As stream ciphers, all three ciphers presented here are malleable in principle, meaning that a powerful adversary could change the encrypted plaintext (if it is known) without knowing the key text. But this becomes impossible if key-derived encoding or Tabula Recta are in use, especially since the adversary has to get it right on the first try. Conclusion: use key-derived encoding or Tabula Recta if at all possible; the increase in security is well worth the little extra effort involved.
  3. Protection against brute force. None of the ciphers comes close to a one-time pad in this category (or to BookPad or TripleText, for that matter) because the key is only slightly longer than the plaintext, and therefore its entropy falls way short of that of a random string of plaintext length, which is what guarantees perfect protection. But it is still a lot of entropy. If we consider that common English has around 13 bits of entropy per word, since in our example the key text used was “It was the bes”, this comes from 4 words, or 52 bits of entropy. A perfectly random string of letters (case-insensitive, no spaces) has 4.7 bits of entropy per letter, so that would be 47 bits for the plaintext in the example, which means that our example ciphertext is perfectly undecipherable (without the key) in practice. For a longer plaintext, the random entropy quickly outstrips the entropy contained in the words, but there is still a lot of it in those words. At 1.58 bits/letter, a 50-letter plaintext (51-letter key) would require a key that has 51 x 1.58 = 80.6 bits of entropy. The computer-standard 128 bits would be reached when the plaintext is 128/1.58 – 1 = 80 letters, and would only get larger as the plaintext gets longer. Still, an additional compression step, described below, can increase the entropy of the key to nearly one-time pad standard.
  4. Short keys. The Fibonacci-based ciphers are much more resistant to the bad effect of short keys than those based strictly on the entropy of the key text. Tests show that, if the key is longer than one-tenth the length of the plaintext, the keystream generated in Numeracci normally will not repeat; the factor is one-twentieth for Letteracci or Subtracci. Anything shorter, though, and the statistical randomness of the keystream goes down the tubes, to say nothing of the fact that it becomes periodic. But BookPad and TripleText are not nearly as tolerant, and things begin to break down if the key text is not at least twice as long as the plaintext. You can see this for yourself because the JavaScript code for those has a little input for changing the key text to plaintext length ratio, plus you can also input short keys, which are automatically repeated to reach the required length.
  5. Insufficient computation. One might be tempted to take the second row of the lagged Fibonacci calculation as keystream, thus saving half of the math effort invested on this step. Unfortunately, the result retains some of the correlation between letters (or numbers, after encoding) of the original key text, which shows as a poor score in the two-digit Chi-squared test. The effect goes away on the third row and thereafter, so that computing more than three rows doesn’t really add any security. The third row also has the nice property that all the digits or letters of the seed affect all the digits or letters of that row, and vice-versa, changing just one digit or letter in the row would mean a complete change in all the digits or letters in the seed for anyone trying to reverse the computation. You can see this for yourself by running the JavaScript codes, which have an input to change the number of rows of the lagged Fibonacci calculation.

Ok, so what is this compression thing that I have mentioned twice already, which can help a lot against brute force? Simply this: rather than a piece of key text that is only one digit or character longer than the plaintext, take a piece that is three times as long as that, then compress it to plaintext length so it can be used as seed for the LFG. Since a piece of English text contains 1.58 bits of entropy per character, the result, if compressed without entropy loss, will have the 4.7 bits/character of true random text. The trick is to compress without losing entropy, but here Shannon’s original paper on the entropy of language comes to the rescue. Early in the paper, Shannon describes a method to represent text by using fewer characters. In essence, the first few characters of each word are written until a subject well acquainted with the language is able to guess the rest. This method is what ultimately yields the experimentally derived 1.58 bits/character of entropy for common text, which comes from being able to compress text by a factor of three in actual tests. Since the average length of a word in English text is 4.5 characters, compressing by a factor of three means reducing this to 1.5 characters/word. That is, some words will compress to a single character, some to two. A way to achieve this without involving an actual subject is to take one character for words shorter than the average (1 to 4 letters), and two for the rest. In the programs posted here, and here, the first letter of words less than five letters long is taken when compression is enabled, and for the rest we take the first and last letters, since other studies show that those are the most essential for comprehension. Spaces and punctuation usually don’t add comprehension and are ignored.

For instance, the key text “It was the best of times, it was the worst of times”  becomes: I W T B O TS I W T WT O TS (spaces added to make it easier to read). Clearly, this is not the same compression that Shannon used since it is not truly reversible and therefore loses some entropy, so I call it “pseudo-Shannon” compression. I have performed tests that show that the key text is indeed shortened by a factor slightly above three, both in English and in Spanish, and although the resulting seed fails randomness tests and therefore is not suitable to be used directly as keystream, the result after the LFG process passes them quite easily. This way, those seeking to crack the encryption by brute-forcing the key will have to look through a keyspace with the same entropy as that of a random keystream of plaintext length, which ensures unbreakability by Shannon’s criterion.

To finish up, bear in mind that neither Numeracci, Letteracci, nor Subtracci (nor BookPad or TripleText, for that matter) contain anything that had not been invented by the beginning of the XVI century, perhaps as early as Fibonacci’s time, three centuries earlier. That was the time when the so-called Vigenère cipher (though Blaise de Vigenère wasn’t related to it), and its cousin the running-key cipher were invented. It was also the time of the Black Chambers in England and other countries, specializing in the cryptanalysis of those ciphers, with important historical consequences. Had anything like what I’ve described in this and other articles been available at the time—and there’s no reason why it couldn’t have been, as far as the math goes—the history of the world might have been quite different indeed for the last 500 years.

Chew on it as you try your hand with these relatively simple paper and pencil ciphers.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.