Paper and pencil ciphers are fun and can be useful in a pinch, when all computers around are suspect. In a previous article, I presented “Root,” a cipher that gives decent security and only requires a dumb calculator. In this article, we’re going to try and do the same without even that. Only paper and pencil. And you don’t need to learn Restonian.
“Restonia” is a cipher evolved from the VIC cipher used during the Cold War. The VIC cipher is called this way because its user, Reino Häyhänen, was a Russian spy with code name VICTOR. Even though it is a paper and pencil sort of cipher, the NSA was not able to crack it from 1953 until 1957, when Reino defected to the West. There is an excellent Wikipedia article on this cipher.
Now, the VIC cipher would be cracked easily today because its key was too short. This is a problem shared with just about very classical cipher before computer came in the scene, with the exception of one-time pads. Ever since a team of volunteers cracked the 56-bit DES cipher in less than 22 hours by simple brute-force trying of every possible key, we know that key length needs to be pretty long in order to withstand a computer-supported attack. The problem with long keys, though, is that they are hard to remember.
Unless some mnemonics are used. Dictionary words, for instance, have a minimum of 13 bits of entropy per word, on the average. Thus, a phrase containing just five words has roughly 65 bits of entropy, which in principle is 2^{9} = 512 times harder to brute-force than DES. Phrases with five or more words are still easy to remember. Some examples: “I’ll have the chicken soufflé with fries,” “Don’t tell me you got fired,” “I love the smell of napalm in the morning,” just to mention some possibilities.
The trick is how to turn a longish set of words into a usable encryption key, and how to use such a key in a pen-and-paper cipher. Well, after thinking about it for a while and looking at the lessons of the past—many can be found on Wikipedia, practicalcryptograhy.com, and Kahn’s classic history “The Codebreakers” unabridged version—a new chiffre indechiffrable has been produced, codenamed “Restonia.” This is the name of a hypothetical small country where most children grow up to be spies. The name happens to contain the most frequently used letters in English (and most Latin-character languages, for that matter).
BTW, everything I’ve said so far also applies to the very similar Aphid cipher, which you can also find on this blog. Aphid is more secure because it has a larger degree of plaintext diffusion, but it is also more tedious and takes one extra step. Your pick. Back to Restonia.
Encrypting involves three obligatory steps, plus a couple optional steps. We are going to illustrate the method by enciphering “MEET ME AT THE STATION MONDAY AT 5PM” using “secret code” (too short, but will serve for illustration purposes) as passphrase and serial “123.” At this point, you may want to click on this link for a JavaScript version of the cipher (with keyword “arsenio” rather than “restonia”). These are the steps:
- Serial number (optional). Security is enhanced if each message is given a different serial number, which is never reused (at least for messages of equal length). A three-digit serial will allow us to encrypt hundreds of messages with one passphrase. Let the serial number for our example be “123.”
- Code generation. Two codes are made from the serial + passphrase: one for encoding the plaintext, and the other for scrambling the encoded result. The easiest to make is the scrambling code: just number each word in the serial + passphrase, according to their order in the dictionary, taking into account numerals that may already be contained in the serial (which is pre-pended to the first word). If a word contains less than four letters, which would lead to insufficient scrambling, merge it with the next before finding the order. If a letter or number is repeated, begin from the right, for better scrambling. For the serial “123” and passphrase “secret code” = “123secret code”, this results in: “123 864759,1423” The encoding table is made into a 3×10 “straddling checkerboard,” which contains the letters of the alphabet, plus a catch-all symbol for punctuation, and a numbers shift. The first row contains only high-frequency letters (coincidentally, those in the name “Restonia”), with two spaces blanked out. The other two contain the rest of the alphabet, etc. We fill it this way: Take the passphrase and count the number of letters in the first two words that have different lengths (don’t merge them with the next even if they are short; if they are all equally long, go up by one for the second digit) = 6,4. Consequently, blank out cells 6 and 4 on the first row. Take the passphrase again, and place each new letter you find, in sequence, either in the first row if it is part of “restonia” or in the second and third, beginning from the left. This is the result in our example:
S | E | R | blank | T | blank | O | |||
C | D | ||||||||
Now fill the rest of the table this way: Top row, the rest of the letters in “restonia”, in reverse order (for less predictability). Rest of rows, the special characters, followed by the rest of the dictionary, also in reverse order. This is the result with labels added at top and left:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | |
S | E | R | blank | T | blank | O | A | I | N | |
6 | C | D | – | # | Z | Y | X | W | V | U |
4 | Q | P | M | L | K | J | H | G | F | B |
The top labels are just the digits 1 to 0 in sequence, those on the left, starting with the second row, are the lengths of the first two words of the passphrase (consecutive, if all words are equally long). This is our straddling checkerboard. If we are going to be decoding the ciphertext back into letters before sending it out, we may want to generate a second checkerboard for this purpose, using a separate passphrase.
- Encoding. Now we convert out plaintext into a numerical code. The letters on the first row of the checkerboard are replaced by the single digit above them. The letters and symbols on the other two rows are replaced by two digits: the one on the left, followed by the one above. If punctuation is needed, use the special character. To represent numerals, enter the numbers shift symbol, then the straight numeral, and end with two number shifts. Here’s the result:
M EETM EATTH ESTATIONM OND AY AT# 5# # P M |
4322543285547215859704370628668564564644243 |
If we want the final result to be a certain length—perhaps in order to disguise it as a list of phone numbers or whatnot, more on this at the end of the article—this is the place to add as many gibberish digits as needed, usually at the end of the code.
- Scrambling. The numerical code is now scrambled with each code we obtained earlier. To do this, we first write the first scrambling code, and then underneath it the long numerical code we just obtained in the previous step, left to right row by row. Then we read the scrambled code vertically, beginning with the column marked as “1”, then “2” and so on until the end. Then we scramble the result with the second scrambling code, and so on until there are no more scrambling codes. For our example, we obtain these two tables:
1 | 2 | 3 | 8 | 6 | 4 | 7 | 5 | 9 |
4 | 3 | 2 | 2 | 5 | 4 | 3 | 2 | 8 |
5 | 5 | 4 | 7 | 2 | 1 | 5 | 8 | 5 |
9 | 7 | 0 | 4 | 3 | 7 | 0 | 6 | 2 |
8 | 6 | 6 | 8 | 5 | 6 | 4 | 5 | 6 |
4 | 6 | 4 | 4 | 2 | 4 | 3 |
Result: 45984 35766 24064 41764 2865 52352 35043 27484 8526
1 | 4 | 2 | 3 |
4 | 5 | 9 | 8 |
4 | 3 | 5 | 7 |
6 | 6 | 2 | 4 |
0 | 6 | 4 | 4 |
1 | 7 | 6 | 4 |
2 | 8 | 6 | 5 |
5 | 2 | 3 | 5 |
2 | 3 | 5 | 0 |
4 | 3 | 2 | 7 |
4 | 8 | 4 | 8 |
5 | 2 | 6 |
Result: 44601252445 95246635246 8744455078 53667823382, which can be transmitted out as is, preceded by the serial: 1234460125244595246635246874445507853667823382.
Decoding (optional). Sometimes we may want to transmit letters rather than decimal digits. The simple way to do this would be to just use the checkerboard to get the letter equivalent of the result, but this would cause many of the letters to go back to plaintext, which is bad, unless we are using a second checkerboard, generated by a different passphrase. Therefore, we will be a little more devious and before decoding back to letters we’ll multiply the numerical ciphertext (without the serial) by the number of letters in the last word of the passphrase, up to 9. If the number of letters is 10 or more, take the last digit. If this last digit is 0 or 1, take the second-last word instead, and so on. If for all words we get 0 or 1, multiply by 7 instead. If the last number requires one more digit after (6 or 4 in our example), double it up and then add ‘XX’ at the end of the output. If the first digit is 0, pre-pend a 0 to the multiplication result before decoding. Then put the serial at the beginning and send it. Here’s what we get for our example:
Numerical code |
4460125244595246635246874445507853667823382 |
Multiplied by 4 |
17840500978380986540987497782030414671293528 |
Decoded |
SOAB TNNIOARANIAZ B IAOF OOAENRNQ J OSEIRTEA |
Sent |
123SO ABTNN IOARA NIAZB IAOFO OAENR NQJOS EIRTE A |
Decryption is roughly the reverse of encryption. This way:
- Code generation. First get the serial from the ciphertext and use it in combination with the passphrase to generate the straddling checkerboard and the scrambling codes as described above. This step is identical to step 2 for encryption. If two checkerboards were used for encryption, we need to make two now, and use them in reverse order.
- Encoding (optional). This is the reverse of encryption step 5. If the ciphertext consists mostly of letters, we’ll need to get that into a numerical code using the straddling checkerboard (the second one, if they were two). Then we figure out the multiplier from the passphrase (length of last word) as above, and divide the numerical code by it. It should be an exact multiple, otherwise there is an error.
- Descrambling. This is the reverse of encryption step 4. We begin making a table for the last scrambling code, with as many columns as digits in the code. The number of rows is the numerical ciphertext length divided by the code length, rounded up. The remainder of the division tells us how many cells in the last row are active, and the rest should be blanked out. Then write the ciphertext numbers from top to bottom, starting on column 1, then 2, and so forth, filling all active cells in each column. The resulting table should be identical to the last scrambling table made when encrypting, but this time the result will be read off by rows from the top, left to right. Then the result will be written in columns on a table made for the second-last scrambling code, and so on until the first scrambling code is used. The final result is the decimal-encoded plaintext.
- Decoding. Now take the encoded plaintext and decode it back to letters (and numbers, if any) using the checkerboard, thus reversing encryption step 3 (if two checkerboards are used, this step involves the first one). If gibberish digits were added, they will usually decode to gibberish so it’s easy to spot them and ignore them.
The strength of this cipher lies in several features:
- Fractionation and diffusion: many letters are split into two-digit codes, and then the digits are separated widely at the scrambling step. High-frequency letters are converted into low-frequency ones by combining with separated digits. This makes it very difficult to do any kind of frequency analysis on the ciphertext.
- Compression: often-used characters (r,e,s,t,o,n,i,a) are transformed into shorter codes than less-used ones. This tends to flatten the frequency distribution and helps to keep the ciphertext within the bounds given by the “unicity distance” (keylength in bits divided by 3.5, for English), which is the ciphertext length below which it is theoretically impossible to ever decipher the message without the proper key because it can be deciphered to other equally possible plaintexts.
- Mechanization: the process can be done quickly on paper without having to think much. In fact, computer-based encryption algorithms usually do something very similar: break characters into bits that are then scrambled according to the key.
- Irregularity: there is no fixed length for any scrambling operation, which makes it hard to attack by simple anagramming. The overall length of the passphrase cannot be easily guessed from the result of the scrambling even if the plaintext is known.
- Unlimited key: arbitrarily long passphrases can be used, each word adding more complexity to the scrambling process and more diffusion to the characters. It is like writing on a piece of dough that gets repeatedly kneaded according to each word of the passphrase.
- Independence: the way the passphrase material is used for making the checkerboard is quite different from the way it is used for making the scrambling codes. The scrambling codes only depend on the ordering of the word letters, not on the letters themselves. The letters matter when making the checkerboard, beyond their alphabetical order. Therefore, there is little danger that one operation would undo the other, even partially, and so it is safe to use the same passphrase as a starting point for both (except for a few very bad choices, which should be obvious to the user right away).
- Reusability: since the scrambling operation with irregular key fragments is hard to reverse, there is little danger that the passphrase be recovered from plaintext samples and their corresponding ciphertexts. Still, it is a good practice to include a serial number that never gets used again with the same passphrase, since messages of identical length scrambled with identical scrambling codes are subject to the anagramming attack.
Now let’s put on our hacker’s hat and try to crack this baby. The algorithm has been published on this blog, so this is not a secret. The secret is the passphrase and the plaintext. We have the numerical ciphertext from step 3 above: serial 123 + 44601 25244 59524 66352 46874 44550 78536 67823 382. These are the digit frequencies:
digit | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 |
occurrences | 1 | 5 | 4 | 9 | 7 | 6 | 3 | 4 | 1 | 2 |
Since in English the most frequent letters go in this order: e-t-a-o-i-n-h-r, we might feel tempted to say 4=e, 5=t, 6=a, which would be incorrect because 4 and 6 are actually used for the least-frequent letters. Or, taking this into account, we might say 6=e, 3 or 8=t or a, which ends up being correct only in the guess for “a”. The problem here is that the low frequency letters are summing their frequencies because of the straddling checkerboard method, thus putting them on a par with the high-frequency letters, thus making frequency analysis almost useless.
Let us now look for digraphs and trigraphs. We see “44”, “46” and “52” three times which, knowing the encoding, could be interpreted as “L”, “J”, and “TE” respectively. None of them are actually present in the plaintext. All the frequent digraphs are artifacts resulting from scrambling. There isn’t a single trigraph to sink our teeth into, even though the plaintext is full or repetitions. Clearly a longer piece would have contained some repeated trigraphs, but again it is very likely that they would bear no relationship with the plaintext because they result from figures brought together from very distant original locations.
Normally, a straight transposition cipher is attacked by “multiple anagramming,” which consists of switching characters around in several ciphertexts of the same length that are suspected to be scrambled in the same way. By detecting which permutations maintain “fitness” with the patterns of the language in all ciphertexts, and which don’t, one can eventually descramble all the ciphertexts at once. But our cipher includes a serial number for the express purpose of defeating this technique since not two messages are scrambled in the same way. If the serial was a permutation that was applied near the end, there might be some hope of reversing it first, and then working with the partially de-scrambled ciphertexts, but unfortunately for the cryptanalyst the permutations encoded by the serial number are the first to be applied when scrambling, and the last when descrambling. Additionally, there are more sophisticated attacks based on dictionaries or “hill climbing,” but these are still based on fitness measures, which are obscured by artificially enhancing the probability of low-frequency letters by means of the straddling checkerboard.
A similar cipher used during World War I, ADFGVX, was broken by frequency analysis because the Polybius square method used as a first step did not obscure completely the underlying frequencies of the language. Also, ADFGVX used only one transposition step, which induces considerably less unpredictability than a mere double transpositions.
How about a “known plaintext” attack, where both the ciphertext and its matching plaintext are known? Can we then recover the passphrase, or something like it? This would clearly succeed if only the substitution had been done, for then we could match letters with their codes immediately and thus reconstruct the checkerboard. It would be harder if only transposition had been used because there would be many identical figures in the encoded plaintext competing for each position in the ciphertext, but there would be only so many of these ambiguities, so that eventually the scrambling code could be figured out. If both solution methods proceed in parallel but are independent of each other, they can help one another. The problem is that the solution of one affects the solution of the other, so that we don’t know the starting position of the figures until we know the checkerboard, and we cannot know the substitutions that make up the checkerboard until we know how the numbers move during the scrambling step. The two problems have their difficulty compounded when they need to be solved at the same time.
The diffusion effect is not helping one bit here. Let’s do a quick experiment and switch just two numbers in the first scrambling code (in our example, which involves only two codes), say, the 1 and 2 columns. Then the second scrambling table becomes this:
1 | 4 | 2 | 3 |
3 | 5 | 7 | 6 |
6 | 4 | 5 | 9 |
8 | 4 | 2 | 4 |
0 | 6 | 4 | 4 |
1 | 7 | 6 | 4 |
2 | 8 | 6 | 5 |
5 | 2 | 3 | 5 |
2 | 3 | 5 | 0 |
4 | 3 | 2 | 7 |
4 | 8 | 4 | 8 |
5 | 2 | 6 |
Result: 36801252445 75246635246 6944455078 54467823382. As a table, with differences highlighted:
original |
4460125244595246635246874445507853667823382 |
altered |
3680125244575246635246694445507854467823382 |
Out of 43 digits, 8 digits or about 20% have moved to different locations. This is hardly surprising since the first scrambling involved nine columns and two of them were switched, resulting in 2/9 = 22.2% of digits moving to different locations. If the two columns had been switched in the second table (four columns), we would have altered around 50% of the ciphertext. This means that we won’t necessarily know that we are getting close to a solution until we almost run into it. But solution methods invariably rely on some sort of fitness indicator for incomplete solutions, which allows the cryptanalyst to find the plaintext by successive approximations. If the indicator shows no better fitness until the very end of a successful de-scrambling, it becomes very hard to get even started.
It is a lot like walking in the dark to find an exit, except that picking a direction and staying with it until a wall is found, then following the wall until a door appears, is not going to work. There is no wall that we can follow. Instead, it’s more like finding a trapdoor in the middle of a dark field. We won’t know where it is until we’ve almost fallen into it. We might be walking only a few feet away and never know it was there.
A final word on how to deliver the ciphertext. Often it will be desirable to hide the fact that encryption is being used. A ciphertext consisting entirely of decimal digits can be disguised as a list of phone numbers, for instance, or as some sort of statistics. If we want to make the output appear as a series of 10-digit phone numbers, it will be best if the final ciphertext length is a multiple of 10. In order to make this happen, we will add some gibberish digits to the just-encoded plaintext (after step 3) to reach the appropriate length (remember that the serial, if any, is added at the end of the process), which will be scrambled along with the real message. The gibberish digits, which are technically named “nulls” will be easy to distinguish from the plaintext at the end of the decryption process because likely they won’t decrypt as anything meaningful. Nulls make cryptanalysis harder because they tend to disguise the statistical properties of plaintext.