And I’d be adding Visionnaire and Worm as well. All of these ciphers resist ciphertext-only attacks quite well because the ciphertext they generate looks quite random (increasingly so as the number of letters per operation increases) and trying to decrypt with the wrong key yields a “plaintext” that looks completely random even if the key is off by a single character, but they fall to a known-plaintext attack right away. In this article, I discuss how this would be done, and what can the sender do to counteract the attack.
Consider the simplest case, which is the Visionnaire cipher with the same mixed alphabet used on all sides of the Tabula Recta. The mixed alphabet input is equivalent to applying a function f(P), where P is a plaintext letter, and the output would be equivalent to doing f-1(P), involving the inverse of the previous function. Now, we are the attacker and have come up with a ciphertext C for a plaintext P that we happen to know as well. Can we deduce the function f(x) from this, which would be the same as deducing the substitution key?
Sure. All letters in the ciphertext after the initial segment, which involves a separate “seed,” obey the formula:
Ci = f-1(Pi-n – f(Pi))
where i is the index to a given letter (plaintext or ciphertext), n the length of the seed, and all additions and subtractions are modulo 26. We then obtain:
f(Ci) = Pi-n – f(Pi)
If the text is sufficiently long, we will have all possible letters in the plaintext and also in the ciphertext. Now, the function f(x) only has 26 values, one for each letter in the alphabet. Therefore, we need only 26 equations like the one above this paragraph to have all the values of f(x). This is a linear system that is solved in no time, and then we’ve got the substitution key. Never mind what the seed happens to be so long as we know its length (which can be guessed or obtained from statistical measures on the ciphertext). Once we know the key, we can decipher the bulk of any other message encrypted with the same key even if the seed is unknown, because the seed letters only affect the first n letters of the ciphertext. The seed itself can be recovered from applying the same process to the first n letters of the ciphertext, yielding equations of the form:
f(Ci) = Si – f(Pi)
where S is the seed. If we use two different keys on the input and output sides of the Tabula Recta, then we are dealing with two functions f(x) and g(x), each with 26 possible values and the equations to be solved are of this form:
g(Ci) = Pi-n – f(Pi)
In this case we will need 26 x 2 = 52 equations, but the process is still trivial. But what if the cipher involves more letters in each operation? Let’s look at Serpentine, this is the most complex one presented on this blog, and let’s use different keys for input and output. The equations become:
g(Ci) = Pi-n – Pi-n+1 + Ci-n – Ci-n+1 – f(Pi)
which leads to a linear system of 52 equations that is still just as trivially solved as Visionnaire. Even worse, Worm involves the seed in the operations after the initial phase, yielding equations of the form:
g(Ci) = Pi-n – Si mod n + Ci-n – f(Pi)
which would allow the attacker to retrieve the seed by adding another n equations to the set. What’s to be done, then?
The sender’s only hope is to use a plaintext that is different from what the attacker has. Some strategies:
- For starters, prepend the plaintext with a number of random “null” characters equal to the seed length. This will prevent the attacker from obtaining the seed even if the substitution keys have been obtained from solving a linear system (except for Worm, where the linear system also gives the seed). If the cipher involves ciphertext letters in the enciphering operation (Zigzag, Skink, and Serpentine do, Visionnaire does not), then the attacker won’t be able to decrypt another message encrypted with the same substitution keys and seed so easily, even if he/she/it is able to obtain the substitution keys. But still the cipher would have been greatly weakened if the seed is short, so use a long seed containing several dozen characters.
- Cut the plaintext at some point in the middle, like a deck of cards, and place the second part ahead of the first, perhaps with an easily distinguishable code separating the two parts. This will force the attacker to solve the system for every possible cut position. Of course, a computer can do this in very little time unless you also do the item below.
- Even better, throw some nulls characters into the plaintext at irregular intervals. This will force the attacker to try every possibility before he/she/it can find the good one. The more nulls and the more “random” their locations (sometimes they’ll need to be next to each other), the better.
- And better still, do a keyed transposition on the plaintext or the ciphertext (or both). This can be done quite easily since a mixed alphabet contains no repeated letters and therefore can also be used quite well as a transposition key. Then the ciphertext characters won’t be matched to any particular plaintext characters, and making guesses is as difficult as guessing the key itself. Anagramming, which is typically used in the cryptanalysis of transposition ciphers, won’t work because of the additional encryption steps.
Option 4 actually works quite well. I have made versions of Sink and Worm that add transposition steps so that there is no way to know (without knowing the separate transposition key, which is also a mixed alphabet having 26! combinations) which ciphertext letter matches which plaintext letter. Transpositions involve quite a bit less work than operations on a Tabula Recta, so the amount of work added is not too great. The enhanced Skink, called SuperSkink, adds a single transposition of the plaintext when encrypting, which needs to be reversed at the end when decrypting. The enhanced Worm, which I’m calling SuperWorm, has a transposition at the beginning and a reverse transposition at the end under the same key, so that the encryption and decryption still involve identical processes. The same trick is involved in SuperFileine, which is the version of Serpentine for files.
Bottom line, there’s no free lunch. These are simple hand ciphers, and quite secure against ciphertext-only attacks, but proofing them against known-plaintext attacks takes some extra work. Of course, you could always use a computer-base cipher, which obtain their resistance against known-plaintext attacks from operations that are so complicated that reversing them takes more computation than simply trying to brute-force the key.
Or you could use a long text key, like Snake and other ciphers of its kind. Snake is not susceptible to the attack we’ve just seen because inverting the output substitution gives us this equation:
f(Ci) = Ki – f(Pi)
where K is the key text obtained by combining three pieces of text on the Tabula Recta. Since K is pretty close to being statistically random, the value of Ki is never known a priori, and every time we add an equation by looking at a new position in the plaintext and the ciphertext we also add an unknown, so we can never make a solvable system of equations.