The Visionnarie cipher

Blaise-Vigenere.0 (1)The autokey cipher was invented nearly five hundred years ago by Blaise the Vigenère, pictured at left, but was almost immediately forgotten in favor of a much weaker repeating-key cipher invented by Bellaso, once upon a time known as “the undecipherable cipher,”  which Vigenère somehow got credit for. Given how many important secrets were revealed when that cipher was broken, the history of the world might have been quite different if Vigenére’s true creation had been the one people actually used. And this is the Visionnaire cipher: a simple combination of Vigenére’s autokey cipher with a substitution, made quite seamless to the user by means of a Tabula Recta. It turns out to be almost as strong as Worm, and much simpler to do. We can only speculate what might have been if this variation had been used back then. Nothing really prevented it.

Visionnaire involves two alphabetic keys plus an alphabetic seed. The keys are converted into mixed alphabets, which control simple letter substitutions. The seed is used to initialize the process before the autokey operations begin. Keys and seed can be all the same, but they also can be different, for a truly enormous key space. For our example, let us use the following: key 1 = “wonderful”, key 2 = “counselor”, seed = “marvelous” (this is from Handel’s “Messiah,” in case you were wondering). And let the secret message be: “we are discovered, flee at once.” I have made a Javascript version of it, which you can run as you read what follows. These are the steps for encryption:

  1. Make a mixed alphabet from each of the two keys and place them surrounding a standard Tabula Recta. To make a mixed alphabet from  a key, write down every new letter as you find it in the key; if a key letter has appeared before, write instead the first letter in the alphabet that has not yet been written; after reading all the letters in the key, write the rest of the alphabet in reverse order. Thus from “wonderful” we obtain “WONDERFULZYXVTSQPMKJIHGCBA” (no letter repeats), and from “counselor” we obtain “COUNSELARZYXWVTQPMKJIHGFDB” (“O” appears twice in the key, so the second time we write “A” instead). Then we place those on the edges of a conventional Tabula Recta, this way: 1st alphabet at top and bottom, 2nd alphabet at left and right. This is what we obtain in our example:
    W O N D E R F U L Z Y X V T S Q P M K J I H G C B A
    C | A B C D E F G H I J K L M N O P Q R S T U V W X Y Z | C
    O | B C D E F G H I J K L M N O P Q R S T U V W X Y Z A | O
    U | C D E F G H I J K L M N O P Q R S T U V W X Y Z A B | U
    N | D E F G H I J K L M N O P Q R S T U V W X Y Z A B C | N
    S | E F G H I J K L M N O P Q R S T U V W X Y Z A B C D | S
    E | F G H I J K L M N O P Q R S T U V W X Y Z A B C D E | E
    L | G H I J K L M N O P Q R S T U V W X Y Z A B C D E F | L
    A | H I J K L M N O P Q R S T U V W X Y Z A B C D E F G | A
    R | I J K L M N O P Q R S T U V W X Y Z A B C D E F G H | R
    Z | J K L M N O P Q R S T U V W X Y Z A B C D E F G H I | Z
    Y | K L M N O P Q R S T U V W X Y Z A B C D E F G H I J | Y
    X | L M N O P Q R S T U V W X Y Z A B C D E F G H I J K | X
    W | M N O P Q R S T U V W X Y Z A B C D E F G H I J K L | W
    V | N O P Q R S T U V W X Y Z A B C D E F G H I J K L M | V
    T | O P Q R S T U V W X Y Z A B C D E F G H I J K L M N | T
    Q | P Q R S T U V W X Y Z A B C D E F G H I J K L M N O | Q
    P | Q R S T U V W X Y Z A B C D E F G H I J K L M N O P | P
    M | R S T U V W X Y Z A B C D E F G H I J K L M N O P Q | M
    K | S T U V W X Y Z A B C D E F G H I J K L M N O P Q R | K
    J | T U V W X Y Z A B C D E F G H I J K L M N O P Q R S | J
    I | U V W X Y Z A B C D E F G H I J K L M N O P Q R S T | I
    H | V W X Y Z A B C D E F G H I J K L M N O P Q R S T U | H
    G | W X Y Z A B C D E F G H I J K L M N O P Q R S T U V | G
    F | X Y Z A B C D E F G H I J K L M N O P Q R S T U V W | F
    D | Y Z A B C D E F G H I J K L M N O P Q R S T U V W X | D
    B | Z A B C D E F G H I J K L M N O P Q R S T U V W X Y | B
    W O N D E R F U L Z Y X V T S Q P M K J I H G C B A
  2. Next we process the plaintext so that only letters are left. This involves removing spaces, punctuation, and accents of several kinds, and converting digits to letters according to a prearranged code (such as 1 = A, 2 = B, and so on, it doesn’t matter what). Optionally, we can further strengthen the cipher by prepending a number of gibberish characters equal to the seed length. Since the seed “marvelous” has 9 letters, we’ll put 9 random letters at the start (it doesn’t matter what). Result: “QXFVXGOIV WEAREDISCOVEREDFLEEATONCE”.
  3. Now the initial phase of the encryption. Write the seed above the processed plaintext and do this on the Tabula Recta for each letter of the plaintext until all seed letters have been used: Look up the plaintext letter on the top or bottom of the Tabula Recta, go along that column until the seed letter is found, then left or right to the edge and write down the letter you find there under the plaintext letter. For instance: Q + M = F, X + A = Q, F + R = X. In the end, we have started making a work table, which at this point looks like this:
  4. And now the main phase of the encryption. We are going to fill the spaces below each plaintext letter this way: take the plaintext letter above the spot to be filled and look it up at the top or bottom of the Tabula Recta (“W”), then go along that column until you find the first plaintext letter not yet used in this phase (“Q”, all the way to the left), then right or left to read the ciphertext letter at the edges (“P”) and write it down in its spot, then mark the position of the second letter you used so you don’t use it again. Keep doing this until the bottom row is filled. Thus, W + Q = P, E + X = J, A + F = L, and so on. In the end, this is the way the work table looks, including dots added to mark second-letter positions that have been used:
    ......... ................         

The bottom row is the ciphertext, which you can send out through insecure channels: “FQXZJQVCLPJLPJNIIDHKGWCCUYDYGMPULO”.

Decryption is quite similar, though not identical. The first two steps are the same, except that we do not add any gibberish letters in step 2. In step 3, we write the ciphertext, not the plaintext, immediately below the seed. And then, we look up ciphertext letters on the left or right side, not the top, and read them at the top or bottom. This is our work table after step 3:


The operations in step 4 are also similar, but we take plaintext (bottom row) rather than ciphertext (second row) for the second letter, and we start on the left or right side and read the result of the operation at the top or bottom. Thus, P + Q = W, J + X = E, L + F = A, and so on. In the end the work table, including markers, looks like this:

......... ................         

Since we know that the first nine letters are nulls (because the seed has nine letters), we strip them now from the beginning of the bottom row, to read the original processed plaintext: “WEAREDISCOVEREDFLEEATONCE”.

Some explanation on why we are following these patterns. In each encryption operation, we are applying the first mixed alphabet to substitute the first plaintext letter of each operation, and the second to obtain the ciphertext letter. This means that we need to reverse the substitutions in the opposite order when we are decrypting, and this is why we enter the Tabula Recta on the left rather than the top; the first substitution to be reversed is the one involving the second alphabet, and the last to be reversed is that involving the first alphabet. Of course, if the same key is used for both alphabets, it doesn’t matter whether you start at the top or on the side.

And now the important question, how secure is this? Let’s start with the obvious: brute force. Each key is equivalent to a complete mixed alphabet, which maxes out at 26! = 4E26 possibilities. Then the seed can be as long as we want to make it, with no limit. If we take the same number of variations for the seed, we come up with a key space with (26!)^3 = 6.56E79 possibilities, which in terms of binary bits is log10(6.56E79)/log10(2) = 265 bits, quite competitive with today’s best computer ciphers (256 bits). In other words, forget about trying every key until one works.

Ah, but computer ciphers are designed so that just changing one bit in the key changes the result completely (actually, 50% of the bits, on the average), so that there is no way for someone trying keys to tell whether he/she/it (most likely “it”) is getting closer to the good one. Otherwise, a computer program could converge step by step to the correct set of keys, as this paper shows for the so-called Vigenère cipher (with a repeating keystream) with a substitution applied after the keystream is added. Does Visionnaire have a similar property? Well, it does. Let’s say we are decrypting the ciphertext in the example and have already gotten right the first key and the seed, but for the second key we try “counseloP” (only one letter at the end is different, involving just one substitution). This is the “plaintext” we obtain, after removing nulls: “LRATEEISCPAEUEEFLERETVNBE”, which contains some of the correct plaintext, but not much, just 11 letters or about half. Most likely the cryptanalyst is using some statistical measure applied to the resulting plaintext, such as Friedman’s Index of Coincidence, to detect whether or not it is a human language and not random letters. The Index of Coincidence of this plaintext (with the nulls still on) is 0.0659, which would detect as normal English. If we make the key “counselo” (one letter missing), the IC goes down to 0.0338, completely in the random range. Now, this is somewhat to be expected with our example plaintext because it is very short, but the result is the same with plaintexts tens of thousands of letters long, as well as for any other error introduced in keys or seed. A computer trying to “get closer” to the correct keys and seed by looking at statistical measures is going to get very frustrated.

The most sensitive measure I’ve found so far is the chi-squared test for independence of two letters separated by a certain distance. In human languages, consecutive letters fail this test quite readily, and less so as the distance grows. Letters separated by ten spaces and beyond are essentially independent (chi-square value below 671). When we apply this measure to Visionnaire-generated ciphertext, it usually goes up to a little over 900 for texts containing around 20,000 letters (higher for longer texts) for consecutive letters, and again for separated a number of letters equal to the seed length. Not a lot, but distinguishable from random. This means that an attacker could run this test and thus determine right away: 1, that the message was likely encrypted with Visionnaire; 2, the length of the seed. This is no good because, even though the actual letters making up the seed only affect a few letters near the beginning of the ciphertext (most letters are combined with letters from the plaintext itself), this gives away the distance between plaintext letters that are combined to make the ciphertext. It then becomes a matter of finding the correct substitution keys (which should be different, or at lest of different length, from the seed).

However, if we apply this measure to “plaintexts” generated by trying to decrypt a ciphertext using keys that are off by a single letter, the measure falls below the random threshold (671), no matter what separation between letters we test. We can be just one letter away from the solution and the statistical tests won’t give a hint that we are so close. This is like trying to putt blind on a perfectly flat green as big as the universe.

Known plaintext attack? Unfortunately, Visionnaire would succumb to this attack right away, as I discuss in this article. The attacker only needs to set up a system of at most 52 linear equations in order to obtain the substitution keys, and then doing the same for the letters in the initial portion of the ciphertext would give the seed. So be very wary of stereotyped beginnings and endings and boilerplate text. Don’t do it. Adding some nulls at the beginning as recommended protects against an attack obtaining the seed. Since the text combined with the seed in the initial phase is random, or at least not guessable, there is no way the attacker can tell a good guess from a bad guess by just looking at that piece of ciphertext. Then sprinkle a lot of nulls throughout the plaintext, and cut it into two places which are arranged in reverse order before encrypting.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.