Aphid cipher

aphidAphids are those pesky green bugs that eat the plants in your garden. But here the name designates a high-security paper-and-pencil cipher similar to Restonia, based on a couple ciphers that are more than a hundred years old.

Aphid derives from the “bifid” cipher invented by Philip Delastelle in 1895, to which it adds an extended columnar transposition step. Bifid was never used in practice despite its excellent properties, but its inner design influenced cryptography in a profound way, so that today’s computer-based symmetric encryption algorithms are a longer, more complex implementation of the same basic ideas: fractionation and diffusion. Fractionation means that every plaintext character is broken apart into several pieces, and diffusion means that the pieces are sent to different spots on the ciphertext, where they might be recombined into something else. Unfortunately, the original bifid cipher was easy to break because the plaintext pieces were sent to predictable locations, which makes it possible to reconstruct the original before encoding, at which point it can be solved like any monoalphabetic cipher. Like all ciphers devised before the computer age, bifid also suffered from short keys, liable to be found by exhaustive brute-force trial of all possible combinations.

Another classical cipher in Aphid’s family is ADFGX, which actually saw use during World War I on behalf of the German army. This cipher also used a 5×5 Polybius square for its substitution step, and then it had a simple columnar transposition. ADFGX was soon broken, however, because a single transposition does not really obscure the plaintext very much. We know now that just a double transposition is much better. In addition, there was no recombination of the 2 parts made from each plaintext letters, which wasted much of the potential of fractionation. Still, cryptanalysts were only able to break it on days of great traffic, when operators got overwhelmed and began reusing keys for a large number of messages, some of them with stereotyped contents known to the Allies.

In order to strengthen bifid and ADFGX into a derivative that can resist modern computer-aided cryptanalysis, we are going to follow a similar approach as in the Restonia cipher, where transposition based on a long passphrase is combined with substitution into digits based on a straddling checkerboard. The main difference between Aphid and Restonia is that Aphid uses a “Polybius square” substitution rather than a straddling checkerboard, and that the encoded result is converted back to letters at the end of the whole process. To illustrate the instructions, we will encipher “MEET ME AT THE STATION MONDAY AT 5 PM” using the passphrase “secret code” (too short, but will serve for illustration purposes) as the serial “ABC”. These are the steps:

  1. 1. Serial code (optional). In order to prevent multiple anagramming attacks, which are possible when two messages of the same length are encrypted with the same key, it is a good practice to use a different serial code for each message. The length of the serial code depends on how many messages will be exchanged. Three characters are good enough for a few hundred messages. For our example, we will use “ABC” as serial.
  2. 2. Code generation. Two codes are made from the passphrase and the serial: one for encoding the plaintext by substitution, and the other for scrambling the encoded result.
  3. The easiest to make is the scrambling code: just number each word in the serial + passphrase, according to their order in the dictionary; if a letter is repeated, begin from the right (for better scrambling). If a word contains less than four letters, which would lead to insufficient scrambling, merge it with the next one before finding the order. For the serial + passphrase “ABC” + “secret code” = “abcsecret code”, this results in: “124863759,1423”
  4. The encoding table is made into a 5×5 “Polybius square”, which contains the letters of the alphabet (J is combined with I). We fill it this way:
  5. Take the passphrase and place each new letter you find, in sequence, on the first row beginning from the left, and continuing on the other rows as needed. This is the first partial table in our example :
1 2 3 4 5
1 S E C R T
2 O D
3
4
5
  1. Now fill the table with the rest of the dictionary in reverse order, for better security. This is the result:
1 2 3 4 5
1 S E C R T
2 O D Z Y X
3 W V U Q P
4 N M L K IJ
5 H G F B A

To ease the encoding of long messages, this can also be expressed as an alphabet, with the coordinates for each (table,row,column) written below each letter:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
5 5 1 2 1 5 5 5 4 4 4 4 4 4 2 3 3 1 1 1 3 3 3 2 2 2
5 4 3 2 2 3 2 1 5 5 4 3 2 1 1 5 4 4 1 5 3 2 1 5 4 3
  1. 3. Encoding. Now we convert our plaintext into a numerical code. Below each plaintext letter we write vertically the two-digit code for each letter. In order to write numerals, we’ll write “X”, followed by two digits for each figure according to the following table (in the text, we’ll write “X” as “CS” or similar), and the code “55” at the end of the digits. The rest of the codes can be used for punctuation signs, etc.:
1 2 3 4 5
1 1 2 3 4 5
2 6 7 8 9 0
3 . , etc.
4
5 end

Here’s the result of encoding:

MEET ME AT THE STATION MONDAY AT X5end PM
4111 41 51 151 1151424 424252 51 215   34
2225 22 55 512 1555511 211254 55 455   52

If we want the final result to be a certain length—perhaps in order to disguise it as something else—this is the place to add as many gibberish pairs as needed, usually at the end of the text.

  1. 4. Scrambling. The numerical code is now scrambled with each code we obtained earlier. To do this, we first write the first scrambling code, and then underneath it we write left to right by rows the code we just obtained in the previous step, starting from the first row, then the second row, then the third. Then we read the scrambled code vertically, beginning with the column marked as “1”, then “2” and so on until the end. Then we scramble the result in the same way with the second scrambling code, and so on until there are no more scrambling codes. For our example, we obtain these two tables, whre the top row in both cases is the scrambling code:
1 2 4 8 6 3 7 5 9
4 1 1 1 4 1 5 1 1
5 1 1 1 5 1 4 2 4
4 2 4 2 5 2 5 1 2
1 5 3 4 2 2 2 5 2
2 5 5 5 1 2 1 5 5
5 5 1 1 2 1 1 2 5
4 5 5 4 5 5 5 2

Result read off by columns in the given sequence: 4541254 1125555 1122215 1143515 1215522 4552125 5452115 1124514 142255

1 4 2 3
4 5 4 1
2 5 4 1
1 2 5 5
5 5 1 1
2 2 2 1
5 1 1 4
3 5 1 5
1 2 1 5
5 2 2 4
5 5 2 1
2 5 5 4
5 2 1 1
5 1 1 2
4 5 1 4
1 4 2 2
5 5

Result read off by columns in the given sequence: 4215253155255415 445121112251112 115114554141242 5525215225521545

  1. Decoding. We write this code into two rows of equal length, filling one row at a time, and find the corresponding letters in the Polybius square by reading from top to bottom, taking the upper figure as row and the lower figure as column. This is the result (here we leave numerical shifts alone, if encountered, without trying to shift into numbers and special characters):
4 2 1 5 2 5 3 1 5 5 2 5 5 4 1 5 4 4 5 1 2 1 1 1 2 2 5 1 1 1 2
1 1 5 1 1 4 5 5 4 1 4 1 2 4 2 5 5 2 5 2 1 5 2 2 5 5 2 1 5 4 5
N O T H O B P T B H Y H G K E A I M G E O T E E X X G S T R X

So the final result, which we can transmit by insecure channels, is this preceded by the serial:

ABCNO THOBP TBHYH GKEAI MGEOT EEXXG STRX

Decryption is roughly the reverse of encryption. This way:

  1. 1. Retrieve the serial (if any) from the beginning of the ciphertext.
  2. 2. Code generation. This step is identical to step 2 for encryption, starting from the serial and the passphrase.
  3. 3. Encoding. This is the reverse of encryption step 5. We find the 2-digit codes for each letter in the ciphertext and write it in two rows underneath, then we read it off by rows, to be descrambled in the next step.
  4. 4. Descrambling. This is the reverse of encryption step 4. We begin making a table for the last scrambling code, with as many columns as digits in the code. The number of rows is the numerical ciphertext length divided by the code length, rounded up. The remainder of the division tells us how many cells in the last row should be blanked out. Then write the ciphertext numbers from top to bottom, starting on column 1, then 2, and so forth, filling all active cells in each column. The resulting table should be identical to the last scrambling table made when encrypting, but this time the result will be read off horizontally from the top, left to right. Then the result will be written in columns on a table made for the second-last scrambling code, and so on until the first scrambling code is used. The final result read by rows is the encoded plaintext, which we divide into two equal rows.
  5. 5. Decoding. Now take the encoded plaintext and decode it back to letters (and numbers, if any) where the top number is the row and the bottom number the column in the Polybius square, thus reversing encryption step 3.

This cipher shares many features with the Restonia cipher, but in a different way. Fractionation and diffusion are better because every plaintext letter gets split into two pieces, and the pieces are sent to distant spots within the ciphertext where they recombine to form letters that are most likely different from those in the plaintext. Aphid doesn’t take into account expected plaintext frequencies as Restonia does, but the combination of fractionation and scrambling is very effective to destroy any language-derived statistics present in the plaintext. All the properties having to do with the scrambling are the same as for Restonia, and there are two additional column to row shuffles (one when encoding to numbers, and one when decoding back to text), which come for free with the Polybius encoding. Fractionation can be made even stronger by using a 3x3x3 “Polybius cube” rather than the 5×5 square, and using three-digit codes for all letters. This is what Delastelle’s “trifid” cipher, an evolution of bifid, did back then. The problem is that, having only three different values for the encoded digits, we end up with long, tedious repetitions, and the likelihood of mistakes becomes much greater.

Unlike with the Restonia cipher, the last decoding step is necessary in order to collect the diffusion effect and transform each plaintext letter into two different letters in the ciphertext. Because of this, Aphid encryption and decryption takes a little longer than Restonia, but its theoretical strength is greater due to the more complete diffusion of the plaintext. Check it out and compare it with Restonia, and see which one you like better. I use Aphid when I want the end result to be letters, Restonia when I want it to be numerals.

Just about everything said about attacking the Restonia cipher can be said about Aphid. The main difficulty lies in having to descramble a code whose alphabetic conversion isn’t known. The ciphertext before conversion back to letters contains very few different characters, so that any type of frequency analysis is bound to fail. Frequency analysis would be effective once those have been (correctly) encoded as letters, but this cannot be done before the number codes are descrambled. Anagramming will fail form the same reason: there’s too few different numerical characters to be able to eliminate unlikely solutions. Many grammatically correct solutions are possible for a given ciphertext if one only has to descramble the figures and convert them back to letters. For a typical five-word passphrase (around 60 bits of entropy), the “unicity distance” for English plaintext is 60/3.5 = 17.1. This means that a ciphertext consisting of 17 or fewer characters would be theoretically impossible to crack, no matter how much computational effort is put into it. Which doesn’t mean that longer texts would be easy to crack, either.

The classical method used to crack ADFGX won’t work with Aphid for two reasons: 1, there will normally be more than one scrambling step, which is way better than one, and 2, the final decoding back to letters effectively obfuscates any language-dependent frequency that might have remained, which was needed in order to break ADFGX. In this regard, Restonia still suffers from this weakness, which is only patched up (quite effectively, though) by the fact that the straddling checkerboard used in the initial substitution messes up the most important letter frequencies.

How about using methods for attacking Delastelle’s bifid cipher? In essence, the conventional attack tries first to determine the “period” of the cipher (bifid converts columns to rows by blocks of a certain size) by evaluating certain statistics on the ciphertext. When the period is known, descrambling of the blocks can be done immediately, and then what is left is a digraphic substitution involving a single alphabet, which is readily solved by standard frequency analysis. But this will not work for Aphid because the blocks are not predictably scrambled and, in fact, the whole text is a single block. The closest thing to a “period” to be determined is the number of columns of each scrambling step, and there are several of them all mixed together. Unless the passphrase was very badly chosen, the Aphid “period”, resulting from all the scrambling steps, will likely be longer than the ciphertext length.

I believe Aphid and Restonia are actually stronger than the famous Enigma code, which the Germans used during World War II and was the first routinely cracked with the help of the first digital computers. It was quite an epic effort, nicely rendered in the 2014 film “The Imitation Game.” Enigma was essentially a stream cipher based on a mechanical pseudo-random number generating machine. Aphid and Restonia are block ciphers operating on the full message length. Even today’s digital block ciphers, such as AES, don’t operate on full-message blocks, but on fixed-length blocks, and they use fixed substitution and permutation operations, while in Restonia and Aphid, they are key-dependent.

Their superior resistance to cryptanalysis of modern ciphers lies in the fact that another substitution is made before each permutation, and several “rounds” including a substitution and a permutation are performed before the final ciphertext is produced. As a result, each character in the plaintext ends up influencing every character in the ciphertext, in ways that depend nonlinearly on the key. Aphid is meant to be simple enough to do with pencil and paper, so that the number of operations is reduced to the minimum necessary while providing as much diffusion as possible, which is two positions per character. Its superior strength relative to bifid and other classic ciphers lies primarily in its ability to use arbitrarily long keys, having a much larger entropy, so it stands a chance against computer-aided cryptanalysis.

Restonia cipher

492px-Smaller_Coat_of_Arms_of_Austria_(1815).svg (1)Paper and pencil ciphers are fun and can be useful in a pinch, when all computers around are suspect. In a previous article, I presented “Root,” a cipher that gives decent security and only requires a dumb calculator. In this article, we’re going to try and do the same without even that. Only paper and pencil. And you don’t need to learn Restonian.

“Restonia” is a cipher evolved from the VIC cipher used during the Cold War. The VIC cipher is called this way because its user, Reino Häyhänen, was a Russian spy with code name VICTOR. Even though it is a paper and pencil sort of cipher, the NSA was not able to crack it from 1953 until 1957, when Reino defected to the West. There is an excellent Wikipedia article on this cipher.

Now, the VIC cipher would be cracked easily today because its key was too short. This is a problem shared with just about very classical cipher before computer came in the scene, with the exception of one-time pads. Ever since a team of volunteers cracked the 56-bit DES cipher in less than 22 hours by simple brute-force trying of every possible key, we know that key length needs to be pretty long in order to withstand a computer-supported attack. The problem with long keys, though, is that they are hard to remember.

Unless some mnemonics are used. Dictionary words, for instance, have a minimum of 13 bits of entropy per word, on the average. Thus, a phrase containing just five words has roughly 65 bits of entropy, which in principle is 29 = 512 times harder to brute-force than DES. Phrases with five or more words are still easy to remember. Some examples: “I’ll have the chicken soufflé with fries,” “Don’t tell me you got fired,” “I love the smell of napalm in the morning,” just to mention some possibilities.

The trick is how to turn a longish set of words into a usable encryption key, and how to use such a key in a pen-and-paper cipher. Well, after thinking about it for a while and looking at the lessons of the past—many can be found on Wikipedia, practicalcryptograhy.com, and Kahn’s classic history “The Codebreakers” unabridged version—a new chiffre indechiffrable has been produced, codenamed “Restonia.” This is the name of a hypothetical small country where most children grow up to be spies. The name happens to contain the most frequently used letters in English (and most Latin-character languages, for that matter).

BTW, everything I’ve said so far also applies to the very similar Aphid cipher, which you can also find on this blog. Aphid is more secure because it has a larger degree of plaintext diffusion, but it is also more tedious and takes one extra step. Your pick. Back to Restonia.

Encrypting involves three obligatory steps, plus a couple optional steps. We are going to illustrate the method by enciphering “MEET ME AT THE STATION MONDAY AT 5PM” using “secret code” (too short, but will serve for illustration purposes) as passphrase and serial “123.” These are the steps:

  1.    Serial number (optional). Security is enhanced if each message is given a different serial number, which is never reused (at least for messages of equal length). A three-digit serial will allow us to encrypt hundreds of messages with one passphrase. Let the serial number for our example be “123.”
  2.    Code generation. Two codes are made from the serial + passphrase: one for encoding the plaintext, and the other for scrambling the encoded result.
  3. The easiest to make is the scrambling code: just number each word in the serial + passphrase, according to their order in the dictionary, taking into account numerals that may already be contained in the serial (which is pre-pended to the first word). If a word contains less than four letters, which would lead to insufficient scrambling, merge it with the next before finding the order. If a letter or number is repeated, begin from the right, for better scrambling. For the serial “123” and passphrase “secret code” = “123secret code”, this results in: “123 864759,1423”
  4. The encoding table is made into a 3×10 “straddling checkerboard,” which contains the letters of the alphabet, plus a catch-all symbol for punctuation, and a numbers shift. The first row contains only high-frequency letters (coincidentally, those in the name “Restonia”), with two spaces blanked out. The other two contain the rest of the alphabet, etc. We fill it this way:
  5. Take the passphrase and count the number of letters in the first two words that have different lengths (don’t merge them with the next even if they are short; if they are all equally long, go up by one for the second digit) = 6,4. Consequently, blank out cells 6 and 4 on the first row.
  6. Take the passphrase again, and place each new letter you find, in sequence, either in the first row if it is part of “restonia” or in the second and third, beginning from the left. This is the result in our example:
S E R blank T blank O
C D

iii.     Now fill the rest of the table this way: Top row, the rest of the letters in “restonia”, in reverse order (for less predictability). Rest of rows, the special characters, followed by the rest of the dictionary, also in reverse order. This is the result with labels added at top and left:

1 2 3 4 5 6 7 8 9 0
S E R blank T blank O A I N
6 C D # Z Y X W V U
4 Q P M L K J H G F B

The top labels are just the digits 1 to 0 in sequence, those on the left, starting with the second row, are the lengths of the first two words of the passphrase (consecutive, if all words are equally long). This is our straddling checkerboard.

  1. Encoding. Now we convert out plaintext into a numerical code. The letters on the first row of the checkerboard are replaced by the single digit above them. The letters and symbols on the other two rows are replaced by two digits: the one on the left, followed by the one above. If punctuation is needed, use the special character. To represent numerals, enter the numbers shift symbol, then the straight numeral, and end with two number shifts. Here’s the result:
M EETM EATTH ESTATIONM OND AY AT# 5# # P M
4322543285547215859704370628668564564644243

If we want the final result to be a certain length—perhaps in order to disguise it as a list of phone numbers or whatnot, more on this at the end of the article—this is the place to add as many gibberish digits as needed, usually at the end of the code.

  1. Scrambling. The numerical code is now scrambled with each code we obtained earlier. To do this, we first write the first scrambling code, and then underneath it the long numerical code we just obtained in the previous step, left to right row by row. Then we read the scrambled code vertically, beginning with the column marked as “1”, then “2” and so on until the end. Then we scramble the result with the second scrambling code, and so on until there are no more scrambling codes. For our example, we obtain these two tables:
1 2 3 8 6 4 7 5 9
4 3 2 2 5 4 3 2 8
5 5 4 7 2 1 5 8 5
9 7 0 4 3 7 0 6 2
8 6 6 8 5 6 4 5 6
4 6 4 4 2 4 3

Result: 45984 35766 24064 41764 2865 52352 35043 27484 8526

1 4 2 3
4 5 9 8
4 3 5 7
6 6 2 4
0 6 4 4
1 7 6 4
2 8 6 5
5 2 3 5
2 3 5 0
4 3 2 7
4 8 4 8
5 2 6

Result: 44601252445 95246635246 8744455078 53667823382, which can be transmitted out as is, preceded by the serial: 1234460125244595246635246874445507853667823382.

Decoding (optional). Sometimes we may want to transmit letters rather than decimal digits. The simple way to do this would be to just use the checkerboard to get the letter equivalent of the result, but this would cause many of the letters to go back to plaintext, which is bad. Therefore, we will be a little more devious and before decoding back to letters we’ll multiply the numerical ciphertext (without the serial) by the number of letters in the last word of the passphrase, up to 9. If the number of letters is 10 or more, take the last digit. If this last digit is 0 or 1, take the second-last word instead, and so on. If for all words we get 0 or 1, multiply by 7 instead. If the last number requires one more digit after (6 or 4 in our example), double it up and then add ‘XX’ at the end of the output. If the first digit is 0, pre-pend a 0 to the multiplication result before decoding. Then put the serial at the beginning and send it. Here’s what we get for our example:

Numerical code  4460125244595246635246874445507853667823382
Multiplied by 4 17840500978380986540987497782030414671293528
Decoded SOAB TNNIOARANIAZ B IAOF OOAENRNQ J OSEIRTEA
Sent 123SO ABTNN IOARA NIAZB IAOFO OAENR NQJOS EIRTE A

Decryption is roughly the reverse of encryption. This way:

  1. Code generation. First get the serial from the ciphertext and use it in combination with the passphrase to generate the straddling checkerboard and the scrambling codes as described above. This step is identical to step 2 for encryption.
  2. Encoding (optional). This is the reverse of encryption step 5. If the ciphertext consists mostly of letters, we’ll need to get that into a numerical code using the straddling checkerboard. Then we figure out the multiplier from the passphrase (length of last word) as above, and divide the numerical code by it. It should be an exact multiple, otherwise there is an error.
  3. Descrambling. This is the reverse of encryption step 4. We begin making a table for the last scrambling code, with as many columns as digits in the code. The number of rows is the numerical ciphertext length divided by the code length, rounded up. The remainder of the division tells us how many cells in the last row are active, and the rest should be blanked out. Then write the ciphertext numbers from top to bottom, starting on column 1, then 2, and so forth, filling all active cells in each column. The resulting table should be identical to the last scrambling table made when encrypting, but this time the result will be read off by rows from the top, left to right. Then the result will be written in columns on a table made for the second-last scrambling code, and so on until the first scrambling code is used. The final result is the decimal-encoded plaintext.
  4. Decoding. Now take the encoded plaintext and decode it back to letters (and numbers, if any) using the checkerboard, thus reversing encryption step 3. If gibberish digits were added, they will usually decode to gibberish so it’s easy to spot them and ignore them.

The strength of this cipher lies in several features:

  1. Fractionation and diffusion: many letters are split into two-digit codes, and then the digits are separated widely at the scrambling step. High-frequency letters are converted into low-frequency ones by combining with separated digits. This makes it very difficult to do any kind of frequency analysis on the ciphertext.
  2. Compression: often-used characters (r,e,s,t,o,n,i,a) are transformed into shorter codes than less-used ones. This tends to flatten the frequency distribution and helps to keep the ciphertext within the bounds given by the “unicity distance” (keylength in bits divided by 3.5, for English), which is the ciphertext length below which it is theoretically impossible to ever decipher the message without the proper key because it can be deciphered to other equally possible plaintexts.
  3. Mechanization: the process can be done quickly on paper without having to think much. In fact, computer-based encryption algorithms usually do something very similar: break characters into bits that are then scrambled according to the key.
  4. Irregularity: there is no fixed length for any scrambling operation, which makes it hard to attack by simple anagramming. The overall length of the passphrase cannot be easily guessed from the result of the scrambling even if the plaintext is known.
  5. Unlimited key: arbitrarily long passphrases can be used, each word adding more complexity to the scrambling process and more diffusion to the characters. It is like writing on a piece of dough that gets repeatedly kneaded according to each word of the passphrase.
  6. Independence: the way the passphrase material is used for making the checkerboard is quite different from the way it is used for making the scrambling codes. The scrambling codes only depend on the ordering of the word letters, not on the letters themselves. The letters matter when making the checkerboard, beyond their alphabetical order. Therefore, there is little danger that one operation would undo the other, even partially, and so it is safe to use the same passphrase as a starting point for both (except for a few very bad choices, which should be obvious to the user right away).
  7. Reusability: since the scrambling operation with irregular key fragments is hard to reverse, there is little danger that the passphrase be recovered from plaintext samples and their corresponding ciphertexts. Still, it is a good practice to include a serial number that never gets used again with the same passphrase, since messages of identical length scrambled with identical scrambling codes are subject to the anagramming attack.

 

Now let’s put on our hacker’s hat and try to crack this baby. The algorithm has been published on this blog, so this is not a secret. The secret is the passphrase and the plaintext. We have the numerical ciphertext from step 3 above: serial 123 + 44601 25244 59524 66352 46874 44550 78536 67823 382. These are the digit frequencies:

digit 1 2 3 4 5 6 7 8 9 0
occurrences 1 5 4 9 7 6 3 4 1 2

Since in English the most frequent letters go in this order: e-t-a-o-i-n-h-r, we might feel tempted to say 4=e, 5=t, 6=a, which would be incorrect because 4 and 6 are actually used for the least-frequent letters. Or, taking this into account, we might say 6=e, 3 or 8=t or a, which ends up being correct only in the guess for “a”. The problem here is that the low frequency letters are summing their frequencies because of the straddling checkerboard method, thus putting them on a par with the high-frequency letters, thus making frequency analysis almost useless.

Let us now look for digraphs and trigraphs. We see “44”, “46” and “52” three times which, knowing the encoding, could be interpreted as “L”, “J”, and “TE” respectively. None of them are actually present in the plaintext. All the frequent digraphs are artifacts resulting from scrambling. There isn’t a single trigraph to sink our teeth into, even though the plaintext is full or repetitions. Clearly a longer piece would have contained some repeated trigraphs, but again it is very likely that they would bear no relationship with the plaintext because they result from figures brought together from very distant original locations.

Normally, a straight transposition cipher is attacked by “multiple anagramming,” which consists of switching characters around in several ciphertexts of the same length that are suspected to be scrambled in the same way. By detecting which permutations maintain “fitness” with the patterns of the language in all ciphertexts, and which don’t, one can eventually descramble all the ciphertexts at once. But our cipher includes a serial number for the express purpose of defeating this technique since not two messages are scrambled in the same way. If the serial was a permutation that was applied near the end, there might be some hope of reversing it first, and then working with the partially de-scrambled ciphertexts, but unfortunately for the cryptanalyst the permutations encoded by the serial number are the first to be applied when scrambling, and the last when descrambling. Additionally, there are more sophisticated attacks based on dictionaries or “hill climbing,” but these are still based on fitness measures, which are obscured by artificially enhancing the probability of low-frequency letters by means of the straddling checkerboard.

A similar cipher used during World War I, ADFGVX, was broken by frequency analysis because the Polybius square method used as a first step did not obscure completely the underlying frequencies of the language. Also, ADFGVX used only one transposition step, which induces considerably less unpredictability than a mere double transpositions.

How about a “known plaintext” attack, where both the ciphertext and its matching plaintext are known? Can we then recover the passphrase, or something like it? This would clearly succeed if only the substitution had been done, for then we could match letters with their codes immediately and thus reconstruct the checkerboard. It would be harder if only transposition had been used because there would be many identical figures in the encoded plaintext competing for each position in the ciphertext, but there would be only so many of these ambiguities, so that eventually the scrambling code could be figured out. If both solution methods proceed in parallel but are independent of each other, they can help one another. The problem is that the solution of one affects the solution of the other, so that we don’t know the starting position of the figures until we know the checkerboard, and we cannot know the substitutions that make up the checkerboard until we know how the numbers move during the scrambling step. The two problems have their difficulty compounded when they need to be solved at the same time.

The diffusion effect is not helping one bit here. Let’s do a quick experiment and switch just two numbers in the first scrambling code (in our example, which involves only two codes), say, the 1 and 2 columns. Then the second scrambling table becomes this:

1 4 2 3
3 5 7 6
6 4 5 9
8 4 2 4
0 6 4 4
1 7 6 4
2 8 6 5
5 2 3 5
2 3 5 0
4 3 2 7
4 8 4 8
5 2 6

Result: 36801252445 75246635246 6944455078 54467823382. As a table, with differences highlighted:

original 4460125244595246635246874445507853667823382
altered 3680125244575246635246694445507854467823382

Out of 43 digits, 8 digits or about 20% have moved to different locations. This is hardly surprising since the first scrambling involved nine columns and two of them were switched, resulting in 2/9 = 22.2% of digits moving to different locations. If the two columns had been switched in the second table (four columns), we would have altered around 50% of the ciphertext. This means that we won’t necessarily know that we are getting close to a solution until we almost run into it. But solution methods invariably rely on some sort of fitness indicator for incomplete solutions, which allows the cryptanalyst to find the plaintext by successive approximations. If the indicator shows no better fitness until the very end of a successful de-scrambling, it becomes very hard to get even started.

It is a lot like walking in the dark to find an exit, except that picking a direction and staying with it until a wall is found, then following the wall until a door appears, is not going to work. There is no wall that we can follow. Instead, it’s more like finding a trapdoor in the middle of a dark field. We won’t know where it is until we’ve almost fallen into it. We might be walking only a few feet away and never know it was there.

A final word on how to deliver the ciphertext. Often it will be desirable to hide the fact that encryption is being used. A ciphertext consisting entirely of decimal digits can be disguised as a list of phone numbers, for instance, or as some sort of statistics. If we want to make the output appear as a series of 10-digit phone numbers, it will be best if the final ciphertext length is a multiple of 10. In order to make this happen, we will add some gibberish digits to the just-encoded plaintext (after step 3) to reach the appropriate length (remember that the serial, if any, is added at the end of the process), which will be scrambled along with the real message. The gibberish digits, which are technically named “nulls” will be easy to distinguish from the plaintext at the end of the decryption process because likely they won’t decrypt as anything meaningful. Nulls make cryptanalysis harder because they tend to disguise the statistical properties of plaintext.

Remember strong passwords with this keyboard trick

keyboardEveryone knows that real people suck at coming up with strong passwords. They are either easy to remember and laughably weak, or decently strong and impossible
to recall. On top of that, it is highly recommended to use different passwords for different sites, so that compromising one won’t compromise the others. In this article, I follow Nobel laureate Manuel Blum’s recommendation of using not a password, but an easy to remember algorithm to come up with a way to generate strong, specific passwords for each site, and be able to remember them all.

In this talk, Manuel Blum asks four volunteers from the audience, who we presume not to have been prepared before the lecture, and explains to them a method which, when given a name to apply it to, leads them all to the same, apparently random result. The video does not reveal the method used, but some articles by Blum speak about mapping the alphabet (plus numbers) into a secret scrambled grid, and applying a secret walk to the successive letters of the name (presumably a website name) to be converted into something else. Thus, the user only needs to memorize the scrambled alphabet and the steps taken in the walk.

I don’t think I could do that, though, so here’s my counter-proposal: use the computer keyboard as my grid, and just memorize the method, plus maybe a simple code that I can change from time to time. Let’s say I want to come up with a strong password for amazon.com. I start, therefore, from the word “amazon”, which I am going to turn into something else. In order to increase security, I memorize the secret code “1234” (maybe I can’t memorize a scrambled alphabet, but this I can memorize). Now I do the following on an American qwerty keyboard like this:

keyboard

  1. Starting with the first letter in the original word, move down (and slightly to the right) on the keyboard as many keys as the first digit of the code. If this causes me to fall off the lower edge, I continue on the top row, on the key directly above the last one I touched. Since the first letter is “a” and the first digit is “1”, I move one key down from “a”, which is “z”.
  2. Repeat with the second letter of the name and the second digit, then do the same until all the letters have been transformed. If I run out of numbers, I take the first number again, and so on. Therefore the other letters are:
    1. “m” + 2 = “i” (wrap to “8” on the first step, and then down to “i”)
    2. “a” + 3 = “w” (wrap from “z” to “2” on the second step, and then down to “w”)
    3. “z” + 4 = “x” (wrap from “z” to “2” on the first step, and then down to “x”)
    4. “o” + 1 = “l” (observe that we go back to “1”, since we ran out of digits on the key)
    5. “n” + 2 = “u” (wrap from “n” to “7” on the first step, and then down to “u”)

Final result: “ziwxlu”, which likely does not appear in any dictionary and is therefore as hard to crack as a group of random letters. If the website demands that you add numbers, go ahead and add a few that you can easily remember (except “1234”, which would compromise the key). This time I will add “1111”, making the final password  “ziwxlu1111”. Never mind that the numerical part is weak; the strength is in the first part, which is one out of 366 = 2,176,782,336 possibilities (numbers are also part of the set).

What we have done is essentially to apply the Vigenère encryption algorithm to the word “amazon”, using “1234” as key and the qwerty keyboard read column by column as starting alphabet. Not secure by today’s standards, but again, we are using it to generate a password, which itself has a small probability of being revealed. Additionally, anyone having access to the password will still have to figure out the algorithm, since what I’ve presented above is just a sample. There are many other ways you can apply a key to a website name. For instance:

  • Do as above but instead moving up, or left, or right on the keyboard. Or maybe alternating directions, or even switching directions in a more complex pattern that is still easy to remember: a cross, a circle, etc.
  • Use an alphanumerical key, and then find the result key by doing a “Playfair” combination. In the classic Playfair cipher, two letters at a time combine into one, this way: if there are in the same row or column, take the letter following the first, right or down respectively, depending on whether they are on the same row (or is the same letter) or column. If they are on different row and column, trace a rectangle with the two letters as opposite corners, and then choose the upper new corner (or lower, or right, or left) as result. For instance: “”q” + “t” = “w” (same row), “r” + “v” = “f” (same column), “i” + “c” = “e” (neither). This method also can be subject to direction changes, if so desired.
  • Use no key at all, and just rely on direction changes to get the result. For instance, Blum used a one-step north-east-south-west repeating walk, which would turn “amazon” into “q,z/9m” on the qwerty keyboard, excluding non-character keys and the space bar.

There are several reasons why this method, even when not using a key at all, is easier and more secure than others.

  1. It is certainly easier than having to remember Blum’s secret square containing a scrambled alphabet. The keyboard is there, so why not use it? Otherwise, I might feel tempted to write down the square on a piece of paper because it is hard to do the translations all in the head. It is still hard to extract the key from a compromised password, since the details of the algorithm are unknown and the text sample very short.
  2. It is more secure than memorizing a single high-security password because, if that is compromised, then all logins are compromised. The result of the algorithm is very random-looking, which makes it hard to crack using a dictionary. Cracking by trying to guess the algorithm is hopeless, since you can use so many different possibilities for that too.
  3. Even if one password is compromised, that does not necessarily reveal the master key (“1234” in the above example) because the attacker does not know the exact series of steps in your algorithm. If he does, of course, he’ll get your key in no time at all, so don’t reveal your method. This is probably why Blum did not reveal his in the video.

Is self-destruct email possible?

Earlier this week, my new app SeeOnce was rejected (hopefully only temporarily) by the iOS app store for allegedly misleading users into thinking that it could make self-destructing messages. Leaving aside what exactly “self-destruct” means for a digital message and whether or not SeeOnce actually achieves this, a number of current and past apps have claimed precisely this. In this article, companion to the one on Privnote vs. SeeOnce, I go over these apps, how they work, and how they can be used most profitably. Read More

Privnote vs. SeeOnce

In this post, I review Privnote and compare it withSeeOnce. Both apps claim to generate self-destructing messages. Which one will be the winner?

Ladies and gentlemen, on this corner is Privnote, a really, really simple app to send self-destructing messages to someone. It works like this:

  1. Navigate to https://privnote.com. The interface contains a single box where you write your message and a big red button to create the note. Nobody asks you for any password or private info of any kind. There is an additional button for some options, such as encrypting it with a password (using AES), changing the default time for self-destruction, or getting an emailed confirmation that the note has been read and destroyed.
  2. Write the note and click the big red button. The box is replaced by a link like this: https://privnote.com/cv0Lcrsw#IgdQIQnTL already selected for you to copy.
  3. Copy the link and paste it into an email or whatever.
  4. When the recipient clicks on the link, Privnote loads on a secure page and displays the original message, plus assurances that the message has been destroyed at the source. Sure enough, reloading the page displays a warning that the message no longer exists.

Privnote is beautifully simple and it seems to work. Can anyone beat it?

On this other corner is SeeOnce, which also claims to generate self-destructing messages in a pretty simple matter. SeeOnce works like this:

  1. Navigate to https://seeonce.net. You are immediately asked to come up with a Password, though you are assured that you are not making an account anywhere. After supplying this, the interface contains a single box with way more buttons than Privnote, though many of them are things like “Select”, “Clear”, “Help”, etc. (Privnote doesn’t have any help, perhaps because it doesn’t need any?). There are no options to set in SeeOnce.
  2.  Write the note and click the highlighted Lock button. Now another dialog pops up, asking you to identify the recipient on a list, or send it as an (insecure) invitation to a still unknown user. This dialog doesn’t appear if you loaded SeeOnce from a link sent by someone else.
  3. After you do this, a piece of text containing a longish random-looking link fills the box. The link may be something like this: https://SeeOnce.net#@/gq1wS2sus6zUegwYZQ7+AMOLEAqBnAyPTd1Fff1lxI1MIURDA6igSnUiHI8pByPtcUX3lfSUS/GqTovQa46NoSu3tFddibJKieDgFI7XyWFw= and it is already selected as in Privnote.
  4. You can copy and paste the contents (the link alone is enough) into email or texting, or simply click the Email button, which will put the whole thing into your default email. Alternatively, you can click a Hide button, which will convert the stuff into apparently normal text taken from a cover text (a popup asks you for the cover text, if you haven’t loaded one yet), before emailing it.
  5. When the recipient clicks on the link, SeeOnce loads on a secure page and asks for a Password. After supplying this, the original message is revealed. Reloading the link and re-typing the Password leads to a message stating that “unlocking has failed” (SeeOnce needs to exchange two messages between the same parties before this happens right away, otherwise the link does not fail immediately but rather after writing a reply).

A little more complicated than Privnote, but still quite manageable. Now, the devil is in the details, as they say. We need to look at what’s inside as well as the features and the overall simplicity of the process. Price is not much of an issue since both apps are free, but availability on different platforms might be.

Price. winner: SeeOnce

Both apps are free, but Privnote has ads. This is not only uncool, but poses a security risk since the ads could potentially inject malicious code into the page, compromising everything. SeeOnce, on the contrary, stays true to the open source ethos and contains no ads. SeeOnce can do this because it doesn’t rely on servers for its operation and therefore expenses are insignificant.

Simplicity. winner: Privnote

It’s hard to be simpler than Privnote: you click a link, enter a message, copy a link; on the receiving side, just click a link and you’re done. SeeOnce is almost there, but it does ask you to come up with a Password, which is extra work and requires the user to exercise his/her memory, never a good thing (we’ll see later that this isn’t as bad as it looks). On the other hand, emailing can be done without cut and paste by just clicking an Email button. Still, Privnote wins this one.

Features. winner: SeeOnce

Privnote does have a few extra settings, such as the ability to encrypt the message with a chosen password rather than the default 54-bit key (but then, how do you send the password to the recipient in a secure manner?), whereas SeeOnce encryption is always under user control (and this is why it asks you for a Password before it does anything). Privnote also has the ability to send a read receipt, which SeeOnce lacks (we see why below). Still, SeeOnce wins this one because it has a comprehensive Help system (to its credit, Privnote hardly needs one) and the ability to hide its output as innocent text, which can be life-saving in places where encryption is frowned upon. SeeOnce also has the ability to switch to secure real-time chat if the correspondents find themselves emailing one another every few minutes.

Availability. winner: tie

Both apps are available from secure links on a regular browser, though SeeOnce can run offline and Privnote cannot. SeeOnce is also available as a Chrome extension and in the Android store. So SeeOnce has an edge here, but I’m going to call it a tie since sending emails requires Internet and most likely a browser.

Security. winner: SeeOnce

Ah, here’s the biggie. Both apps stem from radically different approaches to achieve the same goal. Privnote is fundamentally server-based (except its encryption option, which appears to be client-based), whereas SeeOnce is strictly client-based (after the web server delivers the code, that is). Let’s see what’s underneath each one:

  • In Privnote, the message (encrypted with a symmetric key, which is sent in plaintext with the link but the server does not see) is sent to a server, where it is stored. Clicking on the sender-generated link first instructs the server to send the encrypted message the recipient’s machine, where it is decrypted with the key contained in the link. The Privnote server will deliver the data if this is the first time this particular link has been clicked by anyone, and the other optional conditions, such as expiration date, have been met. Then the server deletes the stored data, or so we are told, so that a repeated request using the same link cannot be met. Still, Privnote can tell the difference between an expired link and one that was never issued, which leads me to think that some memory of having stored the message remains for a while.
  • In SeeOnce, the message is locally encrypted with the result of combining a public key, which was received in a previous message from the same correspondent, and a random private key that is stored locally and is overwritten when a new message for this correspondent is encrypted. The underlying protocol is fairly complex but transparent to the user. SeeOnce never communicates with servers, so the reason why a message “self-destructs” (actually, no longer can be decrypted) is that at least one of the keys has been overwritten and cannot be obtained anywhere else, even if someone has been copying every message exchanged. This is also why SeeOnce cannot produce a read receipt: it was a different program that actually sent the message; the SeeOnce server never knew about the sender or any of his/her data.

There are three reasons why the approach followed by SeeOnce is much more secure:

  1. The first one is that Privnote displays the decrypting key in plaintext (or an equivalent, given that the client-side code can be viewed at any time) as part of the link. It needs to do this because it does not ask for any information about the recipient before preparing the link, so anyone should be able to follow the link. If the link is sent by email, for instance (and remember we are encrypting the message because we believe email to be insecure), the link can be nabbed by someone else, who then can read the message, instead of the intended recipient. Getting some control over who can actually read the message would require some sort of recipient authentication, a password at the minimum, which is what SeeOnce does.
  2. Whenever data is stored in a server, the user loses control over it. Privnote can say they have destroyed the message until they are blue in the face, but they cannot prove it. If a government agency serves them a request to make a copy, they might be doing it without the users’ knowledge. A hacker can break in and look at the data. The server itself may be saving the data as part of a scheduled backup. Now, Privnote states that this data is encrypted with a key that is not sent to the server, but since that key is included at the end of the link sent by email (otherwise the recipient would never be able to decrypt the message), if the link is compromised as we saw above, then the agency or hacker can decrypt the message. The only protection against complete loss of security is user-controlled symmetric local encryption with a separate key, which Privnote offers as an option, but then the user has the problem of transmitting the key. SeeOnce stores data only locally, and so this is much less of a problem. Stored data is encrypted by the user Password (is it beginning to look like this wasn’t such a hassle after all?) and can optionally be wiped or moved to a different machine. Anything transmitted is encrypted with a public-key algorithm, so that key transmission is never an issue.
  3. Code executing on a server is invisible to the user. Therefore, a Privnote user has no way of making sure that the code is honest and free from errors. In Privnote, this means the code that supposedly is keeping track of how many times a particular link has been followed, and which deletes the data on the server. On the other hand, the complete SeeOnce code is visible to the user by just typing ctrl-U on the browser. It is long and complex, to be sure, but it hides nothing. The program itself has a mechanism to ensure that the code has not been tampered with by a third party, fully documented in the Help pages.

Both programs have features to recommend them but in the end it comes down to a personal choice: do you value ease of use above anything else, or is it security what you value the most in a security product? Perhaps the only way to tell is to take both for a spin and decide for yourself.

The case for symmetric encryption

In this day and age, everything dealing with encryption seems to be of the more complex asymmetric kind: PGP, SSL, TLS, BlockChain, you name it. So, is the old-fashioned symmetric encryption, where the same key is used to encrypt and decrypt, obsolete and done with? “By no means!” say a number of users. In this article, I begin quoting an email I got recently, adding some of my own musings, and making an announcement after that.

This is what a faithful user sent me. He did not want his name to be published but was okay with my quoting what he said:

The case for symmetric encryption:

There are circumstances when symmetrical encryption, that is where both sender and recipient use the same secret key to encrypt and decrypt messages, is the most practical and safest method for encrypting email.

Whenever a message sender, who is known by cryptographic custom as Alice, wishes to write an end-to-end encrypted email to a recipient, customarily known as Bob, one of two cryptographic systems can be used.

The simpler is symmetric encryption in which Alice and Bob have a single secret key, which is used by both of them to encrypt and decrypt messages. The obvious shortcoming of symmetrical encryption is that before Alice and Bob can email, they need to meet up – or have some other safe channel – through which to communicate the secret key. Asymmetrical encryption solves that problem. Both Alice and Bob have two mathematically related keys, one private one public. For Alice to send Bob an encrypted message she ascertains his public key and encrypts her message using it. The message can only be decrypted using Bob’s private key, which he keeps secret and safe.

It would seem, then, that asymmetrical encryption, involving no prior secret exchange of keys, enjoys a clear advantage, and for many purposes it does. But there are a number of things that can go wrong with asymmetrical encryption, which can’t happen with symmetrical encryption – or at least when the secret symmetric key is agreed face-to-face. Let us look at these:

1. Alice is sending Bob a message encrypted with Bob’s public key. However she needs to authenticate it; i.e. prove the message is from her. Precisely because Bob’s public key is public anybody could encrypt a message using it and then impersonate Alice. To prevent that, Alice “signs” the message using her own private key. To read the message Bob uses his private key; and to verify the authenticity of the message he needs Alice’s public key. The difficulty is solved, but only at the expense of complexity. With symmetric encryption signing and verification are not necessary because the ability to encrypt and decrypt using the single secret key is proof of authenticity.

2. Before Alice can email Bob she needs to find Bob’s public key, which may be on his website or in some other public place. But how can Alice be sure that the website or key server has not been tampered with, and that she is not encrypting the message with a key that can be read by somebody else? Equally, when Bob needs to find Alice’s public key from a public place to verify the message, how can he know it is genuine? If Alice and Bob had agreed a symmetric key face-to-face the issue of tampering and impersonation would not arise.

3. It could happen that Alice or Bob believe that their private key is no longer secret and safe. If someone else had acquired it all his or her incoming mail could be read, but revoking the key from a public place is not easy. To be successful, everyone who has or might use it needs to know of the revocation and of the replacement. With symmetric encryption, compromising one key only affects the two parties involved; they can then easily set up a new key – and maintain different levels of security for each key used with different people. Alice’s communication with Bob can be kept more securely than her communication with Bill, for instance.

Asymmetric encryption therefore brings with it a number of difficulties not suffered by symmetric encryption. The sole disadvantage of symmetric encryption is that Alice 2 and Bob need to agree a secret key face-to-face or through some other safe channel. But in many cases that it is no difficulty at all. It may well be that Alice and Bob meet in person as well as send emails. Alice and Bob could be lovers, or they could be members of a political action group which is under surveillance by security agencies. There is no safer form of email encryption than for Alice and Bob to meet, agree a twenty-character long password consisting of random numbers and letters, and for both of them to keep it secret and safe and to use it to encrypt and decrypt their text emails.

This is what he said, and this is what I’d like to add:

I have been focused on asymmetric encryption for a while because symmetric encryption has the seemingly insurmountable obstacle of key distribution. Namely, that there is no good secure way to get a new key to those who should use it other than meeting face to face. But if meeting is possible, then it is hard to beat symmetric encryption, especially if the agreed-on key is a random string that is deleted from its storage location when a new key replaces it. In this case, old encrypted messages become undecryptable, and they possess the very valuable property of forward secrecy.

PassLok does retain the ability to use symmetric encryption by providing a shared Key, instead of a Lock, whenever the program does an encryption. If the shared Key is not stored in the directory (it can be stored, just like a Lock), then it takes a click of the Edit button to reveal the box where that shared Key should be placed, and another click to get back to the Main screen. But the author of the above piece, and I presume many others, still consider this too confusing when faced with so many other options and buttons to be pressed.

Therefore, I have updated URSA so it does only symmetric encryption, bringing it to version 4.0. You can find it at https://passlok.com/ursa.  Its interface is similar to that of PassLok, replacing the directory box with a shared Key input box. Help is similar to that in PassLok and SeeOnce, but called from a button rather than occupying its own tab. There are no Options to choose. Because it is based on the NaCl engine like PassLok 2.2 (unlike URSA 3.3, which is based on the SJCL engine), URSA 4.0 is not compatible with version 3.3, but it is compatible with PassLok 2.2 so that messages locked with URSA 4.0 can be unlocked in PassLok, and (short mode) messages locked with PassLok can be unlocked in URSA 4.0.

My hope is that URSA can serve as an introduction to practical encryption for many people. After they feel comfortable with shared Keys and the basics of encryption and decryption (“locking” and “unlocking” in URSA and PassLok), then they can move to the fuller experience that is PassLok. If they still feel overwhelmed, they can try SeeOnce for a while, which will introduce them to asymmetric encryption in a relatively simple way, and then try PassLok.

So, check out the new URSA at https://passlok.com/ursa. Perhaps this is all you’ll ever need for your encryption needs.

Which end-to-end encrypted email is best?

After the 2013 Snowden revelations, there has been a push to make email more private than it currently is (which is, essentially, like writing on a postcard). The big guns, Google and Yahoo, have wowed to make their respective email systems end-to-end  (E2E) encrypted but progress has been slow. The official page about the Google effort has not been updated for months (as of June 2015). In this article, I go over some options available today, while we wait for that final solution that, at this pace, might still take a while to come.

There are dozens of options, but here I am going to review only a few that are better known in order to compare their features. Here I compare the following: Enlocked 2, Virtru,Proton mail, Mailvelope and, of course, my own PassLok. All of them promise to make encryption easy, at least compared to what it used to be. Some take pretty drastic shortcuts in order to achieve this, as we shall see. They all perform encryption and decryption at the user end, rather than at the server, since otherwise the server would have access to the plain content, with the consequent risk if subpoenaed or simply hacked. They all run the encryption code in JavaScript, which is a no-no for some, but currently there is no other way to run code locally on a browser.

Given that they all do the essential thing in some way or another, I looked into other desirable features of an end-to-end email system. Here’s a description of each feature:

  1. No need to change address. It’s a hassle if you need to use a different email address in order to have privacy. Here a Yes means that you can continue using your current email provider and address, and still obtain the benefit of encryption.
  2. Provider doesn’t store secrets. If it does, there’s always the danger that they’ll be forced to reveal them. Now, some providers store them in encrypted form, which is better, but still they’re storing something sensitive essentially forever.
  3. Provider cannot read email. As ludicrous as this may sound, at least one of the providers featured in this article is able to read your emails, or enable someone else to do so. We are assuming the encrypted content is easy to intercept.
  4. Account not required. Because some of us are paranoid enough to prefer not being forced to make an account anywhere, to prevent being tracked when we use it, or whatever other reason.
  5. Encrypt attachments. Not all we send is text. Often the sensitive information will be a picture, document, or any other kind of file. Encryption should be able to handle this.
  6. Encryption visible. A study published in 2013 showed that, for the time being, users prefer to see their messages in encrypted form at some point, in order to be assured that they are indeed encrypted before they send them out. They were willing to go as far as cutting and pasting the material in order to get this. This feature does not need to be on 100% of the time, but at least as an option.
  7. Encryption can be disguised. A number of users need encryption because they fear they are under surveillance. They would be even happier if they could make their emails and attachments appear as if they were not encrypted at all. This is called steganography, and at least one of the systems reviewed adds this feature to the mix.
  8. Forward secrecy. This means that the encrypted content remains secure into the future, even if the encryption key or the user password is obtained by an enemy. This is considered essential for instant messaging apps, and would be nice if email could also pull this trick, perhaps as an option.
  9. Multi-platform. It is no good if users need to use a particular kind of device (PC, iPhone, whatever) in this fractioned market.
  10. Portable across machines. This means that a give user should be able to use his/her E2E encrypted email on different machines, possibly of different kinds, with a minimum of risk and hassle.
  11. Multiple identities. What if several family members or coworkers share a computer? Can they keep privacy from each other? What if you’re schizophrenic or have multiple personalities?
  12. Open source. A deal-breaker for many, if it is not. Some of us feel a lot more reassured if the underlying  code is available and experts can subject it to scrutiny.
  13. Crypto engine. There are a number of cryptographic engines out there, some more recent than others, but all the systems presented here use an engine that has accumulated at least ten years of scrutiny.

So here’s the comparison as a table. Yes is good, No is bad. Some entries have a footnote right below the table.

Features Enlocked 2 Virtru Proton mail Mailvelope PassLok
No need to change address Yes1 Yes1 No Yes1 Yes
Provider doesn’t store secrets No2 No No2 Yes Yes
Provider cannot read email Yes No Yes Yes Yes
Account not required No No No Yes Yes
Encrypt attachments Yes Yes3 Yes No Yes
Encryption visible No No No Yes Yes
Encryption can be disguised No No No No Yes
Forward secrecy No No3 No No Yes
Multi-platform Yes5 Yes5 Yes No Yes
Portable across machines Yes6 Yes6 Yes No Yes
Multiple identities No No Yes No Yes
Open source No No No Yes Yes
Crypto engine PGP AES PGP PGP NaCl
Overall score (# of yes) 5 4 5 6 12

1. The app works only with certain email providers, not all of them
2. They do store encrypted private keys, hence the bad score
3. Separate encryption and delivery from server
4. They deny access to the encryption key (paid feature), but the key is not deleted
5. Browser plugins plus apps in iOS/Android stores
6. User secret data saved in servers (encrypted) enables this

Now a short description of each E2E provider, and why they got these scores.

Enlocked 2

Their first version got slammed by reviewers because it was doing the encryption on a server instead of the client. This meant that their server got  to see the plain text of your private stuff, even though they promised (who doesn’t?) that they didn’t store it. Enlocked 2 is a browser plugin (standalone mobile apps exist) that performs PGP encryption on the client. Their server holds each user’s public key so it can be sent to other users who want to encrypt messages for the owners of those keys to read. The plugin automatically retrieves each public key belonging to a recipient in an email and uses it to encrypt the content before sending the encrypted result to the actual email provider. Because it is a plugin, it only works with Gmail, Yahoo mail, Outlook, and a handful more.

In order to achieve portability, Enlocked also stores and serves the user’s private key, previously encrypted by his/her password. This is a problem, since compromising that password or choosing a bad one (Enlocked accepts bad passwords without complaint) makes all your encrypted emails readable by Enlocked or those to whom they give your private key.

Enlocked is a commercial service, which costs $9.99 every month for sending 100 messages, and goes up from there. You can sign up for a free account, though, which allows you to decrypt an unlimited number of messages and encrypt 10 messages every month. Their system seems to be glitchy: I’ve tried for several days to make a free account without success, leading only to error screens that instruct me to contact support (without any link to support, though).

Virtru

On the surface, Virtru behaves very much like Enlocked. They have slick plugins and mobile apps. They encrypt text and attachments. You only need your account password to use it anywhere you go. The difference is that they use symmetric encryption (256-bit AES) instead of asymmetric encryption (PGP). When you encrypt something, you send the actual, per message encryption key to Virtru, and they store it so they can give it to the recipients of your encrypted message. The whole process is quite automatic and makes it possible for new users to be added in a flash, since no initial key generation and exchange process is needed.

The downside is that they can read your encrypted messages, if intercepted (they are sent through your regular provider, from a limited list of supported ones), and enable anyone who asks nicely to do so. This should be a deal breaker for anyone who has real confidential material to protect. To make matters more egregious, they charge for “forward secrecy”, meaning that Virtru promises to no longer give the decryption key to anyone except you (and, possibly, government agencies).

If you don’t want the paid features, Virtru is free to use, though making an account is a must. A Pro account with the extra features will normally set you back $4 a month.

Proton Mail

Proton Mail is Enlocked with an email server thrown in. When you sign up for Proton Mail, you sign up for the whole thing, and you get a username@protonmail.ch email address. No way to keep using your old address except by forwarding. Proton Mail requires the user to memorize two keys: one is for logging into the email server and getting your emails, plus obtaining the PGP public keys of other users automatically; the other is for encrypting your PGP private key, which is stored in their server so you can use different machines. Since they must know your login key in order to give you access, they’d also be able to read your encrypted mails if both keys were the same, hence the need for two separate keys.

Proton mail is accessed strictly as a website, so no plugins or mobile apps are involved. Interestingly, this approach makes it accessible from any device, running any operating system.

Proton Mail just finished beta and is available for signups, everything is free, though not open source. I believe that Google’s and Yahoo’s E2E solutions, when they come, will end up looking a lot like Proton Mail.

Mailvelope

If Proton Mail was Enlocked plus a mail server, Mailvelope is Enlocked minus a key server. It uses PGP like the other two, but key generation is kept strictly to your machine. The user is responsible for distributing his/her public key to others, and for safeguarding his/her private key, which is locally stored encrypted by a password. Mailvelope is only available as a Chrome extension and off the box it is limited to Gmail, Yahoo mail and a couple more, though apparently you can configure other clients manually (I didn’t try). Integration with those is quite slick, however, as Mailvelope detects encrypted incoming emails and offers to decrypt them if only you supply the password.

This extension may be all that many people will ever need, so long as they don’t use encrypted email with a lot of people (public key entry and selection is a bit involved) and don’t need portability (settings don’t sync across machines).

PassLok

Obviously it was a foregone conclusion that PassLok was going to be the winner in a comparison made by its own developer. I won’t dispute that, but the fact remains that PassLok has all the desirable features on the list. You can use it with your existing email (in fact, with any program at all). It doesn’t store your content or even anything secret. It doesn’t talk to servers so you’re not making any account anywhere. It handles files as well as text. It runs as a Chrome plugin, iOS/Android app, or just as a webpage that gets completely cached so you don’t even need an Internet connection after the first time. It includes a full complement of text and image steganography tools so your encrypted material can slip undetected. In its Chrome app incarnation, it even syncs all your stored public keys (which are not secret, but may be hard to obtain again) using Google’s servers. It is possibly the only web app available today capable of making a self-destructing message, where the key needed to decrypt it disappears from the face of the earth as soon as it is read once.

But it must have some defects, right? C’mon!

A lot of PassLok’s security lies in the fact that it is self-contained code, and precisely this is why some things are harder to do than in other systems. PassLok does not interact with email clients, and so the user must manually copy and paste the encrypted material between PassLok and whatever email client you are using. This is a hassle, but it has the advantage that you know without a doubt that your data is encrypted before it gets to your email. Some email providers, like Google, log every keystroke you type into the compose window, so it is important to encrypt your messages before the email program sees them.

PassLok does run a server to help users obtain other users’ public keys. It is called General Directory, and currently it is manual rather than automatic so that the main PassLok code is isolated from contact with any server. A lot of things in PassLok are automatic, but nothing is forced on the user. If the user decides not to make public his public key, but rather send it to a selected few, that’s okay with PassLok. Most PassLok users actually do this. They are a paranoid but dedicated bunch.

PassLok for Email enters beta

logo v2-440x280 emailJohnny can’t encrypt. It’s no use. . . . This is what has been said repeatedly about mere mortal users and encryption, which forever has been the domain of black chambers and mathematical geniuses. Scores of companies have tried to get around this problem by hiding encryption in their servers, far away from users’ eyes.

And yet, studies have shown that this creates another problem: if I can’t see any of the encryption, how can I relax and be sure that this message where I’m risking my career, maybe my life, is truly away from prying eyes? Just because the software tells me to relax?

PassLok does not hide encryption from users, and it tries hard to make it accessible. This is why the next step in its development is so important. PassLok for Email is a new Chrome extension that adds PassLok encryption to regular web-based email apps. Its beta, which supports Gmail, Yahoo, and Outlook online, has just been released for public testing. Read More

The FBI won’t like this post

I fully expect to start hearing funny clicks on my cellphone or see people in trench coats behind me after finishing this. Perhaps you, who are reading the article, will have a similar experience.
The reason? Here I’m telling you why all the current debate on whether the FBI and other law-enforcement agencies should have access to an individual’s encrypted information is moot, because that individual doesn’t really have to rely on anyone else in order to thwart that effort.

It is no secret that FBI Director James Comey isn’t happy about encryption being used in digital communications. To quote him:

Those charged with protecting our people aren’t always able to access the evidence we need to prosecute crime and prevent terrorism even with lawful authority. We have the legal authority to intercept and access communications and information pursuant to court order, but we often lack the technical ability to do so.

Consistent with this view, he insists on the need to provide some way for his agency, and possibly other agencies engaged in suppressing terrorism and other crimes, so that those communications can be legally decrypted. He is generous about it, too, since he is quite willing to seek court approval for each request, rather than demand carte blanche for the practice, to be used whenever the need arises without further approval, or simply ignore the courts as in the past. Technology leaders have come out in strong opposition to this desire, but bills have nevertheless been introduced in New York and California seeking to force them to provide the kind of access that Comey is requesting. The issue has been prominent in the recent presidential debates by both the Republican and the Democratic party.

So the battle lines are clearly drawn, right? The Government wants a back door because they want to protect the little guy from bad actors who are currently operating in darkness, thanks to encryption; companies don’t want to provide them because they would negate one of their biggest selling points, which makes the public like them and use their products. The sound bites hurled from both sides are well known: “You are protecting terrorists and child molesters!”, “You want to put a camera in my bedroom!”, “Security is paramount!”, “The right to Privacy must be held sacred!” And we are being pressured into falling into one camp or the other.

That is, if the whole issue depends on whether or not tech companies complied, more or less willingly, with the FBI’s request for a workable peephole. But a few important things are seldom mentioned:

  1. Tech companies are global, and so they have to deal with more than the US Government. If a peephole exists, could they deny its use to the government of China . . . Iran . . . North Korea? Are tech companies supposed to maintain a running record of who the good guys are at the moment? How would the US Department of State like it if they allowed such access to the Secretary of State’s encrypted email (not that this has necessarily happened already, but itmight happen ;-).
  2. It’s only terrorists and child molesters (and possibly drug dealers, money launderers and other high-profile criminals, I guess) that are to have their unalienable right to privacy suspended, so relax, the rest of you. But if this is the case, the normal government way to handle this would be to have users sign an affidavit to the effect that they are not seeking to use email or whatnot in order to engage in terrorism, child molestation, or whatnot. See, for instance, the current USCIS application for permanent residence. This is the way they control exports, gun ownership, and just about everything else. You may think I’m kidding (I am, of course), but there’s some hidden disingenuity here that is begging to be exposed.
  3. How would the Government protect the data being de-privatized (their record on keeping their own private data is frankly dismal) and, likely harder to do, how would they ensure that they would remain “the good guys” for as long as that data has a value, without abusing, stretching, or plain breaking the law they are charged to protect? If the past is the best indicator of the future, things don’t too rosy in this area. Not to mention governments in places like Europe, Latin America (I’m looking at you, Venezuela), or the Far East.
  4. Finally, the whole debate may be useless, since individual users already have the ability,protected by the US Constitution and tested multiple times in the past, to do their own encryption without having to rely on a service that might have a backdoor or peephole. This is why the FBI is not going to like the rest of this post, because I’m telling you how to do it.

The key here is that any information that humans can read and understand is “speech,” as narrowly defined by lawyers and legal experts. The free exchange of “speech” is protected by the First Amendment to the US Constitution and similar laws in all other democratic countries. Free speechis believed to help in keeping the government from spiraling down to totalitarianism, and therefore is here to stay, no question. The First Amendment also protects my telling you my awesome recipe for fried chicken if I choose to (which I won’t do, for reasons that should be obvious to everyone), or how to encrypt your own stuff in a way that the NSA will never be able to decrypt it. I didn’t steal this secret from the NSA and it is completely mine to reveal, in use of my free speech rights guaranteed by the First Amendment to the US Constitution. I say “guaranteed” and not “given.” Rights are given by God; the government merely recognizes and hopefully protects them.

It so happens that the “perfect cipher” that the NSA will never be able to break already exists. It is called one-time pad and only needs paper and pencil to do it. It also needs the eponymous one-time pad, which used to require a whole team of typists trained in moving their fingers as randomly as possible (monkeys were tried, but costs were too high ;-). In previous posts, however, I have described several methods that generate a one-time pad of acceptable quality (not perfect but certainly better than what Soviet spies used during the Cold War, whose messages remain unbroken today), staring from simple text. I have collected all of those into a new website, and this is the address:

http://chaosfromorder.weebly.com

The title has nothing to do with a New World Order or nonsense of that sort. Rather, it brings home the point that what makes those ciphers work is extracting the randomness that normal text contains, which is roughly 1.6 bits of entropy per letter. Since random text with a 26-character alphabet would contain log(26)/log(2) = 4.7 bits of entropy per letter, we need a piece of text at least three times the length of our message in order to generate a “one-time pad” to encrypt it.

Using these ciphers you can encrypt a message with utmost security and without the need for a computer, smartphone, or whatnot. It does not matter, therefore, whether Internet or communications providers are required to weaken their own encryption, since those who really want to keep the FBI from reading their messages can always do so no matter what. So, who does Comey think he can catch through all that legislative agitation? Likely, only stupid terrorists and child molesters, not the smart ones.

Chaos from order

Sounds like a play on words, doesn’t it? And yet, this is exactly what I mean. Sixty years ago, renowned mathematician John von Neumann, published a little trick that allowed using a biased coin, where heads and tails do not come out at 50-50, to generate a true, unbiased, 50-50 random sequence. It turns out that this trick can be extended to larger sets, such as alphabet letters, in order to generate what appears to be a true random sequence of letters (chaos) from common text (order, unless you’re starting from a political speech or your latest cellphone bill).

The result, as you probably have already guessed, is yet two more paper-and-pencil ciphers,DicePad, and LetterPad, that come dangerously close to perfect unbreakability.

This is how von Neumann fixes a bad coin, provided one throw does not affect the result of the next (that is, they are statistically independent):

  1. Throw the coin an even number of times and record the result. If we call Heads = 0 and Tails = 1, the result looks like a long binary number.
  2. Now take the digits in groups of two and make this conversion: 01 = 0, 10 = 1, 00 and 11 = no result.
  3. The resulting string, which will be at most one-quarter the length of the original, will have 0’s and 1’s in a truly random pattern and with frequency approaching 50%.

It is simple to understand how this works. If the throws are independent of each other, the sequences 01 and 10 will have the same probability since they only differ in the order in which the 0 and the 1 appear, which should be truly random if the throws are independent. This is true even if Heads is much more (or less) likely to appear than Tails. Say the probability of Heads is 90%, Tails 10%, then the probability of Heads-Tails is 0.9 x 0.1 = 0.09, and that of Tails-Heads is 0.1 x 0.9 = 0.09 also. The other two possibilities don’t work so well: Heads-Heads has probability 0.9 x 0.9 = 0.81, and Tails-Tails has probability 0.1 x 0.1 = 0.01, but this doesn’t matter because we are not using them.

Clément Pit-Claudel has extended this result to dice and published it on his blog. Now instead of taking the resulting numbers in groups of two, he takes them in groups of three and looks at the relative ordering of the numbers, rejecting any group where a number is repeated. For instance, the result 1,2,3 becomes 1, while 3,5,1 becomes 3. Since the final result depends on the order in which the low-middle-high numbers appear rather than the values themselves, and the tosses are supposed to be independent, then each possible permutation of three different low-middle-high numbers will appear with the same frequency, and there are 6 of them. Take only those and you’ve got perfect die performance.

Pit-Claudel does not mention permutations in that article, but this is what is at the heart of the process. Each permutation can be converted to a number by first converting it to a Lehmer code, which for a set of three throws would consist of three digits (0 to 2), counting the number of inversions from the natural order involving that throw. For instance, if the result is 3,5,1 (middle-high-low), the first digit of the Lehmer code would be 1 because the 3 is ahead of a smaller number once (5 is larger, but 1 is smaller, which would have appeared first in a natural ordering). The second digit is also 1 because the 5 is ahead of 1, which is smaller. The third and last digit is zero, which is always the case since nothing follows the last number. Thus the Lehmer code for 3,5,1 is 110.

Once the Lehmer code is found, it can be converted to decimal by multiplying the last digit by o! = 1, the second last by 1! = 1, third last by 2! = 2, fourth last by 3! = 6, and so on, and adding all the results. This is because Lehmer codes are factoradic numbers at heart. Thus for the example we get 1 x 2 + 1 x 1 + 0 = 3. The smallest result that can be obtained is 0, the largest 2 x 2 + 1 = 5, so these results work out exactly like dice whose faces are marked with numbers 0 to 5. Observe that numbering the permutations through Lehmer codes does not produce the same numbers as with Pit-Claudel’s scheme, but the expected frequencies are still the same.

The scheme can be extended to “coins” having more than two sides (say, fat coins that might fall edgewise) and “dice” with more than six sides, when used as source data. Suppose we have a three-sided coin like this, which can produce results heads=0, tails=1, and edge=2. On top of that, it is not a fair coin, so that the probability of heads is 70%, tails is 20%, and edge is 10%. We will take data in groups of two as before, but now we will assign the result “heads” if the assigned numerical value increases from first to second and “tails” if it decreases, and reject the pair if the two values are equal. Thus a “heads” result will be recorded for the sets heads-tails, tails-edge, and heads-edge, and “tails” will be recorded for tails-heads, edge-tails, and edge-heads. Then the probability of the result “heads” will be 0.7 x 0.2 + 0.2 x 0.1 + 0.7 x 0.1 = 0.23. The probability of the result “tails” will be 0.2 x 0.7 + 0.1 x 0.2 + 0.1 x 0.7 = 0.23 again, of course, because multiplication is commutative. Since the other results are rejected, “heads” will be recorded the same number of times as “tails” over a long run, which is a 50-50 split.

The Lehmer code scheme can also be extended to sets larger than 6. The next is permutations of four different symbols, which have 4! = 24 possible orderings, and then five symbols, with 5! = 120 possibilities. Four symbols comes close to alphabet size, and we will use it to encode messages with the help of a Tabula Recta. Next best is doing the dice trick twice so we have 6 x 6 = 36 different combinations, sufficient for 26 alphabet letters plus 10 decimal digits. Therefore I will discuss two ways to produce a near-random keystream starting from common text, which will be used as a one-time pad of sorts in order to encrypt a message with something approaching perfect unbreakability. The fact that there are 26 different letters, which is more than 6 or 24, shouldn’t bother us in light of what I discussed in the previous paragraph, but we have to worry about the strong correlation between letters.

This is because, if we start from common text as a source of randomness (which it does contain in relatively small amounts), we must deal not only with the fact that the letters appear with different frequencies (like imperfect coins or loaded dice), but also are not independent of each other. For instance, “Q” is nearly always followed by “U”, “X” almost never follows “K” or “B”, and so forth. There is a lot of order in speech, in the form of correlation between letters, which we must destroy first if we want to use the trick we’ve been dealing with.

Fortunately, the correlation between letters only goes so far. Can you predict what letter next sentence will begin with, even if you know the previous sentence perfectly? Can you predict the word that comes three words after the present one? If you could, you wouldn’t need to read hardly anything; all the meaning would be contained in the first few letters and the rest would follow from those. This correlation has been measured in different ways. This article, for instance, displays it in Figure 3 in the form of “conditional entropy,” which becomes smaller and smaller as we move away from the the root letter (“block length,” in that paper). After 8 to 12 characters, depending on the text being analyzed, the effect is zero within the bounds of statistics. Another very sensitive way to measure this correlation, which is implemented in the program shown below, is to calculate the chi-squared statistic of groups of two symbols separated by a certain distance, testing the hypothesis that the two symbols are independent of each other. If the hypothesis is correct, the statistic returns a low value that does not depend on the sample size. Here’s the program, in JavaScript language. It returns an array giving the chi-squared statistic (minus threshold value) with increasing distance between letters (first element: consecutive letters):

function indepTestArray(text,maxDistance){
 text = text.toUpperCase().replace(/[^A-Z]/g,'').trim();  //the text should be clear of accents, etc. before this
 var base26 = 'ABCDEFGHIJKLMNOPQRTSUVWXYZ',
     output = [],
     freqArray = [],
     data,result;

//first get frequencies for each letter
 for(var i=0; i<26; i++){
   data = 0;
   for(var j=0; j<text.length; j++){
     if(text.charAt(j) == base26.charAt(i)) data++
   }
   freqArray[i] = data
 }
 
//now get the chi-squared statistic, which will have a value smaller than 670.7 if the test passes at 10% error
 for(var l = 1; l <= maxDistance; l++){                        //for each shift, do a 2-letter chi-squared
   result = 0;
   for(var i=0; i<26; i++){                                    //each first letter
     for(var j=0; j<26; j++){                                  //each second letter
       data = 0;
       var expected = freqArray[i]*freqArray[j]/text.length;           //expected P(xy) = P(x)*P(y)
       if(expected > 0){                                      //in case a letter does not appear at all
         for(var k=0; k<text.length-l; k++){
           if((text.charAt(k) == base26.charAt(i)) && (text.charAt(k+l) == base26.charAt(j))) data++
         }
         result += Math.pow(data - expected,2) / expected
       }
     }
   }
   output[l-1] = result - 670.7                              //negative means a pass at 10% error
 }
 return output
}

I have applied this function to whole books from gutenberg.org (hundreds of thousands of characters), with the following result. The value for adjacent letters (shift 1) is always very high, meaning that adjacent letters are clearly not independent of one another. But the value drops as the distance between letters increases, and it typically dips below the threshold value for a shift of 14 and beyond, sometimes before. The same result is obtained both in English and Spanish. That is, letters that are 14 spaces apart from each other in regular text can be safely assumed to be statistically independent. Smaller spacings might work in practice, too, but this would have to be checked experimentally.

One way to ensure that we are using independent data is to take the source text in sufficiently large chunks. For instance, let’s say the source is “It was the best of times, it was the worst of times,” and we are going to take it in chunks of eight characters, ignoring spaces. We will wrap the text after eight letters until three rows are formed and then repeat with another block, obtaining the following:

ITWASTHE THEWORST
BESTOFTI OFTIMES
MESITWAS

where the second block is still incomplete. If we take the letters in groups of three vertically (I B M, T E E, and so forth), those letters are 8 spaces apart in the original text and stand a good chance at being independent of each other. Now, the letters in the first group have a fairly strong correlation with the letters in the second, since they are consecutive in the original text. This means that the result of evaluating a permutation in the first group might have a correlation with the result of doing the same in the second. In order to prevent this, a further step might be to scramble the letters in each row with different shifts. Let’s say we are going to leave the first row unscrambled, but that we will scramble the second by making 4 rows, and the third by making 2 (row numbers should divide the block length, 8 in this case). Therefore, we will write the second row over two columns and then read it off by columns, and the third row over four columns, and do the same. This is the process:

BE       MESI
ST       TWAS
OF
TI

Resulting in “BSOTETFI” for the second row and “MTEWSAIS” for the third. In practice, this kind of shuffling can be carried out as the letters are first written down, by leaving spaces that are filled as more letters are added. The first block then becomes:

ITWASTHE
BSOTETFI
MTEWSAIS

And the groups, read vertically, are “I B M,” “T S T,” “W O E,” and so forth. Now we can look at the permutation within each group in order to obtain a “die toss” from each one. Thus, “I B M” is of the form middle-low-high, which gives Lehmer code 100, which translates to decimal 2 x 1 + 0 = 2. “T S T” has the letter “T” repeated, so it does not yield a die toss, “W O E” gives Lehmer code 210, which in decimal is 2 x 2 + 1 = 5, and so forth. In the end, this block gives the following base 6 number, where dashes represent the groups that had at least one repeated letter and were not used: 2-50–20.

Observe that we are using letters and alphabetic order rather than numbers from a die in order to run the scheme. Since the result is based on the relative order of the symbols rather than on their absolute values, it works just as well as we saw earlier. In fact, the larger the set from which the symbols are drawn (26, in the case of letters versus 6 for die numbers), the smaller the probability of a repeated symbol in the group and the more efficient the scheme becomes.

If we use groups of four symbols, then we can go all the way to base 24, affording the possibility of operating directly with text. The sentence in our example becomes this when arranged in groups of four lines of length 8 characters (only the first block is complete):

ITWASTHE OFTIMES
BESTOFTI
MESITWAS
THEWORST

We’re not going to scramble the rows this time, but go directly to obtaining the base 24 permutation codes for each group of four letters, read vertically. Four-figure Lehmer codes are converted to decimal by multiplying the first figure by 6, the second by 2, and the third by 1 and adding all of this together. Thus the first vertical group “I B M T” gives Lehmer code 1000, which translates to decimal 1 x 6 + 0 x 2 + 0 = 6. The second and third groups include repeated letters so they won’t be used. The fourth group “A T I W” gives Lehmer code 0100, which in decimal is 0 x 6 + 1 x 2 + 0 = 2, and so forth. This is what the first block yields, with dashes again representing skipped groups: 6 – – 2 – 13 10 0.

And this is it. We have extracted chaos out of order. This is real uncertainty, not something that looks like it as in pseudo-random number generators because the source text does contain some information entropy that has been concentrated through the process. Some quick calculations: a base 6 random sequence contains log 6/log 2 = 2.585 bits/digit of entropy; the first block produced 5 digits, so that is 12.92 bits of entropy, was there enough entropy in the source text to justify this? Common English text averages 1.58 bits/character and we had 3 x 8 = 24 characters, for a total of 37.92 bits of entropy. So yes, we are not “creating” entropy that wasn’t there originally. The numbers for the base 24 example are: log 24 / log 2 = 4.585 bits/number for a total of 22.92 bits of entropy, and we started from 32 characters of text, which are expected to contain 50.56 bits of entropy. Even if no rejects had taken place, the entropy of the result would not have exceeded the input entropy, in both cases.

We can now move on to describe a couple new ciphers based on this algorithm. The first one, which I am calling DicePad, operates in base 6, and the second, called LetterPad, uses base 24 and a Tabula Recta for addition and subtraction. The links lead to Javascript versions of either one, which also explain the steps in detail. The only difference between these ciphers and what we have seen above is that the Lehmer codes obtained in the ciphers count the times that letters are in the normal order rather than in inverted order, which I feel is easier to do in the head. Statistical properties are not affected by this change.

Here’s how to encrypt a message with DicePad:

  1. Take a sufficiently long piece of key text (about five times the length of the message to be encrypted), which will never be used again for another message, and strip everything except letters and numerals. Do the same with the message.
  2. If key text-derived encoding is to be used (highly recommended, for it makes the ciphertext non-malleable), proceed to generate the encoding Polybius square this way:
    1. Take the first sentence of the key text and start writing different letters and numerals in the order they appear.
    2. When we come to the end of the sentence, write the rest of the numerals in normal order (0 is first), followed by the rest of the alphabet in reverse order.
    3. Fill the 6 x 6 Polybius Square with the result, row by row. Row and column headings are the numerals 0 to 5.

Example: for key text “It was the best of times, it was the worst of times.” the sequence obtained is: ITWASHEBOFMR0123456789ZYXVUQPNLKJGDC, leading to this encoding square:

     0 1 2 3 4 5
    ------------
 0 | I T W A S H
 1 | E B O F M R
 2 | 0 1 2 3 4 5
 3 | 6 7 8 9 Z Y
 4 | X V U Q P N
 5 | L K J G D C

If the square is not to be based on the key text, fill the square by rows with the digits 0 to 9, followed by the straight alphabet.

  1. Encode the message using the square, which will result in two digits per character. If the plaintext message is “Eve”, this results in encoded message: 10 41 10.
  2. Break up the key text into blocks comprising three rows, as seen above. Let’s say we use rows of 8 characters each which yields the result shown above.
  3. Optionally, shuffle each row with a constant period, also as shown above. The row length (8 in our case) is the least common multiple of the shuffle periods. We won’t shuffle for this example but it is recommended because it improves the statistical randomness of the keystream. Usually one of the rows will be left unshuffled, and the rest shuffled by short, different periods that divide the row length exactly (say, 2 and 4 in our example).
  4. Now take the letters (and digits) in vertical groups of three letters and compute a result for each group this way, starting from an initial value of zero:
    1. If the second or third letters follow the first in alphabetical order (digits come before letters), add 2 for each
    2. If the third letter follows the second in alphabetical order, add 2.
    3. If at least two letters or digits are the same, skip that group.
  5. The result of the previous step is the keystream, which is in base 6. We repeat the steps above until the number of keystream digits produced is at least the length of the encoded message. The last step is to subtract the encoded message from the keystream, neglecting any carries so it can be done from left to right, and stopping when the last digit of the encoded message is used. Bear in mind that this is a base 6 operation, so we must add 6 every time the result would have been negative. Examples: 4 – 2 = 2, 5 – 1 = 5, 1 – 5 = 2, 2 – 4 = 4. The result is the base 6 ciphertext.
  6. Optionally, decode the base 6 ciphertext back to letters and digits in groups of two, using the encoding square.

To decrypt the ciphertext the recipient, who knows the key used for encryption, performs steps 1 through 7 exactly like the sender, except for encoding the message. If the ciphertext came as letters and numbers rather that in base 6, it must now be encoded into base 6 by means of the Polybius square. He/she now subtracts the base 6 ciphertext from the keystream, resulting in the encoded message, which is then decoded back to letters and digits.

And here’s how to encrypt a message with LetterPad:

  1. Take a sufficiently long piece of key text (about five times the length of the message to be encrypted), which will never be used again for another message, and strip everything except letters and numerals. Do the same with the message. Additionally, the message must use a 24-letter alphabet, so convert all J’s to I’s and all Q’s to K’s, for instance. Any numerals in the message are to be encoded as letters; for instance, 0 = A, 1 = B, etc.
  2. If a key text-derived alphabet is to be used (highly recommended, for it makes the ciphertext non-malleable), proceed to generate it this way:
    1. Take the first sentence of the key text and start writing different letters in the order they appear, using the substitutions above so the alphabet contains only 24 letters.
    2. When we come to the end of the sentence, write the rest of the letters in reverse order.
    3. Place that alphabet at the top, left, and right of an otherwise normal Tabula Recta.

Example: for key text “It was the best of times, it was the worst of times.” the alphabet obtained is: ITWASHEBOFMRZYXVUPNLKGDC, leading to this Tabula Recta, which also contains digits on the left in order to convert between decimal and base 24:

        I T W A S H E B O F M R Z Y X V U P N L K G D C
        -----------------------------------------------
00: I | A B C D E F G H I K L M N O P R S T U V W X Y Z | I
01: T | B C D E F G H I K L M N O P R S T U V W X Y Z A | T
02: W | C D E F G H I K L M N O P R S T U V W X Y Z A B | W
03: A | D E F G H I K L M N O P R S T U V W X Y Z A B C | A
04: S | E F G H I K L M N O P R S T U V W X Y Z A B C D | S
05: H | F G H I K L M N O P R S T U V W X Y Z A B C D E | H
06: E | G H I K L M N O P R S T U V W X Y Z A B C D E F | E
07: B | H I K L M N O P R S T U V W X Y Z A B C D E F G | B
08: O | I K L M N O P R S T U V W X Y Z A B C D E F G H | O
09: F | K L M N O P R S T U V W X Y Z A B C D E F G H I | F
10: M | L M N O P R S T U V W X Y Z A B C D E F G H I K | M
11: R | M N O P R S T U V W X Y Z A B C D E F G H I K L | R
12: R | N O P R S T U V W X Y Z A B C D E F G H I K L M | Z
13: Y | O P R S T U V W X Y Z A B C D E F G H I K L M N | Y
14: X | P R S T U V W X Y Z A B C D E F G H I K L M N O | X
15: V | R S T U V W X Y Z A B C D E F G H I K L M N O P | V
16: U | S T U V W X Y Z A B C D E F G H I K L M N O P R | U
17: P | T U V W X Y Z A B C D E F G H I K L M N O P R S | P
18: N | U V W X Y Z A B C D E F G H I K L M N O P R S T | N
19: L | V W X Y Z A B C D E F G H I K L M N O P R S T U | L
20: K | W X Y Z A B C D E F G H I K L M N O P R S T U V | K
21: G | X Y Z A B C D E F G H I K L M N O P R S T U V W | G
22: D | Y Z A B C D E F G H I K L M N O P R S T U V W X | D
23: C | Z A B C D E F G H I K L M N O P R S T U V W X Y | C

If the Tabula Recta is not to be based on the key text, place the straight 24-character alphabet at top, left and right.

  1. Break up the key text into blocks comprising four rows, as seen above. Let’s say we use rows of 8 characters each which yields the result shown above. There is no need to delete the digits or reduce the alphabet to 24 letters.
  2. Optionally, shuffle each row with a constant period, also as shown above. The row length (8 in our case) is the least common multiple of the shuffle periods. We won’t shuffle for this example but it is recommended because it improves the statistical randomness of the keystream. Usually one of the rows will be left unshuffled, and the rest shuffled by short, different periods that divide the row length exactly (say, 2, 2 and 4 in our example).
  3. Now take the letters (and digits) in vertical groups of four letters and compute a result for each group this way, starting from an initial value of zero:
    1. If the second, third, or fourth letters follow the first in alphabetical order (digits come before letters), add 6 for each.
    2. If the third or fourth letters follow the second in alphabetical order (digits come before letters), add 2 for each.
    3. If the fourth letter follows the third in alphabetical order, add 1.
    4. If at least two letters or digits are the same, skip that group.
    5. Convert each result to base 24 letters using the pattern on the left side of the Tabula Recta. For instance: 2 = W, 13 = Y, etc.
  4. The result of the previous step is the keystream, which is in base 24. We repeat the steps above until the number of keystream digits produced is at least the length of the message. The last step is to subtract the message from the keystream, letter by letter from left to right using the Tabula Recta, stopping when the last message letter is used. Subtraction with the Tabula Recta is done this way: find the second letter (the one being subtracted) on the top and go down that column until the first letter is found, then follow that row to left or right to find the result. Examples with the given Tabula Recta: T – R = E, O – I = Y, O – P = K, A – R = Y. The result is the ciphertext.

To decrypt the ciphertext the recipient, who knows the key used for encryption, performs steps 1 through 5 exactly like the sender, except for encoding the message. He/she now subtracts the ciphertext from the keystream using the Tabula Recta, resulting in the original message (after reduction to 24 characters and numbers encoded as letters).

As you can see, LetterPad involves fewer steps than DicePad, but keystream generation is a little more involved. Subtraction with a Tabula Recta is also a little more complicated. On the other hand, the length involved is half, which may more than make up for the additional complexity. In my tests, I have found that LetterPad seldom needs shuffling of the key text in order to produce a keystream of good randomness, while DicePad sometimes does (we’re talking about 2-character Chi-squared statistic for messages containing thousands of characters; no problems appear for messages in the hundreds of characters) probably due to involving a larger number of uncorrelated symbols, so LetterPad has an edge here.

The Javascript versions of DicePad and LetterPad include an optional step after the keystream has been generated, consisting of a single-row Lagged Fibonacci Generator (LFG) in order to further improve the randomness of the keystream. To do this, repeat the last digit or letter of the keystream below the first and add base 6 (DicePad) or subtract first from second (LetterPad) every two numbers or letters in the vertical direction, placing the result on the right of the bottom number or letter until the bottom row is filled. That bottom row becomes the new keystream. The LFG step helps to improve the statistical quality of the keystream if the key text used is too short so it it is repeated at least partially. Its effect, like that of shuffling, is most noticeable for very long messages.

Let me finish this post with some musings concerning the security of these two ciphers. In order to do this, I will put on my hacker’s hat a pretend that I’m trying to break an encrypted message in various ways. I know the encryption method used but I don’t know the key text. Sometimes I may actually know the plaintext, and then what I’m trying to do is change the ciphertext so it decrypts to something else, or try to guess the source of the key text so I can break other encrypted messages.

  1. Brute force. This can be done in two different ways:
    1. I try every possible key text until I find one that “works,” meaning that the plaintext resulting from subtracting the ciphertext from the keystream derived from the guess makes sense in English or whatever language I suspect it was written in.
    2. I try every possible plaintext until the keystream resulting from subtracting the plaintext from the ciphertext “makes sense” as something that has derived from common text, possibly because of flaws in the keystream-generation process.

Method 1 is not going to work because the key text space is extremely large, much larger than the plaintext space, and has more entropy than  a random string of plaintext length. Per Shannon’s one-time pad theorem, I’ll always be able to find a key text that leads to any plaintext of the given length at all. Method 2 fares a little better because the keystream is generated by blocks. Say I know the block length; then I’ll try likely plaintext blocks of this length and obtain the corresponding piece of keystream by subtracting the plaintext from the ciphertext. That keystream piece likely will look like gibberish, but if it isn’t perfectly random and I know that the keystream-generating process has some flaws (for instance, some residual correlation between consecutive letters, as manifested by the two-letter chi-squared value), then I might be able to tell a bad guess, where the “keystream” piece obtained passed the 2-letter chi-squared test, from a decent one, where the result of the test is what I expect. The problem with this approach is that the blocks most likely won’t be long enough to provide meaningful statistics. Even in base 6, the minimum row length I’ll need is of the order of 6 x 6 x 5 = 180, which is unlikely to be used because it is too long. If care was taken to eliminate all correlation in the keystream (say, by appropriate shuffling of of sufficiently long key text blocks), then this method will never yield anything.

  1. Known plaintext. Let’s say I know the whole thing; can I get the key text, and so find its source so I can crack any subsequent ciphertexts from the same source? This would first require obtaining the keystream, which would be trivial if I knew the Polybius square (DicePad) or the headings of the Tabula Recta (LetterPad), which is the situation if the straight alphabet is used instead of one derived from the key text. If the alphabet derives from the key text this would be quite difficult, since I have no way to know which encoding is the correct one until the key text is found. But let’s say I make a bunch of guesses and try them all. For DicePad, I’d have to make 36! = 3.72E41 guesses, for LetterPad only 24! = 6.2E23. In any case, these are very large numbers. Even if a computer manages to check a billion every second, checking all the LetterPad alphabets would take 620 trillion seconds, which even if it gets distributed over millions of CPUs is many times the age of the Universe. And then, after I get the keystream, I’d have to invert it in order to find the key text. But the keystream generation process loses a lot of entropy because of the skipped groups and the reduction from full alphabet to essentially three or four symbols subject to a permutation, meaning that I’d have to make further guesses concerning those in order to set up those permutations. Very quickly I’ll find myself trying every possible key text, which is equivalent to brute force and, as we saw above, doomed to fail.
  2. Change plaintext. But what if I don’t want to invert the keystream but only change the plaintext? This would be trivial if the straight alphabet has been used, but otherwise I won’t know how to subtract the plaintext, as we saw above. Even worse, I must be right on the first try, so guessing is not an option. Might as well forget it.
  3. Reused key text. This means that I manage to get the sender to use the exact same key text for two different messages, which is something that should never be done. The keystream, therefore, will be identical, and I can subtract it out by subtracting the two ciphertexts, leaving the sum of the two ciphertexts. If the default square or straight alphabet were used, I only have to make educated guesses as to what words might be in one of the messages. Then I place the word at different shifts from the start and subtract it from the combined plaintexts. If the word is there and I have guessed the right location for it, the result will also be a word, or part of it. Then I can complete the word and subtract it, giving me another word, and so forth until I have recovered both plaintexts. If, on the other hand, a key-derived square or alphabet was in use I would have a lot of trouble separating the two plaintexts, for I would not know how to encode my guesses (DicePad) or would not be able to make the subtraction correctly to begin with (LetterPad).
  4. Errors. I change a few letters here and there in the ciphertext so the recipient cannot decrypt it (actually, not much of an attack, but it may happen spontaneously; the sender or the recipient may also make mistakes that amount to the same). If this were a typical computer-based cipher, any change typically leads to a complete decryption failure, but here a single-letter error on the plaintext or ciphertext does not propagate to the rest, which remains unaffected. A switched letter on the key text will lead to a single-letter error if there is no change on whether the group using that letter gets used or rejected by the keystream-generating algorithm, but will garble the rest of the block if it does. An addition or omission may garble the rest of the message after that point, but the recipient will be able to detect it and possibly correct the error so the rest of the message can be decrypted.
  5. Physical break-in. I enter the sender’s and the recipient’s houses and record the names of all the books in their possession. Since both must have the same key text, any repeated books will give me an excellent indication of what book they are using to draw the key text from. Unless, of course, they are not using any of those but rather an equally available Internet source, a newspaper, or some other source I cannot detect so easily. I might have no choice but to actually tie one of them to a chair and apply some physical persuasion until the title is revealed, at which point I will have killed all future usefulness of my efforts.