Is self-destruct email possible?

Earlier this week, my new app SeeOnce was rejected (hopefully only temporarily) by the iOS app store for allegedly misleading users into thinking that it could make self-destructing messages. Leaving aside what exactly “self-destruct” means for a digital message and whether or not SeeOnce actually achieves this, a number of current and past apps have claimed precisely this. In this article, companion to the one on Privnote vs. SeeOnce, I go over these apps, how they work, and how they can be used most profitably. (more…)

Privnote vs. SeeOnce

In this post, I review Privnote and compare it withSeeOnce. Both apps claim to generate self-destructing messages. Which one will be the winner?

Ladies and gentlemen, on this corner is Privnote, a really, really simple app to send self-destructing messages to someone. It works like this:

  1. Navigate to https://privnote.com. The interface contains a single box where you write your message and a big red button to create the note. Nobody asks you for any password or private info of any kind. There is an additional button for some options, such as encrypting it with a password (using AES), changing the default time for self-destruction, or getting an emailed confirmation that the note has been read and destroyed.
  2. Write the note and click the big red button. The box is replaced by a link like this: https://privnote.com/cv0Lcrsw#IgdQIQnTL already selected for you to copy.
  3. Copy the link and paste it into an email or whatever.
  4. When the recipient clicks on the link, Privnote loads on a secure page and displays the original message, plus assurances that the message has been destroyed at the source. Sure enough, reloading the page displays a warning that the message no longer exists.

Privnote is beautifully simple and it seems to work. Can anyone beat it?

On this other corner is SeeOnce, which also claims to generate self-destructing messages in a pretty simple matter. SeeOnce works like this:

  1. Navigate to https://seeonce.net. You are immediately asked to come up with a Password, though you are assured that you are not making an account anywhere. After supplying this, the interface contains a single box with way more buttons than Privnote, though many of them are things like “Select”, “Clear”, “Help”, etc. (Privnote doesn’t have any help, perhaps because it doesn’t need any?). There are no options to set in SeeOnce.
  2.  Write the note and click the highlighted Lock button. Now another dialog pops up, asking you to identify the recipient on a list, or send it as an (insecure) invitation to a still unknown user. This dialog doesn’t appear if you loaded SeeOnce from a link sent by someone else.
  3. After you do this, a piece of text containing a longish random-looking link fills the box. The link may be something like this: https://SeeOnce.net#@/gq1wS2sus6zUegwYZQ7+AMOLEAqBnAyPTd1Fff1lxI1MIURDA6igSnUiHI8pByPtcUX3lfSUS/GqTovQa46NoSu3tFddibJKieDgFI7XyWFw= and it is already selected as in Privnote.
  4. You can copy and paste the contents (the link alone is enough) into email or texting, or simply click the Email button, which will put the whole thing into your default email. Alternatively, you can click a Hide button, which will convert the stuff into apparently normal text taken from a cover text (a popup asks you for the cover text, if you haven’t loaded one yet), before emailing it.
  5. When the recipient clicks on the link, SeeOnce loads on a secure page and asks for a Password. After supplying this, the original message is revealed. Reloading the link and re-typing the Password leads to a message stating that “unlocking has failed” (SeeOnce needs to exchange two messages between the same parties before this happens right away, otherwise the link does not fail immediately but rather after writing a reply).

A little more complicated than Privnote, but still quite manageable. Now, the devil is in the details, as they say. We need to look at what’s inside as well as the features and the overall simplicity of the process. Price is not much of an issue since both apps are free, but availability on different platforms might be.

Price. winner: SeeOnce

Both apps are free, but Privnote has ads. This is not only uncool, but poses a security risk since the ads could potentially inject malicious code into the page, compromising everything. SeeOnce, on the contrary, stays true to the open source ethos and contains no ads. SeeOnce can do this because it doesn’t rely on servers for its operation and therefore expenses are insignificant.

Simplicity. winner: Privnote

It’s hard to be simpler than Privnote: you click a link, enter a message, copy a link; on the receiving side, just click a link and you’re done. SeeOnce is almost there, but it does ask you to come up with a Password, which is extra work and requires the user to exercise his/her memory, never a good thing (we’ll see later that this isn’t as bad as it looks). On the other hand, emailing can be done without cut and paste by just clicking an Email button. Still, Privnote wins this one.

Features. winner: SeeOnce

Privnote does have a few extra settings, such as the ability to encrypt the message with a chosen password rather than the default 54-bit key (but then, how do you send the password to the recipient in a secure manner?), whereas SeeOnce encryption is always under user control (and this is why it asks you for a Password before it does anything). Privnote also has the ability to send a read receipt, which SeeOnce lacks (we see why below). Still, SeeOnce wins this one because it has a comprehensive Help system (to its credit, Privnote hardly needs one) and the ability to hide its output as innocent text, which can be life-saving in places where encryption is frowned upon. SeeOnce also has the ability to switch to secure real-time chat if the correspondents find themselves emailing one another every few minutes.

Availability. winner: tie

Both apps are available from secure links on a regular browser, though SeeOnce can run offline and Privnote cannot. SeeOnce is also available as a Chrome extension and in the Android store. So SeeOnce has an edge here, but I’m going to call it a tie since sending emails requires Internet and most likely a browser.

Security. winner: SeeOnce

Ah, here’s the biggie. Both apps stem from radically different approaches to achieve the same goal. Privnote is fundamentally server-based (except its encryption option, which appears to be client-based), whereas SeeOnce is strictly client-based (after the web server delivers the code, that is). Let’s see what’s underneath each one:

  • In Privnote, the message (encrypted with a symmetric key, which is sent in plaintext with the link but the server does not see) is sent to a server, where it is stored. Clicking on the sender-generated link first instructs the server to send the encrypted message the recipient’s machine, where it is decrypted with the key contained in the link. The Privnote server will deliver the data if this is the first time this particular link has been clicked by anyone, and the other optional conditions, such as expiration date, have been met. Then the server deletes the stored data, or so we are told, so that a repeated request using the same link cannot be met. Still, Privnote can tell the difference between an expired link and one that was never issued, which leads me to think that some memory of having stored the message remains for a while.
  • In SeeOnce, the message is locally encrypted with the result of combining a public key, which was received in a previous message from the same correspondent, and a random private key that is stored locally and is overwritten when a new message for this correspondent is encrypted. The underlying protocol is fairly complex but transparent to the user. SeeOnce never communicates with servers, so the reason why a message “self-destructs” (actually, no longer can be decrypted) is that at least one of the keys has been overwritten and cannot be obtained anywhere else, even if someone has been copying every message exchanged. This is also why SeeOnce cannot produce a read receipt: it was a different program that actually sent the message; the SeeOnce server never knew about the sender or any of his/her data.

There are three reasons why the approach followed by SeeOnce is much more secure:

  1. The first one is that Privnote displays the decrypting key in plaintext (or an equivalent, given that the client-side code can be viewed at any time) as part of the link. It needs to do this because it does not ask for any information about the recipient before preparing the link, so anyone should be able to follow the link. If the link is sent by email, for instance (and remember we are encrypting the message because we believe email to be insecure), the link can be nabbed by someone else, who then can read the message, instead of the intended recipient. Getting some control over who can actually read the message would require some sort of recipient authentication, a password at the minimum, which is what SeeOnce does.
  2. Whenever data is stored in a server, the user loses control over it. Privnote can say they have destroyed the message until they are blue in the face, but they cannot prove it. If a government agency serves them a request to make a copy, they might be doing it without the users’ knowledge. A hacker can break in and look at the data. The server itself may be saving the data as part of a scheduled backup. Now, Privnote states that this data is encrypted with a key that is not sent to the server, but since that key is included at the end of the link sent by email (otherwise the recipient would never be able to decrypt the message), if the link is compromised as we saw above, then the agency or hacker can decrypt the message. The only protection against complete loss of security is user-controlled symmetric local encryption with a separate key, which Privnote offers as an option, but then the user has the problem of transmitting the key. SeeOnce stores data only locally, and so this is much less of a problem. Stored data is encrypted by the user Password (is it beginning to look like this wasn’t such a hassle after all?) and can optionally be wiped or moved to a different machine. Anything transmitted is encrypted with a public-key algorithm, so that key transmission is never an issue.
  3. Code executing on a server is invisible to the user. Therefore, a Privnote user has no way of making sure that the code is honest and free from errors. In Privnote, this means the code that supposedly is keeping track of how many times a particular link has been followed, and which deletes the data on the server. On the other hand, the complete SeeOnce code is visible to the user by just typing ctrl-U on the browser. It is long and complex, to be sure, but it hides nothing. The program itself has a mechanism to ensure that the code has not been tampered with by a third party, fully documented in the Help pages.

Both programs have features to recommend them but in the end it comes down to a personal choice: do you value ease of use above anything else, or is it security what you value the most in a security product? Perhaps the only way to tell is to take both for a spin and decide for yourself.

The case for symmetric encryption

In this day and age, everything dealing with encryption seems to be of the more complex asymmetric kind: PGP, SSL, TLS, BlockChain, you name it. So, is the old-fashioned symmetric encryption, where the same key is used to encrypt and decrypt, obsolete and done with? “By no means!” say a number of users. In this article, I begin quoting an email I got recently, adding some of my own musings, and making an announcement after that.

This is what a faithful user sent me. He did not want his name to be published but was okay with my quoting what he said:

The case for symmetric encryption:

There are circumstances when symmetrical encryption, that is where both sender and recipient use the same secret key to encrypt and decrypt messages, is the most practical and safest method for encrypting email.

Whenever a message sender, who is known by cryptographic custom as Alice, wishes to write an end-to-end encrypted email to a recipient, customarily known as Bob, one of two cryptographic systems can be used.

The simpler is symmetric encryption in which Alice and Bob have a single secret key, which is used by both of them to encrypt and decrypt messages. The obvious shortcoming of symmetrical encryption is that before Alice and Bob can email, they need to meet up – or have some other safe channel – through which to communicate the secret key. Asymmetrical encryption solves that problem. Both Alice and Bob have two mathematically related keys, one private one public. For Alice to send Bob an encrypted message she ascertains his public key and encrypts her message using it. The message can only be decrypted using Bob’s private key, which he keeps secret and safe.

It would seem, then, that asymmetrical encryption, involving no prior secret exchange of keys, enjoys a clear advantage, and for many purposes it does. But there are a number of things that can go wrong with asymmetrical encryption, which can’t happen with symmetrical encryption – or at least when the secret symmetric key is agreed face-to-face. Let us look at these:

1. Alice is sending Bob a message encrypted with Bob’s public key. However she needs to authenticate it; i.e. prove the message is from her. Precisely because Bob’s public key is public anybody could encrypt a message using it and then impersonate Alice. To prevent that, Alice “signs” the message using her own private key. To read the message Bob uses his private key; and to verify the authenticity of the message he needs Alice’s public key. The difficulty is solved, but only at the expense of complexity. With symmetric encryption signing and verification are not necessary because the ability to encrypt and decrypt using the single secret key is proof of authenticity.

2. Before Alice can email Bob she needs to find Bob’s public key, which may be on his website or in some other public place. But how can Alice be sure that the website or key server has not been tampered with, and that she is not encrypting the message with a key that can be read by somebody else? Equally, when Bob needs to find Alice’s public key from a public place to verify the message, how can he know it is genuine? If Alice and Bob had agreed a symmetric key face-to-face the issue of tampering and impersonation would not arise.

3. It could happen that Alice or Bob believe that their private key is no longer secret and safe. If someone else had acquired it all his or her incoming mail could be read, but revoking the key from a public place is not easy. To be successful, everyone who has or might use it needs to know of the revocation and of the replacement. With symmetric encryption, compromising one key only affects the two parties involved; they can then easily set up a new key – and maintain different levels of security for each key used with different people. Alice’s communication with Bob can be kept more securely than her communication with Bill, for instance.

Asymmetric encryption therefore brings with it a number of difficulties not suffered by symmetric encryption. The sole disadvantage of symmetric encryption is that Alice 2 and Bob need to agree a secret key face-to-face or through some other safe channel. But in many cases that it is no difficulty at all. It may well be that Alice and Bob meet in person as well as send emails. Alice and Bob could be lovers, or they could be members of a political action group which is under surveillance by security agencies. There is no safer form of email encryption than for Alice and Bob to meet, agree a twenty-character long password consisting of random numbers and letters, and for both of them to keep it secret and safe and to use it to encrypt and decrypt their text emails.

This is what he said, and this is what I’d like to add:

I have been focused on asymmetric encryption for a while because symmetric encryption has the seemingly insurmountable obstacle of key distribution. Namely, that there is no good secure way to get a new key to those who should use it other than meeting face to face. But if meeting is possible, then it is hard to beat symmetric encryption, especially if the agreed-on key is a random string that is deleted from its storage location when a new key replaces it. In this case, old encrypted messages become undecryptable, and they possess the very valuable property of forward secrecy.

PassLok does retain the ability to use symmetric encryption by providing a shared Key, instead of a Lock, whenever the program does an encryption. If the shared Key is not stored in the directory (it can be stored, just like a Lock), then it takes a click of the Edit button to reveal the box where that shared Key should be placed, and another click to get back to the Main screen. But the author of the above piece, and I presume many others, still consider this too confusing when faced with so many other options and buttons to be pressed.

Therefore, I have updated URSA so it does only symmetric encryption, bringing it to version 4.0. You can find it at https://passlok.com/ursa.  Its interface is similar to that of PassLok, replacing the directory box with a shared Key input box. Help is similar to that in PassLok and SeeOnce, but called from a button rather than occupying its own tab. There are no Options to choose. Because it is based on the NaCl engine like PassLok 2.2 (unlike URSA 3.3, which is based on the SJCL engine), URSA 4.0 is not compatible with version 3.3, but it is compatible with PassLok 2.2 so that messages locked with URSA 4.0 can be unlocked in PassLok, and (short mode) messages locked with PassLok can be unlocked in URSA 4.0.

My hope is that URSA can serve as an introduction to practical encryption for many people. After they feel comfortable with shared Keys and the basics of encryption and decryption (“locking” and “unlocking” in URSA and PassLok), then they can move to the fuller experience that is PassLok. If they still feel overwhelmed, they can try SeeOnce for a while, which will introduce them to asymmetric encryption in a relatively simple way, and then try PassLok.

So, check out the new URSA at https://passlok.com/ursa. Perhaps this is all you’ll ever need for your encryption needs.

Which end-to-end encrypted email is best?

After the 2013 Snowden revelations, there has been a push to make email more private than it currently is (which is, essentially, like writing on a postcard). The big guns, Google and Yahoo, have wowed to make their respective email systems end-to-end  (E2E) encrypted but progress has been slow. The official page about the Google effort has not been updated for months (as of June 2015). In this article, I go over some options available today, while we wait for that final solution that, at this pace, might still take a while to come.

There are dozens of options, but here I am going to review only a few that are better known in order to compare their features. Here I compare the following: Enlocked 2, Virtru,Proton mail, Mailvelope and, of course, my own PassLok. All of them promise to make encryption easy, at least compared to what it used to be. Some take pretty drastic shortcuts in order to achieve this, as we shall see. They all perform encryption and decryption at the user end, rather than at the server, since otherwise the server would have access to the plain content, with the consequent risk if subpoenaed or simply hacked. They all run the encryption code in JavaScript, which is a no-no for some, but currently there is no other way to run code locally on a browser.

Given that they all do the essential thing in some way or another, I looked into other desirable features of an end-to-end email system. Here’s a description of each feature:

  1. No need to change address. It’s a hassle if you need to use a different email address in order to have privacy. Here a Yes means that you can continue using your current email provider and address, and still obtain the benefit of encryption.
  2. Provider doesn’t store secrets. If it does, there’s always the danger that they’ll be forced to reveal them. Now, some providers store them in encrypted form, which is better, but still they’re storing something sensitive essentially forever.
  3. Provider cannot read email. As ludicrous as this may sound, at least one of the providers featured in this article is able to read your emails, or enable someone else to do so. We are assuming the encrypted content is easy to intercept.
  4. Account not required. Because some of us are paranoid enough to prefer not being forced to make an account anywhere, to prevent being tracked when we use it, or whatever other reason.
  5. Encrypt attachments. Not all we send is text. Often the sensitive information will be a picture, document, or any other kind of file. Encryption should be able to handle this.
  6. Encryption visible. A study published in 2013 showed that, for the time being, users prefer to see their messages in encrypted form at some point, in order to be assured that they are indeed encrypted before they send them out. They were willing to go as far as cutting and pasting the material in order to get this. This feature does not need to be on 100% of the time, but at least as an option.
  7. Encryption can be disguised. A number of users need encryption because they fear they are under surveillance. They would be even happier if they could make their emails and attachments appear as if they were not encrypted at all. This is called steganography, and at least one of the systems reviewed adds this feature to the mix.
  8. Forward secrecy. This means that the encrypted content remains secure into the future, even if the encryption key or the user password is obtained by an enemy. This is considered essential for instant messaging apps, and would be nice if email could also pull this trick, perhaps as an option.
  9. Multi-platform. It is no good if users need to use a particular kind of device (PC, iPhone, whatever) in this fractioned market.
  10. Portable across machines. This means that a give user should be able to use his/her E2E encrypted email on different machines, possibly of different kinds, with a minimum of risk and hassle.
  11. Multiple identities. What if several family members or coworkers share a computer? Can they keep privacy from each other? What if you’re schizophrenic or have multiple personalities?
  12. Open source. A deal-breaker for many, if it is not. Some of us feel a lot more reassured if the underlying  code is available and experts can subject it to scrutiny.
  13. Crypto engine. There are a number of cryptographic engines out there, some more recent than others, but all the systems presented here use an engine that has accumulated at least ten years of scrutiny.

So here’s the comparison as a table. Yes is good, No is bad. Some entries have a footnote right below the table.

Features Enlocked 2 Virtru Proton mail Mailvelope PassLok
No need to change address Yes1 Yes1 No Yes1 Yes
Provider doesn’t store secrets No2 No No2 Yes Yes
Provider cannot read email Yes No Yes Yes Yes
Account not required No No No Yes Yes
Encrypt attachments Yes Yes3 Yes No Yes
Encryption visible No No No Yes Yes
Encryption can be disguised No No No No Yes
Forward secrecy No No3 No No Yes
Multi-platform Yes5 Yes5 Yes No Yes
Portable across machines Yes6 Yes6 Yes No Yes
Multiple identities No No Yes No Yes
Open source No No No Yes Yes
Crypto engine PGP AES PGP PGP NaCl
Overall score (# of yes) 5 4 5 6 12

1. The app works only with certain email providers, not all of them
2. They do store encrypted private keys, hence the bad score
3. Separate encryption and delivery from server
4. They deny access to the encryption key (paid feature), but the key is not deleted
5. Browser plugins plus apps in iOS/Android stores
6. User secret data saved in servers (encrypted) enables this

Now a short description of each E2E provider, and why they got these scores.

Enlocked 2

Their first version got slammed by reviewers because it was doing the encryption on a server instead of the client. This meant that their server got  to see the plain text of your private stuff, even though they promised (who doesn’t?) that they didn’t store it. Enlocked 2 is a browser plugin (standalone mobile apps exist) that performs PGP encryption on the client. Their server holds each user’s public key so it can be sent to other users who want to encrypt messages for the owners of those keys to read. The plugin automatically retrieves each public key belonging to a recipient in an email and uses it to encrypt the content before sending the encrypted result to the actual email provider. Because it is a plugin, it only works with Gmail, Yahoo mail, Outlook, and a handful more.

In order to achieve portability, Enlocked also stores and serves the user’s private key, previously encrypted by his/her password. This is a problem, since compromising that password or choosing a bad one (Enlocked accepts bad passwords without complaint) makes all your encrypted emails readable by Enlocked or those to whom they give your private key.

Enlocked is a commercial service, which costs $9.99 every month for sending 100 messages, and goes up from there. You can sign up for a free account, though, which allows you to decrypt an unlimited number of messages and encrypt 10 messages every month. Their system seems to be glitchy: I’ve tried for several days to make a free account without success, leading only to error screens that instruct me to contact support (without any link to support, though).

Virtru

On the surface, Virtru behaves very much like Enlocked. They have slick plugins and mobile apps. They encrypt text and attachments. You only need your account password to use it anywhere you go. The difference is that they use symmetric encryption (256-bit AES) instead of asymmetric encryption (PGP). When you encrypt something, you send the actual, per message encryption key to Virtru, and they store it so they can give it to the recipients of your encrypted message. The whole process is quite automatic and makes it possible for new users to be added in a flash, since no initial key generation and exchange process is needed.

The downside is that they can read your encrypted messages, if intercepted (they are sent through your regular provider, from a limited list of supported ones), and enable anyone who asks nicely to do so. This should be a deal breaker for anyone who has real confidential material to protect. To make matters more egregious, they charge for “forward secrecy”, meaning that Virtru promises to no longer give the decryption key to anyone except you (and, possibly, government agencies).

If you don’t want the paid features, Virtru is free to use, though making an account is a must. A Pro account with the extra features will normally set you back $4 a month.

Proton Mail

Proton Mail is Enlocked with an email server thrown in. When you sign up for Proton Mail, you sign up for the whole thing, and you get a username@protonmail.ch email address. No way to keep using your old address except by forwarding. Proton Mail requires the user to memorize two keys: one is for logging into the email server and getting your emails, plus obtaining the PGP public keys of other users automatically; the other is for encrypting your PGP private key, which is stored in their server so you can use different machines. Since they must know your login key in order to give you access, they’d also be able to read your encrypted mails if both keys were the same, hence the need for two separate keys.

Proton mail is accessed strictly as a website, so no plugins or mobile apps are involved. Interestingly, this approach makes it accessible from any device, running any operating system.

Proton Mail just finished beta and is available for signups, everything is free, though not open source. I believe that Google’s and Yahoo’s E2E solutions, when they come, will end up looking a lot like Proton Mail.

Mailvelope

If Proton Mail was Enlocked plus a mail server, Mailvelope is Enlocked minus a key server. It uses PGP like the other two, but key generation is kept strictly to your machine. The user is responsible for distributing his/her public key to others, and for safeguarding his/her private key, which is locally stored encrypted by a password. Mailvelope is only available as a Chrome extension and off the box it is limited to Gmail, Yahoo mail and a couple more, though apparently you can configure other clients manually (I didn’t try). Integration with those is quite slick, however, as Mailvelope detects encrypted incoming emails and offers to decrypt them if only you supply the password.

This extension may be all that many people will ever need, so long as they don’t use encrypted email with a lot of people (public key entry and selection is a bit involved) and don’t need portability (settings don’t sync across machines).

PassLok

Obviously it was a foregone conclusion that PassLok was going to be the winner in a comparison made by its own developer. I won’t dispute that, but the fact remains that PassLok has all the desirable features on the list. You can use it with your existing email (in fact, with any program at all). It doesn’t store your content or even anything secret. It doesn’t talk to servers so you’re not making any account anywhere. It handles files as well as text. It runs as a Chrome plugin, iOS/Android app, or just as a webpage that gets completely cached so you don’t even need an Internet connection after the first time. It includes a full complement of text and image steganography tools so your encrypted material can slip undetected. In its Chrome app incarnation, it even syncs all your stored public keys (which are not secret, but may be hard to obtain again) using Google’s servers. It is possibly the only web app available today capable of making a self-destructing message, where the key needed to decrypt it disappears from the face of the earth as soon as it is read once.

But it must have some defects, right? C’mon!

A lot of PassLok’s security lies in the fact that it is self-contained code, and precisely this is why some things are harder to do than in other systems. PassLok does not interact with email clients, and so the user must manually copy and paste the encrypted material between PassLok and whatever email client you are using. This is a hassle, but it has the advantage that you know without a doubt that your data is encrypted before it gets to your email. Some email providers, like Google, log every keystroke you type into the compose window, so it is important to encrypt your messages before the email program sees them.

PassLok does run a server to help users obtain other users’ public keys. It is called General Directory, and currently it is manual rather than automatic so that the main PassLok code is isolated from contact with any server. A lot of things in PassLok are automatic, but nothing is forced on the user. If the user decides not to make public his public key, but rather send it to a selected few, that’s okay with PassLok. Most PassLok users actually do this. They are a paranoid but dedicated bunch.

PassLok for Email enters beta

logo v2-440x280 emailJohnny can’t encrypt. It’s no use. . . . This is what has been said repeatedly about mere mortal users and encryption, which forever has been the domain of black chambers and mathematical geniuses. Scores of companies have tried to get around this problem by hiding encryption in their servers, far away from users’ eyes.

And yet, studies have shown that this creates another problem: if I can’t see any of the encryption, how can I relax and be sure that this message where I’m risking my career, maybe my life, is truly away from prying eyes? Just because the software tells me to relax?

PassLok does not hide encryption from users, and it tries hard to make it accessible. This is why the next step in its development is so important. PassLok for Email is a new Chrome extension that adds PassLok encryption to regular web-based email apps. Its beta, which supports Gmail, Yahoo, and Outlook online, has just been released for public testing. (more…)

The FBI won’t like this post

I fully expect to start hearing funny clicks on my cellphone or see people in trench coats behind me after finishing this. Perhaps you, who are reading the article, will have a similar experience.
The reason? Here I’m telling you why all the current debate on whether the FBI and other law-enforcement agencies should have access to an individual’s encrypted information is moot, because that individual doesn’t really have to rely on anyone else in order to thwart that effort.

It is no secret that FBI Director James Comey isn’t happy about encryption being used in digital communications. To quote him:

Those charged with protecting our people aren’t always able to access the evidence we need to prosecute crime and prevent terrorism even with lawful authority. We have the legal authority to intercept and access communications and information pursuant to court order, but we often lack the technical ability to do so.

Consistent with this view, he insists on the need to provide some way for his agency, and possibly other agencies engaged in suppressing terrorism and other crimes, so that those communications can be legally decrypted. He is generous about it, too, since he is quite willing to seek court approval for each request, rather than demand carte blanche for the practice, to be used whenever the need arises without further approval, or simply ignore the courts as in the past. Technology leaders have come out in strong opposition to this desire, but bills have nevertheless been introduced in New York and California seeking to force them to provide the kind of access that Comey is requesting. The issue has been prominent in the recent presidential debates by both the Republican and the Democratic party.

So the battle lines are clearly drawn, right? The Government wants a back door because they want to protect the little guy from bad actors who are currently operating in darkness, thanks to encryption; companies don’t want to provide them because they would negate one of their biggest selling points, which makes the public like them and use their products. The sound bites hurled from both sides are well known: “You are protecting terrorists and child molesters!”, “You want to put a camera in my bedroom!”, “Security is paramount!”, “The right to Privacy must be held sacred!” And we are being pressured into falling into one camp or the other.

That is, if the whole issue depends on whether or not tech companies complied, more or less willingly, with the FBI’s request for a workable peephole. But a few important things are seldom mentioned:

  1. Tech companies are global, and so they have to deal with more than the US Government. If a peephole exists, could they deny its use to the government of China . . . Iran . . . North Korea? Are tech companies supposed to maintain a running record of who the good guys are at the moment? How would the US Department of State like it if they allowed such access to the Secretary of State’s encrypted email (not that this has necessarily happened already, but itmight happen ;-).
  2. It’s only terrorists and child molesters (and possibly drug dealers, money launderers and other high-profile criminals, I guess) that are to have their unalienable right to privacy suspended, so relax, the rest of you. But if this is the case, the normal government way to handle this would be to have users sign an affidavit to the effect that they are not seeking to use email or whatnot in order to engage in terrorism, child molestation, or whatnot. See, for instance, the current USCIS application for permanent residence. This is the way they control exports, gun ownership, and just about everything else. You may think I’m kidding (I am, of course), but there’s some hidden disingenuity here that is begging to be exposed.
  3. How would the Government protect the data being de-privatized (their record on keeping their own private data is frankly dismal) and, likely harder to do, how would they ensure that they would remain “the good guys” for as long as that data has a value, without abusing, stretching, or plain breaking the law they are charged to protect? If the past is the best indicator of the future, things don’t too rosy in this area. Not to mention governments in places like Europe, Latin America (I’m looking at you, Venezuela), or the Far East.
  4. Finally, the whole debate may be useless, since individual users already have the ability,protected by the US Constitution and tested multiple times in the past, to do their own encryption without having to rely on a service that might have a backdoor or peephole. This is why the FBI is not going to like the rest of this post, because I’m telling you how to do it.

The key here is that any information that humans can read and understand is “speech,” as narrowly defined by lawyers and legal experts. The free exchange of “speech” is protected by the First Amendment to the US Constitution and similar laws in all other democratic countries. Free speechis believed to help in keeping the government from spiraling down to totalitarianism, and therefore is here to stay, no question. The First Amendment also protects my telling you my awesome recipe for fried chicken if I choose to (which I won’t do, for reasons that should be obvious to everyone), or how to encrypt your own stuff in a way that the NSA will never be able to decrypt it. I didn’t steal this secret from the NSA and it is completely mine to reveal, in use of my free speech rights guaranteed by the First Amendment to the US Constitution. I say “guaranteed” and not “given.” Rights are given by God; the government merely recognizes and hopefully protects them.

It so happens that the “perfect cipher” that the NSA will never be able to break already exists. It is called one-time pad and only needs paper and pencil to do it. It also needs the eponymous one-time pad, which used to require a whole team of typists trained in moving their fingers as randomly as possible (monkeys were tried, but costs were too high ;-). In previous posts, however, I have described several methods that generate a one-time pad of acceptable quality (not perfect but certainly better than what Soviet spies used during the Cold War, whose messages remain unbroken today), staring from simple text. I have collected all of those into a new website, and this is the address:

http://chaosfromorder.weebly.com

The title has nothing to do with a New World Order or nonsense of that sort. Rather, it brings home the point that what makes those ciphers work is extracting the randomness that normal text contains, which is roughly 1.6 bits of entropy per letter. Since random text with a 26-character alphabet would contain log(26)/log(2) = 4.7 bits of entropy per letter, we need a piece of text at least three times the length of our message in order to generate a “one-time pad” to encrypt it.

Using these ciphers you can encrypt a message with utmost security and without the need for a computer, smartphone, or whatnot. It does not matter, therefore, whether Internet or communications providers are required to weaken their own encryption, since those who really want to keep the FBI from reading their messages can always do so no matter what. So, who does Comey think he can catch through all that legislative agitation? Likely, only stupid terrorists and child molesters, not the smart ones.

Chaos from order

Sounds like a play on words, doesn’t it? And yet, this is exactly what I mean. Sixty years ago, renowned mathematician John von Neumann, published a little trick that allowed using a biased coin, where heads and tails do not come out at 50-50, to generate a true, unbiased, 50-50 random sequence. It turns out that this trick can be extended to larger sets, such as alphabet letters, in order to generate what appears to be a true random sequence of letters (chaos) from common text (order, unless you’re starting from a political speech or your latest cellphone bill).

The result, as you probably have already guessed, is yet two more paper-and-pencil ciphers,DicePad, and LetterPad, that come dangerously close to perfect unbreakability.

This is how von Neumann fixes a bad coin, provided one throw does not affect the result of the next (that is, they are statistically independent):

  1. Throw the coin an even number of times and record the result. If we call Heads = 0 and Tails = 1, the result looks like a long binary number.
  2. Now take the digits in groups of two and make this conversion: 01 = 0, 10 = 1, 00 and 11 = no result.
  3. The resulting string, which will be at most one-quarter the length of the original, will have 0’s and 1’s in a truly random pattern and with frequency approaching 50%.

It is simple to understand how this works. If the throws are independent of each other, the sequences 01 and 10 will have the same probability since they only differ in the order in which the 0 and the 1 appear, which should be truly random if the throws are independent. This is true even if Heads is much more (or less) likely to appear than Tails. Say the probability of Heads is 90%, Tails 10%, then the probability of Heads-Tails is 0.9 x 0.1 = 0.09, and that of Tails-Heads is 0.1 x 0.9 = 0.09 also. The other two possibilities don’t work so well: Heads-Heads has probability 0.9 x 0.9 = 0.81, and Tails-Tails has probability 0.1 x 0.1 = 0.01, but this doesn’t matter because we are not using them.

Clément Pit-Claudel has extended this result to dice and published it on his blog. Now instead of taking the resulting numbers in groups of two, he takes them in groups of three and looks at the relative ordering of the numbers, rejecting any group where a number is repeated. For instance, the result 1,2,3 becomes 1, while 3,5,1 becomes 3. Since the final result depends on the order in which the low-middle-high numbers appear rather than the values themselves, and the tosses are supposed to be independent, then each possible permutation of three different low-middle-high numbers will appear with the same frequency, and there are 6 of them. Take only those and you’ve got perfect die performance.

Pit-Claudel does not mention permutations in that article, but this is what is at the heart of the process. Each permutation can be converted to a number by first converting it to a Lehmer code, which for a set of three throws would consist of three digits (0 to 2), counting the number of inversions from the natural order involving that throw. For instance, if the result is 3,5,1 (middle-high-low), the first digit of the Lehmer code would be 1 because the 3 is ahead of a smaller number once (5 is larger, but 1 is smaller, which would have appeared first in a natural ordering). The second digit is also 1 because the 5 is ahead of 1, which is smaller. The third and last digit is zero, which is always the case since nothing follows the last number. Thus the Lehmer code for 3,5,1 is 110.

Once the Lehmer code is found, it can be converted to decimal by multiplying the last digit by o! = 1, the second last by 1! = 1, third last by 2! = 2, fourth last by 3! = 6, and so on, and adding all the results. This is because Lehmer codes are factoradic numbers at heart. Thus for the example we get 1 x 2 + 1 x 1 + 0 = 3. The smallest result that can be obtained is 0, the largest 2 x 2 + 1 = 5, so these results work out exactly like dice whose faces are marked with numbers 0 to 5. Observe that numbering the permutations through Lehmer codes does not produce the same numbers as with Pit-Claudel’s scheme, but the expected frequencies are still the same.

The scheme can be extended to “coins” having more than two sides (say, fat coins that might fall edgewise) and “dice” with more than six sides, when used as source data. Suppose we have a three-sided coin like this, which can produce results heads=0, tails=1, and edge=2. On top of that, it is not a fair coin, so that the probability of heads is 70%, tails is 20%, and edge is 10%. We will take data in groups of two as before, but now we will assign the result “heads” if the assigned numerical value increases from first to second and “tails” if it decreases, and reject the pair if the two values are equal. Thus a “heads” result will be recorded for the sets heads-tails, tails-edge, and heads-edge, and “tails” will be recorded for tails-heads, edge-tails, and edge-heads. Then the probability of the result “heads” will be 0.7 x 0.2 + 0.2 x 0.1 + 0.7 x 0.1 = 0.23. The probability of the result “tails” will be 0.2 x 0.7 + 0.1 x 0.2 + 0.1 x 0.7 = 0.23 again, of course, because multiplication is commutative. Since the other results are rejected, “heads” will be recorded the same number of times as “tails” over a long run, which is a 50-50 split.

The Lehmer code scheme can also be extended to sets larger than 6. The next is permutations of four different symbols, which have 4! = 24 possible orderings, and then five symbols, with 5! = 120 possibilities. Four symbols comes close to alphabet size, and we will use it to encode messages with the help of a Tabula Recta. Next best is doing the dice trick twice so we have 6 x 6 = 36 different combinations, sufficient for 26 alphabet letters plus 10 decimal digits. Therefore I will discuss two ways to produce a near-random keystream starting from common text, which will be used as a one-time pad of sorts in order to encrypt a message with something approaching perfect unbreakability. The fact that there are 26 different letters, which is more than 6 or 24, shouldn’t bother us in light of what I discussed in the previous paragraph, but we have to worry about the strong correlation between letters.

This is because, if we start from common text as a source of randomness (which it does contain in relatively small amounts), we must deal not only with the fact that the letters appear with different frequencies (like imperfect coins or loaded dice), but also are not independent of each other. For instance, “Q” is nearly always followed by “U”, “X” almost never follows “K” or “B”, and so forth. There is a lot of order in speech, in the form of correlation between letters, which we must destroy first if we want to use the trick we’ve been dealing with.

Fortunately, the correlation between letters only goes so far. Can you predict what letter next sentence will begin with, even if you know the previous sentence perfectly? Can you predict the word that comes three words after the present one? If you could, you wouldn’t need to read hardly anything; all the meaning would be contained in the first few letters and the rest would follow from those. This correlation has been measured in different ways. This article, for instance, displays it in Figure 3 in the form of “conditional entropy,” which becomes smaller and smaller as we move away from the the root letter (“block length,” in that paper). After 8 to 12 characters, depending on the text being analyzed, the effect is zero within the bounds of statistics. Another very sensitive way to measure this correlation, which is implemented in the program shown below, is to calculate the chi-squared statistic of groups of two symbols separated by a certain distance, testing the hypothesis that the two symbols are independent of each other. If the hypothesis is correct, the statistic returns a low value that does not depend on the sample size. Here’s the program, in JavaScript language. It returns an array giving the chi-squared statistic (minus threshold value) with increasing distance between letters (first element: consecutive letters):

function indepTestArray(text,maxDistance){
 text = text.toUpperCase().replace(/[^A-Z]/g,'').trim();  //the text should be clear of accents, etc. before this
 var base26 = 'ABCDEFGHIJKLMNOPQRTSUVWXYZ',
     output = [],
     freqArray = [],
     data,result;

//first get frequencies for each letter
 for(var i=0; i<26; i++){
   data = 0;
   for(var j=0; j<text.length; j++){
     if(text.charAt(j) == base26.charAt(i)) data++
   }
   freqArray[i] = data
 }
 
//now get the chi-squared statistic, which will have a value smaller than 670.7 if the test passes at 10% error
 for(var l = 1; l <= maxDistance; l++){                        //for each shift, do a 2-letter chi-squared
   result = 0;
   for(var i=0; i<26; i++){                                    //each first letter
     for(var j=0; j<26; j++){                                  //each second letter
       data = 0;
       var expected = freqArray[i]*freqArray[j]/text.length;           //expected P(xy) = P(x)*P(y)
       if(expected > 0){                                      //in case a letter does not appear at all
         for(var k=0; k<text.length-l; k++){
           if((text.charAt(k) == base26.charAt(i)) && (text.charAt(k+l) == base26.charAt(j))) data++
         }
         result += Math.pow(data - expected,2) / expected
       }
     }
   }
   output[l-1] = result - 670.7                              //negative means a pass at 10% error
 }
 return output
}

I have applied this function to whole books from gutenberg.org (hundreds of thousands of characters), with the following result. The value for adjacent letters (shift 1) is always very high, meaning that adjacent letters are clearly not independent of one another. But the value drops as the distance between letters increases, and it typically dips below the threshold value for a shift of 14 and beyond, sometimes before. The same result is obtained both in English and Spanish. That is, letters that are 14 spaces apart from each other in regular text can be safely assumed to be statistically independent. Smaller spacings might work in practice, too, but this would have to be checked experimentally.

One way to ensure that we are using independent data is to take the source text in sufficiently large chunks. For instance, let’s say the source is “It was the best of times, it was the worst of times,” and we are going to take it in chunks of eight characters, ignoring spaces. We will wrap the text after eight letters until three rows are formed and then repeat with another block, obtaining the following:

ITWASTHE THEWORST
BESTOFTI OFTIMES
MESITWAS

where the second block is still incomplete. If we take the letters in groups of three vertically (I B M, T E E, and so forth), those letters are 8 spaces apart in the original text and stand a good chance at being independent of each other. Now, the letters in the first group have a fairly strong correlation with the letters in the second, since they are consecutive in the original text. This means that the result of evaluating a permutation in the first group might have a correlation with the result of doing the same in the second. In order to prevent this, a further step might be to scramble the letters in each row with different shifts. Let’s say we are going to leave the first row unscrambled, but that we will scramble the second by making 4 rows, and the third by making 2 (row numbers should divide the block length, 8 in this case). Therefore, we will write the second row over two columns and then read it off by columns, and the third row over four columns, and do the same. This is the process:

BE       MESI
ST       TWAS
OF
TI

Resulting in “BSOTETFI” for the second row and “MTEWSAIS” for the third. In practice, this kind of shuffling can be carried out as the letters are first written down, by leaving spaces that are filled as more letters are added. The first block then becomes:

ITWASTHE
BSOTETFI
MTEWSAIS

And the groups, read vertically, are “I B M,” “T S T,” “W O E,” and so forth. Now we can look at the permutation within each group in order to obtain a “die toss” from each one. Thus, “I B M” is of the form middle-low-high, which gives Lehmer code 100, which translates to decimal 2 x 1 + 0 = 2. “T S T” has the letter “T” repeated, so it does not yield a die toss, “W O E” gives Lehmer code 210, which in decimal is 2 x 2 + 1 = 5, and so forth. In the end, this block gives the following base 6 number, where dashes represent the groups that had at least one repeated letter and were not used: 2-50–20.

Observe that we are using letters and alphabetic order rather than numbers from a die in order to run the scheme. Since the result is based on the relative order of the symbols rather than on their absolute values, it works just as well as we saw earlier. In fact, the larger the set from which the symbols are drawn (26, in the case of letters versus 6 for die numbers), the smaller the probability of a repeated symbol in the group and the more efficient the scheme becomes.

If we use groups of four symbols, then we can go all the way to base 24, affording the possibility of operating directly with text. The sentence in our example becomes this when arranged in groups of four lines of length 8 characters (only the first block is complete):

ITWASTHE OFTIMES
BESTOFTI
MESITWAS
THEWORST

We’re not going to scramble the rows this time, but go directly to obtaining the base 24 permutation codes for each group of four letters, read vertically. Four-figure Lehmer codes are converted to decimal by multiplying the first figure by 6, the second by 2, and the third by 1 and adding all of this together. Thus the first vertical group “I B M T” gives Lehmer code 1000, which translates to decimal 1 x 6 + 0 x 2 + 0 = 6. The second and third groups include repeated letters so they won’t be used. The fourth group “A T I W” gives Lehmer code 0100, which in decimal is 0 x 6 + 1 x 2 + 0 = 2, and so forth. This is what the first block yields, with dashes again representing skipped groups: 6 – – 2 – 13 10 0.

And this is it. We have extracted chaos out of order. This is real uncertainty, not something that looks like it as in pseudo-random number generators because the source text does contain some information entropy that has been concentrated through the process. Some quick calculations: a base 6 random sequence contains log 6/log 2 = 2.585 bits/digit of entropy; the first block produced 5 digits, so that is 12.92 bits of entropy, was there enough entropy in the source text to justify this? Common English text averages 1.58 bits/character and we had 3 x 8 = 24 characters, for a total of 37.92 bits of entropy. So yes, we are not “creating” entropy that wasn’t there originally. The numbers for the base 24 example are: log 24 / log 2 = 4.585 bits/number for a total of 22.92 bits of entropy, and we started from 32 characters of text, which are expected to contain 50.56 bits of entropy. Even if no rejects had taken place, the entropy of the result would not have exceeded the input entropy, in both cases.

We can now move on to describe a couple new ciphers based on this algorithm. The first one, which I am calling DicePad, operates in base 6, and the second, called LetterPad, uses base 24 and a Tabula Recta for addition and subtraction. The links lead to Javascript versions of either one, which also explain the steps in detail. The only difference between these ciphers and what we have seen above is that the Lehmer codes obtained in the ciphers count the times that letters are in the normal order rather than in inverted order, which I feel is easier to do in the head. Statistical properties are not affected by this change.

Here’s how to encrypt a message with DicePad:

  1. Take a sufficiently long piece of key text (about five times the length of the message to be encrypted), which will never be used again for another message, and strip everything except letters and numerals. Do the same with the message.
  2. If key text-derived encoding is to be used (highly recommended, for it makes the ciphertext non-malleable), proceed to generate the encoding Polybius square this way:
    1. Take the first sentence of the key text and start writing different letters and numerals in the order they appear.
    2. When we come to the end of the sentence, write the rest of the numerals in normal order (0 is first), followed by the rest of the alphabet in reverse order.
    3. Fill the 6 x 6 Polybius Square with the result, row by row. Row and column headings are the numerals 0 to 5.

Example: for key text “It was the best of times, it was the worst of times.” the sequence obtained is: ITWASHEBOFMR0123456789ZYXVUQPNLKJGDC, leading to this encoding square:

     0 1 2 3 4 5
    ------------
 0 | I T W A S H
 1 | E B O F M R
 2 | 0 1 2 3 4 5
 3 | 6 7 8 9 Z Y
 4 | X V U Q P N
 5 | L K J G D C

If the square is not to be based on the key text, fill the square by rows with the digits 0 to 9, followed by the straight alphabet.

  1. Encode the message using the square, which will result in two digits per character. If the plaintext message is “Eve”, this results in encoded message: 10 41 10.
  2. Break up the key text into blocks comprising three rows, as seen above. Let’s say we use rows of 8 characters each which yields the result shown above.
  3. Optionally, shuffle each row with a constant period, also as shown above. The row length (8 in our case) is the least common multiple of the shuffle periods. We won’t shuffle for this example but it is recommended because it improves the statistical randomness of the keystream. Usually one of the rows will be left unshuffled, and the rest shuffled by short, different periods that divide the row length exactly (say, 2 and 4 in our example).
  4. Now take the letters (and digits) in vertical groups of three letters and compute a result for each group this way, starting from an initial value of zero:
    1. If the second or third letters follow the first in alphabetical order (digits come before letters), add 2 for each
    2. If the third letter follows the second in alphabetical order, add 2.
    3. If at least two letters or digits are the same, skip that group.
  5. The result of the previous step is the keystream, which is in base 6. We repeat the steps above until the number of keystream digits produced is at least the length of the encoded message. The last step is to subtract the encoded message from the keystream, neglecting any carries so it can be done from left to right, and stopping when the last digit of the encoded message is used. Bear in mind that this is a base 6 operation, so we must add 6 every time the result would have been negative. Examples: 4 – 2 = 2, 5 – 1 = 5, 1 – 5 = 2, 2 – 4 = 4. The result is the base 6 ciphertext.
  6. Optionally, decode the base 6 ciphertext back to letters and digits in groups of two, using the encoding square.

To decrypt the ciphertext the recipient, who knows the key used for encryption, performs steps 1 through 7 exactly like the sender, except for encoding the message. If the ciphertext came as letters and numbers rather that in base 6, it must now be encoded into base 6 by means of the Polybius square. He/she now subtracts the base 6 ciphertext from the keystream, resulting in the encoded message, which is then decoded back to letters and digits.

And here’s how to encrypt a message with LetterPad:

  1. Take a sufficiently long piece of key text (about five times the length of the message to be encrypted), which will never be used again for another message, and strip everything except letters and numerals. Do the same with the message. Additionally, the message must use a 24-letter alphabet, so convert all J’s to I’s and all Q’s to K’s, for instance. Any numerals in the message are to be encoded as letters; for instance, 0 = A, 1 = B, etc.
  2. If a key text-derived alphabet is to be used (highly recommended, for it makes the ciphertext non-malleable), proceed to generate it this way:
    1. Take the first sentence of the key text and start writing different letters in the order they appear, using the substitutions above so the alphabet contains only 24 letters.
    2. When we come to the end of the sentence, write the rest of the letters in reverse order.
    3. Place that alphabet at the top, left, and right of an otherwise normal Tabula Recta.

Example: for key text “It was the best of times, it was the worst of times.” the alphabet obtained is: ITWASHEBOFMRZYXVUPNLKGDC, leading to this Tabula Recta, which also contains digits on the left in order to convert between decimal and base 24:

        I T W A S H E B O F M R Z Y X V U P N L K G D C
        -----------------------------------------------
00: I | A B C D E F G H I K L M N O P R S T U V W X Y Z | I
01: T | B C D E F G H I K L M N O P R S T U V W X Y Z A | T
02: W | C D E F G H I K L M N O P R S T U V W X Y Z A B | W
03: A | D E F G H I K L M N O P R S T U V W X Y Z A B C | A
04: S | E F G H I K L M N O P R S T U V W X Y Z A B C D | S
05: H | F G H I K L M N O P R S T U V W X Y Z A B C D E | H
06: E | G H I K L M N O P R S T U V W X Y Z A B C D E F | E
07: B | H I K L M N O P R S T U V W X Y Z A B C D E F G | B
08: O | I K L M N O P R S T U V W X Y Z A B C D E F G H | O
09: F | K L M N O P R S T U V W X Y Z A B C D E F G H I | F
10: M | L M N O P R S T U V W X Y Z A B C D E F G H I K | M
11: R | M N O P R S T U V W X Y Z A B C D E F G H I K L | R
12: R | N O P R S T U V W X Y Z A B C D E F G H I K L M | Z
13: Y | O P R S T U V W X Y Z A B C D E F G H I K L M N | Y
14: X | P R S T U V W X Y Z A B C D E F G H I K L M N O | X
15: V | R S T U V W X Y Z A B C D E F G H I K L M N O P | V
16: U | S T U V W X Y Z A B C D E F G H I K L M N O P R | U
17: P | T U V W X Y Z A B C D E F G H I K L M N O P R S | P
18: N | U V W X Y Z A B C D E F G H I K L M N O P R S T | N
19: L | V W X Y Z A B C D E F G H I K L M N O P R S T U | L
20: K | W X Y Z A B C D E F G H I K L M N O P R S T U V | K
21: G | X Y Z A B C D E F G H I K L M N O P R S T U V W | G
22: D | Y Z A B C D E F G H I K L M N O P R S T U V W X | D
23: C | Z A B C D E F G H I K L M N O P R S T U V W X Y | C

If the Tabula Recta is not to be based on the key text, place the straight 24-character alphabet at top, left and right.

  1. Break up the key text into blocks comprising four rows, as seen above. Let’s say we use rows of 8 characters each which yields the result shown above. There is no need to delete the digits or reduce the alphabet to 24 letters.
  2. Optionally, shuffle each row with a constant period, also as shown above. The row length (8 in our case) is the least common multiple of the shuffle periods. We won’t shuffle for this example but it is recommended because it improves the statistical randomness of the keystream. Usually one of the rows will be left unshuffled, and the rest shuffled by short, different periods that divide the row length exactly (say, 2, 2 and 4 in our example).
  3. Now take the letters (and digits) in vertical groups of four letters and compute a result for each group this way, starting from an initial value of zero:
    1. If the second, third, or fourth letters follow the first in alphabetical order (digits come before letters), add 6 for each.
    2. If the third or fourth letters follow the second in alphabetical order (digits come before letters), add 2 for each.
    3. If the fourth letter follows the third in alphabetical order, add 1.
    4. If at least two letters or digits are the same, skip that group.
    5. Convert each result to base 24 letters using the pattern on the left side of the Tabula Recta. For instance: 2 = W, 13 = Y, etc.
  4. The result of the previous step is the keystream, which is in base 24. We repeat the steps above until the number of keystream digits produced is at least the length of the message. The last step is to subtract the message from the keystream, letter by letter from left to right using the Tabula Recta, stopping when the last message letter is used. Subtraction with the Tabula Recta is done this way: find the second letter (the one being subtracted) on the top and go down that column until the first letter is found, then follow that row to left or right to find the result. Examples with the given Tabula Recta: T – R = E, O – I = Y, O – P = K, A – R = Y. The result is the ciphertext.

To decrypt the ciphertext the recipient, who knows the key used for encryption, performs steps 1 through 5 exactly like the sender, except for encoding the message. He/she now subtracts the ciphertext from the keystream using the Tabula Recta, resulting in the original message (after reduction to 24 characters and numbers encoded as letters).

As you can see, LetterPad involves fewer steps than DicePad, but keystream generation is a little more involved. Subtraction with a Tabula Recta is also a little more complicated. On the other hand, the length involved is half, which may more than make up for the additional complexity. In my tests, I have found that LetterPad seldom needs shuffling of the key text in order to produce a keystream of good randomness, while DicePad sometimes does (we’re talking about 2-character Chi-squared statistic for messages containing thousands of characters; no problems appear for messages in the hundreds of characters) probably due to involving a larger number of uncorrelated symbols, so LetterPad has an edge here.

The Javascript versions of DicePad and LetterPad include an optional step after the keystream has been generated, consisting of a single-row Lagged Fibonacci Generator (LFG) in order to further improve the randomness of the keystream. To do this, repeat the last digit or letter of the keystream below the first and add base 6 (DicePad) or subtract first from second (LetterPad) every two numbers or letters in the vertical direction, placing the result on the right of the bottom number or letter until the bottom row is filled. That bottom row becomes the new keystream. The LFG step helps to improve the statistical quality of the keystream if the key text used is too short so it it is repeated at least partially. Its effect, like that of shuffling, is most noticeable for very long messages.

Let me finish this post with some musings concerning the security of these two ciphers. In order to do this, I will put on my hacker’s hat a pretend that I’m trying to break an encrypted message in various ways. I know the encryption method used but I don’t know the key text. Sometimes I may actually know the plaintext, and then what I’m trying to do is change the ciphertext so it decrypts to something else, or try to guess the source of the key text so I can break other encrypted messages.

  1. Brute force. This can be done in two different ways:
    1. I try every possible key text until I find one that “works,” meaning that the plaintext resulting from subtracting the ciphertext from the keystream derived from the guess makes sense in English or whatever language I suspect it was written in.
    2. I try every possible plaintext until the keystream resulting from subtracting the plaintext from the ciphertext “makes sense” as something that has derived from common text, possibly because of flaws in the keystream-generation process.

Method 1 is not going to work because the key text space is extremely large, much larger than the plaintext space, and has more entropy than  a random string of plaintext length. Per Shannon’s one-time pad theorem, I’ll always be able to find a key text that leads to any plaintext of the given length at all. Method 2 fares a little better because the keystream is generated by blocks. Say I know the block length; then I’ll try likely plaintext blocks of this length and obtain the corresponding piece of keystream by subtracting the plaintext from the ciphertext. That keystream piece likely will look like gibberish, but if it isn’t perfectly random and I know that the keystream-generating process has some flaws (for instance, some residual correlation between consecutive letters, as manifested by the two-letter chi-squared value), then I might be able to tell a bad guess, where the “keystream” piece obtained passed the 2-letter chi-squared test, from a decent one, where the result of the test is what I expect. The problem with this approach is that the blocks most likely won’t be long enough to provide meaningful statistics. Even in base 6, the minimum row length I’ll need is of the order of 6 x 6 x 5 = 180, which is unlikely to be used because it is too long. If care was taken to eliminate all correlation in the keystream (say, by appropriate shuffling of of sufficiently long key text blocks), then this method will never yield anything.

  1. Known plaintext. Let’s say I know the whole thing; can I get the key text, and so find its source so I can crack any subsequent ciphertexts from the same source? This would first require obtaining the keystream, which would be trivial if I knew the Polybius square (DicePad) or the headings of the Tabula Recta (LetterPad), which is the situation if the straight alphabet is used instead of one derived from the key text. If the alphabet derives from the key text this would be quite difficult, since I have no way to know which encoding is the correct one until the key text is found. But let’s say I make a bunch of guesses and try them all. For DicePad, I’d have to make 36! = 3.72E41 guesses, for LetterPad only 24! = 6.2E23. In any case, these are very large numbers. Even if a computer manages to check a billion every second, checking all the LetterPad alphabets would take 620 trillion seconds, which even if it gets distributed over millions of CPUs is many times the age of the Universe. And then, after I get the keystream, I’d have to invert it in order to find the key text. But the keystream generation process loses a lot of entropy because of the skipped groups and the reduction from full alphabet to essentially three or four symbols subject to a permutation, meaning that I’d have to make further guesses concerning those in order to set up those permutations. Very quickly I’ll find myself trying every possible key text, which is equivalent to brute force and, as we saw above, doomed to fail.
  2. Change plaintext. But what if I don’t want to invert the keystream but only change the plaintext? This would be trivial if the straight alphabet has been used, but otherwise I won’t know how to subtract the plaintext, as we saw above. Even worse, I must be right on the first try, so guessing is not an option. Might as well forget it.
  3. Reused key text. This means that I manage to get the sender to use the exact same key text for two different messages, which is something that should never be done. The keystream, therefore, will be identical, and I can subtract it out by subtracting the two ciphertexts, leaving the sum of the two ciphertexts. If the default square or straight alphabet were used, I only have to make educated guesses as to what words might be in one of the messages. Then I place the word at different shifts from the start and subtract it from the combined plaintexts. If the word is there and I have guessed the right location for it, the result will also be a word, or part of it. Then I can complete the word and subtract it, giving me another word, and so forth until I have recovered both plaintexts. If, on the other hand, a key-derived square or alphabet was in use I would have a lot of trouble separating the two plaintexts, for I would not know how to encode my guesses (DicePad) or would not be able to make the subtraction correctly to begin with (LetterPad).
  4. Errors. I change a few letters here and there in the ciphertext so the recipient cannot decrypt it (actually, not much of an attack, but it may happen spontaneously; the sender or the recipient may also make mistakes that amount to the same). If this were a typical computer-based cipher, any change typically leads to a complete decryption failure, but here a single-letter error on the plaintext or ciphertext does not propagate to the rest, which remains unaffected. A switched letter on the key text will lead to a single-letter error if there is no change on whether the group using that letter gets used or rejected by the keystream-generating algorithm, but will garble the rest of the block if it does. An addition or omission may garble the rest of the message after that point, but the recipient will be able to detect it and possibly correct the error so the rest of the message can be decrypted.
  5. Physical break-in. I enter the sender’s and the recipient’s houses and record the names of all the books in their possession. Since both must have the same key text, any repeated books will give me an excellent indication of what book they are using to draw the key text from. Unless, of course, they are not using any of those but rather an equally available Internet source, a newspaper, or some other source I cannot detect so easily. I might have no choice but to actually tie one of them to a chair and apply some physical persuasion until the title is revealed, at which point I will have killed all future usefulness of my efforts.

Three new ciphers from the early XIII century

Back in 1202, the Italian mathematician Leonardo Bonacci, also known as Fibonacci, included in his book “Liber Abaci” (Book of Calculation) a sidebar illustrating how quickly rabbits breed. It seems that his primary goal, in addition to raising some awareness about the population explosion experienced by those animals back then, was to show how Indo-Arabic numbers (which for the first time included the zero) could be used in a calculation of practical importance.

The rest is history. The zero caught on and the sidebar calculation, which became known as the “Fibonacci sequence,” occupied mathematicians for the next eight centuries. We don’t know, however, what exactly happened to the rabbits; but their population growth must have been checked somehow, otherwise now we’d be swimming in a sea of rabbits hundreds of meters deep.

One of the things that the Fibonacci sequence is good for is to generate a series of apparently random digits, if we only keep the last digit of every operation. This can be used for encryption, although it has to be done right. Well, after a couple of false starts, which you can read about in this article, I think I finally cracked it, and the result is three new ciphers: “Numeracci”, “Letteracci”, and “Subtracci.”

Numeracci and Letteracci are very similar to BookPad and TripleText, respectively. The difference is that, while the latter two use a large piece of text, possibly taken from a book, as encrypting key (and then the key text must be three times the length of the plaintext), the first two can use a much shorter key and still give good security, thus reducing the amount of calculation required. Subtracci is a variation on Letteracci that can be done much quicker by hand, and is therefore the counterpart of Snake.

Now, using a key whose entropy is smaller than the entropy of a random string of plaintext length (that’s why the factor of three) means that the Fibonacci-derived ciphers are not close to theoretically unbreakable by brute force as BookPad and TripleText try to be. A powerful adversary could try every possible text of the recommended length (plaintext length plus one) and thus decipher the message and know that it is the correct answer, but since key entropy grows linearly with message length he/she/it would still have to spend an impracticable amount of computer time for anything beyond very short messages. In any case, an optional compression step, discussed at the end of this article, gets back much of the same effect for little extra effort.

I have made JavaScript versions of NumeracciLetteracci, and Subtracci, which would run on just about any browser, but they remain paper-and-pencil ciphers. I will now present the mechanics of Numeracci, which outputs decimal digits, using as example the encryption of the plaintext “Hello World” with the key text “It was the best of times”:

  1. Encoding set-up. Since we need to convert the plaintext into decimal digits, we need an encoding table. Numeracci uses the same encoding method as BookPad, which involves a straddling checkerboard for all letters plus the space and a generic punctuation mark (total of 28), with the letters in “Arsenio” and the space being replaced by a single digit, and the rest by two digits. The checkerboard could be the default (unscrambled “arsenio+” at the top, followed by the rest of the letters in alphabetical order), or one that derives from the key text itself (usually just the first sentence: “arsenio+” letters and the rest as they appear in the sentence, followed by the period and those that do not appear in reverse order).
  2. The key-derived checkerboard is recommended, as we discuss below, but we’ll use the default checkerboard for the example, resulting in these encodings:

plaintext = 96499997806729993 (17 digits)

key text = 603806138039648914303879480369043 (33 digits)

  1. Keystream generation. Here’s where the Fibonacci series is used. We first take a piece of the encoded key text that is one digit longer than the encoded plaintext. Therefore, we take the first 18 digits = Now we take the last digit and place it below the first digit add them neglecting any carries (6 + 4 = 0), and write the result on the right of the last-written digit. Then we keep adding the digits on the first and second rows and writing them on the right of the last one until the row is complete. This is the result (seed digits are bolded):
 60380613803964891
 40031178199217198

We could take the last row as keystream, but tests show that its randomness is poor, so we do this for one more row (1st digit is the sum of the last two in 1st  and 2nd rows), resulting in this:

 60380613803964891
 40031178199217198
 93336785343245232

And we take the last row as keystream = 93336785343245232

  1. We obtain the ciphertext by subtracting the encoded plaintext from the keystream, again without carries, which can be done from left to right. This is the result:
93336785343245232 – 96499997806729993 = 07947898547526349

So we take that and send it out: 07947 89854 75263 49. The recipient performs steps 1 to 3 (except plaintext encoding, obviously) using the key text, which he/she knows, resulting in the same keystream. Then he/she subtracts the ciphertext from the keystream, resulting in the encoded plaintext:

93336785343245232 – 07947898547526349 = 96499997806729993

And finally he/she decodes this back to the plaintext, using the checkerboard (the spaces are encoded):  96499997806729993 = “hello world”

 

Now that we’ve got some momentum, let’s take advantage of it and describe Letteracci. This time the sums and subtractions will be done directly on letters, so the user’s knowledge of decimal operations won’t be of much help. Therefore, we need to set up a special table for sums and subtractions, which fortunately was invented a long time ago. Its name is Tabula Recta (“straight table,” in Latin). If a straight alphabet (no substitutions) is in use, this table is:

    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
    ---------------------------------------------------
A | A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
B | B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
C | C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
D | D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
E | E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
F | F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
G | G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
H | H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
I | I J K L M N O P Q R S T U V W X Y Z A B C D E F G H
J | J K L M N O P Q R S T U V W X Y Z A B C D E F G H I
K | K L M N O P Q R S T U V W X Y Z A B C D E F G H I J
L | L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
M | M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
N | N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
O | O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
P | P Q R S T U V W X Y Z A B C D E F G H I J K L M N O
Q | Q R S T U V W X Y Z A B C D E F G H I J K L M N O P
R | R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
S | S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
T | T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
U | U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
V | V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
W | W X Y Z A B C D E F G H I J K L M N O P Q R S T U V
X | X Y Z A B C D E F G H I J K L M N O P Q R S T U V W
Y | Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
Z | Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
    ---------------------------------------------------
    a z y x w v u t s r q p o n m l k j i h g f e d c b

To add two letters, find the first letter on the left bar and the second on the top, then follow the respective row and column to find the result at the intersection. Example: R + H = Y. To subtract, find the first letter on the left bar and the second (the one being subtracted) on the bottom, then follow the respective row and column to find the result at the intersection. Example: R – h = K.

Often we will use a key-derived Tabula Recta rather than the straight one. We start by making a scrambled alphabet this way: take every new letter found as the first sentence of the key text is read; then add the rest of the letters (those not found in that sentence) in reverse alphabetical order. If the key text is “It was the best of times”, the scrambled alphabet will be:

ITWASHEBOFMZYXVURQPNLKJGDC

And then we make the Tabula Recta by shifting that to the left by one letter per row. The bottom line, used for subtractions, is made this way: first letter of the scrambled alphabet, followed by the rest of the letters in reverse order. This is the resulting Tabula Recta:

    I T W A S H E B O F M Z Y X V U R Q P N L K J G D C
    ---------------------------------------------------
A | I T W A S H E B O F M Z Y X V U R Q P N L K J G D C
B | T W A S H E B O F M Z Y X V U R Q P N L K J G D C I
C | W A S H E B O F M Z Y X V U R Q P N L K J G D C I T
D | A S H E B O F M Z Y X V U R Q P N L K J G D C I T W
E | S H E B O F M Z Y X V U R Q P N L K J G D C I T W A
F | H E B O F M Z Y X V U R Q P N L K J G D C I T W A S
G | E B O F M Z Y X V U R Q P N L K J G D C I T W A S H
H | B O F M Z Y X V U R Q P N L K J G D C I T W A S H E
I | O F M Z Y X V U R Q P N L K J G D C I T W A S H E B
J | F M Z Y X V U R Q P N L K J G D C I T W A S H E B O
K | M Z Y X V U R Q P N L K J G D C I T W A S H E B O F
L | Z Y X V U R Q P N L K J G D C I T W A S H E B O F M
M | Y X V U R Q P N L K J G D C I T W A S H E B O F M Z
N | X V U R Q P N L K J G D C I T W A S H E B O F M Z Y
O | V U R Q P N L K J G D C I T W A S H E B O F M Z Y X
P | U R Q P N L K J G D C I T W A S H E B O F M Z Y X V
Q | R Q P N L K J G D C I T W A S H E B O F M Z Y X V U
R | Q P N L K J G D C I T W A S H E B O F M Z Y X V U R
S | P N L K J G D C I T W A S H E B O F M Z Y X V U R Q
T | N L K J G D C I T W A S H E B O F M Z Y X V U R Q P
U | L K J G D C I T W A S H E B O F M Z Y X V U R Q P N
V | K J G D C I T W A S H E B O F M Z Y X V U R Q P N L
W | J G D C I T W A S H E B O F M Z Y X V U R Q P N L K
X | G D C I T W A S H E B O F M Z Y X V U R Q P N L K J
Y | D C I T W A S H E B O F M Z Y X V U R Q P N L K J G
Z | C I T W A S H E B O F M Z Y X V U R Q P N L K J G D
    ---------------------------------------------------
    i c d g j k l n p q r u v x y z m f o b e h s a w t

If you find it tedious to fill the whole table and have a straight table already filled, you can leave the body as in the straight table, and add one step to each sum or subtraction, consisting in locating on the second row the letter found at the intersection, then go one step up to find its equivalent in the scrambled alphabet. Subtracci, described below, simplifies this step considerably by using a Tabula Recta where only the headings change.

Now that we have seen what the Tabula Recta is and how it is used, let’s see the process, illustrated with the same example as before (plaintext: “Hello World”).

  1. Tabula Recta set-up. We have just covered how to do that, so we will assume we are using the key-derived table.
  2. Keystream generation. First take a piece of the key text containing as many letters as the plaintext (no spaces or punctuation), plus one, which we will call the seed. Since the plaintext would be “HELLOWORLD” (10 letters), we take 11 letters from the key text: “ITWASTHEBES”. Now pass the last letter of the seed to a location right below the first, and add them, using the Tabula Recta: I + S = Y. Write this directly to the right of the last-written letter and keep adding the letters on the first and second row until the row is filled. This is the result (seed letters are bolded):
 ITWASTHEBE
 SYHTTNYNGD

We could take the last row as keystream, but tests show that its randomness is not perfect, so we repeat the process for one more row (first letter is the sum of the last two in rows one and two), reaching this:

 ITWASTHEBE
 SYHTTNYNGD
 WLPCPZDJFU

And take the last row as keystream:  WLPCPZDJFU

  1. Finally we subtract the plaintext from the keystream letter by letter, using the Tabula Recta (the letters to be subtracted are entered at the bottom line). Resulting in this:
WLPCPZDJFU – HELLOWORLD = QHKOBGKNZJ

So we take that and send it out: QHKOB GKNZJ. The recipient, who knows the key text, will perform steps 1 and 2 exactly as we did, resulting in the same Tabula Recta and the same keystream, but then will subtract the ciphertext from the keystream, resulting in the original plaintext (minus spaces and punctuation) like this:

WLPCPZDJFU – QHKOBGKNZJ = HELLOWORLD

Subtracci is a variant of Letteracci that makes human calculation quite a bit faster than what you see above. The idea is to use the Tabula Recta always to subtract, which is actually easier since it involves starting at the top and going down until a certain letter is found, and then to the left or right following that row, rather than working out an intersection by starting at the top and at one side at the same time. It turns out that you can also do a LFG by subtracting rather than adding, and its statistical properties are just as good. On top of that, the body of the Tabula Recta is always the same even if a key-derived alphabet is used, thus eliminating the tedium of having to make a new one for each message. You can find a Javascript version of Subtracci at this link, which also explains the process step by step.

Which cipher is better, Numeracci, Letteracci, or Subtracci? This is likely a toss-up. Numeracci has one additional step because every piece of text must be encoded into numbers, and then decoded upon decryption. This can get tedious, but on the other hand the math can be done very quickly with decimal digits instead of the rather special Tabula Recta. But again, Subtracci makes using the Tabula Recta almost as fast as adding figures in your head.

The three ciphers are quite comparable in terms of security, and there a few considerations to bear in mind:

  1. Default encoding or Tabula Recta vs. key-derived. Deriving this from the key takes extra effort, especially if you are writing a whole new Tabula Recta, though there are ways to save some work as we saw earlier. On the other hand, a key-derived encoding or Tabula Recta makes it very difficult to obtain the keystream from the ciphertext if the plaintext is known (known plaintext attack). In Letteracci/Subtracci, it also makes it next to impossible to reverse the keystream back to the seed (we saw in this article that this is quite easy to do) because there is an unknown substitution (since it depends on the key) involved in every sum. Cryptographer Manuel Blum proposes a very similar trick to generate a unique password for any given web site, involving a memorized secret permutation, equivalent to our scrambled alphabet, plus a mental calculation that looks very similar to our lagged Fibonacci generator.
  2. Malleability. As stream ciphers, all three ciphers presented here are malleable in principle, meaning that a powerful adversary could change the encrypted plaintext (if it is known) without knowing the key text. But this becomes impossible if key-derived encoding or Tabula Recta are in use, especially since the adversary has to get it right on the first try. Conclusion: use key-derived encoding or Tabula Recta if at all possible; the increase in security is well worth the little extra effort involved.
  3. Protection against brute force. None of the ciphers comes close to a one-time pad in this category (or to BookPad or TripleText, for that matter) because the key is only slightly longer than the plaintext, and therefore its entropy falls way short of that of a random string of plaintext length, which is what guarantees perfect protection. But it is still a lot of entropy. If we consider that common English has around 13 bits of entropy per word, since in our example the key text used was “It was the bes”, this comes from 4 words, or 52 bits of entropy. A perfectly random string of letters (case-insensitive, no spaces) has 4.7 bits of entropy per letter, so that would be 47 bits for the plaintext in the example, which means that our example ciphertext is perfectly undecipherable (without the key) in practice. For a longer plaintext, the random entropy quickly outstrips the entropy contained in the words, but there is still a lot of it in those words. At 1.58 bits/letter, a 50-letter plaintext (51-letter key) would require a key that has 51 x 1.58 = 80.6 bits of entropy. The computer-standard 128 bits would be reached when the plaintext is 128/1.58 – 1 = 80 letters, and would only get larger as the plaintext gets longer. Still, an additional compression step, described below, can increase the entropy of the key to nearly one-time pad standard.
  4. Short keys. The Fibonacci-based ciphers are much more resistant to the bad effect of short keys than those based strictly on the entropy of the key text. Tests show that, if the key is longer than one-tenth the length of the plaintext, the keystream generated in Numeracci normally will not repeat; the factor is one-twentieth for Letteracci or Subtracci. Anything shorter, though, and the statistical randomness of the keystream goes down the tubes, to say nothing of the fact that it becomes periodic. But BookPad and TripleText are not nearly as tolerant, and things begin to break down if the key text is not at least twice as long as the plaintext. You can see this for yourself because the JavaScript code for those has a little input for changing the key text to plaintext length ratio, plus you can also input short keys, which are automatically repeated to reach the required length.
  5. Insufficient computation. One might be tempted to take the second row of the lagged Fibonacci calculation as keystream, thus saving half of the math effort invested on this step. Unfortunately, the result retains some of the correlation between letters (or numbers, after encoding) of the original key text, which shows as a poor score in the two-digit Chi-squared test. The effect goes away on the third row and thereafter, so that computing more than three rows doesn’t really add any security. The third row also has the nice property that all the digits or letters of the seed affect all the digits or letters of that row, and vice-versa, changing just one digit or letter in the row would mean a complete change in all the digits or letters in the seed for anyone trying to reverse the computation. You can see this for yourself by running the JavaScript codes, which have an input to change the number of rows of the lagged Fibonacci calculation.

Ok, so what is this compression thing that I have mentioned twice already, which can help a lot against brute force? Simply this: rather than a piece of key text that is only one digit or character longer than the plaintext, take a piece that is three times as long as that, then compress it to plaintext length so it can be used as seed for the LFG. Since a piece of English text contains 1.58 bits of entropy per character, the result, if compressed without entropy loss, will have the 4.7 bits/character of true random text. The trick is to compress without losing entropy, but here Shannon’s original paper on the entropy of language comes to the rescue. Early in the paper, Shannon describes a method to represent text by using fewer characters. In essence, the first few characters of each word are written until a subject well acquainted with the language is able to guess the rest. This method is what ultimately yields the experimentally derived 1.58 bits/character of entropy for common text, which comes from being able to compress text by a factor of three in actual tests. Since the average length of a word in English text is 4.5 characters, compressing by a factor of three means reducing this to 1.5 characters/word. That is, some words will compress to a single character, some to two. A way to achieve this without involving an actual subject is to take one character for words shorter than the average (1 to 4 letters), and two for the rest. In the programs posted here, and here, the first letter of words less than five letters long is taken when compression is enabled, and for the rest we take the first and last letters, since other studies show that those are the most essential for comprehension. Spaces and punctuation usually don’t add comprehension and are ignored.

For instance, the key text “It was the best of times, it was the worst of times”  becomes: I W T B O TS I W T WT O TS (spaces added to make it easier to read). Clearly, this is not the same compression that Shannon used since it is not truly reversible and therefore loses some entropy, so I call it “pseudo-Shannon” compression. I have performed tests that show that the key text is indeed shortened by a factor slightly above three, both in English and in Spanish, and although the resulting seed fails randomness tests and therefore is not suitable to be used directly as keystream, the result after the LFG process passes them quite easily. This way, those seeking to crack the encryption by brute-forcing the key will have to look through a keyspace with the same entropy as that of a random keystream of plaintext length, which ensures unbreakability by Shannon’s criterion.

To finish up, bear in mind that neither Numeracci, Letteracci, nor Subtracci (nor BookPad or TripleText, for that matter) contain anything that had not been invented by the beginning of the XVI century, perhaps as early as Fibonacci’s time, three centuries earlier. That was the time when the so-called Vigenère cipher (though Blaise de Vigenère wasn’t related to it), and its cousin the running-key cipher were invented. It was also the time of the Black Chambers in England and other countries, specializing in the cryptanalysis of those ciphers, with important historical consequences. Had anything like what I’ve described in this and other articles been available at the time—and there’s no reason why it couldn’t have been, as far as the math goes—the history of the world might have been quite different indeed for the last 500 years.

Chew on it as you try your hand with these relatively simple paper and pencil ciphers.

Extracting randomness from text

My BookPad cipher seems to be closely related to the running key cipher, since both take a long piece of text from a book and use it as key to encrypt a plaintext. Yet while the running key cipher can be broken easily, BookPad offers a level of security comparable to that of a one-time-pad. In this article, I try to explain why in layman’s terms. As a bonus, I introduce TripleText, a variant of BookPad where all the operations are done directly with letters.

A little spoiler: not much prevented the guy on the left from discovering this, back in the early XVI century. Had he discovered it, history might have turned out quite different.

This JavaScript version of BookPad displays the result of several randomness tests performed on the keystream right before it is used to encrypt a message, with and without using a lagged Fibonacci generator (LFG) step, or a key text shift. Big surprise: when a piece of real text is placed in the key text box, the keystream usually passes all the randomness tests even if the process of randomizing the digits by means of a LFG is skipped. This means that a user may be able to skip half the work and still get something approaching perfect security.
First, let’s try to define what “passing the tests” means. To do this, I headed to random.org, which provides strings of numbers from true random sources such as atmospheric noise, and got 1000 integers in the 0 to 999,999,999 range. I added the missing zeroes on the right of some numbers by hand in order to obtain a string of 9000 supposedly perfectly random decimal digits, which I won’t print here. Then I pasted it into the key text box of BookPad, opened up a JavaScript console, and applied several tests to this box. The tests were:

  1. Chi-squared statistic of single digits, compared to a uniform distribution (probability for each value = 1/10). Since a decimal digit string has 9 degrees of freedom (10 -1), theory predicts that a value of less than 14.7 means less than 10% probability that the stream is not random (see this table, for instance). The sample gave 11.65
  2. Chi-squared statistic of pairs of digits, again compared to a uniform distribution (probability for each pair = 1/100). Here the degrees of freedom are 9*9 = 81, so to have less than 10% probability that the stream is not random, the statistic should be less than 97.7. The sample gave 105.57
  3. Durbin-Watson statistic, which estimates the correlation between pairs of consecutive digits. If there is no correlation the value should be 2.0. Less than that, down to a minimum of 0, means that they are negatively correlated; more than that, up to a maximum of 4, means they are positively correlated. The sample gave 2.01
  4. Shannon entropy, which looks at single digits and determines how much unpredictability each one has, in bits per digit, up to a maximum of  log(10)/log(2) = 3.32193 for perfect randomness. This is related to the chi-squared statistic because it is derived strictly from single-digit frequencies. The sample gave 3.32099 bits/digit
  5. Runs test. Here we look at how often the digits switch from low (0 to 4) to high (5 to 9), or vice-versa. A perfectly random sequence, where each digits is truly independent of the previous digits, would switch every two digits on the average, but one that is not random may switch more or less often due to lingering correlations between consecutive digits. The sample gave 2.0008 digits/run

So, the “true random” sample passed all the tests (at 10% confidence, for the chi-squared tests) except the two-digit chi-squared test, but this one was close. It is likely that the sample was not long enough for this test, which is extremely sensitive, to come down to its expected value for infinitely long samples, so we’ll take a value less than 105 to be acceptable for a sample of less than 10000 digits.

So how does a piece of real test fare? I headed to gutenberg.org and got the text version of “Oliver Twist,” by Charles Dickens. I copied chapter 1 (no headings) into the plain text box of BookPad, resulting in a decimal-encoded text of length 8220 digits. I need at least three times that length in the key text box, so I copied chapter 2 (again no headings) into that box, which the JavaScript version of BookPad took as sufficiently long. Here are the statistics for the keystream generated from the key text with the default “Arsenio” encoding and no shift being used, before the lagged Fibonacci step, and after:

  • Before LFG: chi-squared = 12.476. Two-digit chi-squared = 114.75. DW statistic = 1.979. Shannon’s entropy = 3.3208 bits/digit. Runs = 1.9985 digits/run.
  • After LFG: chi-squared = 8.6155. Two-digit chi-squared = 118.93. DW statistic = 2.009. Shannon’s entropy = 3.3211 bits/digit. Runs = 2.0112 digits/run.

Both streams passed the same tests as the true random sample, missing two-digit chi-squared by a little more, but not a lot (possibly because the text sample is a little shorter). The LFG-processed keystream actually beat the “true random” sample in three tests.

In other words, they may not be truly random, but we can’t tell the difference. Even before the LFG step, which essentially doubles the amount of computation, the keystream is good enough for the purpose of encrypting a message by the one-time pad method.

I’ve performed the same test with a number of different texts (not always taken from Dickens) and in different languages: English, Spanish, German (lots of tildes and umlauts eliminated before encoding into numbers, in the last two). The result is always the same, except that oftentimes the keystream before LFG actually beats the one after the LFG. Therefore, I have revised the official description of the BookPad cipher to remove the LFG. The matching JavaScript code can be found here.

Now, how can this be possible? One of the first things you learn in spy school is to never reuse a one-time pad, because then an adversary can subtract the two ciphertexts, thus eliminating the keystream and obtaining as a result the difference between the two encoded plaintexts, which can then be attacked by statistical methods. In other words, the combination of two encoded plaintexts is expected to be sufficiently non-random to allow statistical attack. But BookPad takes the encoded key text, splits it into three parts of equal length, and simply adds them together (digit by digit, without carries) before the LFG step. So it seems that adding or subtracting two pieces of encoded text leaves statistical signatures in the result, but adding three pieces does not, if we have to believe the test results. How can this be?

Indeed, just encoding the key text does not make it random. The chi-squared statistic of Oliver Twist’s chapter 2, after encoding (29549 digits), is 5629.28 and the two-digit chi-squared is 28898.48. Durbin-Watson looks a little better at 1.9615, Shannon entropy is 3.1845 bits/digit, and runs take 2.10924 digits/run. If we split it into two halves and do a mod 10 addition of those, the chi-squared of the result is a much-lower-but-not-quite-low-enough 63.0326, the two-digit chi-squared drops to 330.27, Durbin-Watson is 2.02, Shannon entropy 3.3188 bits/digit, runs is 2.033 digits/run. If we subtract the parts rather than add them, we get statistics similar to those of the addition. That is, it is still possible to tell the result apart from random, at least through the chi-squared tests. This is the essential flaw of the running key cipher, so similar to BookPad, except that the key text is used straight, without trying to concentrate its entropy in any way. So straight encoded text, or text added or subtracted to another text retain exploitable statistics from the original texts, and yet if we involve three pieces of encoded text instead of two, the result appears to be as statistically random as a true random source. How come?

Here’s the way I would explain it. It is by no means a mathematically acceptable proof, but I think it captures the essence of what is going on:

Recall that we had decided to use a piece of key text that is three times the length of the message (after encoding) because of this calculation: 3.32193 (entropy in bits/digit of a random string of decimal digits) x 1.33 (digits/character in encoding) / 1.58 (average entropy of text in bits/character) = 2.8, so we chose 3 to be sure. The idea was that the piece of key text used would have the same entropy as a decimal random string of length equal to that of the encoded message, so we could use it to make a sort of one-time pad. Since the straddling checkerboard encoding we are using produces about 1.33 digits/character (determined experimentally), we wanted to satisfy this formula: 3.32193 x (1.33 x message length) = entropy needed = entropy supplied = message length x factor x 1.58 (average entropy of text) in order to find the factor indicating how many times longer (before or after encoding) the key text must be, relative to the plaintext.

In other words, a piece of key text three times the length of the plaintext (before or after encoding) is expected to have enough entropy to generate a keystream for that plaintext length that would be statistically indistinguishable from a true random keystream. This is what we observe. Typically,randomness extraction (this is the technical term for what we are doing) requires the use of a one-way function or hash possessing several strict properties, but here we are achieving a comparable result with a simple sum.

The LFG step in BookPad is intended to spread the effect of any single-digit changes in the sum to the complete keystream, which is one of the important properties of a hash, plus smoothing non-uniformities that might be present in the sum, and indeed it does improve the keystream considerably if the key text is too short. If we use Oliver Twist’s chapter 1 as key text (same as the plaintext), chi-squared becomes 23.058 and two-digit chi-squared is 118.38 before the LFG step, but after the LFG step they become 6.6204 and 74.450, respectively. The effect intensifies as the key text shortens, and becomes especially pronounced for keys containing 10 or fewer words (which, of course, should never be used to make a one-time pad replacement), so that the keystream after LFG passes the randomness tests even though it starts repeating after a certain point.

Going back to the result of the sum before the LFG is applied, the crucial question is whether or not the process of cutting the encoded key text into three parts and adding them preserves the entropy of the key text. Let’s say its average entropy is one-third of the random value, that is, 3.32193 / 3 = 1.1073 bits/digit (it actually a little higher, as we saw earlier). Let’s further assume for the time being that consecutive digits are independent of each other (they are not, since in the encoding used consecutive digits often derive from a single letter) so that the entropy of a group of digits is the sum of the entropies of all the digits. If we take groups of three consecutive digits, each group will have an entropy of (3.32193 / 3) x 3 bits/digit, which is the same as that of a truly random decimal digit.

So let us now make a code that assigns a single-digit value to each group of three digits. Since there are only 10 different single digits but 1000 different groups of three digits (10 x 10 x 10), this code will convert 100 different groups to the same decimal digit, which is okay because each group of three isn’t found so often anyway. Any function that does this will work: sums, subtractions, more complex calculations, codebooks; what is essential is that each resulting digit is mapped to a set of groups having equal a priori probability and there are no collisions where a group is mapped to two different digits. We can come up with many such mappings, but a simple sum of the digits in the group, mod 10, does the trick just fine. This is what is done in BookPad.

In fact, in BookPad we are not adding three consecutive key text digits, which might not be independent of each other, but digits separated by the length of the encoded plaintext. It is highly unlikely that any correlation would extend that far in a real text, so we can safely assume that the digits to be added are indeed independent. Therefore, each digit after the addition potentially has as much unpredictability as three digits before the addition, potentially resulting in full random entropy.

If instead of cutting the text into three pieces we had cut it into two, which we then added together, we’d be talking about groups of two digits, each having entropy of 2 x 1.1073 = 2.2146 bits/group. If we map this to single digits by modular addition, each digit would have 2.21 bits of entropy, which is less than truly random, and it shows in the statistical tests.

But now, and here’s the kicker, if we add or subtract to this the encoded plaintext, which has the same average entropy per digit, the resulting ciphertext has a full 3.32 bits/digit of entropy. In other words, the ciphertext is statistically indistinguishable from a random string and cannot be decrypted except by brute force. This would be possible, however, because the key text is only twice as long as the plaintext and does not contain as much entropy as a random string of plaintext length, so it is possible to tell, in principle, whether a guess at the plaintext is correct or not. If it is correct, the keystream deduced by subtracting plaintext and ciphertext will have less than perfect randomness. If a triple-length key text, later divided into three parts and added together is used, however, brute force is of no avail either, for then one would always be able to find a key for every tentative plaintext one can think of, which combined with it will yield the ciphertext at hand.

On the other hand, it is easy to prove that simply adding digits does not conserve their entropy. Let us see this with a simplified example involving binary strings made of one and zeroes. Let’s say we have a particular string with exactly twice as many zeroes as ones. The the probability of a zero is 2/3 and that of a one is 1/3. Now let us cut the string in two pieces and add them without carries, which for binary digits is the XOR operation where: 0 XOR 0 = 1 XOR 1 = 1, and o XOR 1 = 1 XOR 0 = 0. Then the probability of getting a zero at a given position in the result is 2/3 x 2/3 + 1/3 x 1/3 = 5/9, and the probability of getting a one is 2/3 x 1/3 + 1/3 x 2/3 = 4/9. That is, the likelihood of getting a one or a zero in the sum is closer to the 1/2 of the perfectly random sequence, but not quite. What’s worse, cutting this into halves and XORing them together again does not get us there, because then the probabilities of zero and one are 41/81 and 40/81 respectively. In fact, we can never get to a probability of 1/2 for either digit because the denominator of a probability thus calculated will always be a power of three, which can never be simplified to 2 no matter what the numerator might be. The original Shannon entropy was -2/3 x log2(2/3) – 1/3 x log2(1/3) = 0.9182 bits/digit, so if the entropy were conserved with the operation, we’d get over the 1 bit/digit of the perfectly random string (except that it cannot get any bigger than this). Instead we get – 5/9 x log2(5/9) – 4/9 x log2(4/9) = 0.9910, which is close to 1.0, but not quite. In other words, the entropy per digit of the sum is greater than that of the parts being added, but it can never reach the entropy of a perfectly random string.

This is because summing strings without carries is an irreversible operation, which loses entropy. It’s like thermodynamic entropy, only backwards; in thermodynamics, irreversible processes generate entropy rather than destroy it. If we wanted to conserve the information entropy of a string as it is shortened, we should have applied a reversible process such as data compression. The problem is that this takes a lot of processing, which isn’t the best thing for a paper and pencil cipher.

The encoding from letters to numbers used in BookPad does indeed involve a bit of compression. After all, one of the good features of a straddling checkerboard encoding is that it uses single digits for the most frequent characters and double digits for the rest, thus providing some compression that is bound to concentrate the entropy into fewer digits. Indeed, for perfectly random text containing 26 letters, the entropy per character is log10(26) / log10(2) = 4.70044 bits/letter, while for perfectly random decimal digits the entropy is log10(10) / log10(2) = 3.32193 bits/digit, that is, we need 4.700044 / 3.32193 = log10(26) = 1.415 times more digits than letters to represent the same entropy. Another way to obtain the same result is consider the length of a long integer expressed in decimal digits or expressed in base 26 (A to Z, for instance). If the number has N decimal digits, all of them nines, its value is 10^N – 1 which is just one unit short of 10^N. The same number expressed in base 26 will span a number M of letters so that 10^N = 26^M, yielding N/M = log10(26) as before.

But typically a text encoded with the Arsenio checkerboard or one of its key-derived variants only takes 1.33 times more digits than letters (including the space). This is because one-third of the letters in a typical text are contained in “arsenio” and are therefore converted to single decimal digits, while the other third is converted to double digits. Consequently, checkerboard encoding increases the length of a typical text by a factor of 2/3 x 1 + 1/3 x 2 = 4/3  = 1.33, which is what we observe experimentally. Therefore, the encoding compresses the text by a factor of 1.415 / 1.33 = 1.064, making the result a 6% shorter than it would have been with a non-compressing encoding.

Intuitively, it doesn’t look like this small amount of compression could be responsible for the amazingly good randomness of the triple text combination, but in order to dispel all doubts I made a simplified cipher where no encoding to digits is used. I call it “TripleText”, and you can find a JavaScript implementation of matching this article in this page (the most current version is here). Then I ran the same tests on its output.

Instead of adding and subtracting decimal digits as in BookPad, we are adding and subtracting the letters A to Z, which are base 26 digits. These operations can be performed easily with the help of a Tabula Recta:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
---------------------------------------------------
A | A B C D E F G H I J K L M N O P Q R S T U V W X Y Z | a
B | B C D E F G H I J K L M N O P Q R S T U V W X Y Z A | z
C | C D E F G H I J K L M N O P Q R S T U V W X Y Z A B | y
D | D E F G H I J K L M N O P Q R S T U V W X Y Z A B C | x
E | E F G H I J K L M N O P Q R S T U V W X Y Z A B C D | w
F | F G H I J K L M N O P Q R S T U V W X Y Z A B C D E | v
G | G H I J K L M N O P Q R S T U V W X Y Z A B C D E F | u
H | H I J K L M N O P Q R S T U V W X Y Z A B C D E F G | t
I | I J K L M N O P Q R S T U V W X Y Z A B C D E F G H | s
J | J K L M N O P Q R S T U V W X Y Z A B C D E F G H I | r
K | K L M N O P Q R S T U V W X Y Z A B C D E F G H I J | q
L | L M N O P Q R S T U V W X Y Z A B C D E F G H I J K | p
M | M N O P Q R S T U V W X Y Z A B C D E F G H I J K L | o
N | N O P Q R S T U V W X Y Z A B C D E F G H I J K L M | n
O | O P Q R S T U V W X Y Z A B C D E F G H I J K L M N | m
P | P Q R S T U V W X Y Z A B C D E F G H I J K L M N O | l
Q | Q R S T U V W X Y Z A B C D E F G H I J K L M N O P | k
R | R S T U V W X Y Z A B C D E F G H I J K L M N O P Q | j
S | S T U V W X Y Z A B C D E F G H I J K L M N O P Q R | i
T | T U V W X Y Z A B C D E F G H I J K L M N O P Q R S | h
U | U V W X Y Z A B C D E F G H I J K L M N O P Q R S T | g
V | V W X Y Z A B C D E F G H I J K L M N O P Q R S T U | f
W | W X Y Z A B C D E F G H I J K L M N O P Q R S T U V | e
X | X Y Z A B C D E F G H I J K L M N O P Q R S T U V W | d
Y | Y Z A B C D E F G H I J K L M N O P Q R S T U V W X | c
Z | Z A B C D E F G H I J K L M N O P Q R S T U V W X Y | b

To add two letters: find one letter on the top and the other on the left side; follow the column and row, respectively; the letter found on intersection of the two is the sum. Example: R + H = Y. To subtract two letters: find the first letter on the top and the second letter (the one being subtracted) on the right side; follow the column and row as for addition. Example: R – h = K.

The rest is very similar to BookPad. Here are the steps for encrypting:

  1. Remove all spaces, punctuation, and diacritical marks from the plaintext and the key text so only the characters A to Z remain.
  2. Measure the length of the processed plaintext and take a piece of processed key text three times as long.
  3. Split the key text piece into three parts of equal length and add them letter by letter using the Tabula Recta, which implements addition and subtraction mod 26. The result is the keystream.
  4. Now subtract the processed plaintext from the keystream using the Tabula Recta. The result is the ciphertext.

The process for decrypting a ciphertext is identical, except that the ciphertext replaces the plaintext in all steps.

If we take the 9000 true random digits that we used earlier and convert them to base 26, we obtain a string of 6360 presumably random letters (check: 9000 / 6360 = 1.415, which is the expected ratio for random letters to digits conversion). The statistical tests applied to this, slightly modified for base-26 letters rather than base-10 digits, give these results: chi-squared = 26.58113 (less than 34.4 means random with less than 10% chance of error), two-digit chi-squared = 720.0586 (less than 670.7 means random with less than 10% chance of error), Durbin-Watson = 2.0302 (should be close to 2.0), Shannon entropy = 4.69736 bits/letter (should approach 4.7 for perfectly random), runs at 1.9408 letters/run (should be close to 2.0). That is, all tests are a pass, except the two-letter chi-squared test, which does not “pass” at a less than 10% chance of error.

Does a real text cut into three parts and added together fare any worse? Should it? If the argument I made for the digits in BookPad is correct, then it seems it should, but by a narrower margin. Since the average entropy of text (spaces included, so if we take them out, the entropy should be a little higher) is measured at 1.58 bits/letter, then to get the same entropy as a random string of letters we need a piece of text that is 4.70044/1.58 = 2.975 times the length of the random string. Therefore, a triple-length string would still suffice, though not by much.

We again take the 1st chapter of “Oliver Twist” as plaintext and the 2nd chapter as key text, resulting in these statistics for the keystream: Chi-squared = 36.272,  two-letter Chi-squared = 725.51, DW statistic = 2.028, Shannon’s entropy  = 4.6949 bits/letter, runs at 1.9706 letters/run. In other words, it passes the same tests that the random text passes, and “fails” the one it fails to pass at the prescribed confidence level, with a very similar score. It is, as far as our tests can tell, indistinguishable from a truly random keystream. The cipher is not theoretically unbreakable because, even though the entropy of the key text is sufficient to guarantee this, some entropy is lost in the keystream-generation process, so the result can never be perfectly random. It does come close, though, and it only takes a piece of text that is three times as long as the random text we require, split into three parts of equal length, and added together with the help of a classic Tabula Recta.

Both the Tabula Recta (Trithemius, 1508 AD) and the running key cipher on which TripleText is based were devised back in the XVI century. Nothing prevented the cryptographers of that time from coming up with a nearly unbreakable cipher like TripleText, except perhaps scarcity of books to use as source material and lack of development of information theory (not much of it being needed, anyway). Had they invented something like this back then, history might have turned out quite different from the way it did.

Cracking the BookPad cipher

BookPad is a paper and pencil “one-time pad” cipher, described in this other post. Real cryptographers are leery of ciphers claiming to be incarnations of the unbreakable one-time pad for good reasons, the best of them being that true one-time pads must contain perfectly random numbers, which not even a computer can produce. In this post, therefore, I put on my cryptanalyst’s hat and attempt to break a longish message encrypted with BookPad.

Who will win? Find out after the break.

First, a clarification. I started writing this article with a version of BookPad that used serial numbers, which were combined with the key text to make the seed for a lagged Fibonacci generator (LFG). The research shown below concluded that serial numbers were insecure and the LFG was likely unnecessary, and so the BookPad spec was subsequently changed, along with the JavaScript program at this link. Serials were replaced by shifts and the LFG was made optional (but available), but again research showed that it was better to eliminate the shift and the LFG altogether (as well as the optional MAC, for that matter). We must, therefore, begin with some background on the components that were studied but are no longer in BookPad, using some older JavaScript code as we go along.

The whole point of BookPad is to generate a keystream starting from a piece of text, so that the keystream is as random and easy to produce as possible. Additionally, it would be nice if the resulting keystream were non-invertible, otherwise an adversary might be able to obtain the key text from the keystream, and recognizing the source end up compromising all our communications. This can be done with the help of a PRNG, which does not need to be cryptographically strong, in the sense that a sequence of bits much longer than the seed cannot be predicted from previous bits, because we are going to use it only to generate short sequences, shorter than the seed itself.

Of the various PRNGs available that can be run on paper and pencil, lagged Fibonacci generators are especially interesting because they are so easy to do by hand. LFGs are based on a variant of the Fibonacci series, where each number is the sum of the preceding two: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55. . . . To make a PRNG, we don’t take the full numbers, but rather we subtract a multiple of 10 as needed so the result is less than 10 (or we neglect to carry digits as we add), like this: 1, 1, 2, 3, 5, 8, 3, 1, 4, 5. . . . Then, a lagged Fibonacci series doesn’t add the last two numbers, but something further back. For instance, if we add the last and third to last numbers (use zero if there is nothing there), we get: 1, 1, 1, 2, 3, 4, 6, 9, 3, 9, 8, 1, 0, 8. . . which is different from the straight Fibonacci series.

A better way to do this is to write the numbers in separate rows, so we are adding the top and bottom numbers to get the next one. For instance, if each number is the carryless sum of the previous and the one five spaces before, we can set them up into rows of five like this, where the first row and the first number of the second, in bold, come from the seed (496737) that I have used to initialize the PRNG:

4 9 6 7 3
7 1 0 6 3
6 3 4 4 0
3 9 2 6 0
0 3 2 4 0

And so on. . . . You can see that adding without carries you get: 4+7=1 and I write it right of the 7, 9+1=0, 6+0=6, 7+6=3, 3+3=6 (this one goes on the next row), etc. You can check that, if you change a single digit in the seed, all the numbers in the third row change, although you may get the same if you change a different seed digit by the opposite amount, not so for the fourth row and beyond. This is a good property, which makes the result that much harder to predict. The fourth row is even better because a single change changes the result in a more complex way (with three rows, all the numbers are simply shifted up or down). But for the rest of our present discussion we are going to take as the “result” value the digits in the third row. In our example, this will be: 63440.

To qualify as a “hash” of sorts, the operation must also be one-way, so that having the result will not reveal the seed. As it stands this LFG fails this test, for it is possible to nearly reconstruct the second row from the third, and then we have this, where the letters represent unknowns with possible values from 0 to 9:

a b c d e
7 1 0 6 f
6 3 4 4 0

And then we can establish this system of equations (addition is mod 10):

a + 7 = 1
b + 1 = 0
c + 0 = 6
d + 6 = f
e + f = 6

That is, five equations with six unknowns. It is not strictly solvable, but then we can guess one of the unknowns and then get the rest. If the seed obtained this way has some recognizable pattern (because it comes from actual text), then we might be able to tell which guess out of ten possible is the good one and then the problem is solved. The problem doesn’t get any harder to solve if we take the fourth row instead of the third one, only longer, adding five more unknowns but also five more equations.

In other words, to make this non-invertible we need to remove information. One easy way is to output a string shorter than a full row. One unknown is added to the problem for every digit skipped. When the number of digits skipped is equal to the digits in the output (that is, when we output only half of the last row), our guess adds the same uncertainty as data we have. In that case, we can always reproduce the half of the last row we have by choosing a certain guess for the other half, regardless of what the seed might be, so that we can never tell what the correct seed is.

But ultimately there is no need to make this non-invertible, especially if the price to be paid for it is to throw away some of the key text, which effectively only gets used for this purpose. In  this other article, I show that the LFG might not be necessary at all since the seed fed into he LFG is as random as the result of the LFG, provided the key text was indeed sufficiently long.

In Dirk Rijmenants’s excellent article on one-time pads, he insists that real one-time pads have failed in practice not because the cipher is not inherently unbreakable, but because people committed mistakes that reduced their security. This was the case also with the Enigma machine of world war II and many other ciphers, especially if some complexity was involved on the user’s side. Well, BookPad is not free of complexities, so let us begin trying to exploit different ways people can misuse the cipher. Here’s a list:

  1. BookPad is designed so the key text is at least three times the length (as encoded into numbers) of the plaintext, but nothing prevents a lazy user from taking a much shorter key text and just repeating it several times in order to achieve the required length.
  2. BookPad is a stream cipher, and so it will be easily broken if the keystream is re-used. This can happen in several ways: re-using the key text for a message or identical length as a previous one, re-using it for a message of a different length, using a serial that is consecutive (or simply close) to a previously used serial.
  3. As all stream ciphers, an adversary in possession of the plaintext (but not the key text) might be able to change the ciphertext so it decrypts to something of his choosing. Adding a message authentication code (MAC) makes this much more difficult, but the MAC process takes extra effort and opens up the possibility of leaking information so the cipher is no longer theoretically unbreakable.

We will now look at these scenarios one by one. But first we need to have something to crack. In order to provide room for statistical analysis, we’ll encrypt something rather long. For instance, the first paragraph in the text-only version of Charles Dickens’s “Oliver Twist” at gutenberg.org, which reads:

Among other public buildings in a certain town, which for many reasons
it will be prudent to refrain from mentioning, and to which I will
assign no fictitious name, there is one anciently common to most towns,
great or small: to wit, a workhouse; and in this workhouse was born; on
a day and date which I need not trouble myself to repeat, inasmuch as
it can be of no possible consequence to the reader, in this stage of
the business at all events; the item of mortality whose name is
prefixed to the head of this chapter.

We will be encrypting this with different keys. If we use as key the second and third paragraphs from the same source, we will have enough key material. This is:

For a long time after it was ushered into this world of sorrow and
trouble, by the parish surgeon, it remained a matter of considerable
doubt whether the child would survive to bear any name at all; in which
case it is somewhat more than probable that these memoirs would never
have appeared; or, if they had, that being comprised within a couple of
pages, they would have possessed the inestimable merit of being the
most concise and faithful specimen of biography, extant in the
literature of any age or country.

Although I am not disposed to maintain that the being born in a
workhouse, is in itself the most fortunate and enviable circumstance
that can possibly befall a human being, I do mean to say that in this
particular instance, it was the best thing for Oliver Twist that could
by possibility have occurred.  The fact is, that there was considerable
difficulty in inducing Oliver to take upon himself the office of
respiration,--a troublesome practice, but one which custom has rendered
necessary to our easy existence; and for some time he lay gasping on a
little flock mattress, rather unequally poised between this world and
the next: the balance being decidedly in favour of the latter.  Now,
if, during this brief period, Oliver had been surrounded by careful
grandmothers, anxious aunts, experienced nurses, and doctors of
profound wisdom, he would most inevitably and indubitably have been
killed in no time.  There being nobody by, however, but a pauper old
woman, who was rendered rather misty by an unwonted allowance of beer;
and a parish surgeon who did such matters by contract; Oliver and
Nature fought out the point between them.  The result was, that, after
a few struggles, Oliver breathed, sneezed, and proceeded to advertise
to the inmates of the workhouse the fact of a new burden having been
imposed  upon the parish, by setting up as loud a cry as could
reasonably have been expected from a male infant who had not been
possessed of that very useful appendage, a voice, for a much longer
space of time than three minutes and a quarter.

In order to make things easier for myself, I start doing the cipher using this web app (a deprecated version implementing serials and LFGs) which goes through the same steps I’d follow to use BookPad with paper and pencil, only much more quickly. Go ahead and click the link (the app will load on a separate tab) in order to follow what I’ll be saying and experiment on your own.

If now we feed both into BookPad and do the encryption with the default “Arsenio” encoding and serial “0000”, the ciphertext before any decoding back to digits becomes:

49504560680084015661998496971386126345322028641775701660631118711542791176999963918180950242273730121545261879856530459624343588341942635310173838649537278287338241876062309030325371889683406237844253140131381115271171980084995803843604317040322957784187844717928566346098883557436085871528224684827122698669356179218594631560234600453394784007600430168768118427774015454962575355875003337486458086967754028911903728214916894115858013830641158728078374294705902700957404571860086035594822671548478906542226605209355983624829263526604419894755056810662664584223372640584840832996675770603231995849090181934841162880569602142655230141244650356624845842627970334117126810363172415890169789014

Some quick statistics of the ciphertext:

length = 689 digits (the plaintext was 522 characters, so the ciphertext is 1.32 times longer).
'0' appears 72 times, '1'=73, '2'=65, '3'=64, '4'=77, '5'=66, '6'=74, '7'=60, '8'=81, '9'=57, which is close to a uniform distribution

Now let’s go through the different scenarios.

Short key scenario.

If the user get’s lazy and uses a very short key text, he may easily produce a repeating keystream. For instance, if the key text is simply “a”, the resulting keystream, ready to be combined with the encoded plaintext, becomes:

86556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150631001360518655681506310013605186556815063100136051865568150

Which is made up of the period “86556815063100136051” (20 digits) repeated 34 times, plus a fraction. The resulting ciphertext, therefore, is as if produced by a Vigenère cipher of key “86556815063100136051”, acting on the encoded plaintext. To crack this, we would make statistics for the digits found at 1, 1+n, 1+2n, and so forth, where n is our guess of the key length. When we try the correct length, then all the digits in each set have been combined with the same keystream value, so that their peculiar statistics are not affected by the keystream. If the plaintext was English, there would be a clear statistical bias toward certain letters (“e” being the most frequent after the space, then “t”, “a”, and so forth) that would allow us to recognize that length as the right one, and likely to recover the corresponding plaintext, each letter shifted by the same amount in the alphabet, as in the Caesar cipher. The scanning and testing process can be automated, so it would take a fraction of a second in a computer. The only requirement is that each set is large enough to produce meaningful statistics and that the encoding does not obliterate the patterns present in the plaintext.

Let’s take a look at this last point. If I count letter frequencies (including space) in the plaintext, I find the following:

'space'=90,'e'=44,'t'=40,'a'=30,'o'=39,'i'=34,'n'=37,'s'=29,'h'=21,'r'=21,'l'=15,'d'=11,'u'=10,'c'=13,'m'=14,'f'=10,'w'=12,'y'=5,'p'=5,'v'=1,'b'=8,'g'=6,'k'=2,'j'=0,'q'=1,'x'=1,'z'=0

Which only has a few inversions with respect to the average English text frequencies (that is the order used in the list above).

In the encoded plaintext, however, the frequencies are:

'0'=94,'1'=45,'2'=35,'3'=80,'4'=64,'5'=44,'6'=69,'7'=40,'8'=103,'9'=115

Here ‘0’, ‘8’ and ‘9’ are the most frequent digits by far. ‘0’ and ‘9’ encode the less-frequent letters in most Western languages (‘t’, which is frequent in English, is encoded as ’03’ in our scheme, sharing the digit ‘0’ with other much less frequent letters), ‘8’ encodes the space. So it stands to reason that a real piece of English plaintext will have a lot of  8’s, 9’s, and 0’s, although it would be hard to tell which encodes which. Looking at the less frequent digits, we find ‘2’, which encodes ‘r’, which is not so frequent in English, and ‘7’, which encodes ‘o’. The most frequent ‘e’ is encoded as ‘4’, which is in the middle of the pack and doesn’t stand out in any way. Still, let us assume that a piece of encoded English might be recognizable because of the high frequency of three digits (encoding spaces and less-frequent letters) over the rest. Unfortunately, digraphs and trigraphs, which usually are much better to detect encoded language than single letters, won’t help with Vigenère cryptanalysis (unless we add a lot of complexity to the process) because consecutive letters end up in different groups.

It is interesting to compare this statistic with what we would get if the plaintext was random. One way to obtain this is to encode the string “abcdefghijklmnopqrstuvwxyz .’” which contains all encodable characters once. If we repeat it 14 times, for a total of 672 digits after encoding, we obtain the following statistic:

'0'=168,'1'=42,'2'=42,'3'=42,'4'=42,'5'=42,'6'=42,'7'=42,'8'=42,'9'=168

so that only two digits, corresponding to the less-frequent letters, appear more frequently than the others. It seems, therefore, that it would be possible to detect the moment when the correct key length is guessed, since then frequency analysis would give three peaks, which is different from a random distribution of digits (no peaks) or of letters (two peaks). From then on, one would apply the standard cryptanalysis of the Vigenère cipher, obtaining the repeated keystream period in the process.

Okay, but that’s with an incredibly bad key. What if the key is a little better? Let’s assume now that the key is “ar”, which is encoded as “12” with the default straddling checkerboard. In this case the keystream consist of repetitions of the sequence “407668162090261866704076681620902618667040766816209026186670”, which comprises 60 digits, so that the string is repeated 11 times plus a fraction. It is a lot harder to come up with a meaningful frequency distribution of 10 different digits if there are only 11 or 12 samples. The expected three peaks might appear, or they might not, and then we’d never know what the true period is. This period is not always the same, either. If the key is “ae”, which also produces only two digits, the repeated keystream sequence is “7310137297781518779273101372977815187792”, which is 40 digits long, leading to an average of 17.2 samples per distribution. The period length isn’t constant, either. If key “ar” is used for a plaintext equal to the original, minus the final period (2 digits shorter after encoding), then the keystream period is 40, not 60. For key “ae”, it is 20 digits long.

Going back to the original plaintext, key “ab” produces a 60-digit keystream period, whereas key “ac” leads to a 45-digit period. Both are three-digit keys, as encoded, so the difference is to be found in the peculiarities of the lagged Fibonacci generator that produces the keystream. A number of four-digit keys also lead to 60- or 40-digit periods, but when we go to five digits, as in “bad”, the period jumps to 100 digits, but “bat” takes us back to 50 digits. A six-digit keyword such as “but” gives 60 digits. In general, one could expect that the keystream period length be ten times the encoded key length, so if the key is one-tenth the length of the plaintext (it should be 30 times longer, at least), the keystream won’t repeat within the length of the message, and we may be back to the one-time pad situation if the resulting keystream has good randomness.

A quick example: since the plaintext encoded length is 689 digits, let’s try a key that encodes to one-tenth of that length, that is, 69 digits. For instance: “This is a not so short key. But not too long either.” Using this key text, we find that the keystream does not repeat. What’s worse, the distribution of digits in the resulting keystream is quite uniform. The shorter key text “This is a not so short key.” (35 digits after encoding) doesn’t lead to repetitions, either, but the resulting keystream does not look so random, with a superabundance of 2’s and 7’s and a dearth of other digits. However, “This is a short key” (25 digits) leads to a period of 125 digits, for a total of five and a half repetitions. So, it seems that in the worst case the keystream period is 5 times as long as the key (as far as we’ve found so far), with a factor of 10 being most frequent, and sometimes 20.

Conclusion: we can expect to be able to crack BookPad by conventional frequency analysis if the user does not follow instructions and uses a short key, with diminishing probability of success as the key gets longer. Once the user takes a key text that is at least one-tenth the length of the plaintext (after encoding), forget about it, even though technically there isn’t enough key entropy to meet the one-time pad criterion for unbreakability.

Reused key

Clearly, reusing the key text for two messages of identical length breaks all security since then an attacker only needs to subtract the two encoded ciphertexts in order to find the difference between the two messages, which then can be easily attacked in many ways. But what if the messages are of different length? What if a different serial was used with the same key text, but the two serials differ only a little?

To test the first scenario (which will be revisited later), let us re-encrypt the original message with the original key, which is shown at the top, and then use the same key to encrypt the original message minus the final period (two digits shorter). Here’s the ciphertext for the second encryption:

080653919354207573082946698279557637717783420159887064587363997801819098623886634621435181617149650881379081385808289648839232357686905572759273683733019242416409622104254425068549459634261536548307305258991049932848714094207982511166147214391626869846663488479725578102687486543909535691590189912834514950633492400878567613933785290007112246339237125515514603589605896248273434294515810209220596453282731762769121058678451926600535491183932264144604511586986598408786068577900862225732301214114919758307362469428955118631085987435666629232914607829656886750844664924781158738892072449040943422393867494120708503026553236782069739087152697883503740382282317330411073067932595532092161284

Even though the plaintexts only differ by two digits at the end, the ciphertexts are completely different from the start. No similarity between the two is evident anywhere. This is because the two keystreams, having different lengths, become completely different due to the properties of lagged Fibonacci generators. At first, it doesn’t appear promising at all for the cryptanalyst, so we’ll try an easier target.

How about serials that only differ by a little? Let’s encrypt the original plaintext with the original key text, but this time using “1000” as serial, so a “1” will appear where before there was a “0” in the process of generating the keystream. Then the ciphertext becomes:

51849138581218572451011842650398461913223252108565824016310120056110692300456753031536639254518308022779728669979986138636688156242176192100296284328549513855239475333852422486004383124251307461301043263587060127516749881218452693966050096052667525685311301507041912025000128125337219338318347030506134933237257303775384754916913612798962685231167220281114897439019683355196032145998459016498793654868988585701026174993928139783759247397431271174757386539373803934414294694216765047839490572772935796665672384211690551525053720316727865573767391488563898041013495096263852177564576904160021018295779193279419063014026492265001919153589228257858302632740326013129461488264306972680282135793

We do see some repeated digraphs between the two ciphertexts, but they are not at the same locations so we must conclude that they arise from normal random coincidences. Prominent features like the “9999” found in the first line of the original message don’t appear to be in the second message, and vice-versa. Both messages look as if encrypted with completely different key texts.

We hit pay dirt if the serial is “1900”, however, for then the ciphertext is:

40615671791195126772009507082497237456433139752886812771742229822653802287000074029291061353384841232656372980967641560735454699452053746421284949750648389398449352987173410141436482990794517348955364251242492226382282091195006914954715428151433068895298955828039677457109994668547196982639335795938233709770467280329605742671345711564405895118711541279879229538885126565073686466986114448597569197078865139022014839325027905226969124941752269839189485305816013811068515682971197146605933782659589017653337716310466094735930374637715520905866167921773775695334483751695951943007786881714342006950101292045952273991670713253766341252355761467735956953738081445228237921474283526901270890125

which is the same as the original ciphertext, adding 1 (mod 10) to every digit except the first. This is because the shift in the LFG caused by the first serial digit ‘1’ is promptly canceled by the second ‘9’, resulting in a related keystream sequence. Serial “2800” shifts the whole thing by 2, which again essentially repeats the keystream, and there a number of such serials that lead to results too close to the original. Now, of course, this only works if the two messages are of identical encoded length and they use the same key test and serials whose digits add up to the same, which will happen every 10 times on the average (not bad at all).

Now, if we go back to the ciphertext encrypted with the serial “1000”, we notice that it is more similar to the original (serial “0000”) that we had been led to believe at first. Indeed, if I add the repeated stream “1234567890…” to the original, I get the one with serial “1000”. The effect is similar if the serial is “0100”, “0010”, or “0001”. If the serial is “2000”, then the keystream shifts by the sequence “24680…”, if the serial is “3000”, the string to the added is “3692581470…”, and so forth. Even better, the linearity of the LGF causes a serial composed of several figures to have an effect on the keystream that is the superposition of the effects of all its digits. In other words, utterly predictable, though apparently random at first. If instead of three rows of LFG the encrypted had used four, we’d still be able to predict the effect, though the sequences to be added would be a little more complex.

So, advice to the encryptor: don’t use a serial as described in the old BookPad spec. It gives no extra security because the linearity of the LFG causes the effect of the serial to be completely predictable so it can be removed, leaving behind the original key text. If this is repeated with two messages of equal length, all will be lost.

Because of this, I altered the original spec of BookPad so that a key text shift, applied as soon as a piece of suitable length is extracted from the encoded key text, is used rather than a serial added to the digits. Shifts move all the digits so that, after three rows in the LFG, there is little apparent correlation between the keystreams produced from different shifts. The idea is to be able to reuse a particular key text, so long as the shift itself is not reused with the same key text and a key text-derived encoding is in use. To follow along with the explanation that follows, use this other JavaScript code (also deprecated), which uses shifts rather than serials.

Shifts move all the digits in the affected parts of the key text, so the result of the sum changes unpredictably, but are they really secure? Not if the attacker has the plaintexts. This is not as unlikely as it seems: perhaps the attacker has planted some news in order to get the user to include certain unusual words in the plaintext, which can be recognized later on, or perhaps he got the plaintext from a different source, and now he is trying to find the origin of the key text. This is know as the “known plaintext attack.” Let us, therefore, assume that I, the attacker, have subtracted the known plaintext from two messages encoded with two different shifts (say, one unshifted and another where the first piece is shifted by just one space to the left), so that I have both keystreams. Let’s say one is “12345” and the other is “67890”; then the process of making the first keystream (assuming no LFG step) will be:

 abcde
+fghij
------
 12345

where “abcde” is the key text part that is shifted in the next message, and “fghij” is the result of adding the two parts that are not shifted. The process of making the second keytream (again, no LFG), is:

 bcdea
+fghij
------
 67890

putting the two together, we obtain the following linear system of equations, where the addition is modulo 10:

a + f = 1 ; b + g = 2 ; c + h = 3 ; d + i = 4 ; e + j = 5 ; b + f = 6 ; c + g = 7 ; d + h = 8; e + i = 9 ; a + j = 0

which is a system of 10 equations with 10 unknowns. Solving it we obtain the digits a through j. Then we decode it back to letters, resulting in a piece of the key text that a diligent search will identify as belonging to a certain book or other source. We’ve got it! Now, the example was too simplified, but it serves to illustrate that there is nothing the encryptor can do to stop us:

  1. If he shifts the part by a different amount, we can still set up the system of equations, so long as we know what the shift is. Now, the shift may be concealed within the ciphertext, but then we only have to try all likely locations (there’s only so many of them), and keep the one that leads to intelligible text in the solution.
  2. If the shift affects two parts rather than one, we’ll end up with a similar situation but involving three rows of unknowns instead of two. This only means that we’ll need three compromised messages rather than two in order to reach a solution. Since the point of the shift is to reuse a key text, it is very likely that three messages using the same key can be found if the encryptor has reused the key. Why use it only for two messages?
  3. Things won’t be much better for the encryptor if he uses a LFG, since the LFG can be reversed after making a few guesses, as we saw above. Nothing that a computer cannot handle.

But what if he does not shift the key text parts, but simply reuses the key text for messages of different lengths, which above we saw lead to very different-looking keystreams. Using the example above, the first, second, and third keystreams, for messages that are 5-digits, 4-digits, and 3-digits long would be, for instance:

 abcde
+fghij
+klmno
------
 12345
 abcd
+efgh
+ijkl
-----
 6789
 abc
+def
+ghi
-----
 012

which results in a system of 5 + 4 + 3 = 12 equations with 15 unknowns. It is not strictly solvable, but it will be if we guess two of the unknowns, for a total of 10 x 10 = 100 possibilities. The right solution will be the one that yields intelligible text for “abcde”, “fghij”, “efgh”, and so forth. It is unlikely that two different solutions will pass this test. So it turns out that reusing a key text for messages of different lengths is not safe, either, even though the keystreams may look quite changed.

What can the encryptor do to thwart me? I can only think of two things:

  1. Never reuse a key text, instead of relying on a shift to make the keystream different each time, or relying on the fact that keystreams naturally look different for different lengths. That is, so long as an attacker is able to obtain the keystreams by subtracting known plaintexts with a known encoding (such as the default “Arsenio” encoding).
  2. If reusing a key is unavoidable, use a key text-derived encoding rather than the default, plus a shift. If the encoding is unknown, I won’t know what the correct keystream is, even if I know the plaintext, and the process I just described will be much more difficult to get started. I can still make guesses as to what the encoding is (essentially, it’s a permutation of the top row of the checkerboard and the other two rows), but there’s a lot of possibilities to try: 10!/2 for the first row, times 19! for the other two, for a total of 2.2E23, which will take a while to go through even on a powerful machine. If I work for the NSA, though, I might still be able to muster enough computer power to go through every possible encoding, after which solving the system will hardly take any extra time, and I will still be able to tell when I’ve got the correct solution, because all parts of the key text will be readable (a machine can detect this, by means of a dictionary).

In other words, the encryptor will do well never to reuse a key text, even with key text-derived encoding. After all, we are after perfect secrecy, and this implies NSA-proof secrecy, present and future. This is the reason why shifts were also removed from the BookPad spec. It is much easier to simply choose a different piece of key text for every message, rather than keep track of shifts and encodings in order to reuse a given key text.

Altered ciphertext

Here we have to deal with the issue of “malleability,” which plagues all stream ciphers. An attacker knowing the plaintext (or a part of it, along with its location) will be able to make a ciphertext that decrypts to a plaintext of his choosing, by simply shifting the values in the ciphertext so the plaintext letters shift in the same way. This is possible whenever the attacker knows the encoding used. For instance, the initial “among” of the message is encoded as “1907595” with the default encoding. “Sugar” encodes to “3049512”, which has the same length. Therefore, if I take the ciphertext and add to it the number 1907595 –  3049512 = 8968083 (digit by digit, mod 10), it should decrypt to what I want. The altered ciphertext begins (changes in bold):

2818439068008401566199849697138612634532202864177570166063111...

which decrypts to:

sugar other public buildings in a certain town...

which is exactly what I wanted to do.

To combat this, the user will either calculate a MAC and include it with the message (or not), which is extra work, or use a key text-derived encoding checkerboard. This is how a message authentication code (MAC) can be calculated (I assume it’s going to be 3 digits long):

  1. Since the MAC length is 3 digits, I set N = 3. I take a piece of encoded key text that I have not used yet with length 3N = 9 digits. For instance: 672930060
  1. Now I take the encoded plaintext (“Alice”), followed by the piece of encoded key text, and the shift or serial (if any), and write them row by row into N = 7 columns. Then I add each column without carries and get the result.
 199
+692
+467
+293
+006
+0(shift here, wrapped if necessary)
-----
 337

My friend on the other end goes through the exact same steps after he has obtained the encoded plaintext. Since he is using the same key text, he should calculate the same MAC (337), which I am sending along with the ciphertext. If the calculated value matches the received value, he knows there’s only one chance in a thousand that the message has been altered, otherwise most likely there has been an error or alteration.

This process of making a MAC doesn’t give away anything about the plaintext, because secret data having as much entropy (remember, entropy is the same as “unpredictability”) as the MAC itself has been mixed in. Those who don’t possess the key text would have to guess this data, which means that they can produce any three-digit MAC just by changing their guess. No good at all for me poor hacker.

Even worse, if the encryptor uses a key-derived encoding (and we saw above there are 2.2E23 possibilities), I cannot begin to modify the ciphertext so the plaintext is something different because I don’t know how the letters translate into digits. I could try to make some guesses, but for this attack I only have one chance since the recipient will realize that something is wrong as soon as the first word decrypts as gibberish while the rest make sense. I’m stuck.

Advice to the encryptor: making your own key-derived checkerboard takes very little time and protects you against this attack, even without a MAC. Much preferred.

Summary

We have subjected BookPad to three kinds of attack that might be likely in practice. BookPad remains secure when short keys, so long as they are at least one-tenth the length of the message. Short keys, on the other hand, don’t have the theoretical unbreakability of long key texts, and can be attacked by brute force. Related-serial attack has succeeded against users who reused the key text and relied on serials to make the keystream different, so serials were deprecated in the BookPad spec and replaced at first by key text shifts. Then shifts also fell to attack easily when the encoding was known, not so easily for key-derived encoding but still within the reach of an attacker with a lot of computing power, so it is better to never reuse a key text, period. Key-derived encoding works as well as calculating a MAC when protecting against a known plaintext attack, and is much quicker to do. In this other article, I found that the LFG step does not add security when the key text is the necessarily length, which is why the current BookPad spec does not include it.