Solving Running Key Ciphers (Manually/Digitally)

The Vigenere cipher, once called chiffre indechirable, can be solved by exploiting the repetitive use of the key. What if the key is not repetitive, that is, the key is as long as the whole message? Such a scheme is known as the running key cipher. If the key is a random sequence of letters, it is as good as a one-time pad and is theoretically unbreakable. But if the key is a meaningful text, the internal characteristics of natural language give rise to vulnerability to codebreaking.

The present article describes techniques (essentially manual, though helped by use of a PC) to break a running key cipher and also introduces some papers for algorithms to solve running key ciphers.

Manual Solution of Running Key Ciphers

The probable word method for solving the Vigenere cipher may be applied for breaking running key ciphers. First, one guesses at a word that is likely to occur somewhere in the plaintext and subtracts it at every possible starting point in the ciphertext. If the subtraction (K=C-P) reveals a fragment of a plausible key text, it may be part of the key in natural language. If so, a prefix or suffix to the fragment to complete a word may be inferred. Such an extension of the key may in turn be subtracted from the ciphertext (P=C-K), which gives an extension of the plaintext word guessed in the first stage. By such alternating extension of the key and plaintext, the plaintext and key may be recovered.

Friedman (1918)

In William Friedman (1918), "Methods for The Solution of Running-Key Ciphers", Riverbank Laboratories, Publication No.16 (University of Amsterdam), the illustrious codebreaker declares that the running key cipher "may nearly always be solved" and demonstrates a "very simple" case where the same running key is used in multiple messages.

(Write the first ciphertext letter from all the messages into a sequence called "generatrix", from which 25 generatrices are written below by shifting (the process called "run down"). Since all these letters are enciphered with the first letter of the key, one in the set of the 26 generatrices contains the correct plaintext letter for the first letter of all the messages. Similarly, 26 generatrices are generated for the second and third letters in the ciphertext of all the messages. The problem is then choosing the correct generatrix for each set. It would be done by trials and errors, but may be helped by picking up generatrices containing many high-frequency letters.)

Example 1

Wikipedia (s.v. Autokey cipher) gives a demonstration of solving a short ciphertext with a probable word "THE." Although this process is facilitated by the nature of the autokey cipher, the essence is the same for the running key cipher (the example is the plaintext "meet at the fountain" encrypted with the key "KILT meet at the foun").

Example 2

The following is my own trial with the probable word method. It was not successful, but may give some insights for similar attempts. (I used a tool here for enciphering and deciphering.)

P:The enemy penetrated still further into the country. The General, eager to regain the lost ground, gallantly drove back the enemy. But his troops were soon forced to retire and leave that rich country in the hands of the enemy.
K:Few persons can be made to believe that it is not quite an easy thing to invent a method of secret writing which shall baffle investigation. Yet it may be roundly asserted that human ingenuity cannot concoct a cipher which human ingenuity cannot resolve.
C:ylatrvemcwpegsefegwmwmpqcvolxyigbhbzrqhkhbkctuigwlxyiyktumeosexgmmgavhztkxiisnjuottygjatakjvvpmbcpysimazqquczhbahemsiapqrctsfbzbuncdlsixxlklagkfqaimgneglqvfeohahkawavvgaaplhvjkdlmplts

Assumed Word "THE"

First, the probable word "THE" is subtracted from every possible starting position of the ciphertext. (The result is given in three rows for starting positions 3n+1, 3n+2, and 3n. For the latter two, the repetitive sequence of the word is padded with A at the beginning. In the following, the first line (P) in each of the three rows is the assumed plaintext word. The second line (K) is the key revealed by the assumed plaintext. The third line (again labelled K) is the correct key, supposed to be yet unknown.)

P:THETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHE
K:fewakrlfydianlamxcdfstimjoksqupzxouvyjdraxrvpbbcdetfburmqtxkzxtnfintrosprqepljqnkamuncwatgqorwfxjiuzbihsmxnygaxhaatlehimyvpzyxguquvzszrcfibqdupoffryuqnqajdwhjqhzbydxncoqmjmtztquhernmb
K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHHU

P:ATHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETH
K:ystpyoatvswxczxblzstpiwjychherenudisnxagougjmqpzssquprganilholqctfchodgmgebezgfbhparcqtphdfcoltuywropfwgjmbvvouwoxizbwwjnjmomuvinjjwhnortfqeajdlutonincexyrtwxnwnynrucqlfagbhwierwsocay
K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHHU

P:AATHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHETHE
K:ylhmncxijpllzolyanpidflxvrvetfbciaxgkmoddidyanenpherefdpbfavlaezitzwcavadtpboucqvmpfzfhmwrcrciiivlflettvxjqjsditdlfoptlxkyalbisxbgykeccoitntogszrickxbztlvghtmbtcmkgizfzcpuywkftfthczpm
K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHHU

Plausible plaintext fragments resulting from the assumed plaintext word THE is highlighted in blue and green, with the latter showing correct matches (though not yet recognized as such). Red highlights show correct matches I overlooked.) It should be remembered we are checking whether "THE" occurs anywhere in the plaintext. So, for example we mark "dia" aligned with THE but not "dian", even if the latter looks plausible in English. (For this reason, at this stage, it is convenient to work with the text split into units of three letters.)

We should remember that the plaintext and the key, both text in natural language, are equivalent in the running key cipher (C=P+K). So, in general, when the subtraction of THE reveals a plausible text fragment, we cannot know whether THE is a candidate for the plaintext or the key, which needs to be decided for every instance. (In this particular example, it so happened that the key text I picked up contained no "THE".)

Anyway, it can be seen that with the short candidate "THE", there are too many false candidates. Even if we focus on a correct match, it is often difficult to extend the revealed fragment to full word(s) ("vet" to "believe that"; "ane" to "an easy"; "fle" to "baffle"; "erw" to "cipher which"). In this example, the two viable candidates are those I overlooked. By some intuition, one may extend "isn" to "is not" and "hod" to "method."

Suppose we notice "hod" can be extended to "method" in K. Then, from this and the corresponding ciphertext "mmgavh", "THE" can be extended to "AINTHE".

P:THETHE AINTHE
C:eosexg mmgavh
K:lholqc method

If we can extend the suffix AIN, it in turn extends the plaintext. To do that, we need to check many candidates such as "main", "gain", "maintain", "train", "pain", "sprain", etc. etc.

So, even if we try a correct candidate (which we know is "gain"), it only extends "method" to "a method", which it is hard to extend further.

P:THETH G AINTHE
C:eosex g mmgavh
K:lholq a method

If we are really lucky to try "regain", we may get a little more.

P:THE REG AINTHE
C:eos exg mmgavh
K:lho nta method

Then, the next step would be to try to extend "nt" to various words ending in "nt" (esp. "ant" or "ent") to see if any of them yields a plausible plaintext.

If we are stuck, trying a different crib (other than THE) may reveal different parts of the message.

Assumed Word "ENEMY"

The above experiment shows that use of "THE" as the assumed plaintext word is likely to produce hits, but also tends to produce more false candidates. Moreover, even if there is a hit, it reveals only three letters, which is too short a fragment to help infer the full word in the key. It would be better to start with a word longer than THE.

Suppose we hypothesize the word "enemy" occurs somewhere in the plaintext.

At positions 5n+1, no plausible plaintext fragment (i.e., 5-letter fragment aligning with ENEMY) is found. ("firdo" may be close but does not seem right.)

P:ENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENE
K:uywhtrriqylrcggbrckoszleerbhlaetxvdvemvmdogqvqvckntlemmphisqortuoitwjjvgglkefjxwkgpmifnpomfirdoxplmuezwnsmhynjxndsoovwdsnppghxmxipyqhuxrziizdasgaiinpwckaqmwnmiccsnydcwoxvjswrotlwzxcgb
K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASSERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHH

At positions 5n+2, "inves" and "enlet" may be viable.

P:AENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYEN
K:yhnpfxazykratoshatsayicmqxkytmkcodpbnddyjxxyhwetszzuvuyvqzacuakcaocnrvbpxtwkoafiqpguulwgwylriladycugkinvesqpvvdwuaauenletygotdvoqbezycjxizqljjjomorexiitryycwdqoibegpifffhpbnzazunhjips
K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASSERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHH

From "inves", "investigation" will be a natural extension (after checking "investigated", "investigating", and "investigate" do not fit).

Applying "investigation" to the original plaintext in turn extends the assumed key ENEMY to ENEMYBUTHISTR, which is good.

P :ENEMYBUTHISTR
C :mazqquczhbahe
K :investigation

Can we extend "enemy but his tr"? We may guess "troops" and possibly even "were" after that. With this extended plaintext, the key can be extended.

P :ENEMYBUTHISTROOPSWERE
C :mazqquczhbahemsiapqrc
K :investigationyetitmay

Extending this further would be possible only by trying many candidates after "were" or "may".

At positions 5n+3, "rcipe" "llowo" may be possible but do not seem to fit.

P:AAENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYE
K:ylwgnjgipsdgcfatgcjikoldyjqhkuwixuxntmugvdgppikcjhlaelghwirkggttiaiwidnvgkewujwqcvplcxcpngxxrcipellowowmmewymdpcdrigkwcmfepfbpbxhjqfhtrjoihtvpsfuaxnoquzapgocmhwuhnxxulowpbhwqilawyruvb
K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASSERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHH

At positions 5n+4, "perso" looks good (and is actually correct), but would be hard to extend further than "person". There is another instance of ENEMY near the end, which reveals the correct key fragment "hichh" which I overlooked. It would be hard to recognize this is part of "which human".

P:AAAENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENE K:ylapersoyjlsiorbsiszsarmprcntleuddovfsdxdpmygqwisytmkuxpioaboszcziucruvhmtvegpfhkhvutfovwxfjxlzxqrufeacvdmievuxojazowcldnqvosxndqayrnciraoqkdbyolijtxhclgyxwosqncttgocxufgjtczztmchichh K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASSERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHHUMAN

At positions 5n, no plausible plaintext fragment is found.

P:AAAAENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENEMYENE
K:ylatniaaescauuasauyijidsyikzzuvcpjxmnejguxyepheuyhkuwaggqagkfaliizcoxdmpyzevoblqbphacwwhcgwrjrioydaoviobmdqqbdowvgifeormeyhubovpwjpzziriiawtujkuuzrfdqttsegnwewwtbfmxtfglpabofikuonrtpt
K:FEWPERSONSCANBEMADETOBELIEVETHATITISNOTQUITEANEASYTHINGTOINVENTAMETHODOFSECRETWRITINGWHICHSHALLBAFFLEINVESTIGATIONYETITMAYBEROUNDLYASSERTEDTHATHUMANINGENUITYCANNOTCONCOCTACIPHERWHICHHUMAN

Tips

From my experimenting, it is advisable to guess as long a word as possible. Moreover, after partial results are obtained, trying other candidate words will be worthwhile because it may reveal plaintext at other portions in the text.

In the example above, the candidate words I assumed occur only in the plaintext. If a candidate word occurs both in the plaintext and the key, there's nothing (other than the context) that can tell whether the candidate word belongs to the plaintext or the key.

Papers about Algorithms to Solve Running Key Ciphers

The manual solution of running key ciphers is based on whether an assumed plaintext word gives a plausible text fragment for the key. When computing power is available, it is natural to try the same with many candidates systematically. In order to let a computer determine how plausible a text fragment is, many scoring methods are known. Since such scoring may work for word fragments, the ciphertext can be broken into blocks of some tractable length and scores of candidates can be examined for each block.

Blockwise solution (2002)

Craig Bauer and Christian N.S. Tate (2002), "A Statistical Attack on the Running Key Cipher", Cryptologia, 26:4, 274-282. DOI:10.1080/0161-110291890939. (This seems to be followed by Craig P. Bauer and Elliott J. Gottloeb (2005), "Results of an Automated Attack on the Running Key Cipher", Cryptologia, 29:3, 248-254. DOI:10.1080/01611190508951301.)

According to the summary in Griffing (2006), the 1000-letter ciphertext is broken into blocks of length n (up to 6) and each block was solved separately by trying all n-letter keys. For each trial, a score was calculated as a product of the probabilities of the key and the revealed plaintext. The probability is defined by frequencies of n-grams in a training set. About a third of letter pairs were recovered.

Viterbi algorithm

Alexander Griffing (2006), "Solving the Running Key Cipher with the Viterbi Algorithm", Cryptologia, 30:4, 361-367, DOI:10.1080/01611190600789117.

Solving the running key cipher can be viewed as finding the pair of plaintext and key that maximizes the product of their probabilities. Such a combination can be efficiently found by the Viterbi algorithm, which is "a dynamic programming algorithm used for finding the most probable sequence of hidden states, assuming this sequence follows a Markov model". (With the assumption of the nth-order Markov model, the probability of the key letter at the ith position (ki) is determined from the previous n key letters ki-n, ..., ki-1, and the probability of the message letter at the ith position (m(ki)) is determined from the previous n message letters mi-n(ki-n), ..., mi-1(ki-1). The Viterbi algorithm was applied for n-grams with n=1-6.

The author claims the Viterbi algorithm gave better results (median percent of about 87%) than Bauer and Tate (2002) for n-grams longer than four letters.

Reddy and Knight (2012) points out a disadvantage of this is that "it requires searching through about 265 states at each position in the ciphertext" (for 6-grams).

Gibbs sampling (2012)

Sravana Reddy, Kevin Knight (2012), "Decoding Running Key Ciphers", Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, p.80-84, Jeju, Republic of Korea, 8-14 July 2012 (ACL Anthology).

The authors propose "a blocked sampling algorithm that samples words rather than letters in the plaintext". That is, a word (rather than a letter) that maximizes the probability is to be found by sampling.

Specifically, the plaintext (P) is initialized with a random sequence and a corresponding key (R) is set. Then, iterations go through:
(a) sampling spaces (word boundaries) in P according to Pr(P) (the probability of the plaintext);
(b) sampling spaces in R according to Pr(R);
(c) sampling each word in P (with R updated accordingly) according to the joint probability Pr(P)Pr(R); and
(d) sampling each word in R (with P updated accordingly) according to Pr(P)Pr(R).

The authors claim that the Gibbs Sampler is orders of magnitude faster than the Viterbi algorithm and gives close to perfect decipherment for longer ciphertexts.



©2025 S.Tomokiyo
First posted on 30 August 2025. Last modified on 30 August 2025.
Articles on Historical Cryptography
inserted by FC2 system