In the sixteenth to nineteenth centuries, the most commonly used ciphers employed number groups to represent letters, syllables, and words (though the exact timing of their adoption varied). The present article presents some undeciphered numerical ciphers mainly from the Austrian archives, with a particular focus on a kind where the number groups are of variable length and written continuously without breaks. For such ciphers, separating the continuous stream of digits into number groups is the first step of deciphering, for which there are various schemes to guide the separation, as I described in "Variable-Length Symbols in Italian Numerical Ciphers" (though sometimes, there appears to be no particular rule for such separation, as noted in the section "Miscellaneous" therein). More elaborate schemes for such variable-length code are documented in the Austrian archives.
Table of Contents:
DECODE R2159 (Non-decrypted, Italian (the language of the cleartext))
DECODE R1408 (Non-decrypted, Italian)
DECODE R2179 (Non-decrypted, Italian)
Adam Starhemberg (1758) (Non-decrypted, German)
Instructions for Continuously Written Figure Codes in Austria
•1. DECODE R2217 (Fixed Length)
•2. DECODE R2212 (False Thousands Digit)
•3. DECODE R2190 (False Thousands Digit) [1780]
•4. DECODE R2207 (Fixed Difference between Digits)
•5. DECODE R2202 (Fixed Number of Subgroups)
•6. DECODE R2197 (Constraint for First Two Digits) [1781]
•7. DECODE R2188 (Fixed Length with Redundant Ones Digit)
(ÖStA_HHStA, Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.18-19)
This is from Fra Giovannni di Lucca to the Emperor (Ferdinand III), 30 May 1644. In Italian.
The number groups may consist of one or two digits, but there is no difficulty in identifying individual groups because the groups are delimited by dots. The ciphertext can be deciphered to read: "Al principe di trenti...." (Someone versed in Italian would be able to complete the decryption.)
(ÖStA_HHStA, Staatskanzlei Interiora, Chiffrenschlüssel, Kt.14. Fasc.20. f.176)
The only cleartext at the beginning "Di Varsouia 24 Xbre 1627" (From Warsaw, 24 December 1627) indicates the language is Italian.
The following is my transcription. (It should be noted that the numbers and letters are written without space in the original manuscript, but I believe my grouping into two- or three-digit groups is fairly straightforward. It is possible that some letters should be grouped (e.g., "zg", "pr", "fi", "ll").)
(ÖStA_HHStA, Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.54)
I think the beginning up to the symbol 🜨 is trivially parsed as "21 33 65 36 68 41 65 16 19" but I have difficulty in consistently breaking the rest into groups.
(Center dots which seem to separate digits are transcribed as "." There are also lower dots (not transcribed), of which some may separate digit groups and others, placed under a particular digit, may also have some meaning in parsing (see another article). Dots above appear to be simply a part of the digit 1.)
Letters "a r e" are written above the last groups "36 51 38". This may be plaintext, but this kind of note is occasionally just a trace of unsuccessful hypothesis.
A letter partly in cipher from Georg Adam Starhemberg (Wikipedia) was kindly provided by Alxandre Pillon. This letter is from "Georg Graf und Herr von Starhemberg", Paris, 23 May 1758. Georg Adam Starhemberg was an Imperial envoy in Paris and worked with Count Kaunitz to bring about the rapprochment between Austria and France in 1756.
The cleartext part is in German.
Again, figures are written without break and parsing them into cipher groups is not trivial. In my transcription below, breaking into groups is mine and may be wrong, though it seems certain that 310-314 form groups of three digits. (On line 7 on the verso, there is a large space after "31". At least this instance may not be grouped with the following digit.)
Punctuations may be irrelevant because ignoring them yields more consistent parsing. Linebreaks may also be irrelevant. (For example, look for where "31" is followed by a punctuation or a linebreak.)
It will be seen the figures 310-314 are very frequent. There are several schemes for continuously written figures in Austrian codes (see the next section), but such code-based scheme cannot explain this high frequency of the five symbols. It seems more natural to assume they represent letters, e.g., vowels.
Some instructions for continuously written figure codes are found in ÖStA_HHStA (Österreichisches Staatsarchiv, Haus-, Hof- und Staatsarchiv). The following is solely based on the examples (illustrating parsing and deciphering of ciphertext) given in the instructions and the accompanying code tables. Further insights may be obtained by those who can read the text of the instructions.
Some of the examples show that linebreaks in the ciphertext can be irrelevant (though it is possible that some clerks chose not to take the liberty of writing a number group across a linebreak).
Some of the code tables include entries such as "1780", "1781", which seem to indicate when the code was prepared. Despite the variety of the parsing schemes, the similarity of the vocabulary of the code indicates these are all from ca. 1780.
(ÖStA_HHStA Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.168-171)
This is parsed as follows.
This is a simple case of fixed length code numbers. (In the example, code numbers beginning with an odd number is marked "erz" instead of plaintext words. This may be because the first page of the code only includes numbers beginning with an even number. One may speculate the second page was not at hand when the instructions were drafted.)
The code used is R2215=R2216. There is a code group "ind: clav: 2" for indicating the key (419).
(ÖStA_HHStA Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.156-159)
This is parsed as follows.
This is variable-length with code numbers with three or four digits. In parsing the ciphertext by units of three digits, when the first two numbers are the same, four digits are grouped into one code number. However, the repeated second digit is dropped for table lookup. In this example, after reading "272", "451", "489", the next group is "4423", but it is read as "423". That is, the ciphertext is parsed as if consisting of three- and four-digit code numbers, but the code actually consists of only three digits.
The code used is R2210=R2211.
There is an entry for "america" (690). While this per se cannot allow specific dating, the similarity of the vocabulary with codes below indicate this was around 1780.
(ÖStA_HHStA Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.98-101)
This is parsed as follows.
For four-digit groups, the thousands digit and the hundreds digit are always the same (as with R2212 above).
The code used is R2191=R2192.
There is a code group "indicans clavim 2" (279) for indicating the key.
The vocabulary (877 for "1780"; 811 for "1700") indicates when the code was prepared.
(ÖStA_HHStA Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.144-147)
This is parsed as follows.
Here, the digits in the parentheses are not in the ciphertext. How can a decipherer know this? The answer is found in the structure of the code, which contains code numbers 0100-9099, in which the hundreds digit is always the thousands digit plus one (e.g., in 3402, 4=3+1). (The addition is cyclic, so 9+1=0.)
In parsing the ciphertext in units of three digits, when the next two digits do not meet this relation, the decipherer should insert a hundreds digit, which should be the thousands digit plus one.
(In encoding, the clerk can omit the hundreds place from time to time, but omission should not be made when the hundreds digit and the tens digit are the same. For example, for "5640", "6" can be safely omitted, but for "5660", if "6" in the hundreds place is omitted, the decipherer would not notice the omission because the remaining "560" satisfies the constraint for four-digit numbers: 6=5+1. A similar problem is addressed by reserving some numbers in R2197 below.)
The code used is R2205=R2206.
(ÖStA_HHStA Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.132-135)
Instead of this, the illustrated example is the following. (I already inserted space.)
That is, the ciphertext looks as if the cipher consists of single-digit or two-digit numbers, but actually the decipherer needs to take three of those subgroups delimited by dots to form a cipher symbol.
Numbers such as "4.10.1." or "6.10.10." are not to be read as four- or five-digit groups "4101" or "61010". The code groups are arranged by these subgroups in the code table. That is, in the column for 4, the sequence is: 4.1.1., 4.1.2., ..., 4.9.10., 4.10.1, ..., 4.10.10.
The code used is R2200=R2201.
The entry "9.5.1." is for "1700" and "10.5.9." is for "178". The latter seems to provide for the years in the 1780s.
(ÖStA_HHStA Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.116-119)
This is parsed as follows.
This may be considered as an extension of DECODE R2207 (Fixed Difference between Digits) above. That is, the code numbers have more variety than R2207 but the constraints for the first two digits allow the decipherer to know whether three or four digits should be taken for parsing.
While the code table consists of four-digit code numbers, the first two digits satisfy one of the following constraints:
(i) the first and second pages have the hundreds digit equal to 0;
(ii) the third and fourth pages have the thousands digit equal to the hundreds digit (leaving blank where the tens digit is also the same, which appears to address the issue of requiring a special care when encoding, mentioned for R2207 above); and
(iii) the fifth and sixth pages have the hundreds digit equal to the thousands digit plus one.
When the first two digits do not match any of the patterns A:0, A:A, or A:(A+1), three digits ABC are read instead of four. Since the code table does not provide for three-digit code groups, a digit needs to be inserted to form a four-digit symbol conforming to one of the constraints.
For the first six instances, 0 is inserted in the hundreds place to conform to the constraint (i). For example, "095" is padded to "0095", which reads "courier".
The next instance 425 is padded according to (iii) to become 4525, which reads "coma" (this word occurs also in 2073 and 7741 in the code table). 427 is turned into 4427 according to (ii), which reads "in". I have not identified the rule to switch which constraint to apply for padding.
Apart from this, this code is faulty because of the alternative constraints, for multiple constraints yield the same code number! For example, 0095 occurs on the first page (for "courier") and the third page (for "li"), because "00" satisfies both constraints (i) and (ii). The number 9032 occurs on the second page and the sixth page, because "90" satisfies both (i) and (iii) (9+1=0).
The code used is R2194=R2193. The vocabulary (0069, 9028, 9948 for "america"; 2069, 6710, and 8870 for "1781"; 4480, 6778, and 8090 for "1700") suggests this was prepared at the time of the American Revolutionary War.
The example message includes code groups for "ind: clav: 2" and "ind: clav: 3" for indicating the key.
(ÖStA_HHStA Staatskanzlei Interiora, Chiffrenschlüssel, Kt.20. Fasc.27. f.93-94)
This is parsed as follows.
All the code numbers consist of four digits, so there is no problem in parsing.
Let us denote the four-digit code numbers as ABCD (where A is the thousands digit etc.). The code numbers are arranged in the code table in the order of: those with B=A, followed by those with B=A+1, B=A+2, B=A+3, B=A+4, B=A+5, B=A+6, B=A+7, B=A+8, and B=A+9. (Again, the addition is cyclic.) Although the words in the code table are in alphabetical order, this special arrangement of numbers introduces some irregularity in the correspondence between the words and numbers.
Thus, the code number has no redundancy in terms of parsing and omission of a digit is not allowed. However, code numbers different only in the ones digit mean the same word. For example, the numbers 1010-1019 all mean "vor".
The code used is R2185=R2186.
Héder, M ; Megyesi, B. The DECODE Database of Historical Ciphers and Keys: Version 2. In: Dahlke, C; Megyesi, B (eds.) Proceedings of the 5th International Conference on Historical Cryptology HistoCrypt 2022. Linkoping, Sweden : LiU E-Press (2022) pp. 111-114. , 4 p. [pdf]
Megyesi Beáta, Esslinger Bernhard, Fornés Alicia, Kopal Nils, Láng Benedek, Lasry George, Leeuw Karl de, Pettersson Eva, Wacker Arno, Waldispühl Michelle. Decryption of historical manuscripts: the DECRYPT project. CRYPTOLOGIA 44 : 6 pp. 545-559. , 15 p. (2020) [link]
Megyesi, B., Blomqvist, N., and Pettersson, E. (2019) The DECODE Database: Collection of Historical Ciphers and Keys. In Proceedings of the 2nd International Conference on Historical Cryptology. HistoCrypt 2019, June 23-25, 2019, Mons, Belgium. NEALT Proceedings Series 37, Linköping Electronic Press. [pdf]