!filename
when prompted for a string. The output is written to filename.vig
(suggest ...)
, just press enter to use the suggested value
PLAINTEXTMESSAGE
KEYKEYKEYKEYKEYK
Then, each character has a Caesar shift applied, with the amount of shift determined by the key character. The table below shows exactly what happens. To encrypt a character, find the column with the plaintext at the top, and the row with the key on the left side. The ciphertext is the character that this row and column intersect on. To decrypt, locate the row with the key on the left side, and look along this until you find the ciphertext. The plaintext is the character at the top of this column.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P R R S T U V W X Y Z A B C D E F G H I J K L M N O P S S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
ZPYSRROBRWIQCEEO
The first step is to determine the length of the key, using the method of coincidence indices. To calculate a coincidence index, shift the ciphertext and compare against itself, counting character matches:
CIPHERTEXT
CIPHERTEXT
----*-*
In this example, for a shift of three, there are two matches out of a possible seven, so the coincidence index is 28.57%. For random text the expected value is 3.85%, but this number tends to be higher for non-random text in any language, because some letters are more frequent than others. For Vigenere ciphertexts, the coincidence index is higher for shifts that are a multiple of the key length than otherwise (can you see why?). The program calculates the first 256 shifts, and finds the shortest length for which all shifts that are a multiple of this length have a coincidence index greater than 6%, which it suggests as the key length.
Knowing the key length, you can break the ciphertext up into chunks, each of which have been encrypted by the same character in the key. For each chunk, try decrypting it with each possible key letter (A...Z). Compare the letter frequencies in each possible plaintext with a table of English letter frequencies. The key letter which results in the least squared deviation is the most likely candidate, which the program suggests.
To test the cracking code, I used words.txt
, a list of many English words, found in the Word Patterns download. With this much ciphertext, keys as long as 200 characters can be uncovered without the user overriding any automatic suggestions.