The Code Book: The Secret History of Codes and Code-breaking

Автор

Simon Singh

Год написания книги

2018

<< 1 2 3 4 5 6 7 ... 9 >>

На страницу:

Перейти

3 из 9

Настройки чтения

Размер шрифта

Высота строк

Поля

Although Suetonius mentions only a Caesar shift of three places, it is clear that by using any shift between 1 and 25 places it is possible to generate 25 distinct ciphers. In fact, if we do not restrict ourselves to shifting the alphabet and permit the cipher alphabet to be any rearrangement of the plain alphabet, then we can generate an even greater number of distinct ciphers. There are over 400,000,000,000,000,000,000,000,000 such rearrangements, and therefore the same number of distinct ciphers.

Each distinct cipher can be considered in terms of a general encrypting method, known as the algorithm, and a key, which specifies the exact details of a particular encryption. In this case, the algorithm involves substituting each letter in the plain alphabet with a letter from a cipher alphabet, and the cipher alphabet is allowed to consist of any rearrangement of the plain alphabet. The key defines the exact cipher alphabet to be used for a particular encryption. The relationship between the algorithm and the key is illustrated in Figure 4.

Figure 4 To encrypt a plaintext message, the sender passes it through an encryption algorithm. The algorithm is a general system for encryption, and needs to be specified exactly by selecting a key. Applying the key and algorithm together to a plaintext generates the encrypted message, or ciphertext. The ciphertext may be intercepted by an enemy while it is being transmitted to the receiver, but the enemy should not be able to decipher the message. However, the receiver, who knows both the key and the algorithm used by the sender, is able to turn the ciphertext back into the plaintext message.

An enemy studying an intercepted scrambled message may have a strong suspicion of the algorithm, but would not know the exact key. For example, they may well suspect that each letter in the plaintext has been replaced by a different letter according to a particular cipher alphabet, but they are unlikely to know which cipher alphabet has been used. If the cipher alphabet, the key, is kept a closely guarded secret between the sender and the receiver, then the enemy cannot decipher the intercepted message. The significance of the key, as opposed to the algorithm, is an enduring principle of cryptography. It was definitively stated in 1883 by the Dutch linguist Auguste Kerckhoffs von Nieuwenhof in his book La Cryptographie militaire: ‘Kerckhoffs’ Principle: The security of a cryptosystem must not depend on keeping secret the crypto-algorithm. The security depends only on keeping secret the key.’

In addition to keeping the key secret, a secure cipher system must also have a wide range of potential keys. For example, if the sender uses the Caesar shift cipher to encrypt a message, then encryption is relatively weak because there are only 25 potential keys. From the enemy’s point of view, if they intercept the message and suspect that the algorithm being used is the Caesar shift, then they merely have to check the 25 possibilities. However, if the sender uses the more general substitution algorithm, which permits the cipher alphabet to be any rearrangement of the plain alphabet, then there are 400,000,000,000,000,000,000,000,000 possible keys from which to choose. One such is shown in Figure 5. From the enemy’s point of view, if the message is intercepted and the algorithm is known, there is still the horrendous task of checking all possible keys. If an enemy agent were able to check one of the 400,000,000,000,000,000,000,000,000 possible keys every second, it would take roughly a billion times the lifetime of the universe to check all of them and decipher the message.

Plain alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z

Cipher alphabet J L P A W I Q B C T R Z Y D S K E G F X H U O N V M

Plaintext e t t u, b r u t e ?

Ciphertext W X X H, L G H X W ?

Figure 5 An example of the general substitution algorithm, in which each letter in the plaintext is substituted with another letter according to a key. The key is defined by the cipher alphabet, which can be any rearrangement of the plain alphabet.

The beauty of this type of cipher is that it is easy to implement, but provides a high level of security. It is easy for the sender to define the key, which consists merely of stating the order of the 26 letters in the rearranged cipher alphabet, and yet it is effectively impossible for the enemy to check all possible keys by the so-called brute-force attack. The simplicity of the key is important, because the sender and receiver have to share knowledge of the key, and the simpler the key, the less the chance of a misunderstanding.

In fact, an even simpler key is possible if the sender is prepared to accept a slight reduction in the number of potential keys. Instead of randomly rearranging the plain alphabet to achieve the cipher alphabet, the sender chooses a keyword or keyphrase. For example, to use JULIUS CAESAR as a keyphrase, begin by removing any spaces and repeated letters (JULISCAER), and then use this as the beginning of the jumbled cipher alphabet. The remainder of the cipher alphabet is merely the remaining letters of the alphabet, in their correct order, starting where the keyphrase ends. Hence, the cipher alphabet would read as follows.

Plain alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z

Cipher alphabet J U L I S C A E R T V W X Y Z B D F G H K M N O P Q

The advantage of building a cipher alphabet in this way is that it is easy to memorise the keyword or keyphrase, and hence the cipher alphabet. This is important, because if the sender has to keep the cipher alphabet on a piece of paper, the enemy can capture the paper, discover the key, and read any communications that have been encrypted with it. However, if the key can be committed to memory it is less likely to fall into enemy hands. Clearly the number of cipher alphabets generated by keyphrases is smaller than the number of cipher alphabets generated without restriction, but the number is still immense, and it would be effectively impossible for the enemy to unscramble a captured message by testing all possible keyphrases.

This simplicity and strength meant that the substitution cipher dominated the art of secret writing throughout the first millennium AD. Codemakers had evolved a system for guaranteeing secure communication, so there was no need for further development – without necessity, there was no need for further invention. The onus had fallen upon the codebreakers, those who were attempting to crack the substitution cipher. Was there any way for an enemy interceptor to unravel an encrypted message? Many ancient scholars considered that the substitution cipher was unbreakable, thanks to the gigantic number of possible keys, and for centuries this seemed to be true. However, codebreakers would eventually find a shortcut to the process of exhaustively searching all keys. Instead of taking billions of years to crack a cipher, the shortcut could reveal the message in a matter of minutes. The breakthrough occurred in the East, and required a brilliant combination of linguistics, statistics and religious devotion.

The Arab Cryptanalysts

At the age of about forty, Muhammad began regularly visiting an isolated cave on Mount Hira just outside Mecca. This was a retreat, a place for prayer, meditation and contemplation. It was during a period of deep reflection, around AD 610, that he was visited by the archangel Gabriel, who proclaimed that Muhammad was to be the messenger of God. This was the first of a series of revelations which continued until Muhammad died some twenty years later. The revelations were recorded by various scribes during the Prophet’s life, but only as fragments, and it was left to Ab

Bakr, the first caliph of Islam, to gather them together into a single text. The work was continued by Umar, the second caliph, and his daughter Hafsa, and was eventually completed by Uthm

n, the third caliph. Each revelation became one of the 114 chapters of the Koran.

The ruling caliph was responsible for carrying on the work of the Prophet, upholding his teachings and spreading his word. Between the appointment of Ab

Bakr in 632 to the death of the fourth caliph, Al

, in 661, Islam spread until half of the known world was under Muslim rule. Then in 750, after a century of consolidation, the start of the Abbasid caliphate (or dynasty) heralded the golden age of Islamic civilisation. The arts and sciences flourished in equal measure. Islamic craftsmen bequeathed us magnificent paintings, ornate carvings, and the most elaborate textiles in history, while the legacy of Islamic scientists is evident from the number of Arabic words that pepper the lexicon of modern science such as algebra, alkaline and zenith.

The richness of Islamic culture was to a large part the result of a wealthy and peaceful society. The Abbasid caliphs were less interested than their predecessors in conquest, and instead concentrated on establishing an organised and affluent society. Lower taxes encouraged businesses to grow and gave rise to greater commerce and industry, while strict laws reduced corruption and protected the citizens. All of this relied on an effective system of administration, and in turn the administrators relied on secure communication achieved through the use of encryption. As well as encrypting sensitive affairs of state, it is documented that officials protected tax records, demonstrating a widespread and routine use of cryptography. Further evidence comes from many administrative manuals, such as the tenth-century Adab al-Kutt

b (‘The Secretaries’ Manual’), which include sections devoted to cryptography.

The administrators usually employed a cipher alphabet which was simply a rearrangement of the plain alphabet, as described earlier, but they also used cipher alphabets that contained other types of symbols. For example, a in the plain alphabet might be replaced by # in the cipher alphabet, b might be replaced by +, and so on. The monoalphabetic substitution cipher is the general name given to any substitution cipher in which the cipher alphabet consists of either letters or symbols, or a mix of both. All the substitution ciphers that we have met so far come within this general category.

Had the Arabs merely been familiar with the use of the monoalphabetic substitution cipher, they would not warrant a significant mention in any history of cryptography. However, in addition to employing ciphers, the Arab scholars were also capable of destroying ciphers. They in fact invented cryptanalysis, the science of unscrambling a message without knowledge of the key. While the cryptographer develops new methods of secret writing, it is the cryptanalyst who struggles to find weaknesses in these methods in order to break into secret messages. Arabian cryptanalysts succeeded in finding a method for breaking the monoalphabetic substitution cipher, a cipher that had remained invulnerable for several centuries.

Cryptanalysis could not be invented until a civilisation had reached a sufficiently sophisticated level of scholarship in several disciplines, including mathematics, statistics and linguistics. The Muslim civilisation provided an ideal cradle for cryptanalysis, because Islam demands justice in all spheres of human activity, and achieving this requires knowledge, or ilm. Every Muslim is obliged to pursue knowledge in all its forms, and the economic success of the Abbasid caliphate meant that scholars had the time, money and materials required to fulfil their duty. They endeavoured to acquire the knowledge of previous civilisations by obtaining Egyptian, Babylonian, Indian, Chinese, Farsi, Syriac, Armenian, Hebrew and Roman texts and translating them into Arabic. In 815, the Caliph al-Ma’m

n established in Baghdad the Bait al-Hikmah (‘House of Wisdom’), a library and centre for translation.

At the same time as acquiring knowledge, the Islamic civilisation was able to disperse it, because it had procured the art of paper-making from the Chinese. The manufacture of paper gave rise to the profession of warraq

n, or ‘those who handle paper’, human photocopying machines who copied manuscripts and supplied the burgeoning publishing industry. At its peak, tens of thousands of books were published every year, and in just one suburb of Baghdad there were over a hundred bookshops. As well as such classics as Tales from the Thousand and One Nights, these bookshops also sold textbooks on every imaginable subject, and helped to support the most literate and learned society in the world.

In addition to a greater understanding of secular subjects, the invention of cryptanalysis also depended on the growth of religious scholarship. Major theological schools were established in Basra, Kufa and Baghdad, where theologians scrutinised the revelations of Muhammad as contained in the Koran. The theologians were interested in establishing the chronology of the revelations, which they did by counting the frequencies of words contained in each revelation. The theory was that certain words had evolved relatively recently, and hence if a revelation contained a high number of these newer words, this would indicate that it came later in the chronology. Theologians also studied the Had

th, which consists of the Prophet’s daily utterances. They tried to demonstrate that each statement was indeed attributable to Muhammad. This was done by studying the etymology of words and the structure of sentences, to test whether particular texts were consistent with the linguistic patterns of the Prophet.

Significantly, the religious scholars did not stop their scrutiny at the level of words. They also analysed individual letters, and in particular they discovered that some letters are more common than others. The letters a and I are the most common in Arabic, partly because of the definite article al-, whereas the letter j appears only a tenth as frequently. This apparently innocuous observation would lead to the first great breakthrough in cryptanalysis.

Although it is not known who first realised that the variation in the frequencies of letters could be exploited in order to break ciphers, the earliest known description of the technique is by the ninth-century scientist Ab

Y

s

fYa’q

b ibn Is-h

q ibn as-Sabb

h ibn ‘omr

n ibn Isma

l al-Kind

. Known as ‘the philosopher of the Arabs’, al-Kind

was the author of 290 books on medicine, astronomy, mathematics, linguistics and music. His greatest treatise, which was rediscovered only in 1987 in the Sulaimaniyyah Ottoman Archive in Istanbul, is entitled A Manuscript on Deciphering Cryptographic Messages; the first page is shown in Figure 6. Although it contains detailed discussions on statistics, Arabic phonetics and Arabic syntax, al-Kind

’s revolutionary system of cryptanalysis is encapsulated in two short paragraphs:

One way to solve an encrypted message, if we know its language, is to find a different plaintext of the same language long enough to fill one sheet or so, and then we count the occurrences of each letter. We call the most frequently occurring letter the ‘first’, the next most occurring letter the ‘second’, the following most occurring letter the ‘third’, and so on, until we account for all the different letters in the plaintext sample.

Then we look at the ciphertext we want to solve and we also classify its symbols. We find the most occurring symbol and change it to the form of the ‘first’ letter of the plaintext sample, the next most common symbol is changed to the form of the ‘second’ letter, and the following most common symbol is changed to the form of the ‘third’ letter, and so on, until we account for all symbols of the cryptogram we want to solve.

Figure 6 The first page of al-Kind

’s manuscript On Deciphering Cryptographic Messages, containing the oldest known description of cryptanalysis by frequency analysis. Ibrahim A. Al-Kadi and Mohammed Mrayati, King Saud University, Riyadh.

<< 1 2 3 4 5 6 7 ... 9 >>

На страницу:

Перейти

3 из 9

Другие электронные книги автора Simon Singh

Fermat’s Last Theorem

Big Bang

The Cracking Code Book