This essay will be centered about an ignorant layman's approach to the design of symmetric ciphers, and I apologize in advance for the terminology, which may seem to some people simply strange, to others outright offensive. Some time ago I got hold of Bruce Schneier's book "Applied Cryptography", and as it usually happens, along with the answers to some of my questions, I got new questions, which are for the moment without answers. Main question - should the algorithm of the cipher be open for analysis or should it be unavailable to cryptanalyst? The consensus is that the algorithm should be made public, and all the security must be based on the key. This seems reasonable, because such requirement closes the door for "snake-oil" promoters, whose algorithms cannot be scrutinized. On the other hand, exact knowledge of the algorithm allows very finely tuned attacks, which exploit the minute details of the ciphering mechanism. Quoting Schneier: "...chosen-key attack, which exploits the fact that all rounds are identical and that the key schedule is just a cyclic shift by 32 bits" (p.327) "...examined Blowfish with known S-boxes..." (p.339) and so on. In course of this reading, by an unknown memory twist, I remembered a funny word combination - "Problem of a Drunken Sailor". Actually this is a serious mathematical problem, conventionally known as the problem of random walk. The essence of this problem can be visualized if one imagines a drunken sailor on a street crossing in an unknown city. The guy has about the equal probability to go in any of the 4 possible directions. On the next crossing there will be again 1 out of 4 choice, then again and again, until he comes to the harbor (if this ever happens). And then a crazy idea came to my mind - why not introduce some trick which will deny the cryptanalyst the access to his main weapon? Why not make the algorithm itself key-dependent and plaintext-dependent. In this case our friend cryptanalyst will be thrown back to square zero, because there will be nothing certain to cryptanalyze - only the "C" code which will explain how the choices are made between different algorithms. Elaborating a little further, the number of possible choices must be greater than the combined number of all possible keys and all possible plaintexts. Then the brute-force attack will really be the only way - all other ways will require more plaintexts or more keys than can exist. Now the purists are advised to close their eyes and to read the rest of the text with the eyes closed, because the main concept which will be discussed here is the concept of BOOZE. There are two variable parts processed by the encryption algorithm in order to produce ciphertext - key and plaintext. Accordingly, if the choices in encryption procedure will be dictated by the key alone, we will call this Master Booze, if the choices will be dictated by the plaintext, we will call this Peer Booze. Both are necessary in varying proportions, and it is my intent to draw attention to the fact that a big amount of high quality Peer Booze is present in many successful block ciphers. Neglect of the amount or quality of Booze (both Master and Peer) facilitates the cryptanalysis enormously. --- MASTER BOOZE BREWERY (Stuffing the cellar) --- BOOZE (orig. boo-zah) * word of Mongolian origin, denotes a beverage which makes people aggressive and delirious. * (amer.slang) strong liquor. Conventionally this procedure is known as Key Schedule, and its purpose boils down to generating some bytes or words which will be used for subsequent encryption. The generated stuff is known as Subkeys, and different ciphers use different number of those and different generating procedures. IDEA generates its subkeys by the 25-bit cyclic shift of the key, FEAL uses a special "Fk" function, BLOWFISH uses itself and the hex digits of Pi, and so on. As a result the generated subkeys vary greatly in several respects: For some ciphers there exist weak keys which produce subkeys with complementation properties, symmetry, etc. Insufficient number of subkeys restricts possibilities for plaintext-dependent choices in the algorithm (little to choose from). Not always do all the subkeys depend on every bit of the key, (which is conventionally called the avalanche property). Sometimes it is necessary to recompute all the preceding subkeys in order to compute a particular subkey needed at the moment (this can be important in case of memory-restricted applications like smartcards, where you cannot keep all Booze in memory). Without elaborating the question in more detail, I am going here to propose a procedure with the following properties: * Computationally easy - the only operations used are addition and multiplication. * Capable of producing an arbitrary number of randomized subkeys. * Change of any bit of the key changes the entire Master Booze array beyond recognition - complete avalanche. * No keys produce the subkeys with symmetry, complementation or other undesirable properties - at least to my knowledge. * Any subkey can be esily recomputed without computing all the previous ones. This ease and features come at a price - some additional material is necessary for this procedure to work. Fortunately, courtesy of Bruce Schneier, we have this required ingredient - the hex digits of Pi, and a reference to a Web site, where one can find more. The proposed procedure is inspired by the convolution algorithm, which is very common in digital filtering studies. Essentially the convolution is described by the following formula (it is not easy to reproduce it in the ASCII text file) If K[n] is the key, and G[i] is some random byte sequence: ********************************************************* n=N CONV[m] = SUM {G[n+m]*K[n]} ; where N is the number of points in K[n] n=1 ********************************************************* Obviously, the number of points in G[i] must be big enough to allow this procedure. The file with the hex digits of Pi referenced by B. Schneier has 65536 hex digits, which should be sufficient for any realistic purpose. In practice, the algorithm works equally well for bytes, short or long integers, the random sequence just does not care. Both cipher packages include a "C" program which implements the algorithm in byte version. Multiplication is unsigned Mod 255, all eventual carries and overflows are disregarded (anyway, this is meant to be a one-way procedure). For purists who don't like the mentioning of Booze, I have an alternative name for this program - PUKS, which stands for "Proposed Universal Key Scheduler". Having computed as many subkeys as desired, we can now proceed further and develop some more sophisticated beverages. These will provide vast opportunities for some parts of the plaintext to present random choices to another parts, choices dependent on the plaintext itself and on the assortment of strong Booze in the cellar. --- Taking them to the Saloon (two case studies) --- Twelve drunken men On the chest of the dead... Yo-ho-ho! And a bottle of rum! In order to appreciate the difference between a sober cipher and a drunken one, I decided to take two well known ciphers as guinea pigs and pour a good amount of Booze down their throat. Again thanks to the courtesy of Mr. Schneier the choice was obvious - FEAL-8, abandoned even by its creators in favor of FEAL-NX, and subject to all kinds of cryptoanalysis, and MMB, which Bruce Schneier declared simply "dead". It was interesting to see, could these two "corpses" be revitalized by offering them a good drink. The source code is easily found on the WWW, as well as digits of Pi and other random stuff. Case study 1: FEAL-8 This is a classical Feistel cipher with 64-bit block and 8 rounds of encryption. Even the superficial study of the encryption procedure reveals an important shortcoming - this cipher used only Master Booze, the subkeys for consecutive rounds were predetermined, and although each round used different subkeys, the plaintext peers had no voice at all in this process, all the sequence was determined by Master. This was wrong, so the changes were to happen precisely at this level. The heart of the change is in one single line of code in the routine for the evaluation of "f" function: WAS: B.All = BB ; Subkey provided by external routine IS NOW: B.All = K[A.Byte[0]] ^ K[A.Byte[1]] ^ K[A.Byte[2]] ^ K[A.Byte[3]]; Subkey selected from an array, using plaintext as a source of indices. Well, in order to make this one change possible, several other changes had to be done as well. Obviously, since the indices are 8 bits long, the array of subkeys must have at least 256 elements (practically it has 264, the extra 8 used in the initial and final XOR-ing). The subkeys used in FEAL-8 are 16 bits long, so this means that 528 bytes of Booze needed to be generated. It made sense to generate all the Booze in the setup procedure, the array dimensions were chosen to accomodate the increased number of entries. Subkeys K[0] to K[255] are used in the rounds, subkeys K[256] to K[263], as mentioned, are used for XOR-ing. The "SetKey" routine takes over after the "MasterBooze" routine, uses the generated subkey bytes and "stuffs the cellar". The "FK" function from the original FEAL-8 is not needed anymore, so it was deleted. The revamped cipher was compiled and tested (very superficially, I am not a pro in cipher testing). The change of any bit in the key changes all the subkeys, that is natural, it could not be otherwise. Change of one bit in the plaintext produces an avalanche-like change in the ciphertext starting from the 2-nd round, and every 2 rounds thereafter. It is verified that all bits of the plaintext have a fair and equal chance to vote in the Peer Booze selection - they choose their Peer Booze out of 2^16 possible variants, give or take a dozen. Now we can make a quick estimate - how many choices does the cipher present to a cryptanalyst. Sure, this is not the clear-cut case of random algorithm, only parameters are random, but this is already something that is worth estimating. First, there is initial and final XOR-ing with unknown 64-bit numbers (remember the 16 extra subkey bytes). This gives 128 bits of Master Booze, already a significant amount of confusion. Then the 8 rounds with 16-bit subkeys contribute additional 128 bits of Peer Booze. In my humble view, the total amount of Booze consumed in this cipher is close to 256 bits, so the only way of breaking it will be the brute force guessing of the plaintext. Remember that the key can be up to 64 bytes long (512 bits), and it has more security than the Master Booze and Peer Booze combined. Brute force cracking the key is out of question. Case study 2: MMB This was a far more demanding customer, and it provided opportunity for some fine points which I hope can be used in other areas beyond this case study. First of all, even if classical MMB is "dead", there is already a new MMB-20 with "Strengthened Key Schedule", and this is what I took for a starting point (just to discard all this Key Schedule in favor of MasterBooze a.k.a. PUKS) Nevertheless, this version had a neat routine for Modular Multiplication, and that alone justified the choice. Constants defined in the program also turned out to be quite useful. Before we turn on to the cipher itself, a couple of facts about modular multiplication need to be reminded. Suppose we have a number relatively prime to N which for our purposes we shall call SEED. Then two numbers: SEED^k mod N and SEED^(T(N) - k) mod N ( where T(N) is Euler totient function ) are in fact the multiplicative inverses of each other. So for any random "k" it is possible to compute a corresponding multiplier-inverse pair. BINGO !! This called for some experimentation. I wrote a short program which took the value of SEED from input, then calculated the subsequent powers of SEED mod (2^32 - 1), and counted the number of iterations until the result was equal to 1. Some rudimentary care was taken to assure that SEED and (2^32 - 1) would be relatively prime. (For those interested in the source code, this program was written in FORTH, and I cannot sing enough praise to the interactivity and ease of programming in this miraculous language). Two important facts were discovered. * The maximum number of iterations needed to come down to 1 turned out to be equal to 2^16 (65536). Remaining on the realistic side, I was very happy with this result. It meant that for an appropriately chosen SEED there exist 2^16 different numbers which can be generated by raising SEED to a random power "k" < 2^16. Each of these numbers has a multiplicative inverse which in turn can be generated by raising SEED to the power (2^16 - k). * Two of these multipliers are trivial - they are themselves their own multiplicative inverses. These are SEED^(2^16) = 1 and SEED^(2^15) = 0x10000, so care should be taken in order to avoid using them in the encryption program. Readers are welcome to look into the MMBOOZE source code and look into "Make_2_Multi" routine to find out the exact procedure. A number chosen to be the SEED is equal to 0x2F8E6D85, this was chosen for no particular reason other than the fact that it generates a 2^16 long cycle of its subsequent powers. Actually the Make_2_Multi routine generates two numbers at once - the multiplier and its inverse, both are generated within the same loop and routed to the adjacent cells in the "Mult" array. The total of 128 multipliers and 128 inverses are generated based on subkey bytes from "ss[128]" to "ss[383]". The encryption process consists of 3 rounds of modular multiplication with initial, two intermediate and final XOR-ing of the text with the Master Booze bytes. Since the maximum length of the plaintext can be up to 32 bytes, 128 bytes of Master Booze are generated for this purpose. Some of them may stay unused in case of shorter plaintext blocks. The modular multiplication round differs significantly from the one used in the original MMB-20 routine in two respects. First, when 4 bytes are taken in order to form the 32-bit number, the multiplier is chosen based on the preceding two bytes in the plaintext sequence, just by XOR-ing these bytes together. This makes the choice of the multiplier plaintext-dependent and gives all bits in the plaintext a fair and equal chance of voting. Second, after the multiplication, when the bytes of the product are returned to their places and it is time to proceed with the subsequent multiplication, the two MS bytes of the previous product provide the number for the choice of subsequent multiplier, the two LS bytes of the previous product become the two MS bytes of the new number to be multiplied, and two adjacent bytes of the plaintext are appended to them in order to make this new 32-bit number. Graphically a round can be visualized in the following sketch: ( the array index is circular mod "ArrayLength" ) _________________________________________________________________________ 1-st multiplication XXXXxxxxxxxxxxxxxxxxxxxxxxxxxxMM ( XXXX - 32-bit number to multiply, MM - bytes for multiplier choice) _________________________________________________________________________ 2-nd multiplication MMPPXXxxxxxxxxxxxxxxxxxxxxxxxxxx ( PPXX - 32-bit number to multiply, MM - bytes for multiplier choice) _________________________________________________________________________ 3-d multiplication ppMMPPXXxxxxxxxxxxxxxxxxxxxxxxxx ( PPXX - 32-bit number to multiply, MM - bytes for multiplier choice) ......................................................................... Last multiplication XXppppppppppppppppppppppppppMMPP ( The index is cyclically wrapped over - PPXX - 32-bit number to multiply, MM - bytes for multiplier choice) _________________________________________________________________________ This arrangement allows an ultra-rapid propagation of all changes across the entire plaintext - there is a complete avalanche already after the 2-nd round, and in my humble opinion, 3-d round is close to an overkill, taking into account all the intermediate XOR-ings. However, I decided to let it stay, mainly because the overlapping doubles the number of multiplications per round, so the total number of multiplications is the same as in original MMB-20 cipher in 6 rounds without overlapping. It should be also noted that this arrangement effectively molds the entire plaintext-ciphertext array into an inseparable entity, where the junctions between the adjacent bytes and words don't exist any more, they are all blurred by overlapping and by cyclical wrapover of the index. This strongly discourages the attacks based on different behavior of the distinct bytes and words during encryption. In this cipher the entire ciphertext is just a circular sequence of bits without beginning or end, without the junctions between bytes or words, without any realistic possibility to trace the output changes back to the changes in any particular word or byte. The cryptanalyst would either be forced to attack the ciphertext as a whole (128 or even 256 bits), or just give up, which I hope will precisely happen. Decryption is the same as encryption, the XOR-ing Master Booze is just used in reverse order, during the multiplication the index rolls backward, and in the number for multiplier choice the LSB is flipped over, so that instead of multiplier used for encryption, its inverse is selected from the adjacent cell. If multiplier #94 was used in the encryption, multiplier #95 will be selected for decryption, and vice versa. I must confess that I shamelessly borrowed from FEAL two neat routines, one for assembling the 32-bit number from 4 bytes, another one for breaking the 32-bit number into 4 bytes. I just wanted the program to work identically on all kinds of machines - big-endian and little-endian alike. This slows the computations quite a bit, but anyway, this was meant to be a model, not a racing car. Those who will have the drive and desire to build MMBOOZE into their products, are welcome to write the Assembler routines for modular multiplication, revise the byte positioning in memory, roll the indices in reverse order, generate the Master Booze from higher indices towards the lower ones, in short, do all kinds of amendments in order to speed up the execution. --- EPILOGUE... AES submission, etc. --- After this essay was completed, I thought that MMBOOZE would make a worthy submission to the AES competition, but on the second thought I stopped regretting. Imagine a completely drunken cipher among the bunch of its sober colleagues, you get the idea? Besides, AES submission required a voluminous pseudoscientific description both in Acrobat and in Postscript formats, some reference and not-so-reference implementations, strict adherence to NIST guidelines, inclusion of NIST headers, estimates of performance on different platforms, etc, etc. This was certainly not a job for an individual amateur with an old 486/100 computer and 14400 link to the Net. And I was simply late, the AES submissions were ended when the MMBOOZE model made its first steps. To end this all on an optimistic note: I leave you drunken MMBOOZE, Which everyone is free to use... If you find there a trapdoor, Push it! Don't ask me any more ! With best regards Boris Kazak.