DATA COMPRESSION
As is well known, the amount of memory available in a smart card is severely limited. Consequently, the desire to improve this situation by using data compression repeatedly arises among application providers. There are certain hurdles that must be overcome before data compression can be used. The algorithm must not take up too much memory space, and in particular it must require very little RAM. In addition, an acceptable compression speed should be achieved. The compression factor is not all that important, since the data volume is always only a few hundred bytes at most. The methods that are frequently used for smart cards are run-length encoding and variable-length encoding. With run-length encoding, a contiguous string of identical data objects is replaced by the combination of a repetition count and the object (such as a character) to be repeated. With variable-length encoding, the frequency of occurrence of characters having a fixed length (e.g., one byte) is analyzed, and the most frequently occurring characters are replaced by characters with shorter lengths (Huffman algorithm). Less frequently occurring characters are encoded using longer codes. With static variable-length encoding, replacements are made using a previously defined table. The dynamic version of variable-length encoding first analyzes the frequency distribution of the characters in the original data and then constructs a replacement table based on the results of this analysis. A third variation is adaptive variable-length encoding, in which the replacement table is continuously updated during the compression process to achieve optimum compression. Both dynamic and adaptive variable-length encoding are out of the question for smart cards, due to the complexity of their algorithms and their large memory requirements. Run-length encoding and static variable-length encoding are thus the only real alternatives for use in smart cards. The algorithm for run-length encoding does not need much program code, but it has the drawback that it can only be used with repetitive data. Image data, for example, are particularly suitable, since images often contain large areas with the same value. Keys for symmetric cryptographic algorithms would be completely unsuitable for compression with this algorithm, since they have the characteristics of random numbers.

Static variable-length encoding is the second compression method used with smart cards. It is quite suitable for files containing telephone directory information, for instance, since the structure of the stored data is known and the replacement table can be permanently built into the algorithm. Telephone numbers consist of only the numerals 0 through 9 and a few special characters, such as ‘*’ and ‘#’. If only capital letters are allowed for names, the replacement table only has to accommodate the 26 characters of the alphabet. Furthermore, certain letters occur significantly less often in names than do others, which also affects the encoding. With telephone directories, a memory space reduction of 30% (compared with the uncompressed data) can certainly be achieved, although this does not take into account the memory occupied by the compression algorithm. However, certain things must be considered with regard to data compression for smart cards. Ideally, data compression should be performed in the operating system in a manner that is fully transparent to the outside world, such that uncompressed data can be read and written in the usual way using standard commands. Compression can also only be applied to certain types of data. The results of attempting to compress program code and keys are usually unsatisfactory. This must be taken into account in the design of the application, since otherwise the anticipated reduction in memory space can (in the worst case) turn into a need for even more memory as the result of ‘compression’. For all of these reasons, data compression has been used only sparingly in smart cards up to now. In special applications, such as telephone directories in cards used in the telecommunications sector, compression algorithms are sometimes used.With general-purpose operating systems and applications in which the structure of the data is not known in advance, using data compression does not produce satisfactory results. It should thus be avoided, due to the additional memory space required by the compression algorithm.

CRYPTOLOGY
In addition to their function as data storage media, smart cards are also used as authorization media and encryption modules. As a result, cryptography achieved a central significance in the early days of smart cards. Nowadays, the procedures and methods of this discipline are firmly established components of smart card technology. Cryptology can be split into two areas of activity, namely cryptography and cryptanalysis. Cryptography is the study of the methods used for encrypting and decrypting data, while cryptanalysis is concerned with attempting to break existing cryptographic systems. In the smart card realm, the practical use of existing cryptographic procedures and methods represents the principal task and primary application area with regard to cryptography. Consequently, here we concentrate more on the practical aspects of cryptography than on the theoretical aspects. However, we do not entirely neglect the application of the procedures and the basic features of their theoretical foundations. The four objectives of cryptography are maintaining the secrecy of messages (confidentiality), ensuring the integrity and the authenticity of messages and ensuring the binding force (non-repudiation) of messages. These objectives are mutually independent, and they place different demands on the system in question. Confidentiality means that only the intended recipient of a message can decrypt its contents. Authenticity means that the recipient can verify that the received message has not been altered in the course of being transmitted. Nonrepudiation means that the sender can verify that a certain recipient has received a particular message, which means that the message has binding force.

The notation used in this book for cryptographic procedures is illustrated in Figures 4.20 and 4.21. The terms and principles described below form the basis for cryptology and are a prerequisite for understanding the procedures described in the rest of this section. In simplified terms, there are three types of data in encryption technology. The first is plaintext, which is unencrypted data. Encrypted data is referred to as ciphertext. Finally there is a key, one or more of which is required for encryption and decryption. These three types of data are processed by an encryption algorithm. The algorithms that are currently used in smart cards are generally block-oriented, which means that the plaintext and ciphertext can only be processed in packets with fixed lengths (such as 8 bytes with DES). Modern cryptographic algorithms are generally based on Kerckhoff’s principle. This principle, which is named after Auguste Kerckhoff (1835–1903), says that the entire security of an algorithm should be based only on the secrecy of the key, and not on the secrecy of the cryptographic algorithm. The consequence of this generally known but often-disregarded principle is that many algorithms used in the civil sector have been published, and in part also standardized. The opposite of Kerckhoff’s principle is the principle of security by concealment.With this principle, the security of a system is based on the idea that a would-be attacker does not know how the system works. This principle is very old, and it is still frequently used even today. However, you should take care not to develop a cryptographic system (or any other system) based on this principle alone. Up to now, every system based on this principle alone has been broken, usually in a very short time. In our information society, it is generally not possible to keep the technical details of a system secret for a long time, and that is precisely what this principle requires.

Of course, the consequences of the unintentional interception of messages can most certainly be limited by using concealment. This principle is thus repeatedly used in combination with Kerckhoff’s principle. In many large systems, it is also incorporated as a supplementary security level. Since the security of modern, published cryptographic algorithms is primarily based only on the limiting processing capacity of current computers, concealing the procedure that is used increases the level of protection against attacks. If you rely only on the protection provided by the assumption that a potential attacker does not have access to sufficient processing power, you may be quickly overtaken by the rapid pace of technical progress. Statements such as ‘it would take a thousand years to break this cryptographic system’ are unreliable, since they are based on currently available processing capacities and algorithms. They cannot take future developments into account, since such developments are generally unknown. The arithmetic processing capacities of processors double around every 18 months, which means that the capacity per processor has increased by a factor of approximately 25,000 over the last 25 years. Recently, the increased degree of networking of computers has created another option for mounting serious attacks on keys or cryptographic systems. For instance, a request to help break a DSS key, if posted on the Internet, would be forwarded to millions of users by the snowball effect. If only 1% of all current users2 participated in such an action, the potential attacker would have access to a parallel computer composed of 300,000 individual computers. Cryptographic algorithms are divided into two types: ymmetric and asymmetric. This division is based on the key that is used. Here ‘symmetric’ means that the algorithm uses the same key for encryption and decryption. By contrast, asymmetric algorithms (which were postulated in 1976 by Whitfield Diffie and Martin E. Hellman) use different keys for encryption and decryption. A term that often comes up in connection with cryptographic algorithms is the magnitude of the key space. This refers to the number of possible keys that can be used with a particular cryptographic algorithm. A large key space is one of several criteria for a secure cryptographic algorithm.

A requirement that has only recently become prominent with regard to the technical implementation of cryptographic algorithms in smart cards is freedom from noise. In this context, this means that the execution time of the algorithm must not depend on the key or the plaintext and ciphertext. If this requirement is not met, it could be possible to discover the key in a relatively short time, which would mean that the entire cryptographic system was broken. In cryptology, there is a strong distinction between the theoretical and practical security of a system or an algorithm. A system is theoretically secure if an attacker, given unlimited time and technical resources, cannot break the system. This means that even if an attacker would need 100 years and the aid of several supercomputers to break a system, it could not be considered to be theoretically secure. If a system cannot be broken when the attacker has only a limited amount of time and technical resources, it is considered to be practically secure. A cryptographic system can assure the confidentiality and/or authenticity of a message. If the system has been broken, this means that confidentiality and/or authenticity are no longer guaranteed. If an attacker can discover the secret key of an encryption algorithm, for example, he can then decrypt data that have been protected by being encrypted, in order to learn their content and modify them as desired. Several different methods of attack can be used to break the key of a cryptographic algorithm. In a ‘ciphertext-only’ attack, the attacker knows only the ciphertext and attempts to determine the key or plaintext from the ciphertext. A more promising method of attack is the ‘knownplaintext’ attack, which involves the attacker knowing several plaintext–ciphertext pairs for a secret key. The ‘chosen-plaintext’ and ‘chosen-ciphertext’ attacks require the attacker to be able to generate his own plaintext–ciphertext pairs. If this is possible, the likelihood of success is improved, since the secret key can be discovered experimentally. Discovering a key by trial and error (a ‘brute-force’ attack) is naturally the least sophisticated method of attack.With this method, an attempt is made to find out the correct key by employing a large amount of processing capacity to test all possible keys with a know plaintext–ciphertext pair. Obviously, a processing capacity in the supercomputer range is normally a prerequisite for this method. Statistically seen, on average half of the possible keys must be tested before the right one is found. Naturally, a large key space considerably increases the difficulty of such an attack.