Technical papers: Knowledge Authentication

ABSTRACT
Knowledge-based authentication, where the user is required to prove the knowledge of a single secret in order to authenticate her self, is by far the cheapest method to confirm one's identity. Because of its simplicity and low costs, it is one of the most popular authentication methods on the internet. By now, it has become quite natural to identify ourselves by typing in our user id and a password in order to gain access to remote resources or authorize various transactions.
However, knowledge-based authentication has a number of challenges and, in fact; it has become the primary target for on-line criminals. In this paper, I will present a novel approach to knowledge-based one-factor authentication that solves many problems, thwarting most common attacks against such systems, while retaining its simplicity and convenience. It is achieved by the means of identity-based public key cryptography: the public/private key pair is generated directly from the unique user id and a secret password. Both provable and zero-knowledge authentications are discussed.
In financial applications, it is essential that users can accurately estimate the value with which they are entrusting service providers. In particular, this value needs to be clearly bounded from above; the damage from any malicious or erroneous action on the service provider's part should not exceed this limit. The proposed authentication method does not let the service provider to unilaterally compromise the user's security with respect to other systems — a feature certainly lacking from many authentication schemes currently in use.
The proposed method has broader applications than authentication. For example, it allows for a digital signature scheme that matches paper-based signatures more closely in that the signer does not need to own any unique key for a signature, just have access to a general-purpose signing application (a pen) that can be shared by any number of users and know a secret that she can remember.

INTRODUCTION
Long before the invention of computers, passwords, pass phrases, lock combinations and signatures became widely used means of one-factor knowledge-based authentication for gaining access to resources or authorizing transactions. The public became familiar and comfortable with such things many generations ago. In most cases, computer technology attempts to imitate these methods as closely as possible. This is reasonable, because they are results of a long evolution of ideas and sufficient experience and trust has been accumulated with them.
In the computer world, one-factor authentication based on typing a single password dates back to the first multi-user systems. Its primary shortcoming is that passwords have to be assigned in a centralized fashion; the users cannot pick their passwords themselves. Otherwise, there would be no guarantee that no two users will pick the same password: if the password picked by a new user is rejected on the grounds that it has already been taken, the user would learn someone else's password, which is unacceptable. The probability of this happening by chance is uncomfortably high.
To mitigate against the above outlined risks, unique user ids have been introduced early on. In some cases they are centrally assigned, in other cases the user can pick one that is not already taken. But authorization requires both a valid user id and the corresponding password, meaning that learning someone else's user id does not immediately compromise the security of the system. Some systems make them public (e.g. UNIX™), others keep even the ids secret (e.g. most on-line banks). In such systems, users can change their passwords at will, thus protecting themselves against the possibility that their secret has leaked. The overwhelming majority of current knowledge-based authentication systems follow this paradigm (user id + password). Throughout this paper, we assume that the user id is known to potential attackers.
The attacker has two options to obtain the password: he can either steal or guess it. Also, in some protocols under certain circumstances it is possible to hijack a session without knowing the password. In this paper, we consider attacks on the communication channel and on the server. There are two kinds of attacks which lie outside of the scope of our research:
Compromise of the terminal, when the user cannot trust it anymore: this kind of attack can be prevented only on the hardware and/or the operating system level of the terminal. It is also an issue of physical security.
Guessing of the password through a number of attempts using the legitimate interface to access the system. This kind of attack can be foiled by introducing sufficient waiting times between unsuccessful attempts on the system's part and choosing a strong password on the user's part.
It is important to emphasize, that most humans are not able to remember a large number of high-entropy passwords; while the costs of protecting them in some recorded form is often unacceptably high. Losing access because of a forgotten password also constitutes a failure condition, while rarely used passwords have a tendency of being forgotten. Actually, one of the advantages of one-factor authentication is that there is only one item to protect from both theft and loss. Thus, it would be advantageous, if the user could use the same password for many services, without giving up too much security.

WHAT IS AN AUTHENTICATION?

Authentication binds a subject/principal outside the computer to an identity inside the the
Computer. Authentication is the process of determining whether someone or something is, in fact, who or what it is declared to be. In private and public computer networks (including the Internet), authentication is commonly done through the use of logon passwords. Knowledge of the password is assumed to guarantee that the user is authentic. Each user registers initially (or is registered by someone else), using an assigned or self-declared password. On each subsequent use, the user must know and use the previously declared password. The weakness in this system for transactions that are significant (such as the exchange of money) is that passwords can often be stolen, accidentally revealed, or forgotten. All subsequent stages assume the mapping is correct, so this is really important! For this reason, Internet business and many other transactions require a more stringent Authentication.

WHY AUTHENTICATION?
One-factor user authentication, where the user is required to prove the knowledge of a single secret in order to authenticate herself, is by far the cheapest method to confirm one's identity. Because of its simplicity and low costs, it is one of the most popular authentication methods on the internet. By now, it became quite natural to identify ourselves by typing in our user id and a password in order to gain access to remote resources or authorize various transactions.
However, knowledge-based authentication has a number of challenges and, in fact; it has become the primary target for on-line criminals. In this paper, I will present a novel approach to knowledge-based one-factor authentication that solves many problems, thwarting most common attacks against such systems, while retaining its simplicity and
Convenience. It is achieved by the means of identity-based public key cryptography: the public/private key pair is generated directly from the unique user id and a secret password. Both provable and zero-knowledge authentications are discussed.
In financial applications, it is essential that users can accurately estimate the value with which they are entrusting service providers. In particular, this value needs to be clearly bounded from above; the damage from any malicious or erroneous action on the service provider's part should not exceed this limit. The proposed authentication method does not let the service provider to unilaterally compromise the user's security with respect to their systems — a feature certainly lacking from many authentication schemes currently in use.
The proposed method has broader applications than authentication. For example, it allows for a digital signature scheme that matches paper-based signatures more closely in that the signer does not need to own any unique key for a signature, just have access to a general-purpose signing application (a pen) that can be shared by any number of users and know a secret that she can remember

TYPES OF KNOWLEDGE AUTHENTICATION
Knowledge authentication is broadly classified into following categories
· Without public key cryptography
· Using public key i.e. public key cryptography to the rescue
· Adding string to key pair
These classifications may be briefly explored as follows

WITHOUT PUBLIC KEY CRYPTOGRAPHY
In this case, at least at some point in the protocol, the user and the server share the secret password. This means that if the server has been compromised (which includes corrupt operators), the attacker immediately gains access to all the other systems, where the user uses the same password. This is a substantial security risk.
In the simplest case, the server has an authentication database relating each user id to the corresponding password. The user proves his knowledge of the password by transmitting it to the server. In this setup, even the passive compromise of any part of the system, that is eavesdropping on the communication or stealing the database results in compromised security.
Protecting either the database or the communication channel by means of symmetric (invertible) encryption does not add much in terms of security, as the key must be available on both ends, meaning that it can be stolen.
One-way encryption can protect the communication channel or the database, but not both:
The communication channel can be protected by a challenge-response protocol where the server provides the salt (which is unique for each session) for some one-way string-to-key transformation so that the terminal can perform it and send the result to the server, which can verify this result. The attacker has a very limited use of the information learned from eavesdropping, if the salt is unique for each session and the string-to-key transformation satisfies certain cryptographic assumptions. He can use it to verify his guesses of the passwords off-line. A man-in-the-middle attacker may hijack a session if the content is not integrity-protected with some MAC based on the shared secret, but cannot learn the passwords and use it for access to other systems or this same system at a later time. However, the entire unencrypted password database can be stolen, if the server gets compromised.
Alternatively, one can store a single string-to-key transformation's result for each user id with the corresponding salt in the database, but then the password needs to be sent through the communication channel, from where it can be stolen. This can happen either by eavesdropping the communication between the terminal and the server or by tricking the user into connecting to the attacker's server and providing his password. Similarly to the previous case, if the authentication database gets stolen, the attacker can launch a dictionary-attack offline.
In order to hamper dictionary attacks, the string-to-key transformation should require considerable computational effort even in the legitimate direction (about half a second or so), imposing a possibly prohibitive computational cost on the attacker. This is typically achieved by iterating the same salted hash-function several thousand times. Note, however, that the search can be performed in parallel on many computers. Unfortunately, gathering enormous computational power through viral infection is uncomfortably easy, due to the lax security of most internet-connected PCs.

PUBLIC KEY CRYPTOGRAPHY TO THE RESCUE
The most widespread use of public-key cryptography on the web is site-authentication through SSL, coupled with a secret key negotiation for protecting the communication channel. As seen in the previous section, one can protect the database using salted and iterated one-way string-to-key transformations. Thus, eaves dropping the communication or stealing the authentication database does not reveal the password to the attacker.
However, if the attacker succeeds in deceiving the user to connect to its server, the password gets delivered on a silver plate. Since verifying the site's authenticity requires effort on the user's part, one can always find some users who do not go through all these hoops. This is known as "phishing" and has become one of the most damaging criminal activities on the internet.
Also, there is nothing to prevent the server's corrupt administrator from logging the entered passwords and selling them on for profit. If the password gets used on a different system (remember, many people use the same password for accessing different systems), it is next to impossible to find the source of the leak.
In a more sophisticated authentication method (used, for example, by Hush Mail™ [Hush]), the server has the user's private key encrypted with a symmetric key derived from a password using a one-way string-to-key transformation and the public key in clear text in its authentication database. In order to authenticate, the user sends a unique id, the server answers with the encrypted private key, and from that point on the user signs every request with her private key. In case of Hush Mail™, the unique id serving as key to the encrypted private key database is also derived from the password, providing some preliminary authentication.
This solution, if the private key is generated and encrypted on a trustworthy terminal, prevents the server from ever seeing the user's password and forces the attacker to attack either the symmetric cipher protecting the private key or the public key itself. A good string-to-key transformation may slow down off-line dictionary attacks against the symmetric key.
It is very important to provide appropriate integrity protection for the private key, for otherwise a malicious server or an active attacker can mount an attack by tricking the user into signing with something else than her private key, and then infer her true private key from the resulting signatures. For example, the OpenPGP encrypted private key packet does not provide appropriate integrity protection. [Klima, Rosa]
Currently, such systems are implemented by passing a java applet to the client, providing another point of attack. In theory, however, this kind of authentication could be standardized and the necessary software installed on the terminals (e.g. as part of the web-browser). But this system has some drawbacks, too.
In fact, it is not a one-factor knowledge-based authentication but a faux two-factor knowledge and possession based authentication, where the "possession" part is stored by the server and given to the — yet to be fully authenticated — user. Thus, the security of the system depends on the security of the password, the security of the symmetric cipher and the security of the public key system together. The compromise of any one of these leads to the compromise of the security of the system. The loss of the password by the user or the encrypted private key by the server will lock the user out. Also, since the authentication is interactive, it can only be used over a duplex communication channel, even for one-way communication.
Finally, as counter-intuitive as it may sound, regularly changing the password in this setting results in deteriorating security: since the encrypted private key is given to the user before authentication, the attacker can collect the private keys encrypted with all the passwords and it suffices to break only one of them. In order to mitigate the risk of cracked passwords or public keys, the whole key has to be changed.
Therefore, the system is not pareto-secure [Grigg], since it can be improved without any tradeoff, as demonstrated in the next section.
STRING TO KEY PAIR
One can eliminate the symmetric cipher from the above scheme and generate the public/private key pair directly from the pass phrase by seeding a secure pseudorandom number generator with the user id and the password. Since the signing operation can be expected to be performed on different implementations, it is beneficial to use a digital signature scheme that does not have a subliminal channel, in order to prevent malicious implementations from leaking the secret key though signatures. For this reason, and its good reputation in the cryptographic community, I suggest the RSA signature algorithm.
This decreases both costs and risks of the defender compared to the stored encrypted private key scheme, while not decreasing the cost that it imposes on the attacker. The cost of storing encrypted private keys with high levels of reliability and integrity by the security provider are removed, as are the risks of successful attacks (of any kind) on the symmetric key encryption used to protect the private keys.
Since generating an RSA key pair requires a large amount of random data, it is instrumental to use a fast, yet secure pseudo-random number generator. The key-stream generator of the RC4 stream cipher is a good choice, if we discard the first 256 bytes, as suggested by RSADSI, to overcome the weakness of the key schedule. As the generation of an RSA key takes a considerable amount of time anyway, there is no reason for an additional string-to-key transformation; one can key the RC4 key stream generator directly with a concatenation of the user id and the password, with a unique separator in between and a unique terminator in the end, to prevent collisions. For separation, I suggest the line-feed character (ASCII 0x0A), since it cannot be entered from the keyboard anyway at an arbitrary position. For the same reason, I suggest the C string terminator (ASCII 0x00) for termination. Non-ASCII characters should be encoded in UTF-8. Of course, this limits the user id and the password to 254 bytes, but for most purposes this is enough.
Another important consideration is the key size (N). The amount of required random data is roughly proportional to the square of the key size, as the expected number of discarded prime candidates before finding a prime is proportional to the length of the prime. So is the number of required primarily tests, which is the computational bottleneck in a pure software implementation. Thus, the time required for generating a key pair from a string is clearly super linear in the key size. While the system can be attacked both on the public key level and on the password level and both attacks become more difficult with the increased key size, as the password can only be verified by checking whether or not the resulting first prime is a divisor of the public modulus, increasing the key size is not necessarily a rational decision.
The other parameter affecting security is the entropy of the password (H). The computational effort to break the password exceeds O (2HN). The computational effort to factor an RSA modulus is currently believed to be in excess of O (2N/8). It is clear that beyond some sufficiently large N; further increasing the key size strengthens the public key much more than the password to the point that increasing the password entropy becomes the rational choice.
Most humans can comfortably memorize a password with approximately 50 bits of entropy, especially if it is used on a regular basis. Factoring 1024 bit numbers seems to be on the far edge of feasibility for the forseable future. Using 1024-bit keys, verifying a password takes about a second on a modern PC. Thus, a million PCs dedicated to cracking our password would finish in about 35 years. This is about the same order of magnitude as the factorization of 1024-bit numbers. Remember, that this is a signature key, not an encryption key, so if it expires before it gets cracked, it cannot be abused. Thus, using 1024 bit keys and changing them together with the passwords each few years appears to be a balanced, secure decision.
Generating an RSA key pair involves generating two N/2 bit primes. If these primes are congruent to 3 mod 4, the public modulus will be a Blum integer. The advantage is that the best known zero-knowledge proof of factorization of Blum integers is a lot more efficient than the best known zero-knowledge proof of factorization. Thus, we propose the following method for generating an N or N-1 bit RSA key, where N is divisible by 16:
Initialize an RC4 key stream generator with a key consisting of the concatenation of the user id, the 0x0A separator, the password and the 0x00 terminator.
Discard the first 256 bytes from the key stream.
Read N/16 bytes from the key stream into p in big-endian manner.
Set the most significant bit and the two least significant bits of p
Test p for primality. If it fails, go to step 3.
Generate q the same way as p.
The public modulus becomes pq.
Using the system-wide public exponent e (e.g. F4=65537), calculate d as the multiplicative inverse of e mod (p-1)(q-1).
Calculating the optimization parameters is superfluous in most cases, as a signature operation is performed only once after generating the key (remember that the key is generated each time from the user id and the password), so we cannot gain anything by pre computations.
If the public key is available, the generation of the private key can be sped up by obtaining q by dividing the public modulus by p in step 6, rather than generating it from the key stream. This saves about half the time.
A pure java implementation of the above algorithm with N=1024 and e=65537 takes typically 15 seconds on a 500MHz PC. If the public key is available then half that time. In C, it is about an order of magnitude faster.
The resulting key pair can be used both for provable authentication through a signature and for zero-knowledge authentication in a very efficient manner.
Most public key infrastructures (PKIs) support RSA keys, thus such a key can be certified using any of these. Both OpenPGP and X.509 certificates can be supported with such a system.
PROVABLE AUTHENTICATION
Using the above key-generation process, one can RSA-sign each request to the server, as described in [PKCS#1]. The server can then verify whether the signature is correct and whether the request is not a replay of an earlier one. This is in direct analogy with the numbered and signed deposit and withdrawal slips used by banks for centuries.
Such a signed request constitutes a proof to a third party. This is desirable in many applications.
Since the RSA signature protects the integrity and the authenticity of the request, there is very little an attacker can do to make illegitimate requests on the legitimate user's behalf.
When registering for the service, the user generates the key pair using the above procedure and registers with her public key. It is worth noting that the service provider does not need to know how the public key was generated and how the user obtains the secret counterpart. Thus, the same server-side interface can be used for both one-factor and two-factor authentication.
Note, furthermore, that the registration and/or certification of the public key can happen after the first signature(s), just like in the case of paper-based ones. There is an important difference, however. Until the public key becomes available, there is no easy way to verify that two documents have been signed by the same person. Thus, it is recommended to transmit the public key with each signature, if deferred registration is allowed.

ZERO-KNOWLEDGE AUTHENTICATION
If the integrity of the communication-session is sufficiently protected by some other means, this same key can be used for zero-knowledge authentication as well (see e.g. [Freige, Fiat, Shamir] for details), when the service provider, while being satisfied about the identity of the user, is left with no third-party proof. This is analogous to verbal pass-phrases for accessing a secret bank account. Of course, this rules out any kind of arbitrated dispute resolution.
Using the same public key for provable and zero-knowledge authentication significantly simplifies the key-distribution problem.

ADVANTAGES

Authentication tools provide the ability to determine the identity of a party to an interaction and to ensure that a message came from who it claims to have come from. Like encryption, authentication is seldom used in isolation. Authentication is used as the basis for authorization (determining whether a privilege will be granted to a particular user or process), privacy (keeping information from becoming known to non-participants), and non-repudiation (not being able to deny having done something that was authorized to be done based on the authentication

Relatively simple administration
· Relatively cheap method of user authentication
· Identity confirmed — Both the server and the client can be sure of who they are communicating with.
Encryption — Both the request and the response are protected from intermediate prying eyes.

DISADVANTAGES

Weak security mechanism:
easy to guess (rules) or hard to remember (writing them down)
passwords are sometimes shared for convenience reasons
· Passwords can be stolen by monitoring keyboard keystrokes or network traffic, hacking increased load — Encrypting and decrypting communications is noticeably more CPU-intensive than unencrypted communications. Every request requires additional back and forth communications to set up the secure socket.
· Additional server requirements — The server must create a unique client certificate for each client that wishes to access the API. These APIs must be created and stored in a secure location and transmitted to the client via a secure channel

CASE STUDIES
Credit Card Verification for High-risk TransactionsIndustries: Financial ServicesApplications: Credit Card TransactionsProducts: ProCheck, ProID, ProMonitor CommercialA Gateway application vendor develops credit card payment processing software and also provides credit card transaction monitoring services for online merchants and credit card issuing banks. These Gateway vendors offer enhanced or packaged identity verification, authentication and monitoring features to online merchants as an additional service.Choice Point’s Authentication Solutions are integrated into the Gateway vendor's packaged application services to reduce "chargebacks." Chargebacks occur when a credit card customer successfully disputes fraudulent charges that appear on his or her credit card; the result is that the card-issuing bank is left holding the bad debt. If an online merchant receives too many chargebacks within a certain period, the credit card-issuing bank may choose to increase its "discount rate" (service fees) or close the merchant account.The authentication process is triggered for high-value or high-risk transactions (e.g., "big ticket" electronic items such as expensive stereo equipment or items that are shipped to an address where the credit card is not billed).Choice Point’s Authentication Solutions verifies the user's identity and then presents the user with a series of randomly generated questions, allowing the user to complete his or her transaction automatically online or via the telephone.The Choice Point authentication process could also be used as part of the credit card activation process, and monitoring services can be used throughout the credit card transaction process.

CONCLUSIONS AND FINAL REMARKS
We have presented a cryptographic technique that matches the standard procedures already widely used in the banking industry: the use of signed deposit and withdrawal slips to initiate transactions and that of proving identity without leaving evidence. As has been pointed out to me by a bank employee, performing authorization after having the request communicated is instrumental to security. This practice commonly followed in traditional banking and almost completely ignored in on-line financial services allows the security service to learn about an attacker's intent before the successful impersonification and aids detecting and assessing the nature of the attack while still in progress. The proposed technique, in the author's opinion, stands good chance of acceptance because it is more analogous to security procedures that people are familiar with.As building blocks, we have used time-honored cryptographic primitives and algorithms. The costs of assessing the security of the proposed technique should be acceptably low, as a large body of past experience can be leveraged. Similarly, the development costs of secure implementations are lowered by the fact that it can be built from pre-existing building blocks. The integration into existing systems can and should be facilitated by standards-compliance whenever possible

Technical papers

Saturday, February 9, 2008

Knowledge Authentication

1 comment:

Blog Archive