Understanding Perfect Secrecy in Cryptography: A Comprehensive Guide
Heads up!
This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.
Generate a summary for freeIf you found this summary useful, consider buying us a coffee. It would help us a lot!
Introduction
In the realm of cryptography, one of the most critical concepts is that of perfect secrecy, initially formalized by Claude Shannon in his groundbreaking work from 1948. Perfect secrecy constitutes the strongest notion of secrecy, essentially ensuring that ciphertext provides no additional information about the corresponding plaintext, regardless of what prior knowledge an attacker possesses. This article dives deep into the concept of perfect secrecy, discussing its definitions, the underlying principles, and its importance in cryptographic systems.
Understanding Perfect Secrecy
What is Perfect Secrecy?
Perfect secrecy is a cryptographic notion where the ciphertext does not reveal any information about the plaintext message. More formally, an encryption scheme is considered perfectly secure if for every plaintext and corresponding ciphertext, the a priori distribution of the plaintext remains the same before and after the ciphertext is observed by the attacker. This means that even with complete knowledge of the encryption mechanism, the adversary cannot determine any more information about the plaintext just from observing the ciphertext.
The Attack Model
The wellknown attack model analyzed in the context of perfect secrecy is the ciphertextonly attack model, where:
 The sender and receiver agree on a random key value (k).
 The sender encrypts a single plaintext message (m) using the encryption algorithm under key (k).
 The ciphertext generated is intercepted by an adversary.
The unique aspect of this model is that the adversary is considered computationally unbounded, implying they can perform any computation, including brute force attacks, without any restrictions on time or resources.
Shannon’s Definition of Perfect Secrecy
According to Shannon, an encryption process is said to be perfectly secure if:
 Prior Information: This encompasses the information known by the attacker about the plaintext prior to seeing the ciphertext.
 No Additional Information: After observing the ciphertext, the attacker should gain no additional information regarding the plaintext.
Mathematically, this can be expressed using probability distributions over message (M), key (K), and ciphertext (C) spaces:

The probability of the plaintext being equal to m given the ciphertext is c should equal the a priori probability of the plaintext being m, i.e.,
P(M = m  C = c) = P(M = m)
This condition must hold for every plaintext message and every ciphertext output, effectively asserting that the ciphertext reveals nothing beyond the prior information already held by the adversary.
Probability Distributions in Perfect Secrecy
1. Distribution over Key Space
The key generation algorithm induces a probability distribution over the key space. Under Kerckhoffs’ principle, the workings of the key generation process are assumed to be public knowledge, meaning that the adversary knows how keys are generated and can make corresponding calculations.
2. Distribution over Plaintext Space
This is where the attacker’s prior knowledge about the plaintext comes into play. For example, if the attacker believes that certain messages have a higher likelihood of being sent, this distribution impacts their deduction about the intercepted ciphertext.
3. Distribution over Ciphertext Space
The probability distribution over the ciphertext space is determined by both the distributions over the message and key spaces. When the encryption process is known, adversaries can compute the probability of each possible ciphertext based on their knowledge.
Equivalent Definitions of Perfect Secrecy
1. Independence of Ciphertext Distribution
An equivalent definition states that the ciphertext distribution must be independent of the plaintext. This means that for any pair of messages (m0, m1) selected with nonzero probability, knowing the ciphertext should not reveal whether m0 or m1 was encrypted at any point in time.
2. GameBased Definition of Perfect Secrecy
The gamebased definition involves an adversary attempting to determine which of two plaintext messages has been encrypted. The encryption process, along with key generation, operates randomly. The adversary must produce a guess that aligns with the encrypted message's index. The encryption scheme is said to be perfectly secure if the adversary’s probability of guessing correctly remains at 0.5, regardless of any observed ciphertext.
The Importance of Perfect Secrecy
Achieving perfect secrecy is the lofty goal in the field of cryptography. Perfect secrecy implies that even the most sophisticated adversary with unlimited computational resources will be unable to derive any information about the plaintext solely based on the intercepted ciphertext. This is crucial for protecting sensitive information against unauthorized disclosures.
Case Study: The Vigenère Cipher
To illustrate the concepts of perfect secrecy, let’s evaluate the Vigenère cipher, which has been shown not to achieve perfect secrecy. When testing for perfect security, if an adversary encrypts the same or two different characters under specific keys, the output will reveal some information by allowing the adversary to draw conclusions based on patterns.
Example of Winning Probability
In the Vigenère cipher instance mentioned, an adversary can guess whether the encrypted text corresponds to plaintext “00” or “01” with a probability significantly higher than 0.5. This demonstrates that the Vigenère cipher does not satisfy the conditions for perfect secrecy, as it does not prevent an adversary from gaining additional information.
Conclusion
In this exploration of perfect secrecy, we uncovered its foundational principles, defined key terminologies, and examined the underlying cryptographic mechanics that govern secure communication. Understanding perfect secrecy is crucial for constructing robust encryption schemes that safeguard sensitive information against any form of cryptographic attack. In a world increasingly reliant on digital communication, the implications of perfect secrecy remain vital in protecting data integrity and confidentiality. Thank you for engaging in this comprehensive overview of one of cryptography’s cornerstone concepts.
Plan for this lecture is as follows: we will discuss perfect secrecy, which is the first formal notion of security, which is also the strongest notion of secrecy. We also will discuss various equivalent definitions of perfect secrecy.
So, the definition of perfect security was given by Claude Shannon in his classical work in 1948 and Shannon is often considered as the father of information theory and this notion of security is also called as unconditional security and information theoretic security.
The attack model that is considered in the definition of perfect security is the ciphertext only attack model, where we assume that we have a sender and a receiver who have agreed upon a random key value k.
We assume that sender has encrypted a single message m using a publicly known encryption process under that key k, and the ciphertext has been intercepted by the adversary. The interesting part of this attack model is that we assume here that the adversary is computationally
unbounded. That means we make no assumption whatsoever about his computing power. We assume that he can do any kind of computation with brute force or any other computation. So that is the interesting aspect of this security model.
Informally, perfect security says that irrespective of any prior information that the attacker has about the underlying plain text, seeing the cipher text provides no additional information to him about the underlying plain text. That means seeing the ciphertext is absolutely
useless for the attacker. Whatever it could infer about the message from the ciphertext, same it could have already inferred before any ciphertext would have been communicated. So here we have 2 entities, we have the prior information and we have the term no additional
information. We have to now learn a little bit of math to understand how to formalize these 2 notions namely that of prior information and that have no additional information. So, imagine we are given a publicly known cipher namely a triplet of algorithms key
generation algorithm, encryption algorithm and a decryption algorithm. Then any attacker has the following information about the encryption process. Namely, he knows 3 spaces, namely the message space, the key space and the cipher text space, where the message space is the
set of all legitimate messages which could be encrypted by the encryption process. Keys space denotes all possible keys which could be output by the key generation algorithm. And the ciphertext denotes all possible ciphertext, which could be generated by the encryption
algorithm. From the viewpoint of the attacker, any encryption scheme induces 3 probability distribution, one probability distribution over the message space, one probability distribution over the
key space and one probability distribution over the ciphertext space. So now let us go over this probability distributions one by one. The first probability distribution is over the key space and this is induced by the key
generation algorithm. So remember, as per the Kerckhoffs’ principle, we assume that steps of the key generation algorithm, encryption algorithm and decryption algorithm are publicly known to the attacker. We also discussed that key generational algorithm has to be a randomized
algorithm. That means, every time we run the key generation algorithm it is going to output a possible candidate key from the key space with certain probability. So that is what we mean by the probability distribution over the key space induced by the key generation
algorithm. Also, we discussed in one of the earlier lectures that in most of the cases, the key generation algorithm is going to output a uniformly random key from the key space. Namely, if the key
space size is cardinality of key space, then any candidate key could be generated with probability one over the key space. So that is the uniform distribution over the key space, which adversary will already have if he knows the steps of key generation algorithm. But
it is not necessary that your key generation algorithm always outputs uniformly random keys from the key space. So that would be inducing another kind of probability distribution. Whatever is the case since the steps of the key generation algorithms are publicly known.
We know that from the viewpoint of the attacker, there is a probability distribution over different values of keys, which could occur with different probabilities. And to capture that, we introduced this notation K, which is a random variable which denotes the value of candidate key,
which could be obtained by running the key generation algorithm. And since the key generation algorithm is going to be a randomized algorithm, it is not the case that every time we run the key generation algorithm, we obtain the same value of k. And different values of k
could occur with different probabilities to denote what value of k has been obtained by running the key generation algorithm, we introduced this random variable K. The notation probability, random variable k is equal to k denotes the probability with
which the key generation algorithm would output the candidate value of key to be k. That is the probability distribution that adversary has about the key space. The next important probability distribution is that over the plain text space, and this
models any kind of prior information that the attacker might have about the underlying plaintext. For example, if the attacker already knows or feels that depending upon the underlying context sender would either encrypt a message “attack” 70% times or the message “retreat”
30% times. Then that is a probability distribution over the message space that is induced and known to the attacker. But it might be the case that attacker is completely clueless what exactly is going
to be the message it could be any uniform the random message from the plain text space in that case, the probability distribution is a uniform probability distribution. We introduced this random variable M, which is a random variable to denote the candidate
plain text, which sender might encrypt using the encryption algorithm. Since attacker does not know what exactly the message is going to be encrypted, we have problems for different candidate messages to occur with different probabilities. The
notation probability random variable M equal to m denotes a probability with which attacker feels that sender would encrypt a plain text m using the encryption algorithm. So that is a probability distribution over the plain text space.
The third probability distribution is the probability distribution over the ciphertext space. This is determined by the probability distribution over the message space and a key space and the steps of the encryption process. Because once the attacker knows,
what is the probability that a certain plain text would be encrypted by the sender, and what is the probability that a candidate key would be obtained by the key generation algorithm. Then by running the steps of the encryption algorithm over that candidate plain text and
the candidate key adversary can find out what is the probability of occurrence of a certain ciphertext. We introduced this third random variable C to denote value of the candidate ciphertext and the notation probability the random variable C equal to c denotes a probability
with which the encryption algorithm would output a cipher text c. Notice that there are random variables, M and K are independent of each other because the key is obtained by running the key generation algorithm, which is run independent of what
message is going to be encrypted by the Enc algorithm. Whereas the random variable c, it depends upon the random value of the random variable m and a random variable k. So now let us see a numerical example to understand these concepts in a better fashion. Imagine
I have a candidate encryption process where the candidate messages could be either the character a or b, or c or d. In the same way, imagine I have a key generation algorithm which could output 3 keys with different probabilities and say the attacker has the following probability
distribution over the pain text space. That means it knows depending upon the underlying context, it feels or it knows somehow that the message could be either “a” with a probability 1/4, or the message could b with
a probability 3/10 and so on. In the same way, assume that the key generation algorithm outputs the candidate key to be k1, k2, k3 with probability 1/4, 1/2 and 1/4 respectively. So that is a probability distribution over the key space that attacker has.
Imagine the adversary knows the steps of the encryption algorithm. So, it could compute an encryption matrix where along the columns you have the candidate plain text namely a b c d, these are the candidate plain text. And here you have the candidate keys which
could be obtained by running the key generation algorithm. This value 3 denotes that the encryption algorithm is going to output cipher text 3 if the plain text would have been “a” and the key would have been k1.
In the same way, for example, this last entry denotes that the ciphertext will be 2, if the plain text would have been d, and the key would have been k3, and so on. Adversary could compute this matrix because it knows the steps of the algorithm and it knows the
different candidate values of plain text and candidate values of key. So that is the information that is available with the attacker. So now let us try to compute the probability distribution over the ciphertext space. So
as you can see from the encryption matrix, the ciphertext could be either 1 or 2, or 3 or 4. Let us try to compute the probability with which the ciphertext could be 1 or 2, or 3 or 4. So let us first compute the probability that what is the probability that encryption
algorithm produces ciphertext to be 1. If you see the encryption matrix, ciphertext could be 1, if the plain text is b and the keys k2, that is one option or one condition under which the cipher text will be 1. The
other condition in which the cipher text could be 1 is when the plain text is c, and the key would have been k3. The third condition under which the ciphertext could be 1 is when the plain text is d, and the key would have been k1.
And all these 3 conditions or events are independent of each other. So we apply the sum rule here. The first term here denotes the probability that the key would have been k2. That means by running the key generation algorithm sender would have obtained the key to b k2, and the
probability that sender would have used the message b for encryption. Both these events occur together, then the ciphertext would have been 1. The second term denotes the other possible condition in which the ciphertext could have
been 1 namely, plaintext at sender wants to encrypt is c. And the key that is obtained by running the key generation algorithm is k3. And the third term here denotes the probability of occurrence of the third condition, namely, the plain text is d and the key generation
algorithm outputs the k to the k1. And since we know the value of each of these individual probabilities, which is available to the attacker, attacker can compute this and hence it knows that the probability that
encryption algorithm could produce ciphertext to be 1 is 0.2625. So that is the value of probability of occurrence of 1 as the ciphertext. In the same way, let us try to compute what is the probability that the ciphertext could be 2.
Again, if you see the encryption matrix, there are 3 possible conditions in which the ciphertext could be 2, the first condition is if sender would have encrypted the plain text c, and the key that is used by the sender namely the key generation algorithm outputs the key
to k1. So that is first term. The second condition in which the ciphertext character would have been 2 is if the plain text would have been d and the key generation algorithm for sender would have output the
k to k2. So that is the second term. And the third condition under which the ciphertext character could have been 2 is if the plaintext would have been d, and the key would have been k3. Again, we know all these individual probabilities and we can sum them together
to obtain the probability of ciphertext being 2 being produced by the encryption algorithm. In the same way you can compute the remaining probabilities, namely, the probability of ciphertext being 3 and the probability that ciphertext character is 4. And that gives
you the probability distribution over the ciphertext space. Not only we can compute the probability distribution over the cipher text space, we can also compute some additional values, namely some kind of conditional probabilities. For example, let
us try to compute the conditional probability that given the plain text is m, what is the probability that ciphertext would have been c. So for instance, suppose let us try to compute the value of probability that ciphertext is 1 given that plaintext is equal to a.
So again, if we see the encryption matrix, you see the column under the plaintext a, then you can see that if the plain text would have been a, no way it is possible that the ciphertext could be 1 because the encryption of a could give either the ciphertext being
3 or ciphertext being 4. For no value of k, the encryption of a would have given you the ciphertext to be 1. That is why we can say that the conditional probability that ciphertext is 1 given the
plain text is a is 0. In the same way, the conditional probability that ciphertext is going to be 2 given the plaintext is 1 is again 0 because again, you can see the encryption matrix or the encryption algorithm, which states that for no value of k encryption of
a under that key would give you the ciphertext value 2. Now let us compute the conditional probability that given the message a, what is the probability ciphertext is equal to 3? Now, if you see the encryption matrix, under the plain text
a, there are 2 values of keys namely k1 and k2, both of which could produce an encryption of a to be 3. Since the key generation algorithm is a randomized algorithm, the probability that the key is equal to k1 is given to you the probability that key is equal to k2 is
given to you and if any of these 2 events occur, then the encryption of a under the key k1 or under the key k2 would have given you the ciphertext equal to 3. The conditional probability that given the plain text is a, you obtain the cipher text
3 is 0.75. In the same way, if you want to compute the conditional probability that given the plain text is “a” what is the probability ciphertext equal to 4? Then again, if you look into the encryption matrix, you can see that if the plain text is a then the only
condition in which you can obtain the ciphertext to be 4 is if the key generation algorithm would have produced the key to be k3. And the probability that the key generation algorithm would produce the key to be k3 is
0.25. In the same way, you could compute other conditional probabilities as well. So now let us see the original definition of perfect security as given by Shannon. Recall the informal requirement from the perfect secrecy is that: we will say an encryption
process to be perfectly secure, if irrespective of any prior information the attacker has about the underlying plaintext, cipher text that it intercepts, leak no additional information about the plaintext.
We know how to model this prior information. That is precisely the probability distribution over the plain text space that attacker has. We can also mathematically capture what exactly we mean by no additional information. Formally, an encryption process namely a collection
of algorithms key generation, Enc and Dec over a plain text space M is called perfectly secure if for every probability distribution over the plain text space and key space, and for every plain text m belonging to the plain text space, according to that probability
distribution, and for every ciphertext c belonging to the ciphertext space, the following equality holds. So let us try to understand what exactly the LHS and RHS of this equality stands for. If you consider the RHS part of this equality, namely the probability that the plain text
is equal to m that is precisely the apriori probability that message m could be communicated by the sender. That is a prior information about the underlying plain text before any communication has happened from the sender to receiver, before any key generation algorithm
has been used, and the message has been encrypted. That is the apriori information about the underlying plaintext. Whereas the LHS part of this equality namely the conditional probability that the plain
text is equal to m given the ciphertext equal to c that supposed posteriori probability that the message m is encrypted in c. That means, given that adversary has seen or intercepted a cipher text c, what is the probability that the plain text m is encrypted in this cipher
text c. So, intuitively, when we say that these 2 probabilities are equal, it means that whatever the adversary knew about the underlying plain text before any cipher text was communicated with same probability adversary knows that a plain text could be m and in
the given cipher text c. That means observing the ciphertext c does not change attacker’s knowledge about the distribution of the plaintext. Whatever the attacker’s knowledge was before seeing the
ciphertext, the same knowledge it has, even after seeing the ciphertext. Moreover, it holds even if adversary is computationally unbounded. That is the importance of this definition.
This definition does not put any kind of restriction on the computational power of the adversary. Even if adversary knows the steps of the algorithm, even if it knows what could be the candidate keys even if it is allowed to do brute force, its view or its knowledge about the underlying
plain text or the distribution of the plain text should not change before and after seeing the ciphertext. If that holds, then we say that our underlying encryption process is perfectly secure. Notice that in this definition, I have highlighted
few things namely, I have said that the condition should hold for every probability distribution over the plain text space. That means it does not matter whether the distribution over a plain text space is a
uniform distribution or a bias distribution, the condition should hold for any possible probability distribution over the message space. In the same way, the condition or equality should hold for any kind of probability distribution over the key space whether it is a uniformly
generated key, whether the key generation algorithm output uniformly generate random keys or bias keys still the condition should hold. Moreover, once we fixed up probability distribution of the plain text space and the key space,
the condition should hold for every plaintext belonging to the message space and every candidate ciphertext belonging to the ciphertext space. So now, what we will do is we will see some alternate equivalent definitions for perfect
security. This is the original definition of perfect security as given by Shannon and the interpretation of equality of these 2 probabilities is that the probability of knowing a plaintext remains the same before and after seeing the cipher text. That is the interpretation
of the original definition of perfect security as given by Shannon. Now let us see the first alternate definition. The first alternative definition says that we will say an encryption process or an encryption scheme to be perfectly secure if for every
probability distribution over the plaintext space and key space and for every pair of message m0, m1 which occurs with nonzero probability with respect to that probability distribution and every cipher text c the following equality should hold.
The equality says that the conditional probability that ciphertext is c given the plaintext is m0 is same as the conditional probability that ciphertext is c given the plaintext is m1. The interpretation of this equality is that the probability distribution of the ciphertext
is independent of what exactly is your underlying plain text. That means if the adversary sees the ciphertext c over the channel, then it does not matter whether the plain text is m0 or whether the plain text is m1 with equal probability from the viewpoint of the attacker,
the ciphertext c should be a candidate encryption of m0 as well as a candidate encryption of m1. There should not be any bias in the probability. But whether it is an encryption of m0 or whether
it is an encryption of m1, that means the ciphertext distribution should be independent of the underlying plaintext. So now what we are going to do next is we are going to prove that both these 2 definitions
are equivalent. Namely, we will show that if there is an encryption process which satisfies the original condition of Shannon's perfect security, then the same encryption process has to satisfy the alternate definition and vice versa. Let us first prove the equivalence
of the definition assuming that we have an encryption process which satisfies the condition of alternate definition. Namely, we assume that we have an encryption process where for every probability distribution,
the conditional probability that m0 is encrypted in c and m1 is encrypted in c is same. And we call that probability to be say delta, for all pair of messages m0, m1 in the plain text space, and for all ciphertext c belonging to the cipher text space. Given this, we will
prove that the original condition of the Shannon’s perfect security is also true for the encryption process. The conditional probability that the message is m in the ciphertext c is same as the probability
that plain text is m. So what we are going to do is assume we have an arbitrary plain text and arbitrary cipher text character. Now let us try to compute the conditional probability that message is m given the cipher text is c. So what I am going to do here is
I am going to apply the Bayes theorem here. By applying the Bayes theorem, I can change the numerator and denominator of the conditional probabilities. In the lefthand side, the numerator is message being m and the denominator
is ciphertext being c. I am applying the Bayes theorem and rewriting the same expression in LHS, where I am changing the numerator and denominator and as for the Bayes theorem, I will get the righthand side part.
Now let us try to compute the value of probability what is C = c. The probability that your ciphertext is c is going to be this summation. This summation state that you take all possible messages from the plain text space and find out what is the probability that the message is that
candidate plain text namely m’ and given that the message is m’, what is the probability that m’ is encrypted in c. That will give you the probability distribution over the cipher text c. Because what have
to basically do is imagine you have the plain text space you take each of the candidate plaintext that is precisely the first term in this summation. And once you have fixed that candidate plaintext you just compute what is the probability that that candidate
plaintext will take you to the ciphertext c, that is this second term. And if you multiply all this, if these 2 probabilities and take the summation over all candidate plain text, that will give you the probability distribution of ciphertext being c. Now, as
per our hypothesis of the condition, we know that the conditional probability that ciphertext is c given the plain text is m’ is same for all m’ namely, it is delta, because that is what is the assumption that we are making.
We are making the assumption that our encryption process satisfies the alternate condition. So for every m’, I can replace the second term in the summation by delta. As a result, I get this simplified form of the summation and I can take out delta common and what I
will be left with summation of probabilities over the plain text space, that plain text is little m’. As per the rules of probability, if I sum up all those probabilities I will get 1. So that means I can say that the probability that ciphertext is c is nothing but delta.
We have computed the value of probability of C = c, that is delta. Now, let us try to compute the conditional probability that ciphertext is c given the plain text is m, that is the numerator part of this RHS expression. And again, as per our hypothesis of this theorem,
or this lemma, we already know that the given the plain text is m, the probability that ciphertext will be c is nothing but delta. And that is irrespective of what is my m, it could be m0, it could be m1 it could be m2 for any candidate m from the plaintext
base, the probability that it could lead to the ciphertext when c is delta. So, that means the numerator part of this RHS expression is also delta. Hence, I can say that my original LHS namely, the conditional probability that plain text is m given the ciphertext is c
is nothing but this equality. And in the numerator as well as in denominator, I have the delta so I can cancel it out. And hence, I obtain that this conditional probability is nothing but the probability that plaintext
is m, which is precisely what exactly we wanted to prove. That means we have proved that if the encryption process satisfies the condition of the alternate definition, then it also has to satisfy the condition of original Shannon's definition. So, now let us do the proof in
the reverse direction. Namely, we assume that we have an encryption process which satisfies the condition of the original Shannon's definition. That is the conditional probability that message is m
given the cipher text is c is same as the probability that message is m, for all messages in the plain text space and all cipher text in the cipher text space. Given this will prove that the distribution of the cipher text is independent of the underlying
plaintext. Namely, it does not matter whether the plain text is m0 or whether the plain text is m1 with equal probability, it could lead to the cipher text c, and this holds for every pair of messages m0, m1 in from the plain text space and every cipher text
c from the cipher text space. Let us first try to compute the probability that ciphertext is c given the plain text is m0. For this we are going to use the fact that as per the given condition, since the
encryption scheme satisfies the original Shannon’s condition, we know that the conditional probability that message is m0 given ciphertext is c is same as the probability that message is m0. So now what I am going to do is I am going to expand my LHS part here using the Bayes
theorem, where I am just changing the numerator and the denominator of the conditional probability. And by applying the Bayes theorem, I get this equality. Now, what I can do is I can cancel out this common term from both the LHS and RHS. Hence
I obtain that the conditional probability that m0 will lead to an encryption of c is same as the probability that ciphertext is c. By applying the same logic, I can conclude that the conditional probability that ciphertext is c given the plain text is m1 is same as
the probability that ciphertext is c. As a result, both this conditional probability that m0 will lead to an encryption of c and the probability that m1 will lead to an encryption of c are same namely, it is probability that
ciphertext is equal to c. That means, we have proved that if the original Shannon's condition is satisfied, then the encryption process also has to satisfy the condition for the alternate definition. That means, both these definitions are equivalent to each other.
So, if you are given an encryption process and if you are asked to prove whether it is perfectly secure or not, then you can either prove it as per the original Shannon's condition or you can prove it as per the first alternate definition.
Now, let us see another equivalent definition of perfect security. Before going into this definition, let us first try to understand the underlying intuition that we want to capture through this definition. So, the goal of the perfect security is the following: imagine
a scenario where a key for the encryption process has been agreed upon between the sender and the receiver. Suppose the adversary knows that the candidate plaintext for the sender is going to be either m0 or m1 and that too with equal probability.
Suppose indeed sender is going to encrypt message m0 or m1 with equal probability. And it encrypts one of these messages randomly using the key k as per the encryption process, computes the cipher text c and sent it over the channel. Say, the attacker intercepts
the cipher text c, and the attacker has unbounded computing power. Intuitively perfect secrecy demands that the adversary’s knowledge about the underlying plain text should remain the same even after seeing the ciphertext c.
So what was this information about the plain text before any cipher text was communicated in this particular case: with probability half from the viewpoint of the attacker, the plain text could be m0, and with probability half the plain text would be m1. Now perfect
secrecy demands that even after seeing the cipher text c and even after knowing the steps of the encryption process, and even if the adversary has unbounded computing power, the advantage that adversity gets after seeing the cipher text and learning the underlying
plain text should be 0. That means, even after seeing the cipher text c, the probability that the underlying plain text would be m0 or m1 should be half. Adversary can do nothing better than guessing the underlying plain text. That is the underlying intuition
that we are now going to formalize. And this intuition is formalized by a challenge response game, which we call this experiment. And we are going to see that in the rest of this course, every security definition is
going to be formalized by this kind of challenge response experiment or a game which model something which we want, which can happen in reality. The experiment is as follows: we assume that we are given a publicly known cipher namely a triplet of algorithms and
the message space, and the game is played between 2 entities. The first entity here is an adversary or an algorithm for the attacker which we denote by this algorithm A and the algorithm or the adversary is computationally unbounded. This
models the fact that we are trying to capture the notion of perfect secrecy where the adversary is computationally unbound. The second entity here is the hypothetical verifier which is going to model the sender.
Now, the nomenclature that we are going to follow while naming this kind of experiments is as follows: this particular experiment is denoted by this name. Now let us try to decode each of the individual parts of this complicated name. So, the name PrivK denotes
here that we are trying to model an experiment for a symmetric encryption process or a private key encryption process. That is why the name PrivK, later when we will try to model the security requirement
for public key primitives. This privK will be replaced by PubK. The name of the string eav denotes here we are considering an adversary where the adversary is only allowed to eavesdrop or just listen the cipher text because we are in the cipher text only attack model.
The name A here denotes the name of the algorithm here and pi denotes the name of the encryption process with respect to which this game is going to be played. That is the nomenclature we are going to follow to denote this particular experiment.
Now, what are the rules of this game? This experiment is going to be a randomized experiment. The first step here is that adversary selects a pair of messages from the plain text space, say m0 and m1 with the restriction that their size has to be same. An adversary can choose
any pair of messages. There is absolutely no restriction we put on which pair of messages that it can submit to the verifier. The only restriction is that their length have to be same, so adversary can deterministically pick up a pair of message, it could randomly pick
a pair of message, it can use any strategy to pick the pair of message that it wants to send to the verifier. Now, once the pair of messages are communicated to the verifier, what the verifier is going
to do is the following: it is going to run the key generation algorithm which will output a uniformly random key or a key as per the steps of the key generation algorithm for the verifier.
And the verifier is going to randomly pick one of these two messages for encryption. The index of the message which is going to pick for encryption is denoted by b, which is picked uniformly randomly. That means, with probability half, it might be picking
the message m0 for encryption, or with probability half it might be picking the message m1 for encryption. Once it has decided the index of the message, it encrypts that message mb using the key k which is known only to the verifier and the ciphertext is given to the
adversary. And now the goal of the adversary is to find out the index of the message which has been encrypted in c, that means the adversary has to find out whether the c is an encryption
of m0 or whether it is an encryption of m1. After analyzing the cipher text c as per whatever strategy adversary wants to follow, it gives them prediction. Namely, it outputs a bit, which is either 0 or 1 corresponding to the index, which it feels that that particular
message has been encrypted in the cipher text c. That means b’ denotes the index, which according to the adversary is the index of the message which is encrypted in the ciphertext c. That is the experiment. As you can see, this is
a randomized experiment, because adversary could pick any pair of messages as per its randomness. In the same way the verifier is going to pick any random message out of this pair for encryption, and the key could be any random key as per the key generation algorithm.
Now the output of the experiment is decided as follows. We say that output of the experiment is 1 or which is interpreted as adversary has won the game, if it has correctly predicted what exactly is the message which is encrypted in the ciphertext c. That means if it correctly
output b’ equal to b then we say that the adversary has won this experiment or the adversary’s experiment’s output is 1. On the other hand, if the adversary incorrectly identifies what is encrypted in that cipher
text c that means it outputs b’ which is different from b, then we say that output of the experiment is 0, which is interpreted as adversary has lost the game. Now, this is the experiment and now we have a definition: we say that an encryption process
pi with respect to which this game has been played, is perfectly indistinguishable, if for every attacker who participates in this game, the probability that the output of the experiment is 1 that means probability that the adversary outputs b equal to b’ in the
experiment is half. If it is more than half, then we say that the encryption process is not perfectly indistinguishable. Also, the probability here is taken over the randomness over adversary and the randomness
over the verifier namely the value of b and the value of k, output by the key generation. So, that is our third equivalent definition of perfect secrecy. The original definition of perfect secrecy as given by Shannon was this, we saw the first
equivalent definition and now we have this gamebased definition. We have now 3 different definitions. It can be proved that all these 3 definitions are equivalent to each other. We already proved that the first 2 definitions are equivalent to each other. We can also
prove that this third definition is equivalent to the first definition. And the third definition is also equivalent to the second definition. That means, if you are given an encryption process, we can prove it to be secure as per any of these 3 conditions,
we can directly prove it using the Shannon's original definition, we can try to prove it using the equivalent definition, or we can try to prove it using the game based definition. So these are the 3 equivalent definitions of perfect security.
So now I would like to stress a little bit on this gamebased definition. This gamebased definition describes the notion of what we call as perfect indistinguishability. That means the ciphertext is completely indistinguishable from the viewpoint of the attacker, and it
leaks no information whatsoever, whether it is an encryption of m0 or whether it is an encryption of m1, and this notion of indistinguishability is very important. Because for the rest of the course, we will see that all the definitions that we are going to formulate will be in
terms of the notion of indistinguishability. So now let us do an example here. We will prove that the Vigenere cipher which we have discussed in the last lecture is not perfectly secure. So, we are going to consider an instance
of Vigenere cipher, where the message space and the cipher text space are going to be of length two, namely, the plaintext is going to be a two character string and the cipher text is going to be a two character string that is why the message space and the cipher
text space are strings over the set (0 – 25) X (0 – 25). The key generation algorithm for this instance of Vigenere cipher is going to do the following: Since the message string, plain text string is of length two, then the key which is going
to be used should be of length either one or it should be either length two. So that means the period of the key which is the value t is a uniformly random value, which could be either one or two. Once the period of the key has been selected, we select the key of
that much size uniformly, randomly from the set 025. So that is a key generation algorithm. Once the key has been picked, we encrypt the plain text that is available with the sender as per the encryption process of the Vigenere cipher. Namely, we divide the plain text into
blocks of size t. If t is equal to one, that means the same t is used to encrypt both the characters of the plain text, whereas if t is equal to two, we divide the plain text into one single block of size two and, encrypt the whole block in one row using the two different
characters using the two characters of the key. That is encryption process. The decryption process is just the reverse operation of the corresponding encryption process. So now, we are going to prove that this instance of Vigenere cipher is not perfectly
secure. That means we will give a concrete attack where the attacker can significantly find out what exactly is the underlying message which has been encrypted and we are going to use the gamebased definition here.
Namely, we will show an instance of the game here we will show that the distinguishing advantage of the attacker is significantly better than half. The instance of the experiment here is as follows: adversary submits the following pair of messages, it submits a message
m0, which consists of the same characters 00. And it submits another plain text m1 which consists of two different plain text characters. Now, as per the rules of the game, the verifier is going to randomly encrypt one of these
two messages as per the key generation algorithm. It decides whether it is going to encrypt the message m0 or whether it is going to encrypt the message m1 with probably half and half. Once it decides what is going to encrypt, it runs a key generation algorithm which is
going to output a period which could be either 1 or 2. And once the period is generated a key of that much size is uniformly randomly picked and given to the verifier. And now using that key the verifier encrypts and sends the ciphertext
to the attacker. Let the cipher text character be c1 and c2. Since in this case the plain text, which is encrypted consist of two characters, the cipher text is also going to contain two characters.
Now the goal of the attacker is to find out after seeing this cipher text c, whether the cipher text c corresponds to an encryption of 00, or whether it corresponds to an encryption of 01. And the adversary does that by applying the following strategy: it compares the two
cipher text characters in c, namely c1 and c2. If it finds that c1 and c2 are same, then it predicts that the message m0 is encrypted in c, whereas if the cipher text characters c1 and c2 are different then it predicts that the message m1 is encrypted in ciphertext.
That is analysis of the attack. So now let us see what the distinguishing advantage of the attacker is, whether it can significantly distinguish apart, whether the c that it is seeing is an encryption of 00
or whether it is an encryption of 01. We claim that the probability that adversary can win this experiment, namely, it can correctly identify what has been encrypted in the cipher text c is 0.75, which is significantly better than half. Then that means that this scheme,
this instance of Vigenere cipher, is not perfectly secure. So now let us prove our claim. What is the probability that adversary wins the experiment? The probability adversary wins to experiment is the following: we can view this instance
of the experiment as two individual experiments, the experiment where the index b is equal to 0, namely the verifier has selected the message m0 for encryption and c is the encryption of m0. The second experiment here is the version of the experiment where the value of b is
equal to 1, that means the verifier has selected the message m1 for encryption and c is the encryption of m1. The probability that the adversary wins the experiment is what is the probability that b is equal to 0 and probability of b equal to 0 is half. And given that b
equal to 0 adversary, indeed outputs b’ equal to 0. That is one case in which adversary can win the experiment. The second case is when b is equal to 1, which can occur with probability half. And given
that b is equal to 1, what is the probability that adversary correctly output the b’ equal to 1. If we sum up these two things, then that gives the winning probability of the adversary for this experiment.
So now let us compute each of these 2 individual probabilities one by one. So let us first compute the probability that adversary’s analysis outputs b’ equal to 0, given that indeed b equal to 0, that means we now want to analyze what is the probability that if
m0 is encrypted in c adversary’s analysis indeed outputs b’ equal to 0. If you see the adversary’s analysis, adversary outputs b’ equal to 0 only if both the ciphertext characters are same.
If indeed 00 is encrypted, then there are two possibilities. The key could be either of period 1, or the key could be of period 2. If the plain text 00 is encrypted with a period of size 1, that means that the key size is 1 and the same character of the key
is used to encrypt first 0 as well as a second 0, then of course the both the cipher text characters are going to be same. Whereas if the key is of size 2, then the probability that both the cipher text characters are same is same as probability with which the key
of size 2 satisfies the following condition, the first character of the key is the same as the second character of the key. If that happens, then again, the encryption of the message m0 will produce the ciphertext in both the characters being the same.
These are the two individual events under which an encryption of m 00 encryption of the message 00 will lead to a cipher text we are both the cipher text characters are same. So now what is the probability that the key generation algorithm of the verifier
outputs a key of size 1. Well, it is 1/2. And what is the probability that the key generation algorithm for the verifier outputs a random key of size 2 where both the key characters are the same. The
probability of that is 1/2 * 1/26. If we sum these 2 probabilities, then we get the probability that adversary’s analysis correctly output b’ equal to 0, given that verifier has indeed encrypted the message m0. So that is the first probability we have computed.
Now let us try to compute the second probability here. Namely, what is the probability that adversary outputs b’ equal to 1, given that indeed b equal to 1, that means we are now trying to analyze that assume the verifier has encrypted the message (0, 1), what is
the probability that adversary’s analysis also outputs b’ equal to 1. For this we are actually going to subtract the complimentary probability namely, we will compute the probability that adversary incorrectly output b’ equal to 0 given that b equal
to 1 and we are going to subtract this probability from 1 and that will give us the required probability. Now let us try to compute the probability that adversary’s analysis or adversary’s algorithm outputs b’ equal to 0 given that b equal to 1.
The probability of this event is same as a key of size period 2 is selected with the first character being 1 more than the second character. Because, for example, if the key is k1, k2 where k1 and k2 differs by 1 character, or whether the difference in the sense that
say, the first character is the 1 and the second character is 0 of the key, or say the first character of the key is 2 and the second character of the key is 1 and so on. If this kind of relationship holds between the 2 characters of the key, and if indeed
the message (0, 1) is encrypted then after encryption both the cipher text characters will be same. Only if this event happens then adversary will incorrectly end up outputting b’ equal to 0, even though b is equal to 1. Now what is the probability that a key
period 2 is selected with the first character being 1 more than the second character. And the probability of this is 1/2 * 1/26. Because with probability 1/2, a key of period 2 could be selected, because the probability half the key could be a period 1 probability
half the key could be period 2. The 1/2 here denotes the probability that key is of period 2. Given this, what is the probability that the first character of the key is 1 more than the second character of the key, it is same as 1/26.
If we multiply these 2 probabilities, we get this complimentary probability. And if we now subtract this complimentary probability from 1, we get our required probability. Now we have both this individual probabilities computed, and if we sum them together, and
we have outside 1/2 as the common thing, we get that the probability with which adversary analysis in this case could correctly identify what is encrypted is 3/4, which is significantly more than 1/2. That means this instance of Vigenere cipher is not perfectly secure. So
that brings me to the end of this lecture. To summarize, in this lecture we discussed the notion of perfect secrecy, which is the first formal notion of secrecy, and which is also the most strongest form of secrecy
because this notion of security is against an adversary which is computationally unbounded. So, if you achieve perfect secrecy, then that is the best possible notion of secrecy that you can think of.
We also discuss some alternate definitions of perfect secrecy, namely, the distribution of the ciphertext should be independent of the underlying plaintext. We also saw the gamebased definition, which models perfect indistinguishability. I hope you enjoyed this