418 lines
20 KiB
TeX

\documentclass[a4paper]{article}
\usepackage[english]{babel}
\usepackage{amsmath,amssymb,amsthm}
\usepackage{color}
\usepackage{units}
\newcommand{\TODO}{\textcolor{red}{TO DO}}
\begin{document}
\begin{center}
\textbf{\Large NWI-IMC061 -- Applied Cryptography}\\[4pt]
\textbf{\large Final Exam, Academic Year 2021--2022}
\end{center}
\bigskip
\hrule
\bigskip
\noindent \textbf{Last Name:} Eidelpes
\medskip\noindent \textbf{First Name:} Tobias
\medskip\noindent \textbf{Student Number:} s1090746
\medskip\noindent \textbf{Personalized Appendix Sequence Number:} 30
\bigskip
\hrule
\bigskip
\begin{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%% SYMMETRIC - LITERATURE %%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\item \textbf{(18 points)}
\begin{enumerate}
\item EWCDM stands for \emph{Encrypted Wegman-Carter with Davies-Meyer}. As
the name implies, EWCDM is based on a Wegman-Carter construction which
takes the hash of a message $M$ and XORes it with the application of a
pseudorandom function (PRF) to a nonce $N$. This construction is very
efficient and also has a strong security bound. However, it is very
vulnerable to \emph{nonce-misuse}. To deal with that problem, the
Wegman-Carter construction is wrapped by another call to the PRF with a
different key. Another disadvantage is the fact that PRFs are hard to get
by and instead pseudorandom permutations are used. If a pseudorandom
permutation (i.e. block cipher) is used, the security bound of the
construction drops to the birthday bound ($2^{n/2}$). The authors replace
the inner call to the PRF with the \emph{Davies-Meyer} construction
\[ \mathrm{DM}[E]_K(N) = E_K(N)\oplus N \]
and then encrypt that (with the hashed message) in another call to the
block cipher. The resulting EWCDM construction looks like this
\[ E_{K'}(E_K(N)\oplus N\oplus H_{K_h}(M)) \]
and is secure \emph{beyond} the birthday bound against nonce-respecting
adversaries while still offering birthday bound security against
nonce-misusing adversaries.
\item The type of symmetric cryptographic scheme introduced is a Message
Authentication Code (MAC).
\item The size of the key(s) depends on the block cipher and the keyed hash
function. In total there likely need to be two distinct keys for the block
cipher calls and one key for the hash function.
\item Since EWCDM is based on a block cipher and a hash function and because
those usually operate on fixed-length inputs, the construction also
operates on fixed-length inputs. Messages come in variable-length sizes
and need to be padded by the block cipher to the specified block size.
\item Depending on the amount of input blocks, the construction will
generate multiples of the block size as outputs. The outputs are
variable-length.
\item EWCDM is based on a pseudorandom permutation (i.e. block cipher) and
an almost xor-universal (AXU) hash function (one-way function).
\item Yes, the authors delivered a security proof. The proof assumes that
the encryption function $E$ is a secure pseudorandom permutation for the
case of a nonce-misusing adversary. This requirement on the security of
$E$ is not present if the adversary is nonce-respecting. Additionally, the
distinguisher is computationally unbounded and never repeats a query.
\item The practical relevance is high, in my opinion. This is due to the
fact that the EWCDM construction is secure against nonce-misusing
adversaries up to the birthday bound. It has been shown that implementing
nonces securely is a difficult task. If a scheme is easily broken by wrong
handling of nonces, there is no \emph{fallback} security guarantee. The
EWCDM construction, however, provides such a \emph{fallback} security
guarantee and is of high practical relevance.
\item Poly1305 is also a message authentication code (MAC), which we
discussed in the lecture.
\item One advantage of EWCDM over Poly1305 is that the former is
nonce-misuse resistant up to the birthday bound while Poly1305 is not.
\item One disadvantage of EWCDM is that it requires two calls to the
underlying block cipher. This can have potentially serious performance
implications for small, low-resource embedded devices.
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%% SYMMETRIC - KEYED %%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\item \textbf{(16 points)}
\begin{enumerate}
\item $\mathsf{CrAp}_K^{-1}$ operates by taking the ciphertexts
$C_1,\cdots,C_l$ and passing them to the decryption function
$\widetilde{E}^{-1}(K,N,\cdot)$. The decryption function takes 128-bit
inputs and produces a 128-bit output. The output has to be stripped of the
counter (the last 26 bits) to obtain the 102-bit message block
$M_1,\cdots,M_l$. Finally, the padding (if any) has to be removed from
$M_1,\cdots,M_l$ to obtain the original message block (102 bits).
\item The length of the message $M$ is limited by the counter, which is at
most 26 bits long. Since the very first counter ($\langle 0\rangle_{26}$)
is reserved for the tag, $2^{26}-2$ message blocks remain. Every block
(without the counter) is at most 102 bits long which gives a maximum
message length of $102\cdot (2^{26}-2) = \unit[6845103924]{bits}$.
\item $\widetilde{E}$ should behave like a pseudorandom permutation in order
to be able to prove the security of $\mathsf{CrAp}$. If it does not, a
distinguisher is able to gain a significant advantage because the block
cipher does not actually generate \emph{random} outputs. Further, if the
security of the underlying primitive is broken, the whole scheme falls
apart.
\item \TODO
\item \TODO
\item The length of the random nonce $N$ is $\unit[96]{bits}$. The expected
number of evaluations an attacker has to make to obtain a repeated nonce
is $2^{96/2} = 2^{48}$.
\item After $2^b = 2^{62}$ forgery attempts, the attacker has exhausted the
keyspace of the tag because the tag $T$ is of size $\unit[62]{bits}$. The
distinguisher checks continuously if the current tag matches the
ciphertext. If it does not, the tag is incremented by one until $2^{62}$
queries have been made. Eventually, the distinguisher will get the valid
tag and is then able to identify if it is in the real world or in the
ideal world.
\item \TODO
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%% SYMMETRIC - UNKEYED %%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\item \textbf{(16 points)}
\begin{enumerate}
\item The chaining value size is $\unit[101]{bits}$ ($=g$) and the message
block size is $655-101=\unit[554]{bits}$.
\item If the message is of size $|M|=\unit[1234567]{bits}$ and the block
size is $\unit[554]{bits}$, we need $\frac{1234866}{554}=2229$ blocks with
a padding of $1234866-1234567=\unit[299]{bits}$. Additionally, the
message length is also encoded in a $\unit[554]{bit}$ block and so the
total number of blocks is $2230$. The total number of blocks corresponds
to the total number of evaluations needed, which is $2230$ evaluations of
$P$.
\item In order for $(x,y)$ to be a valid preimage for the compression
function $F^P$, $x$ must be of size $\unit[800]{bits}$ and contain 55
zeros at the beginning and 90 zeros at the end. The 655 bits in-between
can be modified by an adversary to achieve the required target. Similarly,
$y$ must be of size $\unit[800]{bits}$ where the first 50 and the last 649
bits are discarded. The bits in-between must be 101 zeros to satisfy our
target image. Furthermore, the following condition must be true to achieve
a valid preimage: $[F^P(x)=y]$ where $x$ and $y$ satisfy the
aforementioned conditions.
\item If the adversary makes one forward query, the probability that it hits
the target image is $1/2^{g} = 1/2^{101}$. The adversary wants to find a
$\unit[655]{bit}$ input to map to 101 zeros. Therefore, the whole search
space is $2^{101}$ and the probability with one query is $1/2^{101}$.
\item If the adversary makes one inverse query, the probability that it hits
a preimage is $1/2^{55+90} = 1/2^{145}$. This is due to the fact that the
first $\unit[55]{bits}$ and the last $\unit[90]{bits}$ have to be zeros.
\item An adversary does not gain additional information by using inverse
queries additionally to forward queries. Forward queries have a better
probability of being successful at breaking preimage resistance and thus
an adversary should focus on that. The total probability is thus
$1/2^{101}$.
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%% ASYMMETRIC - LITERATURE %%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\item \textbf{(17 points)}
\begin{enumerate}
\item LEDAcrypt is a post-quantum asymmetric suite of cryptosystems. It
contains a public-key encryption scheme and a key-encapsulation mechanism
(KEM). The underlying hard problem (arbitrary linear binary code decoding)
is currently believed to be secure against quantum adversaries.
\item The authors introduce a post-quantum public-key cryptosystem based on
linear codes.
\item IND-CCA2 is proven for both the KEM and the PKC. IND-CPA is proven for
the KEM.
\item LEDAcrypt is based on the hardness of the decoding problem for linear
codes. Given a parity-check matrix $H$ and a received codeword $y$, the
syndrome is $s=yH$. The best estimate for the received codeword is
$x=y+z_0$. Find a minimum-weight solution $z_0$ for the equation $s=zH$.
Finding a minimum-weight solution to $s=zH$ given $s$ and $H$ is
$\mathsf{NP}$-hard.
\item The private key in LEDAcrypt consists of two binary matrices $Q$ and
$H$. The public key is constructed from the matrix $L=Q\cdot H$. The
security of the scheme relies on the fact that obtaining the original
information from a perturbed codeword is hard unless the factorization of
the public key ($Q\cdot H$) is known. If the aforementioned problem of
decoding linear codes has a polynomial-time solution, an attacker will
also easily be able to obtain the factorization of the public key. If that
was possible, the scheme would be broken.
\item The strongest type of security the authors claim to achieve is
IND-CCA2. The authors use the Fujisaki-Okamoto transform to achieve
IND-CCA2 security.
\item The scheme can be used to exchange symmetric keys between parties
with the usage of the key encapsulation mechanism (KEM). In that scenario,
the sender encrypts a symmetric key with LEDAcrypt and shares the
encrypted key with the other party. The other party then decrypts the
message to obtain the symmetric key which can be used for further
communication.
\item The lowest security level treated by the authors is level 1 of the
NIST security levels corresponding to AES-128. The parameters depend on
whether the scheme is used for ephemeral or long-term keys and what kind
of code rate ($n_0$) is needed. For ephemeral keys with $n_0=2$ the
authors suggest values of: $p=14,939$, $t=136$, $d_v=11$ and $m=[4,3]$.
For long-term keys the authors suggest values of: $p=35,899$, $t=136$,
$d_v=9$, $m=[5,4]$, $\overline{t}=4$ and $b_0=44$. These parameters are
chosen with respect to an adversary using Information Set Decoding (ISD)
to find a solution to the underlying hard problem.
\item The size for ephemeral keys is $\unit[452]{bytes}$ (in memory) for the
private key and $\unit[1872]{bytes}$ for the public key. The size for
long-term keys is $\unit[468]{bytes}$ (in memory) for the private key and
$\unit[4488]{bytes}$ for the public key.
\item Kyber512 is also a KEM and achieves the same level of (classical)
security.
\item One advantage of LEDAcrypt is that the key sizes are relatively small
compared to Classic McEliece, for example. Small key sizes are important
for transmission of public keys so that they can fit in commonly used
packet sizes.
\item One disadvantage of the scheme is that it inherently has a non-zero
decoding failure rate (DFR). For ephemeral keys and the lowest security
level, the authors advertise an error probability of $14$ out of $1.2\cdot
10^9$ decodes. The DFR can be lowered by choosing different parameters,
but the rate is arguably still too high for practical use.
For long term keys the authors state that 95 out of 100 keys (lowest
security level) provide a DFR of $2^{-64}$, which is also arguably too low
for extended use well into the future.
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%% ASYMMETRIC - SECURITY %%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\item \textbf{(33 points)}
\begin{enumerate}
\item Let there be an adversary $\mathcal{A}$ which breaks CGI. We can then
construct an adversary $\mathcal{B}$ which breaks CGI2.
Suppose $\mathcal{B}$ is given a CGI2 instance
$(\mathcal{G}_a,\mathcal{G}_b)$ where $a\neq b$ and $\mathcal{G}_a$ and
$\mathcal{G}_b$ are in the set of $2^{130}$ graphs isomorphic to
$\mathcal{G}$. The goal of $\mathcal{B}$ is to find an isomorphism $\phi$
with non-negligible advantage such that $\mathcal{G}_a =
\phi(\mathcal{G}_b)$. $\mathcal{B}$ will give
$(\mathcal{G}_a,\mathcal{G}_b)$ to $\mathcal{A}$ and $\mathcal{A}$ will
output an isomorphism $\phi$ which satisfies $\mathcal{G}_a =
\phi(\mathcal{G}_b)$. $\mathcal{B}$ can then take this isomorphism and
apply it to its own problem to obtain the solution.
\item First, the prover takes a random isomorphism and generates a
permutation of the given graph $\mathcal{G}$. The resulting graph is the
commitment which is sent to the verifier. The verifier then picks a random
graph from the set of graphs isomorphic to $\mathcal{G}$ and sends it to
the prover. The prover takes this graph and calculates the permutation
needed to arrive at the original graph $\mathcal{G}$. This is the response
which is sent to the verifier. The verifier can then use the response to
check if the graph it picked earlier (in the challenge) is actually
isomorphic to $\mathcal{G}$. If it is, the verifier accepts, otherwise it
rejects.
\item The domain of the commitment scheme is the set of graphs isomorphic to
$\mathcal{G}$ and the range is the number ($2^{130}$) of isomorphic
graphs. The scheme consists of three phases: setup, commitment and
opening. The setup phase consists of choosing an appropriate random
permutation $\psi$ from the set of isomorphisms on $\mathcal{G}$. The
commitment phase takes the isomorphism $\psi$ and the graph $\mathcal{G}$
as input and produces a commitment $\mathcal{G}'$. The opening phase takes
an isomorphism $\mathsf{resp}$ and another graph
$\mathcal{G}_{\mathsf{ch}}$ isomorphic to $\mathcal{G}$ as well as the
original commitment as input and outputs $\top$ if the result matches
$\mathcal{G}'$ and $\bot$ otherwise.
\item Computational binding: Suppose $\mathsf{Comm}(\psi,\mathcal{G}_0) =
\mathsf{Comm}(\psi,\mathcal{G}_1)$. This means that $\psi(\mathcal{G}_0) =
\psi(\mathcal{G}_1)$ and the adversary has found an isomorphism which maps
two different graphs to the same output which corresponds to solving the
CGI problem.
\item If $G_{ch}=\phi_{ch}(G)$ and $G'=\psi(G)$, it follows that
$G=\phi_{ch}^{-1}(G_{ch})$ and therefore $G'=\psi(\phi_{ch}^{-1}(G_{ch}))$
so the verifier will always accept.
\item Suppose $G_{ch}$ is not isomorphic to $G$. $\mathcal{P}$ prepares in
advance for a challenge $ch^*$ and so
$G'=\psi(\phi_{ch^*}^{-1}(G_{ch^*}))$. $\mathcal{P}$ commits to $G'$. If
the challenge by $V$ is $ch^*$ (so $ch=ch^*$), $\mathcal{V}$ accepts,
otherwise it rejects. Because $ch\in\{0,\dots,2^{130}-1\}$, the
probability that $\mathcal{P}$ convinces $\mathcal{V}$ is
$1/2^{130}$ (soundness error).
\item The soundness error after one iteration is $1/2^{130}$. To achieve
a $1/2^{192}$ soundness error, the protocol should be done twice to arrive
at a soundness error of $1/2^{260}$, which is well below the required
$1/2^{192}$.
\item A simulator $\mathcal{S}$ is built as follows:
\begin{itemize}
\item $\mathcal{S}$ starts $\mathcal{V}^*$ with $G_i$ and
$i\in\{0,\dots,2^{130}-1\}$.
\item $\mathcal{S}$ makes a guess $\mathsf{ch}^*$ and calculates
$G'\leftarrow\psi(\phi_{ch^*}^{-1}(G_{ch^*}))$.
\item $\mathcal{S}$ gets a challenge $\mathsf{ch}$ from $\mathcal{V}^*$.
If $\mathsf{ch}=\mathsf{ch}^*$, $\mathcal{S}$ outputs
$(G',\mathsf{ch}^*,\phi_{ch^*}^{-1}\psi)$. If
$\mathsf{ch}\neq\mathsf{ch}^*$, $\mathcal{S}$ rewinds $\mathcal{V}^*$
and goes to step 2.
\end{itemize}
The simulator $\mathcal{S}$ is expected probabilistic polynomial-time with
$2^{130}n$ time and the protocol is zero-knowledge.
\item For completeness see 5e.
Special soundness: given two accepting transcripts for the same commitment
$\mathsf{trans} = (G',\mathsf{ch},\phi_{ch}^{-1}\psi)$ and $\mathsf{trans}'
= (G',\mathsf{ch'},\phi_{ch'}^{-1}\psi)$ we have \[ \psi =
\frac{\mathsf{resp}-\mathsf{resp}'}{\phi_{ch}^{-1}-\phi_{ch'}^{-1}} \] which
means that the witness can be extracted with probability 1.
For special HVZK: given $\mathsf{ch}\in\{0,\dots,2^{130}-1\}$ choose
$\mathsf{resp}\xleftarrow{\$}\mathcal{I}_{1107}$ and calculate
$G'\leftarrow\mathsf{resp}(G_{ch})$. The distributions of real transcripts
and simulated transcripts are the same. A given valid transcript occurs with
probability $1/2^{130}$.
\item $\mathsf{ID}_{\mathrm{CGI2}}$ can be used for authentication if a
client (prover) proves to a server (verifier) the possession of a password
without actually revealing it. The client shares a commitment with the
server and as soon as the client wants to log-in, it receives a challenge
from the server. If the client can successfully pass the challenge (i.e.,
the response from the client is equal to the commitment), it is
authenticated with the server.
The advantage of such a scheme over conventional password-based
authentication is that the secret is never transmitted to anyone.
Futhermore, the commitment is also not vulnerable to dictionary attacks,
as is common with stored password hashes on the server's side.
\item \TODO
\item The signer calculates a commitment with a predefined soundness error.
Then the signer calculates the challenge by taking the hash of the message
to be signed and the commitment. Afterwards, it will run the protocol
again and calculate a response for the created challenge (hash) and the
commitment. The signature is a tuple of the commitment and the response.
The verifier can calculate the challenge on its own from the message and
the commitment and then verifies that the response matches the commitment
for that challenge. If it does, the signature is valid, otherwise it is
invalid.
The signature is $\mathsf{EUF}$-$\mathsf{CMA}$ secure if
$\mathsf{ID}_{\mathrm{CGI2}}$ satisfies special soundness and honest
verifier zero-knowledge, which it does. Futhermore, it is secure if the
attacker has a negligible probability of finding a valid signature for a
message which has not been queried before. This rests on the fact that
finding an isomorphism for a specific commitment and challenge which
matches the response is hard.
\item The size of the signature comprises the commitment, which is a hash,
and the response. The hash function is chosen to be $\unit[256]{bits}$ and
the response is $\lceil\log_2 1107\rceil\cdot 1107 = \unit[12.177]{kbit} =
\unit[1522.125]{bytes}$. In total, the signature is
$\unit[1554.125]{bytes}$ big.
\item The signature can be made smaller if the underlying graphs have less
vertices. The signature shrinks linearly with the number of vertices.
\end{enumerate}
\end{enumerate}
\end{document}