197 lines
12 KiB
TeX
197 lines
12 KiB
TeX
\documentclass[12pt,a4paper]{article}
|
|
|
|
\usepackage[cm]{fullpage}
|
|
\usepackage{amsthm}
|
|
\usepackage{amsmath}
|
|
\usepackage{amsfonts}
|
|
\usepackage{amssymb}
|
|
\usepackage{xspace}
|
|
\usepackage[english]{babel}
|
|
\usepackage{fancyhdr}
|
|
\usepackage{titling}
|
|
\usepackage{hyperref}
|
|
\renewcommand{\thesection}{Exercise \Alph{section}:}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
% This part needs customization from you %
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\newcommand{\groupnumber}{04}
|
|
\newcommand{\name}{Tobias Eidelpes, Mehmet Ege Demirsoy, Nejra Komic}
|
|
\newcommand{\matriculation}{01527193, 01641187, 11719704}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
% End of customization %
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\newcommand{\projnumber}{1}
|
|
\newcommand{\Title}{Analysing the Blockchain}
|
|
\setlength{\headheight}{15.2pt}
|
|
\setlength{\headsep}{20pt}
|
|
\setlength{\textheight}{680pt}
|
|
\pagestyle{fancy}
|
|
\fancyhf{}
|
|
\fancyhead[L]{Cryptocurrencies - Project \projnumber\ - Analysing the Blockchain}
|
|
\fancyhead[C]{}
|
|
\fancyhead[R]{\name}
|
|
\renewcommand{\headrulewidth}{0.4pt}
|
|
\fancyfoot[C]{\thepage}
|
|
|
|
|
|
\begin{document}
|
|
\thispagestyle{empty}
|
|
\noindent\framebox[\linewidth]{%
|
|
\begin{minipage}{\linewidth}%
|
|
\hspace*{5pt} \textbf{Cryptocurrencies (WS2021/22)} \hfill Prof.~Matteo Maffei \hspace*{5pt}\\
|
|
|
|
\begin{center}
|
|
{\bf\Large Project \projnumber~-- \Title}
|
|
\end{center}
|
|
|
|
\vspace*{5pt}\hspace*{5pt} \hfill TU Wien \hspace*{5pt}
|
|
\end{minipage}%
|
|
}
|
|
\vspace{0.5cm}
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section*{Group \groupnumber}
|
|
Our group consists of the following members:
|
|
\begin{center}
|
|
\textbf{\name} %please fill the information above
|
|
|
|
\matriculation %please fill the information above
|
|
\end{center}
|
|
|
|
\section{Finding invalid blocks}
|
|
|
|
For this exercise all invalid blocks contained in the database provided to us
|
|
had to be found. While there is an
|
|
official\footnote{\url{https://en.bitcoin.it/wiki/Protocol\_rules\#.22block.22\_messages}}
|
|
algorithm which allows network participants to verify whether a block is invalid
|
|
or not, the stripped-down version of the blockchain we received does not require
|
|
all the steps. This stripped-down version of the algorithm thus specifies which
|
|
constraints the data must satisfy:
|
|
|
|
\begin{enumerate}
|
|
\item All blocks which do not have the coinbase transaction as their first
|
|
transaction are invalid. This will be achieved by creating a view which
|
|
lists all coinbase transactions. Then we query the database for all
|
|
first transactions of each block and check if that transaction is in the
|
|
view of all coinbase transactions. If it is not, we reject the block and
|
|
add it to the invalid list.
|
|
\item All blocks which contain transactions which do not have inputs or
|
|
outputs are invalid. We split this task into two queries, one for
|
|
checking if a block contains transactions with zero inputs and another
|
|
one for checking if a block contains transactions with zero outputs.
|
|
\item All blocks which have transactions with an invalid output value or
|
|
where the sum of all output values exceeds the legal money range are
|
|
invalid. This task is split into two queries as well. One for checking
|
|
if individual output values are outside of the legal money range and a
|
|
second one for checking if the sum of all output values per transaction
|
|
is outside of the legal money range.
|
|
\item Reject all blocks which have transactions with inputs that do not have
|
|
a corresponding output. For this task we first create a view which finds
|
|
all non coinbase transactions. The output of that query is then filtered
|
|
for all inputs which are not part of a coinbase transaction (so the non
|
|
coinbase inputs). Finally, the non coinbase inputs are joined with the
|
|
outputs and rows containing \texttt{NULL} as their \texttt{value}
|
|
indicate an invalid block.
|
|
\item All blocks which contain transactions where the input's
|
|
\texttt{sig\_id} field is not the same as the output's \texttt{pk\_id}
|
|
field are invalid. Since we are not interested in the coinbase
|
|
transactions, the query uses the non coinbase inputs again to join them
|
|
with the outputs. If the two fields do not match, the block is invalid.
|
|
\item All blocks which have inputs for which there exist outputs which have
|
|
already been spent are invalid. This task is split into three queries.
|
|
First, we find all outputs which have more than one input. Second, for
|
|
all the outputs found, we find the corresponding inputs where the output
|
|
was first spent. Third, the two tables are combined such that blocks
|
|
with outputs which have corresponding inputs that are not listed as the
|
|
first spending occurrence, are marked as invalid.
|
|
\item All blocks containing inputs which are not in the legal money range
|
|
are invalid. First, we construct a view which gathers all transactions
|
|
and their corresponding sum of value for all inputs. All blocks
|
|
containing input sums which are outside of the legal money range are
|
|
marked as invalid. Second, we reuse the view of all non coinbase inputs
|
|
and filter them for the ones which have an output value outside of the
|
|
legal money range.
|
|
\item All blocks where the sum of input values is smaller than the sum of
|
|
output values are invalid. This task allows us to reuse the view created
|
|
earlier of all input sums. Additionally, the sum of output values is
|
|
obtained similarly to the input sums. After joining both input sums and
|
|
output sums, we can filter for blocks which have smaller input sums than
|
|
output sums. Those blocks are invalid.
|
|
\item All blocks where the coinbase value is larger than the sum of the
|
|
block creation fee and all transaction fees are invalid. This task is
|
|
split into four queries. First, we create a view which shows all block
|
|
ids and their coinbase values. Second, we need to know the sum of all
|
|
input values per block. Third, we repeat that query for the sum of the
|
|
output values per block. Lastly, these three tables are joined and all
|
|
blocks which satisfy the constraint are invalid.
|
|
\end{enumerate}
|
|
|
|
Finally, the invalid blocks are written to the \texttt{invalid\_blocks} table
|
|
and all duplicates are removed.
|
|
|
|
\section{UTXOs}
|
|
In this exercise we were given a smaller data set in comparison to the first exercise and we were expected to work on unspent transaction outputs which have the following constraint:
|
|
\begin{enumerate}
|
|
\item A transaction output is unspent if it is not used as an input to a later transaction.
|
|
\end{enumerate}
|
|
The exercise further has the following constraints:
|
|
|
|
\begin{enumerate}
|
|
\item The table \texttt{utxos} with columns \texttt{output\_id} and \texttt{value} should contain all UTXOs as of the last block of the data set. For this constraint we need to filter out the outputs from \texttt{outputs} table, whose \texttt{output\_id}'s are not referenced in the \texttt{inputs} table. Thus we need a \texttt{WHERE NOT EXISTS} clause for the filtering.
|
|
|
|
\item The table \texttt{number\_of\_utxos} with column \texttt{utxo\_count} should contain as single entry the total
|
|
number of UTXOs. For implementing the solution of this constraint, we just need to count the number of \texttt{output\_id} present in the \texttt{utxos} table from the previous constraint's implementation. \texttt{COUNT(output\_id)} clause here is sufficient.
|
|
|
|
\item The table \texttt{id\_of\_max\_utxo} with the column \texttt{max\_utxo} should contain as single entry the id of
|
|
the UTXO with the highest associated value. For getting the highest valued utxo, we need to order the \texttt{utxos} table in descending manner by the values. This would ensure that we have the highest valued utxo as the first entry. Thus by adding \texttt{LIMIT 1} clause, we get the top entry from the ordered results.
|
|
\end{enumerate}
|
|
|
|
With each constraint, we insert the expected results into the given tables.
|
|
|
|
|
|
\section{De-anonymization}
|
|
In this exercise, a de-anonymization attempt was expected using the following two heuristics:
|
|
|
|
\begin{enumerate}
|
|
\item Joint control: addresses used as inputs to a common transaction are controlled by the same
|
|
entity.
|
|
|
|
\item Serial control: the output address of a transaction with only a single input and output is usually
|
|
controlled by the same entity owning the input addresses.
|
|
\end{enumerate}
|
|
|
|
First part of this exercise was to insert all pairs of addresses into the table \texttt{addressRelations} satisfying the 2 constraints above. For this we first create 2 views each representing respectively the transactions that satisfy the above constraints. After that we use these tables to find pairs of addresses by performing (multiple) joins with \texttt{inputs} table and then insert the result into a temporary table called \texttt{tempRelations}. Since result contains reflexive and symmetrical pairs, we further up define additional queries to delete these pairs from the \texttt{tempRelations} table. With reflexive and symmetrical pairs deleted, we insert the \texttt{tempRelations} pairs into \texttt{addressRelations}, which concludes the first part.
|
|
\\
|
|
For the second part, the function \texttt{clusterAddresses()} was provided for clustering the address pairs into entities with (artificial) ids. This function then returned a table with entity ids and the addresses belonging to these entities. First step is to save the results of the function into a temporary table. After this, following constraints have to be satisfied:
|
|
|
|
\begin{enumerate}
|
|
\item The table \texttt{max\_value\_by\_entity} with column \texttt{value} should contain as single entry the
|
|
maximum total value of (unspent) satoshis controlled by one cluster (one entity). To make our job easier, we save all the utxos with addresses, transaction ids, output\_ids and values into a temporary table. We can then use this table in a join query with the table containing clusters by addresses. We then group entries by the entity ids and perform built-in \texttt{SUM} function on values and then call another built-in \texttt{MAX} function on the query result to obtain the maximum value.
|
|
\\
|
|
(Note: In our solution for readability purposes, we save the \texttt{SUM} results into a temporary table and query this table when we are querying for the max value.)
|
|
|
|
\item The table \texttt{min\_addr\_of\_max\_entity} with column \texttt{addr} should contain as single entry the
|
|
(numerically) lowest address of the cluster (the entity) controlling the most total (unspent)
|
|
bitcoins. To solve this, we first create a temporary table called \texttt{temp\_max\_entity} containing entity id, address and utxo values of all the addresses of the entity with maximum utxo value from constraint 1. Then we use this table to filter out all the addresses of this entity from cluster table, saving it into yet another temporary table called \texttt{max\_entity\_all\_addresses}. As last step, we perform a \texttt{MIN} query on \texttt{max\_entity\_all\_addresses} to satisfy the constraint.
|
|
|
|
\item The table \texttt{max\_tx\_to\_max\_entity} with column \texttt{tx\_id} should contain as single entry the
|
|
transaction sending the greatest number of bitcoins to the cluster (the entity) controlling
|
|
the most total (unspent) bitcoins. We start by creating a temporary table called \texttt{max\_tx\_value\_to\_max\_entity}. The goal here is to save the value of the transaction, which sends the most amount of coins to an address of the max entry. For this query, we need to provide transaction id by joining \texttt{outputs} and \texttt{max\_entities\_all\_addresses}. We construct this join query in a \texttt{WITH..AS} clause called \texttt{max\_entity\_join\_outputs}. After that we use the same join query \texttt{max\_entity\_join\_outputs}, but we additionally filter the result by the value in \texttt{max\_tx\_value\_to\_max\_entity}, thus this leaves us with the desired transaction with the max value. The transaction id of this transaction is then inserted into the given table.
|
|
\end{enumerate}
|
|
|
|
\section*{Work distribution}
|
|
%Fill in here an overview on which group member participated in which task and to which extent
|
|
|
|
\begin{description}
|
|
\item[Tobias Eidelpes] Code and report for Exercise A.
|
|
\item[Ege Mehmet Demirsoy] Code and report for Exercise C.
|
|
\item[Nejra Komic] Code and report for Exercise B.
|
|
\end{description}
|
|
|
|
\end{document}
|
|
|