diff --git a/introduction.tex b/introduction.tex index c54e41e..34acd80 100644 --- a/introduction.tex +++ b/introduction.tex @@ -25,4 +25,61 @@ used to track individuals on the Internet and which countermeasures exist?} \section{Terms and Scope} \label{sec:terms and scope} +This thesis will focus on web tracking as employed by for example advertising +companies. When users visit a web site which uses third party content from +advertisers, those advertisers collect bits of information about the user. These +bits of information are not yet associated with a particular user but with an +online identity which is usually tied to a unique identifier. The unique +identifiers are by themselves not meaningful because the same user might get +multiple unique identifiers, each corresponding to other bits of information. To +allow the series of information to be aggregated into one profile which +approximates a user's personality, needs and wants, tracking mechanisms are +used. In many cases the goal is to persist tracking identifiers on the user's +computer for as long as possible and to not assign multiple identifiers to the +same person. +The tracking mechanisms presented in this work are mechanisms which store +information on the user's computer. They are---in other words---\emph{stateful} +mechanisms. Such mechanisms include \gls{HTTP} cookies or various forms of +caches. Contrary to stateful mechanisms, \emph{stateless} mechanisms do not +store information on the user's computer but attempt to infer information by +reading the browser state. This can mean knowing which fonts are installed and +inferring that a particular user is using a Windows operating system instead of +Linux or that they are visiting with a mobile browser and not from a desktop. +This type of tracking is also called \emph{device fingerprinting}. With enough +fingerprints, trackers can uniquely identify a user or device by knowing that no +other entity uses the Internet with the same unique fingerprint. Stateless +tracking mechanisms are not discussed in this work, instead the focus will be on +stateful tracking mechanisms. + +\section{Methodology} +\label{sec:methodology} + +This work gives an overview of tracking methods and defenses which have been +studied in the literature. As such, a comprehensive literature review of +relevant research is performed, with a focus on recent developments. Papers will +be collected through the usage of digital libraries and search engines such as +the \emph{ACM Digital Library}, the \emph{IEEE Xplore Library}, \emph{Google +Scholar} and for selected works to appear in peer-reviewed journals +\emph{arXiv.org}. Additionally, well-known journals and proceedings like +\emph{Computers \& Security} and \emph{Proceedings on Privacy Enhancing +Technologies} are manually searched for relevant papers. The used search terms +include but are not limited to keywords such as \emph{Stateful Web Tracking}, +\emph{Web Tracking}, \emph{Tracking Measurement} and variants thereof. +Furthermore, queries for the names of particular tracking methods are made. For +information on \emph{Cookie Synchronization} (section~\ref{subsec:cookie +synchronization}) for instance, separate search queries will be performed. + +\section{Structure of the Thesis} +\label{sec:structure of the thesis} + +The thesis is divided into two major parts: chapter~\ref{chap:tracking methods} +is concerned with how web sites on the Internet track individuals and +chapter~\ref{chap:defenses against tracking} offers users ways to defend +themselves against those tracking methods. Chapter~\ref{chap:tracking methods} +is split into three parts, each focussing on a subset of tracking methods that +can be grouped together. The chapter on defenses against tracking first presents +ways in which users can use existing browser features to limit tracking. The +second part discusses specialized tools which focus on one aspect of tracking +and summarizes research concerned with the effectiveness of these tools. The +thesis is concluded in chapter~\ref{chap:conclusion}.