bachelorarbeit/defences.tex

194 lines
12 KiB
TeX

\chapter{Defenses against Tracking}%
\label{chap:defenses against tracking}
The proliferation of tracking across the web has led to the development of a
myriad of tools that each have their own advantages and disadvantages. Some
tracking methods can be easily mitigated by changing browser settings or by
disabling certain technologies. More often than not, these methods not only stop
or limit tracking but also severely hamper the internet experience for end
users. Especially some of the more advanced tools require user input to know
which items to block and which to let through. This in turn requires expertise
that few regular internet users possess, further complicating defending against
tracking. This chapter introduces methods and tools that have been proven to be
effective against tracking on the web. It is split into two parts, with the
first surveying techniques that can be applied to limit tracking and the second
presenting tools to managing tracking on the web. The focus lies on defending
against the methods discussed in chapter~\ref{chap:tracking methods}.
\section{Techniques}
\label{sec:techniques}
The aim of this section is to present comparatively simple techniques that a
user can employ to limit tracking. The benefit of these methods is that they are
built into modern browsers and therefore do not require specific user knowledge
of installing any additional tools. Although their implementations vary from
one browser to another, the basic idea of the underlying functionality remains
the same.
\subsection{Opt-out and Opt-in}
\label{subsec:opt-out}
To opt-out in the context of web tracking means to make use of the possibility
of turning off data collection by a web site. After the user has opted-out of
either all data collection or only a subset of all the data that a web site
collects, an opt-out cookie is set, indicating the user's preference. Whereas
opting-out generally means that data collection happens by default, opt-in
requires that data collection is turned off by default. In theory it allows
users to have fine-grained control over which aspects of their online presence
they are comfortable with sharing by either opting-out or opting-in (depending
on how web sites ask for consent). In practice however, the seemingly irrelevant
difference between those two lead to very different outcomes with respect to the
amount of users that are tracked.
For either opt-out or opt-in to work, a web site has to provide an option for
doing so. Because web sites increasingly use third parties to manage data
collection on their site, consent or rejection has to be passed to these third
parties and they have to be willing to accept such a decision. Since the
European's \gls{GDPR} came into force in 2018, service providers operating in
the European Union are required to ask users for explicit consent before
collecting any data, except when that data is absolutely necessary to ensure
basic functionality. It is not allowed to notify the user that by continuing to
visit the web site, consent to data collection is given. Furthermore, if consent
is not given, the web site provider is not allowed to block the user from
visiting the web site. Even before the \gls{GDPR}, the EU required web sites to
ask for informed consent via the ePrivacy Directive which came into force in
2013. \citet{trevisanYearsEUCookie2019} use their tool \emph{CookieCheck} to
evaluate how many of the surveyed 35.000 sites comply with the legislation put
forth in the ePrivacy Directive. Their findings indicate that almost half (49\%)
of the web sites use profiling technologies without consent. Similarly,
\citet{sanchez-rolaCanOptOut2019a} show that tracking is still prevalent and
happens already before user consent is given after the \gls{GDPR} has been in
force for a year. \citet{huCharacterisingThirdParty2019} come to a a similar
conclusion while only looking at third party tracking: the amount of cookies
stored on a user's computer has not changed significantly since before the
\gls{GDPR}. In yet another survey of the top 500 web sites as ranked by Alexa,
\citet{degelingWeValueYour2019} conclude that the amount of tracking before and
after the \gls{GDPR} stayed the same and only 37 sites ask for consent before
storing any cookies.
Giving users a choice whether they want to share their personal information or
not and given that web sites honor such a request, all of the methods discussed
in chapter~\ref{chap:tracking methods} can be defended against.
\subsection{Clearing Browser History}
\label{subsec:Clearing Browser History}
For our purposes, clearing the browser history means not only clearing the web
sites that have been visited but also cookies and other relevant data that is
saved with a visit to a web site. All major browser offer this function and what
they delete is similar. Firefox for example allows clearing the browsing and
search history, form and search history, cookies (also flash cookies), the
cache, active logins, offline web site data and site preferences such as
permissions, zoom level and character encodings. This technique is only
beneficial in the long term if users do it frequently to stop any accumulation
of tracking identifiers in caches, cookies or other site data. The downside is
that not having a history to go back to can hamper user experience depending on
the workflow of each user. Futhermore, opt-out or opt-in preferences are deleted
as well, making the technique in section~\ref{subsec:opt-out} less effective.
Clearing the browser history is effective against some storage-based tracking
methods. Evercookie (section~\ref{subsec:evercookie}) and cookie synchronisation
(section~\ref{subsec:cookie synchronization}) are designed to respawn items in
the browser history and can therefore not be mitigated. Almost all cache-based
methods are also mitigated by frequently clearing the browser history as long as
users do not authenticate themselves with a web service.
\citet{kleinDNSCacheBasedUser2019} demonstrate that their \gls{DNS} cache attack
works across history deletions. Session-based methods are not affected by
history clearing because they are intended to track a user for one session only.
\subsection{Private Browsing Mode}
\label{subsec:Private Browsing Mode}
The private browsing mode is a feature offered by all major browser that intends
to improve privacy by not allowing access to storage areas within the browser.
Users associate it with an increase of privacy compared to normal or public
mode. Unfortunately, implementations of the private browsing mode are inconsistent
across browsers and what is deemed worthy of protection is largely up to browser
vendors. \citet[p.~440]{xuUCognitoPrivateBrowsing2015} provide a comprehensive
overview of browsers and their private browsing mode practices. Most notably,
Safari allows access to earlier cookies, history and HTML5 storage while other
browsers disallow it. Table~\ref{tab:private browsing mode} provides a list of
browsers and their protection against tracking with the methods from
chapter~\ref{chap:tracking methods}.
\begin{sidewaystable}
\caption{Private browsing mode for major browsers}
\label{tab:private browsing mode}
\centering
\begin{tabular}{|l|l|c|c|c|c|}
\hline
\multicolumn{1}{|c|}{\textbf{Section}} & \multicolumn{1}{c|}{\textbf{Tracking Method}} & \multicolumn{4}{c|}{ \textbf{Tracking in Private Browsing Mode}} \\
\hline
\multicolumn{2}{|l|}{} & \textbf{Safari} & \textbf{Firefox} & \textbf{Chrome} & \textbf{IE} \\
\hline
\multicolumn{6}{|l|}{\textbf{Session-based} } \\
\hline
\ref{subsec:passing information in urls} & Passing Information in URLs & NA & NA & NA & NA \\
\hline
\ref{subsec:hidden form fields} & Hidden Form Fields & NA & NA & NA & NA \\
\hline
\ref{subsec:http referer} & HTTP Referer & NA & NA & NA & NA \\
\hline
\ref{subsec:explicit authentication} & Explicit Authentication & NA & NA & NA & NA \\
\hline
\ref{subsec:window.name dom property} & window.name DOM property & NA & NA & NA & NA \\
\hline
\multicolumn{6}{|l|}{\textbf{Storage-based} } \\
\hline
\ref{subsec:http cookies} & HTTP cookies & Yes & No & No & No \\
\hline
\ref{subsec:flash cookies and java jnlp persistenceservice} & Flash Cookies and Java JNLP PersistenceService & Yes & Yes & Yes & Yes \\
\hline
\ref{subsec:evercookie} & Evercookie & Yes & No & No & No \\
\hline
\ref{subsec:cookie synchronization} & Cookie Synchronization & Yes & Yes & Yes & Yes \\
\hline
\ref{subsec:silverlight isolated storage} & Silverlight Isolated Storage & Yes & No & No & No \\
\hline
\ref{subsec:html5 web storage} & HTML5 Web Storage & Yes & No & No & No \\
\hline
\ref{subsec:html5 indexed database api} & HTML5 Indexed Database API & Yes & No & No & No \\
\hline
\ref{subsec:web sql database} & Web SQL Database & Yes & No & No & No \\
\hline
\multicolumn{6}{|l|}{\textbf{Cache-based} } \\
\hline
\ref{subsec:web cache} & Web Cache & Yes & No & No & No \\
\hline
\ref{subsec:cache timing} & Cache Timing & Yes & No & No & No \\
\hline
\ref{subsec:cache control directives} & Cache Control Directives & Yes & No & No & No \\
\hline
\ref{subsec:dns cache} & DNS Cache & Yes & Yes & Yes & Yes \\
\hline
\ref{subsec:tls session resumption} & TLS Session Resumption & Yes & No & No & No \\
\hline
\end{tabular}
\end{sidewaystable}
\subsection{Do Not Track}
\label{subsec:Do Not Track}
\subsection{Privacy-focused Search Engines}
\label{subsec:Privacy-focused Search Engines}
\section{Tools}
\label{sec:tools}
\subsection{Blacklists}
\label{subsec:blacklists}
\subsection{TOR}
\label{subsec:tor}
\subsection{Virtual Private Networks}
\label{subsec:virtual private networks}
\subsection{Privacy Badger}
\label{subsec:privacy badger}
\subsection{Request Policy}
\label{subsec:Request Policy}