Fix spelling and grammatical errors
This commit is contained in:
parent
f31d2d4f9f
commit
0085461c32
12
defenses.tex
12
defenses.tex
@ -5,14 +5,14 @@ The proliferation of tracking across the web has led to the development of a
|
||||
myriad of tools that each have their own advantages and disadvantages. Some
|
||||
tracking methods can be easily mitigated by changing browser settings or by
|
||||
disabling certain technologies. More often than not, these methods not only stop
|
||||
or limit tracking but also severely hamper the internet experience for end
|
||||
or limit tracking but also severely hamper the Internet experience for end
|
||||
users. Especially some of the more advanced tools require user input to know
|
||||
which items to block and which to let through. This in turn requires expertise
|
||||
that few regular internet users possess, further complicating defending against
|
||||
that few regular Internet users possess, further complicating defending against
|
||||
tracking. This chapter introduces methods and tools that have been proven to be
|
||||
effective against tracking on the web. It is split into two parts, with the
|
||||
first surveying techniques that can be applied to limit tracking and the second
|
||||
presenting tools to managing tracking on the web. The focus lies on defending
|
||||
presenting tools to manage tracking on the web. The focus lies on defending
|
||||
against the methods discussed in chapter~\ref{chap:tracking methods}.
|
||||
|
||||
\section{Techniques}
|
||||
@ -75,9 +75,9 @@ in chapter~\ref{chap:tracking methods} can be defended against.
|
||||
|
||||
For our purposes, clearing the browser history means not only clearing the web
|
||||
sites that have been visited but also cookies and other relevant data that is
|
||||
saved with a visit to a web site. All major browser offer this function and what
|
||||
they delete is similar. Firefox for example allows clearing the browsing and
|
||||
search history, form and search history, cookies (also flash cookies), the
|
||||
saved with a visit to a web site. All major browser offer this functionality and
|
||||
what they delete is similar. Firefox, for example, allows clearing the browsing
|
||||
and search history, form and search history, cookies (also flash cookies), the
|
||||
cache, active logins, offline web site data and site preferences such as
|
||||
permissions, zoom level and character encodings. This technique is only
|
||||
beneficial in the long term if users do it frequently to stop any accumulation
|
||||
|
||||
62
methods.tex
62
methods.tex
@ -23,16 +23,16 @@ identifiers.
|
||||
\label{sec:session-based tracking methods}
|
||||
|
||||
One of the simplest and most used forms of tracking on the Internet relies on
|
||||
sessions. Since HTTP is a stateless protocol, web servers cannot by default keep
|
||||
track of any previous client requests. In order to implement specific features
|
||||
such as personalized advertising, some means to save current and recall previous
|
||||
states must be used. For this functionality, sessions were introduced. Sessions
|
||||
represent a temporary and interactive exchange of information between two
|
||||
parties. Due to their temporary nature, they have to be `brought up' at some
|
||||
point and `torn down' at a later point in time. It is not specified however,
|
||||
how long the period between establishing and stopping a session has to be. It
|
||||
could be only for a single browser session and terminated by the user manually,
|
||||
or it could be for as long as a year.
|
||||
sessions. Since \gls{HTTP} is a stateless protocol, web servers cannot by
|
||||
default keep track of any previous client requests. In order to implement
|
||||
specific features such as personalized advertising, some means to save current
|
||||
and recall previous states must be used. For this functionality, sessions were
|
||||
introduced. Sessions represent a temporary and interactive exchange of
|
||||
information between two parties. Due to their temporary nature, they have to be
|
||||
`brought up' at some point and `torn down' at a later point in time. It is not
|
||||
specified however, how long the period between establishing and stopping a
|
||||
session has to be. It could be only for a single browser session and terminated
|
||||
by the user manually, or it could be for as long as a year.
|
||||
|
||||
|
||||
\subsection{Passing Information in URLs}
|
||||
@ -55,7 +55,7 @@ specification to include where and how a particular resource can be found.
|
||||
\end{enumerate}
|
||||
|
||||
To access a section called \texttt{introduction} in a blog post named
|
||||
\texttt{blog post} on a host with the domain name \texttt{example.com} over the
|
||||
\texttt{blog post} on a host with the domain name \texttt{example.com} over
|
||||
\gls{HTTP}, a user might use the following \gls{URI}:
|
||||
|
||||
\begin{verbatim}
|
||||
@ -95,7 +95,7 @@ methods. Normally, a user would input data into a form and on clicking
|
||||
\emph{submit} the input would be sent to the server. Sometimes it is necessary
|
||||
to include additional information that the user did not enter. For this reason
|
||||
there exist \emph{hidden} web forms. Hidden web forms do not show on the web site
|
||||
and therefore the user cannot enter any information. Similar to \gls{URL}
|
||||
and therefore the user cannot enter any information. Similarly to \gls{URL}
|
||||
parameters, the value parameter in a hidden field contains additional
|
||||
information like the user's preferred language for example. Since almost
|
||||
anything can be sent in a value parameter, hidden form fields present another
|
||||
@ -200,7 +200,7 @@ storage or---more specifically---maintaining session variables. In order to
|
||||
store multiple variables in the window.name property, the values have first to
|
||||
be packed in some way because only a single string is allowed. A \gls{JSON}
|
||||
stringifier converts a normal string into a \gls{JSON} string which is then
|
||||
ready to be stored in the DOM property. Additionally, serializers can also
|
||||
ready to be stored in the \gls{DOM} property. Additionally, serializers can also
|
||||
convert JavaScript objects into a \gls{JSON} string. Normally JavaScript's
|
||||
same-origin policy prohibits making requests to servers in another domain, but
|
||||
the window.name property is accessible from other domains and resistant to page
|
||||
@ -218,8 +218,8 @@ store session data as well but are not limited to that use case. They generally
|
||||
enable more advanced tracking approaches because they have information about the
|
||||
current browser instance and the operating system the browser is running on. Due
|
||||
to their nature of residing on the user's computer, they are in most cases
|
||||
harder to circumvent, especially when two or more methods are combined resulting
|
||||
in better resilience against simple defences.
|
||||
harder to circumvent, especially when two or more methods are combined, resulting
|
||||
in better resilience against simple defenses.
|
||||
|
||||
\subsection{HTTP Cookies}
|
||||
\label{subsec:http cookies}
|
||||
@ -263,7 +263,7 @@ Google's \texttt{analytics.js}) or by using the \gls{HTTP} Set-Cookie response
|
||||
header. Once a request to a web server has been issued, the server can set a
|
||||
cookie in the Set-Cookie header and sends the response back to the client. On
|
||||
the client's side the cookie is stored by the browser and sent with subsequent
|
||||
requests to the same domain via the Cookie \gls{HTTP} header. An example of a
|
||||
requests to the same domain via the cookie \gls{HTTP} header. An example of a
|
||||
cookie header is given in Listing~\ref{lst:session cookie header}. Because this
|
||||
example does not set an expiration date for the cookie, it sets a session
|
||||
cookie. Session cookies are limited to the current session and are deleted as
|
||||
@ -317,7 +317,7 @@ cookies which contain more than a unique identifier. This allows for a better
|
||||
understanding and interpretation of complex cookies as they are found in
|
||||
advertising networks with a lot of reach (e.g., doubleclick.net). This
|
||||
information is particularly useful for building applications that effectively
|
||||
detect and block cookies (see chapter~\ref{chap:defences against tracking}).
|
||||
detect and block cookies (see chapter~\ref{chap:defenses against tracking}).
|
||||
|
||||
\subsection{Flash Cookies and Java JNLP PersistenceService}
|
||||
\label{subsec:flash cookies and java jnlp persistenceservice}
|
||||
@ -414,9 +414,9 @@ necessarily having to know the web site the user visits.
|
||||
|
||||
\begin{figure}[ht]
|
||||
\centering
|
||||
\includegraphics[width=1\textwidth]{../figures/cookiesyncing.pdf}
|
||||
\label{fig:cookie synchronization}
|
||||
\includegraphics[width=1\textwidth]{figures/cookiesyncing.pdf}
|
||||
\caption{Cookie Synchronization in practice between two trackers
|
||||
\label{fig:cookie synchronization}
|
||||
\emph{cloudflare.com} and \emph{google.com}.}
|
||||
\end{figure}
|
||||
|
||||
@ -762,7 +762,7 @@ is stale and needs to be updated. Commonly, a collision-resistant hash function
|
||||
is used to generate a unique hash of a cached resource which is sent along with
|
||||
the resource in the first \gls{HTTP} request. The resource and the hash—which is
|
||||
stored in the \gls{ETag} header—is then cached by the client. On subsequent
|
||||
retrievals of the same \gls{URL}, the client checks for an expire date on the
|
||||
retrievals of the same \gls{URL}, the client checks for an expiration date on the
|
||||
requested \gls{URL} via the Cache-Control and Expire headers. If the \gls{URL}
|
||||
has expired, the client sends a request with the \emph{If-None-Match} field set
|
||||
with the \gls{ETag}. The server then compares the \gls{ETag} received by the
|
||||
@ -780,7 +780,7 @@ the identifier has been placed in the \gls{ETag} header, the server can answer
|
||||
requests to check for an updated resource always with an \gls{HTTP} 301
|
||||
Not-Modified header, effectively persisting the unique identifier in the
|
||||
client's cache. During their 2011 survey of QuantCast.com's top 100 U.S. based
|
||||
web sites \citet{ayensonFlashCookiesPrivacy2011} found \texttt{hulu.com} to be
|
||||
web sites, \citet{ayensonFlashCookiesPrivacy2011} found \texttt{hulu.com} to be
|
||||
using \glspl{ETag} as backup for tracking cookies that are set by
|
||||
\texttt{KISSmetrics} (an analytics platform). This allowed cookies to be
|
||||
respawned once they had been cleared by checking the \gls{ETag} header.
|
||||
@ -803,15 +803,15 @@ operating system has it's own cache that applications can ask for name
|
||||
resolution. Some applications introduce another layer of caching by having their
|
||||
own cache (e.g., browsers).
|
||||
|
||||
\citet{kleinDNSCacheBasedUser2019} demonstrated a tracking method which is
|
||||
using \gls{DNS} caches to assign unique identifiers to client machines. In
|
||||
order for the technique to work, the tracker has to have control over one web
|
||||
server (or multiple) as well as an authoritative \gls{DNS} server which
|
||||
associates the web servers with a domain name under the control of the tracker.
|
||||
The tracking process starts once a user agent requests a web site which loads a
|
||||
script from one of the web servers the attacker is controlling. The process
|
||||
can then be sketched out as follows (see
|
||||
\cite[p.~5]{kleinDNSCacheBasedUser2019} for a detailed description).
|
||||
\citet{kleinDNSCacheBasedUser2019} demonstrate a tracking method which uses
|
||||
\gls{DNS} caches to assign unique identifiers to client machines. In order for
|
||||
the technique to work, the tracker has to have control over one web server (or
|
||||
multiple) as well as an authoritative \gls{DNS} server which associates the web
|
||||
servers with a domain name under the control of the tracker. The tracking
|
||||
process starts once a user agent requests a web site which loads a script from
|
||||
one of the web servers the attacker is controlling. The process can then be
|
||||
sketched out as follows (see \cite[p.~5]{kleinDNSCacheBasedUser2019} for a
|
||||
detailed description).
|
||||
|
||||
\begin{enumerate}
|
||||
\item The snippet loads a resource from muliple domains (\texttt{1.ex.com},
|
||||
@ -852,7 +852,7 @@ for example.
|
||||
\label{subsec:tls session resumption}
|
||||
|
||||
\gls{TLS} is widely used today to securely encapsulate communication across the
|
||||
web. For bandwidth savings and better performance it is possible to cache a
|
||||
web. For bandwidth savings and better performance, it is possible to cache a
|
||||
\gls{TLS} session to allow reusing an already established secure connection at a
|
||||
later point in time. Versions prior to \gls{TLS} 1.3 used two mechanisms to
|
||||
accomplish this: \gls{TLS} session identifiers and session tickets. Session
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user