Fix spelling and grammatical errors

This commit is contained in:
Tobias Eidelpes 2020-07-29 17:10:41 +02:00
parent f31d2d4f9f
commit 0085461c32
2 changed files with 37 additions and 37 deletions

View File

@ -5,14 +5,14 @@ The proliferation of tracking across the web has led to the development of a
myriad of tools that each have their own advantages and disadvantages. Some
tracking methods can be easily mitigated by changing browser settings or by
disabling certain technologies. More often than not, these methods not only stop
or limit tracking but also severely hamper the internet experience for end
or limit tracking but also severely hamper the Internet experience for end
users. Especially some of the more advanced tools require user input to know
which items to block and which to let through. This in turn requires expertise
that few regular internet users possess, further complicating defending against
that few regular Internet users possess, further complicating defending against
tracking. This chapter introduces methods and tools that have been proven to be
effective against tracking on the web. It is split into two parts, with the
first surveying techniques that can be applied to limit tracking and the second
presenting tools to managing tracking on the web. The focus lies on defending
presenting tools to manage tracking on the web. The focus lies on defending
against the methods discussed in chapter~\ref{chap:tracking methods}.
\section{Techniques}
@ -75,9 +75,9 @@ in chapter~\ref{chap:tracking methods} can be defended against.
For our purposes, clearing the browser history means not only clearing the web
sites that have been visited but also cookies and other relevant data that is
saved with a visit to a web site. All major browser offer this function and what
they delete is similar. Firefox for example allows clearing the browsing and
search history, form and search history, cookies (also flash cookies), the
saved with a visit to a web site. All major browser offer this functionality and
what they delete is similar. Firefox, for example, allows clearing the browsing
and search history, form and search history, cookies (also flash cookies), the
cache, active logins, offline web site data and site preferences such as
permissions, zoom level and character encodings. This technique is only
beneficial in the long term if users do it frequently to stop any accumulation

View File

@ -23,16 +23,16 @@ identifiers.
\label{sec:session-based tracking methods}
One of the simplest and most used forms of tracking on the Internet relies on
sessions. Since HTTP is a stateless protocol, web servers cannot by default keep
track of any previous client requests. In order to implement specific features
such as personalized advertising, some means to save current and recall previous
states must be used. For this functionality, sessions were introduced. Sessions
represent a temporary and interactive exchange of information between two
parties. Due to their temporary nature, they have to be `brought up' at some
point and `torn down' at a later point in time. It is not specified however,
how long the period between establishing and stopping a session has to be. It
could be only for a single browser session and terminated by the user manually,
or it could be for as long as a year.
sessions. Since \gls{HTTP} is a stateless protocol, web servers cannot by
default keep track of any previous client requests. In order to implement
specific features such as personalized advertising, some means to save current
and recall previous states must be used. For this functionality, sessions were
introduced. Sessions represent a temporary and interactive exchange of
information between two parties. Due to their temporary nature, they have to be
`brought up' at some point and `torn down' at a later point in time. It is not
specified however, how long the period between establishing and stopping a
session has to be. It could be only for a single browser session and terminated
by the user manually, or it could be for as long as a year.
\subsection{Passing Information in URLs}
@ -55,7 +55,7 @@ specification to include where and how a particular resource can be found.
\end{enumerate}
To access a section called \texttt{introduction} in a blog post named
\texttt{blog post} on a host with the domain name \texttt{example.com} over the
\texttt{blog post} on a host with the domain name \texttt{example.com} over
\gls{HTTP}, a user might use the following \gls{URI}:
\begin{verbatim}
@ -95,7 +95,7 @@ methods. Normally, a user would input data into a form and on clicking
\emph{submit} the input would be sent to the server. Sometimes it is necessary
to include additional information that the user did not enter. For this reason
there exist \emph{hidden} web forms. Hidden web forms do not show on the web site
and therefore the user cannot enter any information. Similar to \gls{URL}
and therefore the user cannot enter any information. Similarly to \gls{URL}
parameters, the value parameter in a hidden field contains additional
information like the user's preferred language for example. Since almost
anything can be sent in a value parameter, hidden form fields present another
@ -200,7 +200,7 @@ storage or---more specifically---maintaining session variables. In order to
store multiple variables in the window.name property, the values have first to
be packed in some way because only a single string is allowed. A \gls{JSON}
stringifier converts a normal string into a \gls{JSON} string which is then
ready to be stored in the DOM property. Additionally, serializers can also
ready to be stored in the \gls{DOM} property. Additionally, serializers can also
convert JavaScript objects into a \gls{JSON} string. Normally JavaScript's
same-origin policy prohibits making requests to servers in another domain, but
the window.name property is accessible from other domains and resistant to page
@ -218,8 +218,8 @@ store session data as well but are not limited to that use case. They generally
enable more advanced tracking approaches because they have information about the
current browser instance and the operating system the browser is running on. Due
to their nature of residing on the user's computer, they are in most cases
harder to circumvent, especially when two or more methods are combined resulting
in better resilience against simple defences.
harder to circumvent, especially when two or more methods are combined, resulting
in better resilience against simple defenses.
\subsection{HTTP Cookies}
\label{subsec:http cookies}
@ -263,7 +263,7 @@ Google's \texttt{analytics.js}) or by using the \gls{HTTP} Set-Cookie response
header. Once a request to a web server has been issued, the server can set a
cookie in the Set-Cookie header and sends the response back to the client. On
the client's side the cookie is stored by the browser and sent with subsequent
requests to the same domain via the Cookie \gls{HTTP} header. An example of a
requests to the same domain via the cookie \gls{HTTP} header. An example of a
cookie header is given in Listing~\ref{lst:session cookie header}. Because this
example does not set an expiration date for the cookie, it sets a session
cookie. Session cookies are limited to the current session and are deleted as
@ -317,7 +317,7 @@ cookies which contain more than a unique identifier. This allows for a better
understanding and interpretation of complex cookies as they are found in
advertising networks with a lot of reach (e.g., doubleclick.net). This
information is particularly useful for building applications that effectively
detect and block cookies (see chapter~\ref{chap:defences against tracking}).
detect and block cookies (see chapter~\ref{chap:defenses against tracking}).
\subsection{Flash Cookies and Java JNLP PersistenceService}
\label{subsec:flash cookies and java jnlp persistenceservice}
@ -414,9 +414,9 @@ necessarily having to know the web site the user visits.
\begin{figure}[ht]
\centering
\includegraphics[width=1\textwidth]{../figures/cookiesyncing.pdf}
\label{fig:cookie synchronization}
\includegraphics[width=1\textwidth]{figures/cookiesyncing.pdf}
\caption{Cookie Synchronization in practice between two trackers
\label{fig:cookie synchronization}
\emph{cloudflare.com} and \emph{google.com}.}
\end{figure}
@ -762,7 +762,7 @@ is stale and needs to be updated. Commonly, a collision-resistant hash function
is used to generate a unique hash of a cached resource which is sent along with
the resource in the first \gls{HTTP} request. The resource and the hash—which is
stored in the \gls{ETag} header—is then cached by the client. On subsequent
retrievals of the same \gls{URL}, the client checks for an expire date on the
retrievals of the same \gls{URL}, the client checks for an expiration date on the
requested \gls{URL} via the Cache-Control and Expire headers. If the \gls{URL}
has expired, the client sends a request with the \emph{If-None-Match} field set
with the \gls{ETag}. The server then compares the \gls{ETag} received by the
@ -780,7 +780,7 @@ the identifier has been placed in the \gls{ETag} header, the server can answer
requests to check for an updated resource always with an \gls{HTTP} 301
Not-Modified header, effectively persisting the unique identifier in the
client's cache. During their 2011 survey of QuantCast.com's top 100 U.S. based
web sites \citet{ayensonFlashCookiesPrivacy2011} found \texttt{hulu.com} to be
web sites, \citet{ayensonFlashCookiesPrivacy2011} found \texttt{hulu.com} to be
using \glspl{ETag} as backup for tracking cookies that are set by
\texttt{KISSmetrics} (an analytics platform). This allowed cookies to be
respawned once they had been cleared by checking the \gls{ETag} header.
@ -803,15 +803,15 @@ operating system has it's own cache that applications can ask for name
resolution. Some applications introduce another layer of caching by having their
own cache (e.g., browsers).
\citet{kleinDNSCacheBasedUser2019} demonstrated a tracking method which is
using \gls{DNS} caches to assign unique identifiers to client machines. In
order for the technique to work, the tracker has to have control over one web
server (or multiple) as well as an authoritative \gls{DNS} server which
associates the web servers with a domain name under the control of the tracker.
The tracking process starts once a user agent requests a web site which loads a
script from one of the web servers the attacker is controlling. The process
can then be sketched out as follows (see
\cite[p.~5]{kleinDNSCacheBasedUser2019} for a detailed description).
\citet{kleinDNSCacheBasedUser2019} demonstrate a tracking method which uses
\gls{DNS} caches to assign unique identifiers to client machines. In order for
the technique to work, the tracker has to have control over one web server (or
multiple) as well as an authoritative \gls{DNS} server which associates the web
servers with a domain name under the control of the tracker. The tracking
process starts once a user agent requests a web site which loads a script from
one of the web servers the attacker is controlling. The process can then be
sketched out as follows (see \cite[p.~5]{kleinDNSCacheBasedUser2019} for a
detailed description).
\begin{enumerate}
\item The snippet loads a resource from muliple domains (\texttt{1.ex.com},
@ -852,7 +852,7 @@ for example.
\label{subsec:tls session resumption}
\gls{TLS} is widely used today to securely encapsulate communication across the
web. For bandwidth savings and better performance it is possible to cache a
web. For bandwidth savings and better performance, it is possible to cache a
\gls{TLS} session to allow reusing an already established secure connection at a
later point in time. Versions prior to \gls{TLS} 1.3 used two mechanisms to
accomplish this: \gls{TLS} session identifiers and session tickets. Session