From 0085461c32b88a3ccba1bac490d733e5a2a9407c Mon Sep 17 00:00:00 2001 From: Tobias Eidelpes Date: Wed, 29 Jul 2020 17:10:41 +0200 Subject: [PATCH] Fix spelling and grammatical errors --- defenses.tex | 12 +++++----- methods.tex | 62 ++++++++++++++++++++++++++-------------------------- 2 files changed, 37 insertions(+), 37 deletions(-) diff --git a/defenses.tex b/defenses.tex index f786fdf..fcbe3a8 100644 --- a/defenses.tex +++ b/defenses.tex @@ -5,14 +5,14 @@ The proliferation of tracking across the web has led to the development of a myriad of tools that each have their own advantages and disadvantages. Some tracking methods can be easily mitigated by changing browser settings or by disabling certain technologies. More often than not, these methods not only stop -or limit tracking but also severely hamper the internet experience for end +or limit tracking but also severely hamper the Internet experience for end users. Especially some of the more advanced tools require user input to know which items to block and which to let through. This in turn requires expertise -that few regular internet users possess, further complicating defending against +that few regular Internet users possess, further complicating defending against tracking. This chapter introduces methods and tools that have been proven to be effective against tracking on the web. It is split into two parts, with the first surveying techniques that can be applied to limit tracking and the second -presenting tools to managing tracking on the web. The focus lies on defending +presenting tools to manage tracking on the web. The focus lies on defending against the methods discussed in chapter~\ref{chap:tracking methods}. \section{Techniques} @@ -75,9 +75,9 @@ in chapter~\ref{chap:tracking methods} can be defended against. For our purposes, clearing the browser history means not only clearing the web sites that have been visited but also cookies and other relevant data that is -saved with a visit to a web site. All major browser offer this function and what -they delete is similar. Firefox for example allows clearing the browsing and -search history, form and search history, cookies (also flash cookies), the +saved with a visit to a web site. All major browser offer this functionality and +what they delete is similar. Firefox, for example, allows clearing the browsing +and search history, form and search history, cookies (also flash cookies), the cache, active logins, offline web site data and site preferences such as permissions, zoom level and character encodings. This technique is only beneficial in the long term if users do it frequently to stop any accumulation diff --git a/methods.tex b/methods.tex index d60040d..d5f35fc 100644 --- a/methods.tex +++ b/methods.tex @@ -23,16 +23,16 @@ identifiers. \label{sec:session-based tracking methods} One of the simplest and most used forms of tracking on the Internet relies on -sessions. Since HTTP is a stateless protocol, web servers cannot by default keep -track of any previous client requests. In order to implement specific features -such as personalized advertising, some means to save current and recall previous -states must be used. For this functionality, sessions were introduced. Sessions -represent a temporary and interactive exchange of information between two -parties. Due to their temporary nature, they have to be `brought up' at some -point and `torn down' at a later point in time. It is not specified however, -how long the period between establishing and stopping a session has to be. It -could be only for a single browser session and terminated by the user manually, -or it could be for as long as a year. +sessions. Since \gls{HTTP} is a stateless protocol, web servers cannot by +default keep track of any previous client requests. In order to implement +specific features such as personalized advertising, some means to save current +and recall previous states must be used. For this functionality, sessions were +introduced. Sessions represent a temporary and interactive exchange of +information between two parties. Due to their temporary nature, they have to be +`brought up' at some point and `torn down' at a later point in time. It is not +specified however, how long the period between establishing and stopping a +session has to be. It could be only for a single browser session and terminated +by the user manually, or it could be for as long as a year. \subsection{Passing Information in URLs} @@ -55,7 +55,7 @@ specification to include where and how a particular resource can be found. \end{enumerate} To access a section called \texttt{introduction} in a blog post named -\texttt{blog post} on a host with the domain name \texttt{example.com} over the +\texttt{blog post} on a host with the domain name \texttt{example.com} over \gls{HTTP}, a user might use the following \gls{URI}: \begin{verbatim} @@ -95,7 +95,7 @@ methods. Normally, a user would input data into a form and on clicking \emph{submit} the input would be sent to the server. Sometimes it is necessary to include additional information that the user did not enter. For this reason there exist \emph{hidden} web forms. Hidden web forms do not show on the web site -and therefore the user cannot enter any information. Similar to \gls{URL} +and therefore the user cannot enter any information. Similarly to \gls{URL} parameters, the value parameter in a hidden field contains additional information like the user's preferred language for example. Since almost anything can be sent in a value parameter, hidden form fields present another @@ -200,7 +200,7 @@ storage or---more specifically---maintaining session variables. In order to store multiple variables in the window.name property, the values have first to be packed in some way because only a single string is allowed. A \gls{JSON} stringifier converts a normal string into a \gls{JSON} string which is then -ready to be stored in the DOM property. Additionally, serializers can also +ready to be stored in the \gls{DOM} property. Additionally, serializers can also convert JavaScript objects into a \gls{JSON} string. Normally JavaScript's same-origin policy prohibits making requests to servers in another domain, but the window.name property is accessible from other domains and resistant to page @@ -218,8 +218,8 @@ store session data as well but are not limited to that use case. They generally enable more advanced tracking approaches because they have information about the current browser instance and the operating system the browser is running on. Due to their nature of residing on the user's computer, they are in most cases -harder to circumvent, especially when two or more methods are combined resulting -in better resilience against simple defences. +harder to circumvent, especially when two or more methods are combined, resulting +in better resilience against simple defenses. \subsection{HTTP Cookies} \label{subsec:http cookies} @@ -263,7 +263,7 @@ Google's \texttt{analytics.js}) or by using the \gls{HTTP} Set-Cookie response header. Once a request to a web server has been issued, the server can set a cookie in the Set-Cookie header and sends the response back to the client. On the client's side the cookie is stored by the browser and sent with subsequent -requests to the same domain via the Cookie \gls{HTTP} header. An example of a +requests to the same domain via the cookie \gls{HTTP} header. An example of a cookie header is given in Listing~\ref{lst:session cookie header}. Because this example does not set an expiration date for the cookie, it sets a session cookie. Session cookies are limited to the current session and are deleted as @@ -317,7 +317,7 @@ cookies which contain more than a unique identifier. This allows for a better understanding and interpretation of complex cookies as they are found in advertising networks with a lot of reach (e.g., doubleclick.net). This information is particularly useful for building applications that effectively -detect and block cookies (see chapter~\ref{chap:defences against tracking}). +detect and block cookies (see chapter~\ref{chap:defenses against tracking}). \subsection{Flash Cookies and Java JNLP PersistenceService} \label{subsec:flash cookies and java jnlp persistenceservice} @@ -414,9 +414,9 @@ necessarily having to know the web site the user visits. \begin{figure}[ht] \centering - \includegraphics[width=1\textwidth]{../figures/cookiesyncing.pdf} - \label{fig:cookie synchronization} + \includegraphics[width=1\textwidth]{figures/cookiesyncing.pdf} \caption{Cookie Synchronization in practice between two trackers + \label{fig:cookie synchronization} \emph{cloudflare.com} and \emph{google.com}.} \end{figure} @@ -762,7 +762,7 @@ is stale and needs to be updated. Commonly, a collision-resistant hash function is used to generate a unique hash of a cached resource which is sent along with the resource in the first \gls{HTTP} request. The resource and the hash—which is stored in the \gls{ETag} header—is then cached by the client. On subsequent -retrievals of the same \gls{URL}, the client checks for an expire date on the +retrievals of the same \gls{URL}, the client checks for an expiration date on the requested \gls{URL} via the Cache-Control and Expire headers. If the \gls{URL} has expired, the client sends a request with the \emph{If-None-Match} field set with the \gls{ETag}. The server then compares the \gls{ETag} received by the @@ -780,7 +780,7 @@ the identifier has been placed in the \gls{ETag} header, the server can answer requests to check for an updated resource always with an \gls{HTTP} 301 Not-Modified header, effectively persisting the unique identifier in the client's cache. During their 2011 survey of QuantCast.com's top 100 U.S. based -web sites \citet{ayensonFlashCookiesPrivacy2011} found \texttt{hulu.com} to be +web sites, \citet{ayensonFlashCookiesPrivacy2011} found \texttt{hulu.com} to be using \glspl{ETag} as backup for tracking cookies that are set by \texttt{KISSmetrics} (an analytics platform). This allowed cookies to be respawned once they had been cleared by checking the \gls{ETag} header. @@ -803,15 +803,15 @@ operating system has it's own cache that applications can ask for name resolution. Some applications introduce another layer of caching by having their own cache (e.g., browsers). -\citet{kleinDNSCacheBasedUser2019} demonstrated a tracking method which is -using \gls{DNS} caches to assign unique identifiers to client machines. In -order for the technique to work, the tracker has to have control over one web -server (or multiple) as well as an authoritative \gls{DNS} server which -associates the web servers with a domain name under the control of the tracker. -The tracking process starts once a user agent requests a web site which loads a -script from one of the web servers the attacker is controlling. The process -can then be sketched out as follows (see -\cite[p.~5]{kleinDNSCacheBasedUser2019} for a detailed description). +\citet{kleinDNSCacheBasedUser2019} demonstrate a tracking method which uses +\gls{DNS} caches to assign unique identifiers to client machines. In order for +the technique to work, the tracker has to have control over one web server (or +multiple) as well as an authoritative \gls{DNS} server which associates the web +servers with a domain name under the control of the tracker. The tracking +process starts once a user agent requests a web site which loads a script from +one of the web servers the attacker is controlling. The process can then be +sketched out as follows (see \cite[p.~5]{kleinDNSCacheBasedUser2019} for a +detailed description). \begin{enumerate} \item The snippet loads a resource from muliple domains (\texttt{1.ex.com}, @@ -852,7 +852,7 @@ for example. \label{subsec:tls session resumption} \gls{TLS} is widely used today to securely encapsulate communication across the -web. For bandwidth savings and better performance it is possible to cache a +web. For bandwidth savings and better performance, it is possible to cache a \gls{TLS} session to allow reusing an already established secure connection at a later point in time. Versions prior to \gls{TLS} 1.3 used two mechanisms to accomplish this: \gls{TLS} session identifiers and session tickets. Session