222 lines
11 KiB
TeX
222 lines
11 KiB
TeX
\documentclass[runningheads]{llncs}
|
|
|
|
\usepackage{graphicx}
|
|
\usepackage[backend=biber,style=numeric]{biblatex}
|
|
\usepackage{hyperref}
|
|
\usepackage{amsmath}
|
|
|
|
\hypersetup{
|
|
colorlinks=true,
|
|
linkcolor=black,
|
|
urlcolor=blue,
|
|
citecolor=black
|
|
}
|
|
|
|
\addbibresource{trustworthy-ai.bib}
|
|
|
|
\begin{document}
|
|
|
|
\title{Trustworthy Artificial Intelligence}
|
|
\author{Tobias Eidelpes}
|
|
\authorrunning{T. Eidelpes}
|
|
|
|
\institute{Technische Universität Wien, Karlsplatz 13, 1040 Wien, Austria
|
|
\email{e1527193@student.tuwien.ac.at}}
|
|
|
|
\maketitle
|
|
|
|
\begin{abstract}
|
|
The abstract should briefly summarize the contents of the paper in
|
|
150--250 words.
|
|
|
|
\keywords{Artificial Intelligence, Trustworthiness, Social Computing}
|
|
\end{abstract}
|
|
|
|
|
|
\section{Introduction}
|
|
\label{sec:introduction}
|
|
|
|
The use of artificial intelligence (AI) in computing has seen an unprecedented
|
|
rise over the last few years. From humble beginnings as a tool to aid humans in
|
|
decision making to advanced use cases where human interaction is avoided as much
|
|
as possible, AI has transformed the way we live our lives today. The
|
|
transformative capabilities of AI are not just felt in the area of computer
|
|
science, but have bled into a diverse set of other disciplines such as biology,
|
|
chemistry, mathematics and economics. For the purposes of this work, AIs are
|
|
machines that can learn, take decision autonomously and interact with the
|
|
environment~\cite{russell_artificial_2021}.
|
|
|
|
While the possibilities of AI are seemingly endless, the public is slowly but
|
|
steadily learning about its limitations. These limitations manifest themselves
|
|
in areas such as autonomous driving and medicine, for example. These are fields
|
|
where AI can have a direct—potentially life-changing—impact on people's lives. A
|
|
self-driving car operates on roads where accidents can happen at any time.
|
|
Decisions made by the car before, during and after the accident can result in
|
|
severe consequences for all participants. In medicine, AIs are increasingly used
|
|
to drive human decision-making. The more critical the proper use and functioning
|
|
of AI is, the more trust in its architecture and results is required. Trust,
|
|
however, is not easily defined, especially in relation to artificial
|
|
intelligence.
|
|
|
|
This work will explore the following question: \emph{Can artificial intelligence
|
|
be trustworthy, and if so, how?} To be able to discuss this question, trust has
|
|
to be defined and dissected into its constituent components.
|
|
Chapter~\ref{sec:modeling-trust} analyzes trust and molds the gained insights
|
|
into a framework suitable for interactions between humans and artificial
|
|
intelligence. Chapter~\ref{sec:taxonomy} approaches trustworthiness in
|
|
artificial intelligence from a computing perspective. There are various ways to
|
|
make AIs more \emph{trustworthy} through the use of technical means. This
|
|
chapter seeks to discuss and summarize important methods and approaches.
|
|
Chapter~\ref{sec:social-computing} discusses combining humans and artificial
|
|
intelligence into one coherent system which is capable of achieving more than
|
|
either of its parts on their own.
|
|
|
|
|
|
\section{Trust}
|
|
\label{sec:modeling-trust}
|
|
|
|
In order to be able to define the requirements and goals of \emph{trustworthy
|
|
AI}, it is important to know what trust is and how we humans establish trust
|
|
with someone or something. This section therefore defines and explores different
|
|
forms of trust.
|
|
|
|
\subsection{Defining Trust}
|
|
|
|
Commonly, \emph{trusting someone} means to have confidence in another person's
|
|
ability to do certain things. This can mean that we trust someone to speak the
|
|
truth to us or that a person is competently doing the things that we
|
|
\emph{entrust} them to do. We trust the person delivering the mail that they do
|
|
so on time and without mail getting lost on the way to our doors. We trust
|
|
people knowledgeable in a certain field such as medicine to be able to advise us
|
|
when we need medical advice. Trusting in these contexts means to cede control
|
|
over a particular aspect of our lives to someone else. We do so in expectation
|
|
that the trustee does not violate our \emph{social agreement} by acting against
|
|
our interests. Often times we are not able to confirm that the trustee has
|
|
indeed done his/her job. Sometimes we will only find out later that what was
|
|
in fact done did not happen in line with our own interests. Trust is therefore
|
|
also always a function of time. Previously entrusted people can—depending on
|
|
their track record—either continue to be trusted or lose trust.
|
|
|
|
We do not only trust certain people to act on our behalf, we can also place
|
|
trust in things rather than people. Every technical device or gadget receives
|
|
our trust to some extent, because we expect it to do the things we expect it to
|
|
do. This relationship encompasses \emph{dumb} devices such as vacuum cleaners
|
|
and refrigerators, as well as seemingly \emph{intelligent} systems such as
|
|
algorithms performing medical diagnoses. Artificial intelligence systems belong
|
|
to the latter category when they are functioning well, but can easily slip into
|
|
the former in the case of a poorly trained machine learning algorithm that
|
|
simply classifies pictures of dogs and cats always as dogs, for example.
|
|
|
|
Scholars usually divide trust either into \emph{cognitive} or
|
|
\emph{non-cognitive} forms. While cognitive trust involves some sort of rational
|
|
and objective evaluation of the trustee's capabilities, non-cognitive trust
|
|
lacks such an evaluation. For instance, if a patient comes to a doctor with a
|
|
health problem which resides in the doctor's domain, the patient will place
|
|
trust in the doctor because of the doctor's experience, track record and
|
|
education. The patient thus consciously decides that he/she would rather trust
|
|
the doctor to solve the problem and not a friend who does not have any
|
|
expertise. Conversely, non-cognitive trust allows humans to place trust in
|
|
people they know well, without a need for rational justification, but just
|
|
because of their existing relationship.
|
|
|
|
Due to the different dimensions of trust and its inherent complexity in
|
|
different contexts, frameworks for trust are an active field of research. One
|
|
such framework—proposed by \textcite{ferrario_ai_2020}—will be discussed in
|
|
the following sections.
|
|
|
|
\subsection{Incremental Model of Trust}
|
|
|
|
The framework by \textcite{ferrario_ai_2020} consists of three types of trust:
|
|
simple trust, reflective trust and paradigmatic trust. Their model thus consists
|
|
of the triple
|
|
|
|
\[ T = \langle\text{simple trust}, \text{reflective trust}, \text{paradigmatic
|
|
trust}\rangle \]
|
|
|
|
\noindent and a 5-tuple
|
|
|
|
\[ \langle X, Y, A, G, C\rangle \]
|
|
|
|
\noindent where $X$ and $Y$ denote interacting agents and $A$ the action to be
|
|
performed by the agent $Y$ to achieve goal $G$. $C$ stands for the context in
|
|
which the action takes place.
|
|
|
|
\subsubsection{Simple Trust} is a non-cognitive form of trust and the least
|
|
demanding form of trust in the incremental model. $X$ trusts $Y$ to perform an
|
|
action $A$ to pursue the goal $G$ without requiring additional information about
|
|
$Y$'s ability to generate a satisfactory outcome. In other words, $X$
|
|
\emph{depends} on $Y$ to perform an action. $X$ has no control over the process
|
|
and also does not want to control it or the outcome. A lot of day-to-day
|
|
interactions happen in some form or another under simple trust: we (simply)
|
|
trust a stranger on the street to show us the right way when we are lost.
|
|
Sometimes simple trust is unavoidable because of the trustor's inability to
|
|
obtain additional information about the other party. Children, for example, have
|
|
to simply trust adults not because they want to but out of necessity. This
|
|
changes when they get older and develop their ability to better judge other
|
|
people.
|
|
|
|
\subsubsection{Reflective Trust} adds an additional layer to the simple trust
|
|
model: trustworthiness. Trustworthiness can be defined as the cognitive belief
|
|
of $X$ that $Y$ is trustworthy. Reflective trust involves a cognitive process
|
|
which allows a trustor to obtain reasons for trusting a potential trustee. $X$
|
|
believes in the trustworthiness of $Y$ because there are reasons for $Y$ being
|
|
trustworthy. Contrary to simple trust, reflective trust includes the aspect of
|
|
control. For an agent $X$ to \emph{reflectively} trust another agent $Y$, $X$
|
|
has objective reasons to trust $Y$ but is not willing to do so without control.
|
|
Reflective trust does not have to be expressed in binary form but can also be
|
|
expressed by a subjective measure of confidence. The more likely a trustee $Y$
|
|
is to perform action $A$ towards a goal $G$, the higher $X$'s confidence in $Y$
|
|
is. Additionally, $X$ might have high reflective trust in $Y$ but still does not
|
|
trust $Y$ to perform a given task because of other, potentially unconscious,
|
|
reasons.
|
|
|
|
\subsubsection{Pragmatic Trust} is the last form of trust in the incremental
|
|
model proposed by \cite{ferrario_ai_2020}. In addition to having objective
|
|
reasons to trust $Y$, $X$ is also willing to do so without control. It is thus a
|
|
combination of simple trust and reflective trust. Simple trust provides the
|
|
non-cognitive, non-controlling aspect of trust and reflective trust provides the
|
|
cognitive aspect.
|
|
|
|
\subsection{Application of the Model}
|
|
|
|
Since the incremental model of trust can be applied to human-human as well as
|
|
human-AI interactions, an example which draws from both domains will be
|
|
presented. The setting is that of a company which ships tailor-made machine
|
|
learning (ML) solutions to other firms. On the human-human interaction side
|
|
there are multiple teams working on different aspects of the software. The
|
|
hierarchical structure between bosses, their team leaders and their developers
|
|
is composed of different forms of trust. A boss has worked with a specific team
|
|
leader in the past and thus knows from experience that the team leader can be
|
|
trusted without control (paradigmatic trust). The team leader has had this
|
|
particular team for a number of projects already but has recently hired a new
|
|
junior developer. The team leader has some objective proof that the new hire is
|
|
capable of delivering good work on time due to impressive credentials but needs
|
|
more time to be able to trust the new colleague without control (reflective
|
|
trust).
|
|
|
|
On the human-AI side, one developer is working on a machine learning algorithm
|
|
to achieve a specific goal $G$. Taking the 5-tuple from the incremental model,
|
|
$X$ is the developer, $Y$ is the machine learning algorithm and $A$ is the
|
|
action the machine learning algorithm takes to achieve the goal $G$. In the
|
|
beginning, $X$ does not yet trust $Y$ to do its job properly. This is due to an
|
|
absence of any past performance metric of the algorithm to achieve $G$. While
|
|
most, if not all, parameters of $Y$ have to be controlled by $X$ in the
|
|
beginning, there is less and less control needed if $Y$ achieves $G$
|
|
consistently. This also increases the cognitive trust in $Y$ as time goes on due
|
|
to accurate performance metrics.
|
|
|
|
\section{Computational Aspects of Trustworthy AI}
|
|
\label{sec:taxonomy}
|
|
|
|
|
|
\section{Social Computing}
|
|
\label{sec:social-computing}
|
|
|
|
|
|
\section{Conclusion}
|
|
\label{sec:conclusion}
|
|
|
|
\printbibliography
|
|
|
|
\end{document}
|