diff --git a/figures/social-compute-unit.png b/figures/social-compute-unit.png new file mode 100644 index 0000000..129767d Binary files /dev/null and b/figures/social-compute-unit.png differ diff --git a/trustworthy-ai.tex b/trustworthy-ai.tex index 57d2f6d..1768029 100644 --- a/trustworthy-ai.tex +++ b/trustworthy-ai.tex @@ -325,6 +325,7 @@ transformations can be a difficult endeavor because prediction accuracy can suffer. \subsection{Explainability} +\label{ssec:explainability} Recent advances in artificial intelligence can mostly be attributed to an ever-increasing model complexity, made possible by massive deep neural networks @@ -471,6 +472,137 @@ output \cite{hintonDistillingKnowledgeNeural2015}. \section{Social Computing} \label{sec:social-computing} +So far, trustworthy AI has been defined by exploring how trust between humans +and other humans or AI is formed. From a computational perspective, +chapter~\ref{sec:taxonomy} provides various insights into how artificial +intelligence can be made more trustworthy. Since implementing these +recommendations for improved designs of artificial intelligence systems require +a thorough understanding of technical matters, some systems will not be +sufficiently equipped for an environment in which trustworthiness is a +requirement. Furthermore, a system---regardless of whether it falls into the AI +category, or not---is almost always embedded in a context which includes many +other agents as well. Other agents do not only constitute technical machinery +but also humans which are part of the larger system. Humans interact with these +systems through interfaces which are specifically designed for this purpose. It +is therefore reasonable to assume that the \emph{social context} in which a +system operates plays an important part in its trustworthiness. Especially with +regards to a business environment, models which integrate both aspects well +stand to gain a lot. This relationship---between an AI system and the human +context in which it is situated---is the topic of this chapter. + +\subsection{Example Scenario} +\label{sec:example-scenario} + +A common scenario in a monitoring environment is the following. A business has +multiple services in operation which facilitate interactions with their +customers. All of these services produce log messages which contain information +about the service's current status, how many customers are currently being +served and potential errors the system is encountering. With a high number of +parallel running services, an increasing amount of log messages is produced. +These log messages have to be looked at by either another system or a unit of +humans so that only the most relevant messages are passed to the next stage. The +next stage might be a manager or similar entity responsible for decisions +concerning the on-going operation of the services. + +There are multiple dimensions to this scenario which require different +approaches to the organization of the +workflow~\cite{dustdarSocialComputeUnit2011}. The first dimension is the +\emph{number of events}. A system which produces a relatively low number of +events due to having a low number of services, is most likely appropriately +managed by a handful of humans. If the number of events is high and/or crosses a +certain threshold, the sheer amount of information arriving at the processors +(in this case humans) is overwhelming. + +The second dimension is a product of the heterogeneous environment of an IT +service provider. Since multiple services are in operation which likely do not +follow the same standards for how log messages or events look like, when they +are produced and how they are reported, processors are faced with a high +\emph{event variability}~\cite{dustdarSocialComputeUnit2011}. A high event +variability places a high cognitive load on any human responsible for processing +the incoming events. Event variability and cognitive load are proportional to +each other in that an increasing variability results in an increasing cognitive +load. + +The third dimension concerns the likely necessity to \emph{upgrade} the systems +once \emph{growth} happens~\cite{dustdarSocialComputeUnit2011}. A change in the +underlying systems often requires the processors of the events to adapt to the +changing environment as well. They have to know in which way the systems have +changed and what kind of different events they produce so that the agents can +continue their critical function. A failure to adapt downstream structures to +the changing systems can stunt the business' growth and its ability to compete. + +To deal with the complexity of these three dimensions, a common solution is to +have a system (often rule-based) which extracts relevant information from all +events and forwards a processed version to human agents for further inspection +and resolving. Additionally, a second group of human agents is necessary to +change the rules of the system so that it can keep up with changes in its +environment. \textcite{dustdarSocialComputeUnit2011} propose such a group and +term it the \emph{Social Compute Unit}. + +\subsection{Social Compute Unit} +\label{sec:social-compute-unit} + +The social compute unit is composed of a team of resources possessing either +\enquote{skills in the problem domain (event analysis) or the system domain +(configuring the filtering +software)}~\cite[p.~66]{dustdarSocialComputeUnit2011}. +Figure~\ref{fig:social-compute-unit} shows the environment the social compute +unit is embedded in and the interactions between the parties in the +organizational unit. An important property of the social compute unit is that +it's members are not dedicated resources but rather act when the requirement to +do so comes up. For example, if a customer-facing service has an update which +introduces breaking changes, a knowledgeable member of the monitoring agents +steps in to update the affected rules. Another reason for resources expending +time to work on the rule-based software is to improve processes within the +structure so that certain tasks are accomplished in a more efficient manner. +Members of the social compute unit are rewarded based on metrics such as the +number of rules configured, time spent in hours or hours saved by the +introduction of new rules. + +\begin{figure} + \centering + \includegraphics[width=.7\textwidth]{figures/social-compute-unit.png} + \caption{Social Compute Unit (SCU). An IT service provider needs to monitor + the operations of multiple services/servers. Monitoring agents act to + resolve issues a the software deems important. The social compute unit + updates, adds or deletes rules. Image + credit~\cite{dustdarSocialComputeUnit2011}.} + \label{fig:social-compute-unit} +\end{figure} + +Since the social compute unit is not composed of dedicated resources, it comes +only into effect when a specific problem has to be solved. It is therefore +requested by someone to get the right team for the right job. The team is then +assembled \emph{on demand} by a person with sufficient expertise in the problem +domain such that the unit is constructed while taking the member's skills and +ability to work together into account. Alternatively, the task of assembling the +team does not have to be carried out by a human but can also be carried out by +specialized (matching) software. + +\textcite{dustdarSocialComputeUnit2011} define a social compute unit's life +cycle as consisting of six stages. In the first stage, a unit comes into effect +once a \emph{request} has been issued. In the \emph{create} stage, the team is +compiled. After that it goes through an \emph{assimilation} phase wherein it +becomes acquainted with the task at hand. The \emph{virtualization} stage +follows where it is ensured that the unit can communicate effectively and has +the necessary resources, such as a test environment. In the \emph{deployment} +phase, the unit produces results and implements them in the production +environment. After completion of the objective the team is \emph{dissolved}. + +Due to the close integration of monitoring agents, the social compute unit and +the software which initially processes the events generated by the services, a +high degree of trust is required. Since the software is architected as a +rule\nobreakdash-based system, it has the properties necessary for achieving +explainability of its actions (see section~\ref{ssec:explainability}). +Monitoring agents have the opportunity to become (temporarily) part of the +social compute unit which allows them to configure the rules of the software +themselves. Working with the software and contributing a vital part to a +software's function has the potential effect of increasing the software's +trustworthiness. Additionally, stakeholders outside of the system can be +confident that the degree to which humans are involved is high, which arguably +also increases the system's trustworthiness. The architecture proposed +by~\textcite{dustdarSocialComputeUnit2011} is therefore a good model to build +trust in the system's inner workings and results. \section{Conclusion} \label{sec:conclusion}