\documentclass[11pt,a4paper]{article} \usepackage{termpaper} \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage{microtype} \usepackage{setspace} \usepackage{amssymb} \usepackage{amsmath} \usepackage[english]{babel} \usepackage{csquotes} \usepackage[style=ieee,backend=biber]{biblatex} \usepackage{hyperref} \setstretch{1.07} \addbibresource{references.bib} % opening \title{Participatory Budgeting: Algorithms and Complexity} \author{ \authorname{Tobias Eidelpes} \\ \studentnumber{01527193} \\ \curriculum{033 534} \\ \email{e1527193@student.tuwien.ac.at} } % Numbered example environment \newcounter{example}[section] \newenvironment{example}[1][]{\refstepcounter{example}\par\medskip \noindent \textbf{Example~\theexample. #1} \rmfamily}{\medskip} \begin{document} \maketitle \begin{abstract} Participatory budgeting is a deliberative democratic process that allows residents to decide how public funds should be spent. By combining a form of preference elicitation with an aggregation method, a set of winning projects is determined and funded. This paper first gives an introduction into participatory budgeting methods and then focuses on approval-based models to discuss algorithmic complexity. Furthermore, a short overview of useful axioms that can help select one method in practice is presented. Finally, an outlook on future challenges surrounding participatory budgeting is given. \end{abstract} \section{Introduction} \emph{Participatory Budgeting} (PB) is a process of democratic deliberation that allows residents of a municipality to decide how a part of the public budget is to be spent. It is a way to improve transparency and citizen involvement which are two important cornerstones of a democracy. PB was first realized in the 1990s in Porto Alegre in Brazil by the Workers' Party to combat the growing divide between the rich city center and the poor living in the greater region. Owing to its success in the south of Brazil, PB quickly spread to North America, Europe, Asia and Africa. Although the process is heavily adapted by each municipality to suit the environment in which the residents live in, it generally follows the following stages \autocite{participatorybudgetingprojectHowPBWorks}: \begin{description} \item [Design the process] A rule book is crafted to ensure that the process is democratic. \item [Collect ideas] Residents propose and discuss ideas for projects. \item [Develop feasible projects] The ideas are developed into projects that can be undertaken by the municipality. \item [Voting] The projects are voted on by the residents. \item [Aggregating votes \& funding] The votes are combined to determine a set of winning projects which are then funded. \end{description} \noindent The two last stages \emph{voting} and \emph{aggregating votes} are of main interest for computer scientists, economists and social choice theorists because depending on how voters elicit their preferences (\emph{balloting} or \emph{input method}) and how the votes are aggregated through the use of algorithms, the outcome is different. For this paper it is assumed that the first three stages have already been completed. The rules of the process have been set, ideas have been collected and developed into feasible projects and the budget limit is known. To study different ways of capturing votes and aggregating them, the participatory process is modeled mathematically. This model will be called a participatory budgeting \emph{scenario}. The aim of studying participatory budgeting scenarios is to find ways to achieve a desirable outcome. A desirable outcome can be one based on fairness by making sure that each voter has at least one chosen project in the final set of winning projects for example. Other approaches are concerned with maximizing social welfare or discouraging \emph{gaming the voting process} (where an outcome can be manipulated by not voting truthfully; also called \emph{strategyproofness}). First, this paper will give a brief overview of common methods and show how a participatory budgeting scenario can be modeled mathematically. To illustrate these methods, one approach will be chosen and discussed in detail with respect to algorithmic complexity and properties. Finally, the gained insight into participatory budgeting algorithms will be summarized and an outlook on further developments will be given. \section{Mathematical Model} \label{sec:mathematical model} \textcite{talmonFrameworkApprovalBasedBudgeting2019} define a participatory budgeting scenario as a tuple $E = (P,V,c,B)$, consisting of a set of projects $P = \{ p_1,\dots,p_m \}$ where each project $p\in P$ has an associated cost $c(p):P\rightarrow\mathbb{R}$, a set of voters $V = \{v_1,\dots,v_n\}$ and a budget limit $B$. The voters express preferences over individual projects or over subsets of all projects. How the preferences of voters are expressed has to be decided during the design phase of the process and is a choice that has to be made in accordance with the method that is used for aggregating the votes. After the voters have elicited their preferences, a set of projects $A\subseteq P$ is selected as \emph{winning projects} according to some rule and subject to the total budget limit $B$. For the case where projects are indivisible, which is also called discrete, the sum of the winning projects' costs is not allowed to exceed the limit $B$: \begin{equation}\label{eq:1} \sum_{p\in A}{c(p)\leq B}. \end{equation} When projects can be divisible, i.e., completed to a fractional degree, the authors define a function $\mu(p) : P\rightarrow [0,1]$ which maps every project to an interval between zero and one, representing the fractional degree to which this project is completed. Since the cost of each project is a function of its degree of completion, the goal is to select a set of projects where the cost of the degree of completion does not exceed the budget limit: \begin{equation}\label{eq:2} \sum_{p\in A}{c(\mu(p))\leq B}. \end{equation} Common ways to design the input method is to ask the voters to approve a subset of projects $A_v\subseteq P$ where each individual project can be either chosen to be in $A_v$ or not. This form is called \emph{dichotomous preferences} because every project is put in one of two categories: \emph{good} or \emph{bad}. Projects that have not been approved (are not in $A_v$) are assumed to be in the bad category. This type of preference elicitation is known as approval-based preference elicitation or balloting. It is possible to design variations of the described scenario by for example asking the voters to only specify at most $k$ projects which they want to see approved ($k$-Approval) \cite{goelKnapsackVotingParticipatory2019a}. These variations typically do not take into account the cost that is associated with each project at the voting stage. To alleviate this, approaches where the voters are asked to approve projects while factoring in the cost have been proposed. After asking the voters for their preferences, various aggregation methods can be used. Section~\ref{sec:approval-based budgeting} will go into detail about the complexity and axiomatic guarantees of these methods. One such approach, where the cost and benefit of each project is factored in, is described by \textcite{goelKnapsackVotingParticipatory2019a}, which they term \emph{knapsack voting}. It allows voters to express preferences by factoring in the cost as well as the benefit per unit of cost. The name stems from the well-known knapsack problem in which, given a set of items, their associated weight and value and a weight limit, a selection of items that maximize the value subject to the weight limit has to be chosen. In the budgeting scenario, the items correspond to projects, the weight limit to the budget limit and the value of each item to the value that a project provides to a voter. To have a suitable metric for the value that each voter gets from a specific project, the authors introduce different \emph{utility models}. These models make it possible to provide axiomatic guarantees such as strategyproofness or welfare maximization. While their model assumes fractional voting---that is each voter can allocate the budget in any way they see fit---utility functions are also used by \textcite{talmonFrameworkApprovalBasedBudgeting2019} to measure the total satisfaction that a winning set of projects provides under an aggregation rule. A third possibility for preference elicitation is \emph{ranked orders}. In this scenario, voters specify a ranking over the available choices (projects) with the highest ranked choice receiving the biggest amount of the budget and the lowest ranked one the lowest amount of the budget. \textcite{langPortioningUsingOrdinal2019} study a scenario in which the input method is ranked orders and the projects that can be chosen are divisible. The problem of allocating the budget to a set of winning projects under these circumstances is referred to as \emph{portioning}. Depending on the desired outcome, multiple aggregation methods can be combined with ranked orders. % Cite municipalities using approval-based budgeting (Paris?) Since approval-based methods are comparatively easy to implement and are being used in practice by multiple municipalities, the next section will discuss aggregation methods, their complexity as well as useful axioms for comparing the different aggregation rules. \section{Approval-based budgeting} \label{sec:approval-based budgeting} Although approval-based budgeting is also suitable for the case where the projects can be divisible, municipalities using this method generally assume indivisible projects. Moreover---as is the case with participatory budgeting in general---we not only want to select one project as a winner but multiple. This is called a multi-winner election and is in contrast to single-winner elections. Once the votes have been cast by the voters, again assuming dichotomous preferences, a simple aggregation rule is greedy selection. In this case the goal is to iteratively select one project $p\in P$ that gives the maximum satisfaction for all voters. Satisfaction can be viewed as a form of social welfare where it is not only desirable to stay below the budget limit $B$ but also to achieve a high score at some metric that quantifies the value that each voter gets from the result. \textcite{talmonFrameworkApprovalBasedBudgeting2019} propose three satisfaction functions which provide this metric. Formally, they define a satisfaction function as a function $sat : 2^P\times 2^P\rightarrow \mathbb{R}$, where $P$ is a set of projects. A voter $v$ selects projects to be in her approval set $P_v$ and a bundle $A\subseteq P$ contains the projects that have been selected as winners. The satisfaction that voter $v$ gets from a selected bundle $A$ is denoted as $sat(P_v,A)$. The set $A_v = P_v\cap A$ denotes the set of approved items by $v$ that end up in the winning bundle $A$. A simple approach is to count the number of projects that have been approved by a voter and which ended up being in the winning set: \begin{equation}\label{eq:3} sat_\#(P_v,A) = |A_v| \end{equation} Combined with the greedy rule for selecting projects, projects are iteratively added to the winning bundle $A$ where at every iteration the project that gives the maximum satisfaction to all voters is selected. It is assumed that the voters' individual satisfaction can be added together to provide the satisfaction that one project gives to all the voters. This gives the rule $\mathcal{R}_{sat_\#}^g$ which seeks to maximize $\sum_{v\in V}sat_\#(P_v,A\cup \{p\})$ at every iteration. Another satisfaction function assumes a relationship between the cost of the items and a voter's satisfaction. Namely, a project that has a high cost and is approved by a voter $v$ and ends up in the winning bundle $A$ provides more satisfaction than a lower cost project. Equation~\ref{eq:4} gives a definition of this property. \begin{equation}\label{eq:4} sat_\$(P_v,A) = \sum_{p\in A_v} c(p) = c(A_v) \end{equation} The third satisfaction function assumes that voters are content as long as there is at least one of the projects they have approved selected to be in the winning set. Therefore, a voter achieves satisfaction 1 when at least one approved project ends up in the winning bundle, i.e., if $|A_v| > 0$ and 0 satisfaction otherwise (see equation~\ref{eq:5}). \begin{equation}\label{eq:5} sat_{0/1}(P_v,A) = \begin{cases} 1 & \mathsf{if}\; |A_v|>0 \\ 0 & \mathsf{otherwise} \end{cases} \end{equation} The satisfaction functions from equations~\ref{eq:4} and \ref{eq:5} can also be combined with the greedy rule, potentially giving slightly different outcomes than $\mathcal{R}_{sat_\#}^g$. An example demonstrating the greedy rule is given in example~\ref{ex:greedy}. \begin{example}\label{ex:greedy} A set of projects $P = \{ p_2,p_3,p_4,p_5,p_6 \}$ and their associated cost $p_i$ where project $p_i$ costs $i$ and a budget limit $B = 10$ is given. Futhermore, five voters vote $v_1 = \{ p_2,p_5,p_6 \}$, $v_2 = \{ p_2, p_3,p_4,p_5 \}$, $v_3 = \{ p_3,p_4,p_5 \}$, $v_4 = \{ p_4,p_5 \}$ and $v_5 = \{ p_6 \}$. Under $\mathcal{R}_{sat_\#}^g$ the winning bundle is $\{ p_4,p_5 \}$, $\mathcal{R}_{sat_\$}^g$ gives $\{ p_4,p_5 \}$ and $\mathcal{R}_{sat_{0/1}}^g$ $\{ p_2,p_3,p_5 \}$. \end{example} Computing a solution to the problem of finding a winning set of projects by using greedy rules can be done in polynomial time due to their iterative nature. The downside to using a greedy selection process is that the provided solution might not be optimal with respect to the satisfaction. To be able to compute optimal solutions, \textcite{talmonFrameworkApprovalBasedBudgeting2019} suggest combining the satisfaction functions with a maximization rule. The maximization rule always selects a winning set of projects that maximizes the sum of the voters' satisfaction: \begin{equation}\label{eq:6} \max_{A\subseteq P}\sum_{v\in V}sat(P_v,A) \end{equation} The max rule can then be used with the three satisfaction functions in the same way, giving: $\mathcal{R}_{sat_\#}^m$, $\mathcal{R}_{sat_\$}^m$ and $\mathcal{R}_{sat_{0/1}}^m$. Example~\ref{ex:max} shows that the selection of winning projects is not as intuitive as when using the greedy rule. Whereas it was still possible to compute a solution without any tools for the greedy selection, the max rule requires knowing the possible sets of projects beforehand in order to select the bundle with the maximum satisfaction. This hints at the complexity of the max rule being harder to solve than the greedy rule. The authors confirm this by identifying $\mathcal{R}_{sat_\$}^m$ as weakly \textsf{NP}-hard for the problem of finding a winning set that gives at least a specified amount of satisfaction. The proof follows from a reduction to the subset sum problem which asks the question of given a set of numbers (in this case the cost associated with each project) and a number $B$ (the budget limit) does any subset of the numbers sum to exactly $B$? Because the subset sum problem is solvable by a dynamic programming algorithm in $O(B\cdot |P|)$ where $P$ is the set of projects, $\mathcal{R}_{sat_\$}^m$ is solvable in pseudo-polynomial time. Finding a solution using the rule $\mathcal{R}_{sat_\#}^m$ however, is doable in polynomial time due to the problem's relation to the knapsack problem. If the input is represented in unary, a dynamic programming algorithm is bounded by a polynomial in the length of the input. For $\mathcal{R}_{sat_{0/1}}^m$, finding a set of projects that gives at least a certain amount of satisfaction is \textsf{NP}-hard. Assuming that the cost of all of the projects is one unit, the rule is equivalent to the max cover problem because we are searching for a subset of all projects with the number of the projects (the total cost due to the projects given in unit cost) smaller or equal to the budget limit $B$ and want to maximize the number of voters that are represented by the subset. \begin{example}\label{ex:max} Taking the initial setup from example~\ref{ex:greedy}: $P = \{ p_2,p_3,p_4,p_5,p_6 \}$ and their associated cost $p_i$ where project $p_i$ costs $i$, a budget limit $B = 10$ and the five voters: $v_1 = \{ p_2,p_5,p_6 \}$, $v_2 = \{ p_2, p_3,p_4,p_5 \}$, $v_3 = \{ p_3,p_4,p_5 \}$, $v_4 = \{ p_4,p_5 \}$ and $v_5 = \{ p_6 \}$. We get $\{ p_2,p_3,p_5 \}$ for $\mathcal{R}_{sat_\#}^m$, $\{ p_4,p_5 \}$ for $\mathcal{R}_{sat_\$}^m$ and $\{ p_4,p_6 \}$ for $\mathcal{R}_{sat_{0/1}}^m$. Especially the last rule is interesting because it provides the highest amount of satisfaction possible by covering each voter with at least one project. Project $p_6$ covers voters $v_1$ and $v_5$ and project $p_4$ voters $v_2$, $v_3$ and $v_4$. \end{example} The third rule, which places a heavy emphasis on cost versus benefit, is similar to the greedy rule but instead of disregarding the satisfaction per cost that a project provides, it seeks to maximize the sum of satisfaction divided by cost for a project $p\in P$: \begin{equation} \frac{\sum_{v\in V}sat(P_v,A\cup\{p\}) - \sum_{v\in V}sat(P_v,A)}{c(p)} \end{equation} \textcite{talmonFrameworkApprovalBasedBudgeting2019} call this type of aggregation rule \emph{proportional greedy rule}. Example~\ref{ex:prop greedy} shows how the outcome of a budgeting scenario might look like compared to using a simple greedy rule or a max rule. Since the proportional greedy rule is a variation of the simple greedy rule, it is therefore also solvable in polynomial time. The variation of computing the satisfaction per unit of cost does not change the complexity since it only adds an additional step which can be done in constant time. \begin{example}\label{ex:prop greedy} We again have the same set of projects $P = \{ p_2,p_3,p_4,p_5,p_6 \}$, the same budget limit of $B = 10$ and the five voters: $v_1 = \{ p_2,p_5,p_6 \}$, $v_2 = \{ p_2, p_3,p_4,p_5 \}$, $v_3 = \{ p_3,p_4,p_5 \}$, $v_4 = \{ p_4,p_5 \}$ and $v_5 = \{ p_6 \}$. If we combine the satisfaction function $sat_\#$ from equation~\ref{eq:3} with the proportional greedy rule, we get the same result as with the simple greedy rule of $\{ p_4,p_5 \}$. While the simple greedy rule selects first $p_5$ and then $p_4$, the proportional greedy rule first selects $p_4$ and then $p_5$. The rule $\mathcal{R}_{sat_\$}^p$ yields the same result as $\mathcal{R}_{sat_\$}^g$ and $\mathcal{R}_{sat_\$}^m$ of $\{ p_4,p_5 \}$. $\mathcal{R}_{sat_{0/1}}^p$ however, gives $\{ p_2,p_3,p_4 \}$. \end{example} A benefit of the three discussed satisfaction functions is that they can be viewed as constraint satisfaction problems (CSPs) and can thus be formulated using integer linear programming (ILP). Although integer programming is \textsf{NP}-complete, efficient solvers are readily available for these types of problems. \textcite{talmonFrameworkApprovalBasedBudgeting2019} show that the rule $\mathcal{R}_{sat_{0/1}}^m$ is similar to the max cover problem which can be approximated with a $(1-\frac{1}{e})$-approximation algorithm. In fact, \textcite{khullerBudgetedMaximumCoverage1999} show that an approximation algorithm with the same ratio exists not only for the case where the projects have unit cost but also for the general cost version. Instead of sacrificing exactness to get a better running time, \textcite{talmonFrameworkApprovalBasedBudgeting2019} show that the $\mathcal{R}_{sat_{0/1}}^m$ rule is fixed parameter tractable for the number of voters $|V|$. A problem is fixed parameter tractable if there exists an algorithm that decides each instance of the problem in $O(f(k)\cdot p(n))$ where $p(n)$ is a polynomial function and $f(k)$ an arbitrary function in $k$. It is crucial to note that $f(k)$ does not admit functions of the form $n^k$. The algorithm for the maximum rule tries to guess the number of voters that are represented by the same project. The estimation is then used to pick a project which has the lowest cost and satisfies exactly the estimated amount of voters. \printbibliography \end{document}