MSc Thesis TU Delft - AI governance in the city of Amsterdam

By Daniël Brom

AI governance in the city of Amsterdam: Scrutinising Vulnerabilities of Public Sector AI Systems

A thesis submitted to the Delft University of Technology in partial fulfillment of the requirements for the degree of Master of Science in Engineering and Policy Analysis by Daniël Brom.

Executive Summary

Scandals in which governmental ADM tools played a role, have recently brought about political and societal debate about the potential harms to citizens that such automated systems potentially bring. In the Toeslagenaffaire, the Dutch Tax and Customs Administration used an algorithmic risk classification system for fraud detection with childcare benefit applications (Parlementaire Ondervragingscommissie Kinderopvangtoeslag, 2020). Until 2018, nationality of the parents was one of the indicators to check for fraud. The Dutch Data Protection Authority (DPA) concludes in its 2020 investigation report that using nationality as a fraud indicator was unnecessary, discriminatory, and unlawful with respect to the EU General Data Protection Regulation (Autoriteit Persoonsgegevens, 2020). In Amsterdam, law firm SOLV raises questions about the lawfulness and effectiveness of a municipal ADM for detection of Airbnb fraud (van Dorp, 2020).

This thesis focuses on ADM systems which contain an AI component. For AI, algorithms are the means to create a system of computational codes with human-related competencies like perception, understanding, and action (Mnih et al., 2015; Wirtz, Weyerer, & Geyer, 2019). However, the AI in this thesis, as in most applications, concerns so-called Narrow AI (NAI). Narrow refers to the goal-focused and specialised characteristics of the solutions. Such AI technologies show no general intelligence, but intelligence in a specific area (Pennachin & Goertzel, 2007). AI is explored in the context of process governance according to Pierre & Peters (2020): governance as steering and coordinating during the process of AI development. Governments seeking to design governance which safeguards citizens from potential harms, encounter challenges inherent to AI, for example lacking understanding with policymakers due to technological complexity (Wirtz et al., 2019). It is also well-known that designing governance strategies for public sector AISs leads to value trade-offs, for example when decision-makers have to find a balance between privacy protection and accuracy of the system (Dobbe & Raji, 2019).

Despite the societal urgency of the subject and the large amounts of scientific articles on public sector AI governance, some knowledge gaps still have to be addressed. Zuiderwijk et al. (2021, p.16) call for multidisciplinary studies which develop and test theories about AI governance specifically for the public sector. The preliminary literature review also reveals a lacking understanding of how potential citizens harms due to AI result in governance requirements for governments, as the link between vulnerabilities of AISs and governance requirements is rarely made. Lastly, keeping up with practical development of AI in governments worldwide is necessary for public sector AI governance scholarship. Currently there is a lack of empirical research on governmental AISs (Zuiderwijk et al., 2021). Based on the knowledge gaps perceived, the main research question for this study is:

In public sector AI systems, what are emerging vulnerabilities for citizens and how do these translate into governance requirements for decision-makers?

The approach to answer this question is an adjusted form of Theory Building from Case Study as proposed by Eisenhardt (1989). The adjustment is to do create theory upfront, to be tested during the case study, where it is common not to do so. The theoretical model based on an integrative literature review is validated and improved with empirical insights from semi-structured interviews, with case study actors from three cases: Reporting issues in public space, Illegal holiday rental housing risk, and Automated parking control. After having assessed the theoretical model, case study insights also serve to find governance requirements from the CoA practice which can be linked to the model. But first, the integrative literature review helps to better understand what the implications of AI use in the public sector are.

Three main types of reasons to use AI for governmental operations are: improving efficiency of decision-making, improving effectiveness of decision-making and inherent advantages it brings to citizens. But AISs bring disadvantages too, for example in the form of biased decision-making or potential corruption of automated systems. The introductory literature review also revealed several types of mitigation measures for inconveniences of AISs: technological measures, data measures, monitoring and evaluation measures, legal measures, organisational measures, and user engagement and citizen agency. The overview of disadvantages point to directions for the theoretical framework of AIS vulnerabilities.

A layered ’onion model’ presents four relevant contexts to consider for AIS vulnerabilities specifically in the public sector: AI model, Model deployment, Political administrative, and Societal. If a government decides to buy or develop some AIS, the first source of potential vulnerabilities is the model itself. There is a department which deploys the model for its daily operations, where other problems may occur. Then there are the political-administrative overarching policy objectives which are involved with everything the specific government does. Lastly, the societal context represents the actors who are ultimately affected by the AIS, namely citizens, where other vulnerabilities appear. The contextual layers together constitute a unity which covers the scope of every governmental AIS. Vulnerabilities differ from "Overfitting and underfitting" in the AI model context to "Negative impact on workforce" in the model deployment context to "Reward hacking" in the societal context. These examples illustrate the different nature of vulnerabilities: e.g. technological, socioeconomic and behavioural. Vulnerabilities emerge in different stages of the model development, but this research distinguishes two main phases: either before or after implementation.

The expert interviews for the three case studies in total yields 52 case-specific vulnerabilities of the AISs. 6 of them did not fall within one of the vulnerabilities in the model contexts. This is an acceptably low share and the model is considered to be valid for further use. The vulnerabilities that were not interpretable using the model, point towards three complements: model being adopted without enough capacity to control it, inadequacies in the requisite security of the AI model and unwanted strategic behaviour with the AI model. From the vulnerabilities that were found, several lessons are learned.
One: the AIS of Reporting issues in public space creates new forms of biases in favour of certain subgroups of citizens who are e.g. more outspoken or have higher trust in government. The involved actors did not acknowledge that it then becomes a political choice to what extent the AIS must influence the municipal strategy regarding public space management.
Two: as seen in the Automated parking control case, shifting discretionary power becomes a striking vulnerability when an important share of the AIS is outsourced to an external party. Questions were raised here e.g. about whether the current KPIs in the contract between Egis Parking Services and the CoA lead to fair routing of automated scanning cars for Amsterdam’s citizens.
Three: based on the Illegal holiday rental housing risk vulnerabilities: for fraud detection systems based on large amounts of data from different sources, it is questionable whether governments can find legal foundations or domain-knowledge arguments for all relationships the AI model creates to come up with risk profiles.

The case study results demonstrate that dealing with vulnerabilities in one of the four model contexts often complicates dealing with vulnerabilities in the other contexts. Hence, four governance requirements dilemmas found in the CoA practice which relate to the vulnerabilities model are:

  • I-a Increase the impact of the AI model on the decision-making process to ensure its added value and create balance with the downsides of the AI system. vs
  • I-b Decrease the impact of the AI model on the decision-making process until the effects of model errors are acceptable or errors are still possible to mitigate.
  • II-a Leave model developers with enough time and professional freedom to create high-end technological products. vs
  • II-b Ensure that developers create models which are explainable and functional for governmental employees to be used in their daily practice.
  • III-a Be transparent about the AI systems you use. Communicate actively and provide opportunities for citizens’ idea contribution and participation. vs
  • III-b Foster objective representations of reality by your AI systems and prevent new forms of bias caused by citizen participation.
  • IV-a Stimulate innovation with regards to AI development within your organisation and do not restrict every innovative project upfront. vs
  • IV-b Ensure that AI development projects have a clear and proportional contribution to an agreed policy objective.

Reflecting on all results, it is not the existence of AIS vulnerabilities, but merely the absence of a thorough process to settle considerations about the vulnerabilities which is the current challenge for governments. If governments use reports like this to understand the context-specific and interdependent vulnerabilities, this forms a first step towards this process and creating room for politically responsible decision-makers to make the relevant trade-offs. The next step is to document such considerations and create overviews of best practices, so that learning between development teams and their managers within or outside the CoA and other governmental bodies is allowed. Several policies would contribute to establishing this maturity in AI governance:

  1. Goals for the AIS may develop over time, but to at least have a shared belief in what the system is ought to do and keep discussing this over the lifecycle of the system, helps to find out about the relevant consideration of vulnerabilities as just described. Besides having an agreed purpose for the AIS, the distinction between efficiency goals, effectiveness goals, citizen well-being goals and innovative goals should always be made clearly as well.
  2. Resolving challenges in one context, often seems to lead to emergence of vulnerabilities in others. Understanding such tensions and trade-offs should be the primary focus when assessing the risks of using ADM within governments. Using the vulnerabilities model can be of help to do so: multidisciplinary teams with actors from all contexts can use the model for directions to think about the vulnerabilities they encounter.
  3. Although only being transparent is not enough to deal with AIS vulnerabilities, it is essential and thus highly recommendable to provide societal actors with more opportunities to get involved with governmental use of AI. Media, interpreted as societal actors, are then enabled to take up their role as well.
  4. Decision-makers must analyse the bias that occurs by using the AI rather than the bias of the AI itself. This is oftentimes not so much the problem, as in the cases Reporting issues in public space and Automated parking control, or it is the exact reason for the algorithmic system to be used in the first place, as in the Illegal housing rental case. Only focusing on mitigation of the algorithmic bias itself would therefore miss the point.