Building trust for beneficial AI: Trustworthy systems
18 May 2018 02:00h
Event report
Dr Jess Whittlestone (Postdoctoral Research Associate, Leverhulme Centre for the Future of Intelligence, CFI, Cambridge) spoke about bridging the policy-technical gap for trustworthy AI. She stressed the importance of policy in shaping the way technology is used and the environment in which it is used.
She argued that AI policy-making is different from policy-making in other areas of science and technology, because it needs to be much more focused on anticipating challenges. The pitfall related to this is two-fold: policy should not be too reactive, at the same time, it should not fall victim to the hype.
Whittlestone suggested that broad and general thinking is needed that recognises the complexities of the societies and the environments in which technology is used in order to establish policy. In order to achieve this, inputs from a wide range of stakeholders are needed. While technical experts cannot answer these questions alone, it is also obvious that there are few senior policy makers with the necessary technical expertise. These two communities need to improve their communication and tackle the challenges arising from the very different languages they speak. In this regard, we also need to ask what level of technical understanding is needed for policy makers to be able to ask and answer the right questions.
Whittlestone suggested a number of ways to bridge the policy-technical gap: digital and technical training for policy makers, digital coaches for members of parliament (MPs), data ethics frameworks within governments, and scientific advisors in government.
She also stressed that terms such as trust, fairness, privacy, and transparency mean different things to different groups of people and are discussed in a variety of ways in relation to technical solutions. It will be important to connect the various communities to bridge the gaps in mutual understanding.
The next speaker, Dr Rumman Chowdhury (Senior Principal of AI, Accenture) spoke about ‘Trustworthy data: creating and curating a repository for diverse datasets’. She highlighted that in a number of cases, biases already come in at the stage of data collection. For example, AI that engages in natural language training based on broad input from the Internet often results in sexist AI. Similarly, because of a lack of diversity in the data sets that are used for training facial recognition AI, this AI often works best for white and male persons while struggling with the rest of the population.
As one solution, Chowdhury and her collaborator suggested building a repository for open data. Data scientists need to rely on ‘what is out there’ and the caveat with open data and ‘available data’ approaches is to convince people to make part of their data open. In order to work towards the repository, trust building and ethical principles need to be built into the process from the very beginning. Consent is of course an important aspect. However, with the rapid developments in AI, she argued that complications arise if people are asked to consent for their data to be used for purposes yet unknown.
Chowdhury and her collaborator argued that the question of what trustworthy data is, does not have an easy answer. However, they noted that the AI hype sometimes leads to researchers and developers disregarding the basic principles of data collection. Similarly, they stressed that data collection is impacted by policies. Changes in policies can change the data available and lead to further biases in the algorithm which then needs to undergo several further development iterations before it yields useful outcomes.
Chowdhury also stressed that bias can come from other sources, not just data, but also the data scientists. This includes collection biases, measurement biases, and contextual society biases. In the Q&A part of the session, Chowdhury and her collaborator stressed that the focus is not on creating non-biased data, which is impossible given how contextual bias is.
Dr Krishna Gummadi (Head of Networked Research Systems Group, Max Planck Institute for Software Systems) focused on the question of assessing and creating fairness in algorithmic decision-making. He used the example of algorithms that are used in the US justice system (such as COMPAS) to assess the likelihood of relapse into criminal behaviour. These algorithmic predictions then play a role in making decisions about granting bail.
Gummadi and his collaborators were interested in perceptions of fairness in relation to these algorithms and conducted surveys with people affected as well as the general population. In broad terms, perceptions of what is fair were similar among respondents. However, differences came in with regard to the relevance and reliability of some of the questions. For example, there was no agreement among those surveyed whether the criminal history of parents or the behaviour in the defendants’ youth should play a role in the assessment. The survey also showed that the causal mechanisms between these facts and the likelihood of relapse was assessed in diverse ways. One interesting finding of Gummadi and his collaborators is that differences in political position (liberal vs. conservative) leads to differences in the extent to which behaviour is viewed as volitional or as a feature of the environment and social group membership.
One conclusion is that it seems difficult to find agreement among survey respondents on the causal mechanisms that underlie algorithmic decision-making in this example. This raises the question to what extent we can actually settle societal disagreements in moral reasoning in order to build algorithmic decision-making tools.