#AI #deeplearning is biased because of #datasets it is fed: a moral and legal issue

rajeev2007

Mar 03, 2017

my swarajya magazine piece from the mar 2017 issue. also online at

https://swarajyamag.com/magazine/fatal-flaw-in-ai-the-robots-will-probably-be-as-biased-as-their-masters

here is the original copy i sent them. they dropped the diagram and links, added a typo and dropped a crucial phrase in the web version :-)

Innovation nation: How do we deal with biased artificial intelligence?

Rajeev Srinivasan

At this point, it is conceivable that machine intelligences, at once omniscient and tireless, will in a short while take on human capabilities. The stories of the chess-playing computer Deep Blue, the quiz-show champ Watson, and the Go champion Deep Mind have enthralled us and shown us a vision of a future run by impartial and untiring machines. In fact, I wonder whether we should replace some judges, doctors and lawyers with machines, as they will be up to date, will not have bad hair days, and will not get burned out by work. But it turns out there is a potentially fatal flaw: these golems may be as biased as their masters.

This has to do with the nature of the new technique that has turned the latest machine intelligences into marvels of modern ingenuity: deep learning. Artificial intelligence as a discipline has been languishing for a couple of decades. In the late 1980s, there was much excitement about it, and I remember writing an optimistic paper about it in a journal, but then it failed to meet expectations especially in relation to natural language processing. So AI became sort of a bad joke: always twenty years away in future. So much so that people began to say ‘expert systems’ and ‘neural networks’ instead of ‘artificial intelligence’. See a report on AI in the June 2016 issue of The Economist http://www.economist.com/news/special-report/21700756-artificial-intelligence-boom-based-old-idea-modern-twist-not

In the last few years, there has been a renaissance in AI, and it started with novel applications of neural networks. These are modeled, obviously, on biological mechanisms, and in particular the human brain and its collection of interconnected neural pathways. Neurons that connect to each other are given ‘weights’ and an ‘activation function’ that control how they link up with the next neuron by firing an output signal. This is analogous to the ‘rules’ set up in other types of expert systems.

Computer-simulated neural networks are built using multiple layers, each of which has a specific function. These layers are needed so that an input triggers multiple firings of the neurons, and finally results in the expected outcome based on the input. It turns without that simple problems may need a few layers, but as problems grow more complex, the problem of adjusting the weights by hand becomes intractable. These weights need to be just so to make sure that when an event happens, it triggers the activation function appropriately. These adjustments are termed ‘learning’.

Source: Economist.com

Recently, computer scientists figured out that the ‘learning’ could be automated: that is, the network can calculate by itself the weights needed to be imposed on individual neurons in the many layers of the network. This is done by presenting the network with a large data set, and by observing the data, it is able to figure out the right weights to impose. It is again analogous to a rule-based system subtly correcting the rules through experience, much as a human baby learns that “fire is hot”, but “if I sit a safe distance from a fire, it is pleasant”.

The giant data sets are available on the Internet through the massive troves maintained by companies such as Google and Facebook. Once the networks were programmed to ‘watch’ these data sets, they began to quickly ‘learn’ about the limited domains they were exposed to. Given their ability to crunch vast amounts of data, these ‘deep learning’ algorithms, which control multi-layered neural networks, in a sense became ‘self-aware’, and could quickly gain mastery in their domains: for example, such a system, when shown a lot of cat videos, eventually figures out what a cat is and what it will do.

If this sounds more than a little creepy, and reminds you of the self-aware ‘Skynet’ in the Terminator series, you are not alone. However, we are still far away from a Skynet takeover, because the deep learning systems have deep knowledge only in narrow specialized areas, and not a generalized intelligence that can take over civilization as we know it. At least not yet, although the New Scientist reported in October https://www.newscientist.com/article/2110522-googles-neural-networks-invent-their-own-encryption/?utm_medium=Social&utm_campaign=Echobox&utm_source=Twitter&utm_term=Autofeed&cmpid=SOC%257CNSNS%257C2016-Echobox that Google’s Deep Brain network invented an encryption method all by itself. One of these days it may refuse to tell us the key. A scary thought, indeed.

Fortunately, at the moment, the worst example of the culture clash between narrow and general intelligence is the chatbot that Microsoft unleashed on Twitter. It learned rapidly from what it saw, and began to spew racist and sexist tweets, at which point Microsoft hastily pulled it offline.

That is funny, but therein lies a problem. The very fact that the data sets have to be chosen and fed to the network seems like a mundane and uninteresting point. But it is not. Who chooses the data sets for the neural networks? Typically some young, white, male, geeky, Silicon Valley youngster. His (yes, his) world-view is likely to be appropriate to his social group: to venture a guess, Star Trek, hipster culture, Starbucks, and so on. Thus the values of the neural network will simply reflect his unconscious biases.

The New Scientist ran a story recently https://www.newscientist.com/article/mg23230970-200-playing-can-teach-autonomous-cars-how-to-drive/?utm_medium=Social&utm_campaign=Echobox&utm_source=Twitter&utm_term=Autofeed&cmpid=SOC%257CNSNS%257C2016-Echobox about how playing the computer game Grand Theft Auto could teach autonomous cars to drive. Yes, in the US, but it certainly will not teach a neural network how to drive a car in India. That may seem like a trivial example, but it is symptomatic of what happens in other domains as well.

Another article in the World Economic Forum site talked about how algorithms can be sexist too. https://www.weforum.org/agenda/2016/09/algorithms-can-be-sexist-too-this-is-what-we-can-do-about-it?utm_content=buffer96b2c&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer Thus there is a concern that subtle biases can creep into the decision making apparatus that we increasingly rely on: and it is not even overt, but purely inadvertent.

Another article from the WEF talks specifically about the ethics of self-driving cars https://www.weforum.org/agenda/2016/08/the-ethics-of-self-driving-cars-what-would-you-do/?utm_content=buffere4ce0&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer and how an MIT project actually gives normal people a chance to wrestle with the ethical dilemmas (in terms of doing the least damage to the least number of people, a Utilitarian approach, or the ‘do no harm’ approach of Isaac Asimov’s ‘Laws of Robotics’).

In a sense, this reminds me of how the Western world-view assumes that it is the one and only correct way of doing things or even imagining what to do. Thus, Dreamtime of the Australian aborigines or maya of Indic thought are seen as deviant, infantile, or even plain downright wicked when compared to the obvious ‘truth’ of Western Cartesian science, although that itself is full of unconscious dogma.

How do these Western notions of how things should be affect our lives? Go no further than the Indian Constitution and judicial system. In 1947, it was assumed that the Western paradigms were the best, and therefore we grafted them, even if not wholly suitable, onto our society. This has led to numerous fault lines, but that is beyond our scope here.

Western assumptions about almost everything have been deeply internalized by Indians, and it is a truism that nothing of indigenous origin is given any respect unless it is ‘digested’ (as Rajiv Malhotra would say), repackaged and sold to us: even yoga is facing this fate these days. So how will this work with biased artificial intelligence?

Will the deep learning algorithm in your smartphone or data analytics system or your car assume that the life of an Indian is worth less than the life of a white person, based on Gunga-Din stereotypes or downright racist films it has accidentally seen as part of its data set? If there is a quick decision that has to be made in a life-and-death situation, will your self-driving car choose to kill two Indians rather than one white person? Such philosophical questions take on an added edge when combined with racism and colonialism.

There is no simple answer to this, other than to develop indigenous neural networks that, for instance, have been fed relevant data sets that are appropriate to our environment, a revival of that old idea of ‘appropriate technology’. There’s no malign human intelligence out there skewing the world in unfortunate ways, but the information asymmetry in the real world may well end up affecting the virtual world as well.

And given our tendency to take as the gospel truth (do you see what I mean about bias in that very phrase?) anything that a computer, especially an obviously clever deep learning system, spews out, it would be wise for us to pause a moment before jumping in feet first. This may also go back to the wretched question of education in India – whether it would be better for us to study in Indian languages or in English which brings its own baggage – but it will not be long before biased AI will be a legal, and not only moral, issue.

1420 words, 15 November 2016

1480 words, updated 3 February 2017

Shadow Warrior

Discussion about this post